About Archive Tags RSS Feed

 

Entries tagged assembly

Writing a brainfuck compiler.

14 June 2020 19:00

So last night I had the idea that it might be fun to write a Brainfuck compiler, to convert BF programs into assembly language, where they'd run nice and quickly.

I figured I could allocate a day to do the work, and it would be a pleasant distraction on a Sunday afternoon. As it happened it only took me three hours from start to finish.

There are only a few instructions involved in brainfuck:

  • >
    • increment the data pointer (to point to the next cell to the right).
  • <
    • decrement the data pointer (to point to the next cell to the left).
  • +
    • increment (increase by one) the byte at the data pointer.
  • -
    • decrement (decrease by one) the byte at the data pointer.
  • .
    • output the byte at the data pointer.
  • ,
    • accept one byte of input, storing its value in the byte at the data pointer.
  • [
    • if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching ] command.
  • ]
    • if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching [ command.

The Wikipedia link early shows how you can convert cleanly to a C implementation, so my first version just did that:

  • Read a BF program.
  • Convert to a temporary C-source file.
  • Compile with gcc.
  • End result you have an executable which you can run.

The second step was just as simple:

  • Read a BF program.
  • Convert to a temporary assembly language file.
  • Compile with nasm, link with ld.
  • End result you have an executable which you can run.

The following program produces the string "Hello World!" to the console:

++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.

My C-compiler converted that to the program:

extern int putchar(int);
extern char getchar();

char array[30000];
int idx = 0;

int main (int arc, char *argv[]) {
  array[idx]++;
  array[idx]++;
  array[idx]++;
  array[idx]++;
  array[idx]++;
  array[idx]++;
  array[idx]++;
  array[idx]++;

  while (array[idx]) {
    idx++;
    array[idx]++;
    array[idx]++;
    array[idx]++;
    array[idx]++;

    while (array[idx]) {
      idx++;
      array[idx]++;
      array[idx]++;
      idx++;
      array[idx]++;
      array[idx]++;
      array[idx]++;
    ..
    ..

The assembly language version is even longer:


global _start
section .text

_start:
  mov r8, stack
  add byte [r8], 8
label_loop_8:
  cmp byte [r8], 0
  je close_loop_8
  add r8, 1
  add byte [r8], 4
label_loop_14:
  cmp byte [r8], 0
  je close_loop_14
  add r8, 1
  add byte [r8], 2
  add r8, 1
  add byte [r8], 3
  add r8, 1
  add byte [r8], 3
  add r8, 1
  add byte [r8], 1
  sub r8, 4
  sub byte [r8], 1
  jmp label_loop_14
close_loop_14:
  add r8, 1
  add byte [r8], 1
  ..

  mov rax, 60
  mov rdi, 0
  syscall
section .bss
stack: resb 300000

Annoyingly the assembly language version ran slower than the C-version, which I was sneakily compiling with "gcc -O3 .." to ensure it was optimized.

The first thing that I did was to convert it to fold adjacent instructions. Instead of generating separate increment instructions for ">>>" I instead started to generate "add xx, 3". That didn't help as much as I'd hoped, but it was still a big win.

After that I made a minor tweak to the way that loops are handled to compare at the end of the loop as well as the start, and that shaved off a fraction of a second.

As things stand I think I'm "done". It might be nice to convert the generated assembly language to something gcc can handle, to drop the dependency on nasm, but I don't feel a pressing need for that yet.

Was a fun learning experience, and I think I'll come back to optimization later.

| No comments

 

Writing an assembler.

3 October 2020 13:00

Recently I've been writing a couple of simple compilers, which take input in a particular format and generate assembly language output. This output can then be piped through gcc to generate a native executable.

Public examples include this trivial math compiler and my brainfuck compiler.

Of course there's always the nagging thought that relying upon gcc (or nasm) is a bit of a cheat. So I wondered how hard is it to write an assembler? Something that would take assembly-language program and generate a native (ELF) binary?

And the answer is "It isn't hard, it is just tedious".

I found some code to generate an ELF binary, and after that assembling simple instructions was pretty simple. I remember from my assembly-language days that the encoding of instructions can be pretty much handled by tables, but I've not yet gone into that.

(Specifically there are instructions like "add rax, rcx", and the encoding specifies the source/destination registers - with different forms for various sized immediates.)

Anyway I hacked up a simple assembler, it can compile a.out from this input:

.hello   DB "Hello, world\n"
.goodbye DB "Goodbye, world\n"

        mov rdx, 13        ;; write this many characters
        mov rcx, hello     ;; starting at the string
        mov rbx, 1         ;; output is STDOUT
        mov rax, 4         ;; sys_write
        int 0x80           ;; syscall

        mov rdx, 15        ;; write this many characters
        mov rcx, goodbye   ;; starting at the string
        mov rax, 4         ;; sys_write
        mov rbx, 1         ;; output is STDOUT
        int 0x80           ;; syscall


        xor rbx, rbx       ;; exit-code is 0
        xor rax, rax       ;; syscall will be 1 - so set to xero, then increase
        inc rax            ;;
        int 0x80           ;; syscall

The obvious omission is support for "JMP", "JMP_NZ", etc. That's painful because jumps are encoded with relative offsets. For the moment if you want to jump:

        push foo     ; "jmp foo" - indirectly.
        ret

:bar
        nop          ; Nothing happens
        mov rbx,33   ; first syscall argument: exit code
        mov rax,1    ; system call number (sys_exit)
        int 0x80     ; call kernel

:foo
        push bar     ; "jmp bar" - indirectly.
        ret

I'll update to add some more instructions, and see if I can use it to handle the output I generate from a couple of other tools. If so that's a win, if not then it was a fun learning experience:

| 2 comments

 

Simple REPL for CP/M, in Z80 assembly

24 June 2023 13:00

So my previous post documented a couple of simple "scripting languages" for small computers, allowing basic operations in a compact/terse fashion.

I mentioned that I might be tempted to write something similar for CP/M, in Z80 assembly, and the result is here:

To sum up it allows running programs like this:

0m 16k{rP _ _}
C3 03 EA 00 00 C3 06 DC 00 00 00 00 00 00 00 00

Numbers automatically get saved to the A-register, the accumulator. In addition to that there are three dedicated registers:

  • M-register is used to specify which RAM address to read/write from.
    • The instruction m copies the value of accumulator to the M-register.
    • The instruction M copies the value of the M-register to the accumulator.
  • K-register is used to execute loops.
    • The instruction k copies the value of accumulator to the K-register.
    • The instruction K copies the value of the K-register to the accumulator.
  • U-register is used to specify which port to run I/O input and output from.
    • The instruction u copies the value of accumulator to the U-register.
    • The instruction U copies the value of the U-register to the accumulator.

So the program above:

  • 0m
    • 0 is stored in the accumulator.
    • m copies the value of the accumulator to the M-register.
  • 16k
    • 16 is stored in the accumulator.
    • k copies the value of the accumulator (16) to the K-register, which is used for looping.
  • { - Starts a loop.
    • The K-register is decremented by one.
    • If the K-register is greater than zero the body is executed, up to the closing brace.
  • Loop body:
    • r Read a byte to the accumulator from the address stored in the M-register, incrementing that register in the process.
    • P: Print the contents of the accumulator.
    • _ _ Print a space.
  • } End of the loop, and end of the program.

TLDR: Dump the first sixteen bytes of RAM, at address 0x0000, to the console.

Though this program allows delays, RAM read/write, I/O port input and output, as well as loops it's both kinda fun, and kinda pointless. I guess you could spend hours flashing lights and having other similar fun. But only if you're like me!

All told the code compiles down to about 800 bytes and uses less than ten bytes of RAM to store register-state. It could be smaller with some effort, but it was written a bit adhoc and I think I'm probably done now.

| No comments