So last night I had the idea that it might be fun to write a Brainfuck compiler, to convert BF programs into assembly language, where they'd run nice and quickly.
I figured I could allocate a day to do the work, and it would be a pleasant distraction on a Sunday afternoon. As it happened it only took me three hours from start to finish.
There are only a few instructions involved in brainfuck:
>
- increment the data pointer (to point to the next cell to the right).
<
- decrement the data pointer (to point to the next cell to the left).
+
- increment (increase by one) the byte at the data pointer.
-
- decrement (decrease by one) the byte at the data pointer.
.
- output the byte at the data pointer.
,
- accept one byte of input, storing its value in the byte at the data pointer.
[
- if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching
]
command.
- if the byte at the data pointer is zero, then instead of moving the instruction pointer forward to the next command, jump it forward to the command after the matching
]
- if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching
[
command.
- if the byte at the data pointer is nonzero, then instead of moving the instruction pointer forward to the next command, jump it back to the command after the matching
The Wikipedia link early shows how you can convert cleanly to a C implementation, so my first version just did that:
- Read a BF program.
- Convert to a temporary C-source file.
- Compile with
gcc
. - End result you have an executable which you can run.
The second step was just as simple:
- Read a BF program.
- Convert to a temporary assembly language file.
- Compile with
nasm
, link withld
. - End result you have an executable which you can run.
The following program produces the string "Hello World!" to the console:
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
My C-compiler converted that to the program:
extern int putchar(int);
extern char getchar();
char array[30000];
int idx = 0;
int main (int arc, char *argv[]) {
array[idx]++;
array[idx]++;
array[idx]++;
array[idx]++;
array[idx]++;
array[idx]++;
array[idx]++;
array[idx]++;
while (array[idx]) {
idx++;
array[idx]++;
array[idx]++;
array[idx]++;
array[idx]++;
while (array[idx]) {
idx++;
array[idx]++;
array[idx]++;
idx++;
array[idx]++;
array[idx]++;
array[idx]++;
..
..
The assembly language version is even longer:
global _start
section .text
_start:
mov r8, stack
add byte [r8], 8
label_loop_8:
cmp byte [r8], 0
je close_loop_8
add r8, 1
add byte [r8], 4
label_loop_14:
cmp byte [r8], 0
je close_loop_14
add r8, 1
add byte [r8], 2
add r8, 1
add byte [r8], 3
add r8, 1
add byte [r8], 3
add r8, 1
add byte [r8], 1
sub r8, 4
sub byte [r8], 1
jmp label_loop_14
close_loop_14:
add r8, 1
add byte [r8], 1
..
mov rax, 60
mov rdi, 0
syscall
section .bss
stack: resb 300000
Annoyingly the assembly language version ran slower than the C-version, which I was sneakily compiling with "gcc -O3 ..
" to ensure it was optimized.
The first thing that I did was to convert it to fold adjacent instructions. Instead of generating separate increment instructions for ">>>
" I instead started to generate "add xx, 3
". That didn't help as much as I'd hoped, but it was still a big win.
After that I made a minor tweak to the way that loops are handled to compare at the end of the loop as well as the start, and that shaved off a fraction of a second.
As things stand I think I'm "done". It might be nice to convert the generated assembly language to something gcc
can handle, to drop the dependency on nasm
, but I don't feel a pressing need for that yet.
Was a fun learning experience, and I think I'll come back to optimization later.