About Archive Tags RSS Feed


Before I forget, a simple virtual machine

5 October 2014 21:50

Before I forget I had meant to write about a toy virtual machine which I'ce been playing with.

It is register-based with ten registers, each of which can hold either a string or int, and there are enough instructions to make it fun to use.

I didn't go overboard and write a complete grammer, or a real compiler, but I did do enough that you can compile and execute obvious programs.

First compile from the source to the bytecodes:

$ ./compiler examples/loop.in

Mmm bytecodes are fun:

$ xxd  ./examples/loop.raw
0000000: 3001 1943 6f75 6e74 696e 6720 6672 6f6d  0..Counting from
0000010: 2074 656e 2074 6f20 7a65 726f 3101 0101   ten to zero1...
0000020: 0a00 0102 0100 2201 0102 0201 1226 0030  ......"......&.0
0000030: 0104 446f 6e65 3101 00                   ..Done1..

Now the compiled program can be executed:

$ ./simple-vm ./examples/loop.raw
[stdout] register R01 = Counting from ten to zero
[stdout] register R01 = 9 [Hex:0009]
[stdout] register R01 = 8 [Hex:0008]
[stdout] register R01 = 7 [Hex:0007]
[stdout] register R01 = 6 [Hex:0006]
[stdout] register R01 = 5 [Hex:0005]
[stdout] register R01 = 4 [Hex:0004]
[stdout] register R01 = 3 [Hex:0003]
[stdout] register R01 = 2 [Hex:0002]
[stdout] register R01 = 1 [Hex:0001]
[stdout] register R01 = 0 [Hex:0000]
[stdout] register R01 = Done

There could be more operations added, but I'm pleased with the general behaviour, and embedding is trivial. The only two things that make this even remotely interesting are:

  • Most toy virtual machines don't cope with labels and jumps. This does.
    • Even though it was a real pain to go patching up the offsets.
    • Having labels be callable before they're defined is pretty mandatory in practice.
  • Most toy virtual machines don't allow integers and strings to be stored in registers.
    • Now I've done that I'm not 100% sure its a good idea.

Anyway that concludes todays computer-fun.



Comments on this entry

icon Stan Schwertly at 16:21 on 5 October 2014

Did you just decide to do this one day? Can you talk more about what inspired you to write a toy VM?

Really enjoyed the code, it's very easy to read. It helped make the idea less scary to see it implemented too :)

icon Steve Kemp at 16:48 on 5 October 2014

In the past I used to do a lot of programming in assembly, initially on the z80, and later on intel.

I've always enjoyed such things, and used to love both patching binaries for extra lives/free-registration, and shaving bytes from programms, making them ever so slightly smaller. These days those kind of skills aren't so useful, except in security research, and so I'm rusty. I wanted to see how hard/easy it was to design an environment for a virtual CPU with a random adhoc instruction-set. Partly because it would be fun, and partly because I thought I could experiment with LLVM / JIT stuff later.

As it happens getting it working was very simple, the first code to read instructions only had two "EXIT" and "PRINTSTEVE", but that was enough to make me add more and more. It was deciding on the kinds of opcodes to implement which was the hardest part - that and handling the JUMP/LABEL support in the compiler. I just added what seemed useful at the time, and left gaps to fill in instructions later.

The "compiler" was written last because outputting hex-codes by hand, via emacs, was a real pain. I think I'm glad I did it that way round, but it wasn't perhaps the most ideal way to start!

I suspect I'll write a decompiler in the near future then leave it alone for a while.

icon Kevin Veroneau at 03:22 on 6 October 2014

Hello Steve, your toy VM has a similar feel to one I built using Python awhile back(that I'm still improving on), I found it very educational and entertaining to create a toy virtual machine, and it thought me some rather interesting concepts and ideas about how to write programs. If you wanted to check out my toy VM, you can find it on my Bitbucket page: https://bitbucket.org/kveroneau/simple-cpu

I actually built the assembler at the same time I wrote the VM, as it was easier to play around and test it out during development. Some idea which you may want to bring over to your toy VM is the concept of memory segments, like a code segment, data segment, and stack segment. Tell me what you think about my VM, if you have some time. Wouldn't mind hearing some feedback from someone else who also created a toy VM. Feel free to go back in the revision history to see how it evolved.

icon Steve Kemp at 08:45 on 6 October 2014

Kevin I like what you've done, even though I'm not a fan of Python ;)

I can see you were influenced by intel stuff, with the interrupt instruction and the register naming, though for me the biggest shock was reading your example programs and realizing I didn't implement a stack and the push/pop instructions. D'oh!