Updated myy compiler, and bought a watch.

Saturday, 16 February 2019

The simple math-compiler I introduced in my previous post has had a bit of an overhaul, so that now it is fully RPN-based.

Originally the input was RPN-like, now it is RPN for real. It handles error-detection at run-time, and generates a cleaner assembly-language output:

In other news I bought a new watch, which was a fun way to spend some time.

I love mechanical watches, clocks, and devices such as steam-engines. While watches are full of tiny and intricate parts I like the pretence that you can see how they work, and understand them. Steam engines are seductive because their operation is similar; you can almost look at them and understand how they work.

I've got a small collection of watches at the moment, ranging from €100-€2000 in price, these are universally skeleton-watches, or open-heart watches.

My recent purchase is something different. I was looking at used Rolexs, and found some from 1970s. That made me suddenly wonder what had been made the same year as I was born. So I started searching for vintage watches, which had been manufactured in 1976. In the end I found a nice Soviet Union piece, made by Raketa. I can't prove that this specific model was actually manufactured that year, but I'll keep up the pretence. If it is +/- 10 years that's probably close enough.

My personal dream-watch is the Rolex Oyster (I like to avoid complications). The Oyster is beautiful, and I can afford it. But even with insurance I'd feel too paranoid leaving the house with that much money on my wrist. No doubt I'll find a used one, for half that price, sometime. I'm not in a hurry.

(In a horological-sense a "complication" is something above/beyond the regular display of time. So showing the day, the date, or phase of the moon would each be complications.)

| No comments

 

I decided it was time to write a compiler

Thursday, 31 January 2019

I've spent some time in the recent past working with interpreters, and writing a BASIC interpreter, but one thing I'd not done is write a compiler.

Once upon a time I worked for a compiler-company, but I wasn't involved with the actual coding at that time. Instead I worked on other projects, and did a minor amount of system-administration.

There are enough toy-languages that it didn't seem worthwhile to write a compiler for yet another one. At the same time writing a compiler for a full-language would get bogged down in a lot of noise.

So I decided to simplify things: I would write a compiler for "maths". Something that would take an expression and output assembly-language, which could then be compiled.

The end result is this simple compiler:

Initially I wrote something that would parse expressions such as 3 + 4 * 5 and output an abstract-syntax-tree. I walked the tree and started writing logic to pick registers, and similar. It all seemed like more of a grind than a useful exercise - and considering how ludicrous compiling simple expressions to assembly language already was it seemed particularly silly.

So once again I simplified, deciding to accept only a simple "reverse-polish-like" expression, and outputing the assembly for that almost directly.

Assume you want to calculate "((3 * 5) +2)" you'd feed my compiler:

  3 5 * 2 +

To compile that we first load the initial state 3, then we walk the rest of the program always applying an operation with an operand:

  • Store 3
  • 5 * -> multiply by 5.
  • 2 + -> add 2.
  • ..

This approach is trivial to parse, and trivial to output the assembly-language for: Pick a register and load your starting value, then just make sure all your operations apply to that particular register. (In the case of Intel assembly it made sense to store the starting value in EAX, and work with that).

A simple program would then produce a correspondingly simple output. Given 1 1 + we'd expect this output:

  .intel_syntax noprefix
  .global main

  .data
    result: .asciz "Result %d\n"

  main:
    mov rax, 1
    add rax, 1

    lea rdi,result
    mov rsi, rax
    xor rax, rax
    call printf
    xor rax, rax
    ret

With that output you can assemble the program, and run it:

 $ gcc -static -o program program.s
 $ ./program
 Result 2

I wrote some logic to allow calculating powers too, so you can output 2 ^ 8, etc. That's just implemented the naive-way, where you have a small loop and multiply the contents of EAX by itself the appropriate number of times. Modulus is similarly simple to calculate.

Adding support for named variables, and other things, wouldn't be too hard. But it would involve register-allocation and similar complexity. Perhaps that's something I need to flirt with, to make the learning process complete, but to be honest I can't be bothered.

Anyway check it out, if you like super-fast maths. My benchmark?

$  time perl -e 'print 2 ** 8 . "\n"'
256
real    0m0.006s
user    0m0.005s
sys     0m0.000s

vs.

$ math-compiler -compile '2 8 ^'
$ time ./a.out
Result 256

real   0m0.001s
user   0m0.001s
sys    0m0.000s

Wow. Many wow. Much speed. All your base-two are belong to us.

| No comments

 

This is mostly how I make bread

Sunday, 28 October 2018

I've talked about making bread in the past on this blog, here's my typical recipe - this recipe requires only two bowls, a spatula, a dutch-oven, oven-gloves, and some scales.

Ingredients:

  • 1kg flour
  • 950g water.
    • If you want to keep it simple you could use 1kg of water instead. It simplifies the measuring if you're using a balance-scale
  • 1.5 teaspoon salt.
  • 1 teaspoon instant/dried yeast.

Here is the flour, water, & yeast - not pictured salt!

Mix all ingredients together. It will be sticky, and your hands will become messy. Embrace it. (ProTip: Take off your watch and wedding-ring(s) if applicable.) Expect it to take 2-5 minutes to do a decent job. Ensure you scoop your hand right into the bottom of the mixture, to make sure there is no flour clumped together at the bottom of the bowl which is not fully mixed in.

The end result is a sticky mess which will look something like this, perhaps your bowl will be cleaner and you'll have done a better job at mixing all the flour!

Cover the bowl with cling-film, and stick in the fridge overnight. (I tend to mix stuff at 6PM in the evening, then come back to it around 9AM the following morning, which means the bowl sits in the fridge for 14 hours or so.)

Take the bowl out of the fridge and you should see it has "grown", and it will have a lot of bubbles on the top, as growth of the yeast emitted CO2.

You'll also see that it is significantly more gloopy, as chains of gluten have formed

Anyway now your bowl is on the counter, out of the fridge, you want turn on your oven and set it to 250°C, with the dutch oven inside it. While you're waiting for the oven to heat up transfer the sticky mess to a new bowl, lined with baking-paper. This will make it easier to add to the pot when we're ready to actually cook it.

As per the previous video the mixture will be very sticky, but you should be able to manage it. Don't worry too much about the shape, it'll become a "loaf-shape" when it cooks, the only reason we're moving it is because it is much easier to lift the mixture into the pot by holding the paper, than trying to scrape it from your cold bowl to your very hot dutch-oven. Anyway once you've moved it to a new bowl you'll have something like this:

When your oven has reached the right temperature carefully transfer the mixture, in its paper, to the dutch oven which you'll then return to the oven.

  • Cook for 40 minutes at 250°C
  • Cook for an additional 20 minutes at 200°C
    • Just turn down the temperature-dial.
  • Finally open the oven, remove the lid from the pot, and cook for a further 15 minutes (still at 200°C)

The end result will be something similar to this:

Enjoy!

Notes:

  • You can see vestiages of the paper-wrapper in the final result.
  • I like my bread dark.
  • Let it cool down before you eat it, something like 45-60 minutes once you've removed from the oven.

| 7 comments.

 

A visual basic server

Tuesday, 23 October 2018

So my previous post described a BASIC interpreter I'd written.

Before the previous release I decided to ensure that it was easy to embed, and that it was possible to extend the BASIC environment such that it could call functions implemented in golang.

One of the first things that came to mind was to allow a BASIC script to plot pixels in a PNG. So I made that possible by adding "PLOT x,y" and "SAVE" primitives.

Taking that step further I then wrote a HTTP-server which would allow you to enter a BASIC program and view the image it created. It's a little cute at least.

Install it from source, or fetch a binary if you prefer, via:

$ go get -u github.com/skx/gobasic/goserver

Then launch it and point your browser at http://localhost:8080, and you'll be presented with something like this:

Fun times.

| 2 comments.

 

So I wrote a basic BASIC

Saturday, 20 October 2018

So back in June I challenged myself to write a BASIC interpreter in a weekend. The next time I mentioned it was to admit defeat. I didn't really explain in any detail, because I thought I'd wait a few days and try again and I was distracted at the time I wrote my post.

As it happened that was over four months ago, so clearly it didn't work out. The reason for this was because I was getting too bogged down in the wrong kind of details. I'd got my heart set on doing this the "modern" way:

  • Write a lexer to spit the input into tokens
    • LINE-NUMBER:10, PRINT, "Hello, World"
  • Then I'd take those tokens and form an abstract syntax tree.
  • Finally I'd walk the tree evaluating as I went.

The problem is that almost immediately I ran into problems, my naive approach didn't have a good solution for identifying line-numbers. So I was too paralysed to proceed much further.

I sidestepped the initial problem and figured maybe I should just have a series of tokens, somehow, which would be keyed off line-number. Obviously when you're interpreting "traditional" BASIC you need to care about lines, and treat them as important because you need to handle fun-things like this:

10 PRINT "STEVE ROCKS"
20 GOTO 10

Anyway I'd parse each line, assuming only a single statement upon a line (ha!) you can divide it into:

  • Number - i.e. line-number.
  • Statement.
  • Newline to terminate.

Then you could have:

code{blah} ..
code[10] = "PRINT STEVE ROCKS"
code[20] = "GOTO 10"

Obviously you spot the problem there, if you think it through. Anyway. I've been thinking about it off and on since then, and the end result is that for the past two evenings I've been mostly writing a BASIC interpreter, in golang, in 20-30 minute chunks.

The way it works is as you'd expect (don't make me laugh ,bitterly):

  • Parse the input into tokens.
  • Store those as an array.
  • Interpet each token.
    • No AST
    • No complicated structures.
    • Your program is literally an array of tokens.

I cheated, horribly, in parsing line-numbers which turned out to be exactly the right thing to do. The output of my naive lexer was:

INT:10, PRINT, STRING:"Hello World", NEWLINE, INT:20, GOTO, INT:10

Guess what? If you (secretly) prefix a newline to the program you're given you can identify line-numbers just by keeping track of your previous token in the lexer. A line-number is any number that follows a newline. You don't even have to care if they sequential. (Hrm. Bug-report?)

Once you have an array of tokens it becomes almost insanely easy to process the stream and run your interpreter:

 program[] = { LINE_NUMBER:10, PRINT, "Hello", NEWLINE, LINE_NUMBER:20 ..}

 let offset := 0
 for( offset < len(program) ) {
    token = program[offset]

    if ( token == GOTO ) { handle_goto() ; }
    if ( token == PRINT ) { handle_print() ; }
    .. handlers for every other statement
    offset++
 }

Make offset a global. And suddenly GOTO 10 becomes:

  • Scan the array, again, looking for "LINE_NUMBER:10".
  • Set offset to that index.

Magically it all just works. Add a stack, and GOSUB/RETURN are handled with ease too by pushing/popping the offset to it.

In fact even the FOR-loop is handled in only a few lines of code - most of the magic happening in the handler for the "NEXT" statement (because that's the part that needs to decide if it needs to jump-back to the body of the loop, or continue running.

OK this is a basic-BASIC as it is missing primtives (CHR(), LEN,etc) and it only cares about integers. But the code is wonderfully simple to understand, and the test-case coverage is pretty high.

I'll leave with an example:

10 REM This is a program
00 REM
 01 REM This program should produce 126 * 126 * 10
 02 REM  = 158760
 03 REM
 05 GOSUB 100
 10 FOR i = 0 TO 126
 20  FOR j = 0 TO 126 STEP 1
 30   FOR k = 0 TO 10
 40    LET a = i * j * k
 50   NEXT k
 60  NEXT j
 70 NEXT i
 75 PRINT a, "\n"
 80 END
100 PRINT "Hello, I'm multiplying your integers"
110 RETURN

Loops indented for clarity. Tokens in upper-case only for retro-nostalgia.

Find it here, if you care:

I had fun. Worth it.

I even "wrote" a "game":

| 2 comments.

 

Pulling back

Saturday, 29 September 2018

I've updated my fork of the monkey programming language to allow object-based method calls.

That's allowed me to move some of my "standard-library" code into Monkey, and out of Go which is neat. This is a simple example:

 //
 // Reverse a string,
 //
 function string.reverse() {
   let r= "";
   let l = len(self);

   for( l > 0 ) {
      r += self[l-1];
      l--;
   }
   return r;
 }

Usage is the obvious:

  puts( "Steve".reverse() );

Or:

  let s = "Input";
  s = s.reverse();
  puts( s + "\n" );

Most of the pain here was updating the parser to recognize that "." meant a method-call was happening, once that was done it was otherwise only a matter of passing the implicit self object to the appropriate functions.

This work was done in a couple of 30-60 minute chunks. I find that I'm only really able to commit to that amount of work these days, so I've started to pull back from other projects.

Oiva is now 21 months old and he sucks up all my time & energy. I can't complain, but equally I can't really start concentrating on longer-projects even when he's gone to sleep.

And that concludes my news for the day.

Goodnight dune..

| No comments

 

PAM HaveIBeenPwned module

Monday, 17 September 2018

So the PAM module which I pondered about in my previous post now exists:

I did mention "sponsorship" in my post which lead to a couple of emails, and the end result of that was that a couple of folk donated to charity in my/its name. Good enough.

Perhaps in the future I'll explore patreon/similar, but I don't feel very in-demand so I'll avoid it for the moment.

Anyway I guess it should be Debian-packaged for neatness, but I'll resist for the moment.

| No comments

 

Recommendations for software?

Saturday, 15 September 2018

A quick post with two questions:

  • What spam-filtering software do you recommend?
  • Is there a PAM module for testing with HaveIBeenPwnd?
    • If not would you sponsor me to write it? ;)

So I've been using crm114 to perform spam-filtering on my incoming mail, via procmail, for the past few years.

Today I discovered it had archived about 12Gb of my email history, because I'd never pruned it. (Beneath ~/.crm/.)

So I wonder if there are better/simpler/different Bayesian-filters out there at that I should be switching to? Recommendations welcome - but don't say "SpamAssassin", thanks!

Secondly the excellent Have I Been Pwned site provides an API which allows you to test if a password has been previously included in a leak. This is great, and I've integrated their API in a couple of my own applications, but I was thinking on the bus home tonight it might be worth tying into PAM.

Sure in the interests of security people should use key-based authentication for SSH, but .. most people don't. Even so, if keys are used exclusively, a PAM module would allow you to validate the password which is used for sudo hasn't previously been leaked.

So it seems like there is value in a PAM module to do a lookup at authentication-time, via libcurl.

| 4 comments.

 

Improving my release process

Tuesday, 4 September 2018

I have written a lot of software which I deploy upon my own systems. Typically I deploy these things via puppet, but some things are more ad-hoc. For the ad-hoc deployments I tend to rely upon ssh to deploy them.

I have a bunch of shell-scripts, each called .deploy, which carries out the necessary steps. Here is an example script which deploys my puppet-summary service upon a host:

  #!/bin/sh

  HOST=master.steve.org.uk
  RELEASE=1.2

   # download
  ssh -t ${HOST} "wget --quiet -O /srv/puppet-summary/puppet-summary-linux-amd64-${RELEASE} https://github.com/skx/puppet-summary/releases/download/release-${RELEASE}/puppet-summary-linux-amd64"

   # symlink
   ssh -t ${HOST} "ln -sf /srv/puppet-summary/puppet-summary-linux-amd64-${RELEASE}  /srv/puppet-summary/puppet-summary"

   # make executable
   ssh -t ${HOST} "chmod 755 /srv/puppet-summary/puppet-summary*"

   # restart
   ssh -t ${HOST} "systemctl restart puppet-summary.service"

As you can see this script is very obvious:

  • Download a binary release from a github-page.
  • Symlinks a fixed name to point to this numbered-release.
  • Ensures the download is executable.
  • Then restarts a service.

This whole process is pretty painless, but it assumes prior-setup. For example it assumes that the systemd unit-file is in-place, and any corresponding users, directories, and configuration-files.

I wanted to replace it with something simple to understand, and that replacement system had to have the ability to do two things:

  • Copy a file from my local system to the remote host.
  • Run a command upon the remote host.

That would let me have a local tree of configuration-files, and allow them to be uploaded, and then carry out similar steps to the example above.

Obviously the simplest way to go would be to use fabric, ansible, salt, or even puppet. That would be too easy.

So I wondered could I write a script like this:

   # Copy the service-file into place
   CopyFile puppet-summary.service /lib/systemd/system/puppet-summary.service
   IfChanged "systemctl daemon-reload"
   IfChanged "systemctl enable puppet-summary.service"

   # Fetch the binary
   Run "wget --quiet -O /srv/puppet-summary/puppet-summary-linux-amd64-${RELEASE} https://github.com/skx/puppet-summary/releases/download/release-${RELEASE}/puppet-summary-linux-amd64"

   # Run the rest here ..
   Run "ln -sf .."

   ..
   Run "systemctl restart puppet-summary.service"

Turns out that connecting to a remote-host, via SSH, and using that single connection to either upload/download files or run commands is very simple in golang. So I'm quite placed with that.

I've only added three primitives, and the ability to set/expand variables:

  • CopyFile [local-file-name] [remote-file-name]
    • This sets a flag if the remote file was missing, or the contents changed.
  • IfChanged
    • Run a command only if the file-copy flag was set.
  • Run
    • Run a command. Unconditionally.

I suspect if I needed much more than that I should use something else. But having dealt with Ansible deployments in the past I'm very glad I didn't use that.

(It makes a lot of sense to have the repositories for deploying application A inside the repository of project A. Another reason to not use ansible, etc. Though I guess fabric would work well with that, as recipes are typically contained in ./fabfile.py. Of course .. python.)

| 1 comment.

 

Project cleanup

Friday, 27 July 2018

For the past couple of days I've gone back over my golang projects, and updated each of them to have zero golint/govet warnings.

Nothing significant has changed, but it's nice to be cleaner.

I did publish a new project, which is a webmail client implemented in golang. Using it you can view the contents of a remote IMAP server in your browser:

  • View folders.
  • View messages.
  • View attachments
  • etc.

The (huge) omission is the ability to reply to messages, compose new mails, or forward/delete messages. Still as a "read only webmail" it does the job.

Not a bad hack, but I do have the problem that my own mailserver presents ~/Maildir over IMAP and I have ~1000 folders. Retrieving that list of folders is near-instant - but retrieving that list of folders and the unread-mail count of each folder takes over a minute.

For the moment I've just not handled folders-with-new-mail specially, but it is a glaring usability hole. There are solutions, the two most obvious:

  • Use an AJAX call to get/update the unread-counts in the background.
    • Causes regressions as soon as you navigate to a new page though.
  • Have some kind of open proxy-process to maintain state and avoid accessing IMAP directly.
    • That complicates the design, but would allow "instant" fetches of "stuff".

Anyway check it out if you like. Bug reports welcome.

| No comments

 

Recent Posts

Recent Tags