Monkeying around with intepreters - Result

Monday, 18 June 2018

So I challenged myself to writing a BASIC intepreter over the weekend, unfortunately I did not succeed.

What I did was take an existing monkey-repl and extend it with a series of changes to make sure that I understood all the various parts of the intepreter design.

Initially I was just making basic changes:

  • Added support for single-line comments.
    • For example "// This is a comment".
  • Added support for multi-line comments.
    • For example "/* This is a multi-line comment */".
  • Expand \n and \t in strings.
  • Allow the index operation to be applied to strings.
    • For example "Steve Kemp"[0] would result in S.
  • Added a type function.
    • For example "type(3.13)" would return "float".
    • For example "type(3)" would return "integer".
    • For example "type("Moi")" would return "string".

Once I did that I overhauled the built-in functions, allowing callers to register golang functions to make them available to their monkey-scripts. Using this I wrote a simple "standard library" with some simple math, string, and file I/O functions.

The end result was that I could read files, line-by-line, or even just return an array of the lines in a file:

 // "wc -l /etc/passwd" - sorta
 let lines = file.lines( "/etc/passwd" );
 if ( lines ) {
    puts( "Read ", len(lines), " lines\n" )
 }

Adding file I/O was pretty neat, although I only did reading. Handling looping over a file-contents is a little verbose:

 // wc -c /etc/passwd, sorta.
 let handle = file.open("/etc/passwd");
 if ( handle < 0 ) {
   puts( "Failed to open file" )
 }

 let c = 0;       // count of characters
 let run = true;  // still reading?

 for( run == true ) {

    let r = read(handle);
    let l = len(r);
    if ( l > 0 ) {
        let c = c + l;
    }
    else {
        let run = false;
    }
 };

 puts( "Read " , c, " characters from file.\n" );
 file.close(handle);

This morning I added some code to interpolate hash-values into a string:

 // Hash we'll interpolate from
 let data = { "Name":"Steve", "Contact":"+358449...", "Age": 41 };

 // Expand the string using that hash
 let out = string.interpolate( "My name is ${Name}, I am ${Age}", data );

 // Show it worked
 puts(out + "\n");

Finally I added some type-conversions, allowing strings/floats to be converted to integers, and allowing other value to be changed to strings. With the addition of a math.random function we then got:

 // math.random() returns a float between 0 and 1.
 let rand = math.random();

 // modify to make it from 1-10 & show it
 let val = int( rand * 10 ) + 1 ;
 puts( "math.random() -> ", val , "\n");

The only other signification change was the addition of a new form of function definition. Rather than defining functions like this:

 let hello = fn() { puts( "Hello, world\n" ) };

I updated things so that you could also define a function like this:

 function hello() { puts( "Hello, world\n" ) };

(The old form still works, but this is "clearer" in my eyes.)

Maybe next weekend I'll try some more BASIC work, though for the moment I think my monkeying around is done. The world doesn't need another scripting language, and as I mentioned there are a bunch of implementations of this around.

The new structure I made makes adding a real set of standard-libraries simple, and you could embed the project, but I'm struggling to think of why you would want to. (Though I guess you could pretend you're embedding something more stable than anko and not everybody loves javascript as a golang extension language.)

| No comments

 

Monkeying around with intepreters

Saturday, 16 June 2018

Recently I've had an overwhelming desire to write a BASIC intepreter. I can't think why, but the idea popped into my mind, and wouldn't go away.

So I challenged myself to spend the weekend looking at it.

Writing an intepreter is pretty well-understood problem:

  • Parse the input into tokens, such as "LET", "GOTO", "INT:3"
    • This is called lexical analysis / lexing.
  • Taking those tokens and building an abstract syntax tree.
    • The AST
  • Walking the tree, evaluating as you go.
    • Hey ho.

Of course BASIC is annoying because a program is prefixed by line-numbers, for example:

 10 PRINT "HELLO, WORLD"
 20 GOTO 10

The naive way of approaching this is to repeat the whole process for each line. So a program would consist of an array of input-strings each line being treated independently.

Anyway reminding myself of all this fun took a few hours, and during the course of that time I came across Writing an intepreter in Go which seems to be well-regarded. The book walks you through creating an interpreter for a language called "Monkey".

I found a bunch of implementations, which were nice and clean. So to give myself something to do I started by adding a new built-in function rnd(). Then I tested this:

let r = 0;
let c = 0;

for( r != 50 ) {
   let r = rnd();
   let c = c + 1;
}

puts "It took ";
puts c;
puts " attempts to find a random-number equalling 50!";

Unfortunately this crashed. It crashed inside the body of the loop, and it seemed that the projects I looked at each handled the let statement in a slightly-odd way - the statement wouldn't return a value, and would instead fall-through a case statement, hitting the next implementation.

For example in monkey-intepreter we see that happen in this section. (Notice how there's no return after the env.Set call?)

So I reported this as a meta-bug to the book author. It might be the master source is wrong, or might be that the unrelated individuals all made the same error - meaning the text is unclear.

Anyway the end result is I have a language, in go, that I think I understand and have been able to modify. Now I'll have to find some time to go back to BASIC-work.

I found a bunch of basic-intepreters, including ubasic, but unfortunately almost all of them were missing many many features - such as implementing operations like RND(), ABS(), COS().

Perhaps room for another interpreter after all!

| 3 comments.

 

A brief metric-update, and notes on golang-specific metrics

Saturday, 2 June 2018

My previous post briefly described the setup of system-metric collection. (At least the server-side setup required to receive the metrics submitted by various clients.)

When it came to the clients I was complaining that collectd was too heavyweight, as installing it pulled in a ton of packages. A kind twitter user pointed out that you can get most of the stuff you need via the use of the of collectd-core package:

 # apt-get install collectd-core

I guess I should have known that! So for the moment that's what I'm using to submit metrics from my hosts. In the future I will spend more time investigating telegraf, and other "modern" solutions.

Still with collectd-core installed we've got the host-system metrics pretty well covered. Some other things I've put together also support metric-submission, so that's good.

I hacked up a quick package for automatically submitting metrics to a remote server, specifically for golang applications. To use it simply add an import to your golang application:

  import (
    ..
    _ "github.com/skx/golang-metrics"
    ..
  )

Add the import, and rebuild your application and that's it! Configuration is carried out solely via environmental variables, and the only one you need to specify is the end-point for your metrics host:

$ METRICS=metrics.example.com:2003 ./foo

Now your application will be running as usual and will also be submitting metrics to your central host every 10 seconds or so. Metrics include the number of running goroutines, application-uptime, and memory/cpu stats.

I've added a JSON-file to import as a grafana dashboard, and you can see an example of what it looks like there too:

| No comments

 

On collecting metrics

Saturday, 26 May 2018

Here are some brief notes about metric-collection, for my own reference.

Collecting server and service metrics is a good thing because it lets you spot degrading performance, and see the effect of any improvements you've made.

Of course it is hard to know what metrics you might need in advance, so the common approach is to measure everything, and the most common way to do that is via collectd.

To collect/store metrics the most common approach is to use carbon and graphite-web. I tend to avoid that as being a little more heavyweight than I'd prefer. Instead I'm all about the modern alternatives:

  • Collect metrics via go-carbon
    • This will listen on :2003 and write metrics beneath /srv/metrics
  • Export the metrics via carbonapi
    • This will talk to the go-carbon instance and export the metrics in a compatible fashion to what carbon would have done.
  • Finally you can view your metrics via grafana
    • This lets you make pretty graphs & dashboards.

Configuring all this is pretty simple. Install go-carbon, and give it a path to write data to (/srv/metrics in my world). Enable the receiver on :2003. Enable the carbonserver and make it bind to 127.0.0.1:8888.

Now configure the carbonapi with the backend of the server above:

  # Listen address, should always include hostname or ip address and a port.
  listen: "localhost:8080"

  # "http://host:port" array of instances of carbonserver stores
  # This is the *ONLY* config element in this section that MUST be specified.
  backends:
    - "http://127.0.0.1:8888"

And finally you can add your data-source to grafana of 127.0.0.1:8080, and graph away.

The only part that I'm disliking at the moment is the sheer size of collectd. Getting metrics of your servers (uptime, I/O performance, etc) is very useful, but it feels like installing 10Mb of software to do that is a bit excessive.

I'm sure there must be more lightweight systems out there for collecting "everything". On the other hand I've added metrics exporting to my puppet-master, and similar tools very easily so I have lightweight support for that in the tools themselves.

I have had a good look at metricsd which is exactly the kind of tool I was looking for, but I've not searched too far afield for other alternatives and choices just yet.

I should write more about application-specific metrics in the future, because I've quizzed a few people recently:

  • What's the average response-time of your application? What's the effectiveness of your (gzip) compression?
    • You don't know?
  • What was the quietest time over the past 24 hours for your server?
    • You don't know?
  • What proportion of your incoming HTTP-requests were for HTTP?
    • Do you monitor HTTP-status-codes? Can you see how many times people were served redirects to the SSL version of your site? Will using HST save you bandwidth, if so how much?

Fun times. (Terrible pun is terrible, but I was talking to a guy called Tim. So I could have written "Fun Tims".)

| 1 comment.

 

This month has been mostly golang-based

Monday, 21 May 2018

This month has mostly been about golang. I've continued work on the protocol-tester that I recently introduced:

This has turned into a fun project, and now all my monitoring done with it. I've simplified the operation, such that everything uses Redis for storage, and there are now new protocol-testers for finger, nntp, and more.

Sample tests are as basic as this:

  mail.steve.org.uk must run smtp
  mail.steve.org.uk must run smtp with port 587
  mail.steve.org.uk must run imaps
  https://webmail.steve.org.uk/ must run http with content 'Prayer Webmail service'

Results are stored in a redis-queue, where they can picked off and announced to humans via a small daemon. In my case alerts are routed to a central host, via HTTP-POSTS, and eventually reach me via the pushover

Beyond the basic network testing though I've also reworked a bunch of code - so the markdown sharing site is now golang powered, rather than running on the previous perl-based code.

As a result of this rewrite, and a little more care, I now score 99/100 + 100/100 on Google's pagespeed testing service. A few more of my sites do the same now, thanks to inline-CSS, inline-JS, etc. Nothing I couldn't have done before, but this was a good moment to attack it.

Finally my "silly" Linux security module, for letting user-space decide if binaries should be executed, can-exec has been forward-ported to v4.16.17. No significant changes.

Over the coming weeks I'll be trying to move more stuff into the cloud, rather than self-hosting. I'm doing a lot of trial-and-error at the moment with Lamdas, containers, and dynamic-routing to that end.

Interesting times.

| No comments

 

A filesystem for known_hosts

Thursday, 19 April 2018

The other day I had an idea that wouldn't go away, a filesystem that exported the contents of ~/.ssh/known_hosts.

I can't think of a single useful use for it, beyond simple shell-scripting, and yet I couldn't resist.

 $ go get -u github.com/skx/knownfs
 $ go install github.com/skx/knownfs

Now make it work:

 $ mkdir ~/knownfs
 $ knownfs ~/knownfs

Beneat out mount-point we can expect one directory for each known-host. So we'll see entries:

 ~/knownfs $ ls | grep \.vpn
 builder.vpn
 deagol.vpn
 master.vpn
 www.vpn

 ~/knownfs $ ls | grep steve
 blog.steve.fi
 builder.steve.org.uk
 git.steve.org.uk
 mail.steve.org.uk
 master.steve.org.uk
 scatha.steve.fi
 www.steve.fi
 www.steve.org.uk

The host-specified entries will each contain a single file fingerprint, with the fingerprint of the remote host:

 ~/knownfs $ cd www.steve.fi
 ~/knownfs/www.steve.fi $ ls
 fingerprint
 frodo ~/knownfs/www.steve.fi $ cat fingerprint
 98:85:30:f9:f4:39:09:f7:06:e6:73:24:88:4a:2c:01

I've used it in a few shell-loops to run commands against hosts matching a pattern, but beyond that I'm struggling to think of a use for it.

If you like the idea I guess have a play:

It was perhaps more useful and productive than my other recent work - which involves porting an existing network-testing program from Ruby to golang, and in the process making it much more uniform and self-consistent.

The resulting network tester is pretty good, and can now notify via MQ to provide better decoupling too. The downside is of course that nobody changes network-testing solutions on a whim, and so these things are basically always in-house only.

| 3 comments.

 

Bread and data

Wednesday, 11 April 2018

For the past two weeks I've mostly been baking bread. I'm not sure what made me decide to make some the first time, but it actually turned out pretty good so I've been doing every day or two ever since.

This is the first time I've made bread in the past 20 years or so - I recall in the past I got frustrated that it never rose, or didn't turn out well. I can't see that I'm doing anything differently, so I'll just write it off as younger-Steve being daft!

No doubt I'll get bored of the delicious bread in the future, but for the moment I've got a good routine going - juggling going to the shops, child-care, and making bread.

Bread I've made includes the following:

Beyond that I've spent a little while writing a simple utility to embed resources in golang projects, after discovering the tool I'd previously been using, go-bindata, had been abandoned.

In short you feed it a directory of files and it will generate a file static.go with contents like this:

files[ "data/index.html" ] = "<html>....
files[ "data/robots.txt" ] = "User-Agent: * ..."

It's a bit more complex than that, but not much. As expected getting the embedded data at runtime is trivial, and it allows you to distribute a single binary even if you want/need some configuration files, templates, or media to run.

For example in the project I discussed in my previous post there is a HTTP-server which serves a user-interface based upon bootstrap. I want the HTML-files which make up that user-interface to be embedded in the binary, rather than distributing them seperately.

Anyway it's not unique, it was a fun experience writing, and I've switched to using it now:

| No comments

 

Rewriting some services in golang

Friday, 30 March 2018

The past couple of days I've been reworking a few of my existing projects, and converting them from Perl into Golang.

Bytemark had a great alerting system for routing alerts to different enginners, via email, SMS, and chat-messages. The system is called mauvealert and is available here on github.

The system is built around the notion of alerts which have different states (such as "pending", "raised", or "acknowledged"). Each alert is submitted via a UDP packet getting sent to the server with a bunch of fields:

  • Source IP of the submitter (this is implicit).
  • A human-readable ID such as "heartbeat", "disk-space-/", "disk-space-/root", etc.
  • A raise-field.
  • More fields here ..

Each incoming submission is stored in a database, and events are considered unique based upon the source+ID pair, such that if you see a second submission from the same IP, with the same ID, then any existing details are updated. This update-on-receive behaviour is pretty crucial to the way things work, especially when coupled with the "raise"-field.

A raise field might have values such as:

  • +5m
    • This alert will be raised in 5 minutes.
  • now
    • This alert will be raised immediately.
  • clear
    • This alert will be cleared immediately.

One simple way the system is used is to maintain heartbeat-alerts. Imagine a system sends the following message, every minute:

  • id:heartbeat raise:+5m [source:1.2.3.4]
    • The first time this is received by the server it will be recorded in the database.
    • The next time this is received the existing event will be updated, and crucially the time to raise an alert will be bumped (i.e. it will become current-time + 5m).
    • The next time the update is received the raise-time will also be bumped
    • ..

At some point the submitting system crashes, and five minutes after the last submission the alert moves from "pending" to "raised" - which will make it visible in the web-based user-interface, and also notify an engineer.

With this system you could easily write trivial and stateless ad-hoc monitoring scripts like so which would raise/clear :

 curl https://example.com && send-alert --id http-example.com --raise clear --detail "site ok" || \
  send-alert  --id http-example.com --raise now --detail "site down"

In short mauvealert allows aggregation of events, and centralises how/when engineers are notified. There's the flexibility to look at events, and send them to different people at different times of the day, decide some are urgent and must trigger SMSs, and some are ignorable and just generate emails .

(In mauvealert this routing is done by having a configuration file containing ruby, this attempts to match events so you could do things like say "If the event-id contains "failed-disc" then notify a DC-person, or if the event was raised from $important-system then notify everybody.)

I thought the design was pretty cool, and wanted something similar for myself. My version, which I setup a couple of years ago, was based around HTTP+JSON, rather than UDP-messages, and written in perl:

The advantage of using HTTP+JSON is that writing clients to submit events to the central system could easily and cheaply be done in multiple environments for multiple platforms. I didn't see the need for the efficiency of using binary UDP-based messages for submission, given that I have ~20 servers at the most.

Anyway the point of this blog post is that I've now rewritten my simplified personal-clone as a golang project, which makes deployment much simpler. Events are stored in an SQLite database and when raised they get sent to me via pushover:

The main difference is that I don't allow you to route events to different people, or notify via different mechanisms. Every raised alert gets sent to me, and only me, regardless of time of day. (Albeit via an pluggable external process such that you could add your own local logic.)

I've written too much already, getting sidetracked by explaining how neat mauvealert and by extension purple was, but also I rewrote the Perl DNS-lookup service at https://dns-api.org/ in golang too:

That had a couple of regressions which were soon reported and fixed by a kind contributor (lack of CORS headers, most obviously).

| 2 comments.

 

Serverless deployment via docker

Monday, 19 March 2018

I've been thinking about serverless-stuff recently, because I've been re-deploying a bunch of services and some of them could are almost microservices. One thing that a lot of my things have in common is that they're all simple HTTP-servers, presenting an API or end-point over HTTP. There is no state, no database, and no complex dependencies.

These should be prime candidates for serverless deployment, but at the same time I don't want to have to recode them for AWS Lamda, or any similar locked-down service. So docker is the obvious answer.

Let us pretend I have ten HTTP-based services, each of which each binds to port 8000. To make these available I could just setup a simple HTTP front-end:

  • https://api.example.fi/

We'd need to route the request to the appropriate back-end, so we'd start to present URLs like:

  • https://api.example.fi/steve/foo
  • https://api.example.fi/steve/bar

Here any request which had the prefix steve/foo would be routed to a running instance of the docker container steve/foo. In short the name of the (first) path component performs the mapping to the back-end.

I wrote a quick hack, in golang, which would bind to port 80 and dynamically launch the appropriate containers, then proxy back and forth. I soon realized that this is a terrible idea though! The problem is a malicious client could start making requests for things like:

  • https://api.example.fi/wordpress/wordpress
  • https://api.example.fi/blah/blah

That would trigger my API-proxy to download the containers and spin them up. Allowing running arbitrary (albeit "sandboxed") code. So taking a step back, we want to use the path-component of an URL to decide where to route the traffic? Each container will bind to :8000 on its private (docker) IP? There's an obvious solution here: HAProxy.

So I started again, I wrote a trivial golang deamon which will react to docker events - containers starting and stopping - and generate a suitable haproxy configuration file, which can then be used to reload haproxy.

The end result is that if I launch a container named "foo" then requests to http://api.example.fi/foo will reach it. Success! The only downside to this approach is that you must manually launch your back-end docker containers - but if you do so they'll become immediately available.

I guess there is another advantage. Since you're launching the containers (manually) you can setup links, volumes, and what-not. Much more so than if your API layer span them up with zero per-container knowledge.

| 1 comment.

 

A change of direction ..

Friday, 9 March 2018

In my previous post I talked about how our child-care works here in wintery Finland, and suggested there might be a change in the near future.

So here is the predictable update; I've resigned from my job and I'm going to be taking over childcare/daycare. Ideally this will last indefinitely, but it is definitely going to continue until November. (Which is the earliest any child could be moved into public day-care if there problems.)

I've loved my job, twice, but even though it makes me happy (in a way that several other positions didn't) there is no comparison. Child-care makes me happier-still. Sure there are days when your child just wants to scream, refuse to eat, and nothing works. But on average everything is awesome.

It's a hard decision, a "brave" decision too apparently (which I read negatively!), but also an easy one to make.

It'll be hard. I'll have no free time from 7AM-5PM, except during nap-time (11AM-1PM, give or take). But it will be worth it.

And who knows, maybe I'll even get to rant at people who ask "Where's his mother?" I live for those moments. Truly.

| 4 comments.

 

Recent Posts

Recent Tags