About Archive Tags RSS Feed

 

Entries tagged basic

So I wrote a basic BASIC

20 October 2018 12:01

So back in June I challenged myself to write a BASIC interpreter in a weekend. The next time I mentioned it was to admit defeat. I didn't really explain in any detail, because I thought I'd wait a few days and try again and I was distracted at the time I wrote my post.

As it happened that was over four months ago, so clearly it didn't work out. The reason for this was because I was getting too bogged down in the wrong kind of details. I'd got my heart set on doing this the "modern" way:

  • Write a lexer to spit the input into tokens
    • LINE-NUMBER:10, PRINT, "Hello, World"
  • Then I'd take those tokens and form an abstract syntax tree.
  • Finally I'd walk the tree evaluating as I went.

The problem is that almost immediately I ran into problems, my naive approach didn't have a good solution for identifying line-numbers. So I was too paralysed to proceed much further.

I sidestepped the initial problem and figured maybe I should just have a series of tokens, somehow, which would be keyed off line-number. Obviously when you're interpreting "traditional" BASIC you need to care about lines, and treat them as important because you need to handle fun-things like this:

10 PRINT "STEVE ROCKS"
20 GOTO 10

Anyway I'd parse each line, assuming only a single statement upon a line (ha!) you can divide it into:

  • Number - i.e. line-number.
  • Statement.
  • Newline to terminate.

Then you could have:

code{blah} ..
code[10] = "PRINT STEVE ROCKS"
code[20] = "GOTO 10"

Obviously you spot the problem there, if you think it through. Anyway. I've been thinking about it off and on since then, and the end result is that for the past two evenings I've been mostly writing a BASIC interpreter, in golang, in 20-30 minute chunks.

The way it works is as you'd expect (don't make me laugh ,bitterly):

  • Parse the input into tokens.
  • Store those as an array.
  • Interpet each token.
    • No AST
    • No complicated structures.
    • Your program is literally an array of tokens.

I cheated, horribly, in parsing line-numbers which turned out to be exactly the right thing to do. The output of my naive lexer was:

INT:10, PRINT, STRING:"Hello World", NEWLINE, INT:20, GOTO, INT:10

Guess what? If you (secretly) prefix a newline to the program you're given you can identify line-numbers just by keeping track of your previous token in the lexer. A line-number is any number that follows a newline. You don't even have to care if they sequential. (Hrm. Bug-report?)

Once you have an array of tokens it becomes almost insanely easy to process the stream and run your interpreter:

 program[] = { LINE_NUMBER:10, PRINT, "Hello", NEWLINE, LINE_NUMBER:20 ..}

 let offset := 0
 for( offset < len(program) ) {
    token = program[offset]

    if ( token == GOTO ) { handle_goto() ; }
    if ( token == PRINT ) { handle_print() ; }
    .. handlers for every other statement
    offset++
 }

Make offset a global. And suddenly GOTO 10 becomes:

  • Scan the array, again, looking for "LINE_NUMBER:10".
  • Set offset to that index.

Magically it all just works. Add a stack, and GOSUB/RETURN are handled with ease too by pushing/popping the offset to it.

In fact even the FOR-loop is handled in only a few lines of code - most of the magic happening in the handler for the "NEXT" statement (because that's the part that needs to decide if it needs to jump-back to the body of the loop, or continue running.

OK this is a basic-BASIC as it is missing primtives (CHR(), LEN,etc) and it only cares about integers. But the code is wonderfully simple to understand, and the test-case coverage is pretty high.

I'll leave with an example:

10 REM This is a program
00 REM
 01 REM This program should produce 126 * 126 * 10
 02 REM  = 158760
 03 REM
 05 GOSUB 100
 10 FOR i = 0 TO 126
 20  FOR j = 0 TO 126 STEP 1
 30   FOR k = 0 TO 10
 40    LET a = i * j * k
 50   NEXT k
 60  NEXT j
 70 NEXT i
 75 PRINT a, "\n"
 80 END
100 PRINT "Hello, I'm multiplying your integers"
110 RETURN

Loops indented for clarity. Tokens in upper-case only for retro-nostalgia.

Find it here, if you care:

I had fun. Worth it.

I even "wrote" a "game":

| 2 comments

 

A visual basic server

23 October 2018 12:01

So my previous post described a BASIC interpreter I'd written.

Before the previous release I decided to ensure that it was easy to embed, and that it was possible to extend the BASIC environment such that it could call functions implemented in golang.

One of the first things that came to mind was to allow a BASIC script to plot pixels in a PNG. So I made that possible by adding "PLOT x,y" and "SAVE" primitives.

Taking that step further I then wrote a HTTP-server which would allow you to enter a BASIC program and view the image it created. It's a little cute at least.

Install it from source, or fetch a binary if you prefer, via:

$ go get -u github.com/skx/gobasic/goserver

Then launch it and point your browser at http://localhost:8080, and you'll be presented with something like this:

Fun times.

| 2 comments

 

Trivial benchmarks of toy languages

8 October 2022 15:00

Over the past few months (years?) I've posted on my blog about the various toy interpreters I've written.

I've used a couple of scripting languages/engines in my professional career, but in public I think I've implemented

Each of these works in similar ways, and each of these filled a minor niche, or helped me learn something new. But of course there's always a question:

  • Which is fastest?

In the real world? It just doesn't matter. For me. But I was curious, so I hacked up a simple benchmark of calculating 12! (i.e. The factorial of 12).

The specific timings will vary based on the system which runs the test(s), but there's no threading involved so the relative performance is probably comparable.

Anyway the benchmark is simple, and I did it "fairly". By that I mean that I didn't try to optimize any particular test-implementation, I just wrote it in a way that felt natural.

The results? Evalfilter wins, because it compiles the program into bytecode, which can be executed pretty quickly. But I was actually shocked ("I wrote a benchmark; The results will blow your mind!") at the second and third result:

BenchmarkEvalFilterFactorial-4      61542     17458 ns/op
BenchmarkFothFactorial-4            44751     26275 ns/op
BenchmarkBASICFactorial-4           36735     32090 ns/op
BenchmarkMonkeyFactorial-4          14446     85061 ns/op
BenchmarkYALFactorial-4              2607    456757 ns/op
BenchmarkTCLFactorial-4               292   4085301 ns/op

here we see that FOTH, my FORTH implementation, comes second. I guess this is an efficient interpreter too, bacause that too is essentially "bytecode". (Looking up words in a dictionary, which really maps to indexes to other words. The stack operations are reasonably simple and fast too.)

Number three? BASIC? I expected better from the other implementations to be honest. BASIC doesn't even use an AST (in my implementation), just walks tokens. I figured the TCO implemented by my lisp would make that number three.

Anyway the numbers mean nothing. Really. But still interesting.

| No comments