How to Write a BrainF Interpreter From Scratch!
IMayBeMe

Introduction

Writing interpreters, as well as compilers, has always been a fascination of mine, but I never really had the time or the effort to write a compiler for a language let alone make my own. Prologue aside, what I mean to say is that writing interpreters doesn't have to be hard, especially when it's for BrainF.

What's BrainF anyways one might ask? Well to put it in light terns, BrainF is essentially a really weird but cool programming language. The idea behind BrainF stems from Urban Muller a computer scientist who devised this diabolical spectacle of a language. BrainF only has 8 characters consisting of, >, <, +, -, [, ], ., and ,. This might seem like a lot of gibberish, which quite frankly it is, but we will make sense of this later in the tutorial.

As far as the language of choice goes for the program, I was originally planning on using pseudo-code as it's easy to translate into any language but I soon realized that Python is nearly identical and super popular so I felt that it was best to go with Python.

Understanding BrainF

Now my goal here isn't to teach you BrainF, but due to its lack of popularity, I felt it would be fitting to include a small portion on actually coding in BrainF. To simplify things, BrainF is essentially a grid of memory cells that can be mutated in various ways. Let's go through the 8 instructions:

  • > is used to move to memory pointer forward by a degree of one. For someone not familiar with how memory works, let me just give a quick example of this. Say we have 3 memory cells all set to 0 like this: [0, 0, 0]. Now we would have a pointer that would keep track of the current cell and let's also set that to 0. For those of you who know how arrays work, you would know that the pointer is currently referencing the first of the 3 memory cells. By using >, we can increment the pointer by one selecting the second cell as our active cell.
  • < is the dead opposite of > and is used to decrement the memory pointer.
  • + increments the current memory cell by 1 and would pretty much be: memory_cell += 1 in Python.
  • - is the opposite of + and decrements the current memory cell.
  • [ is the start of a loop in BrainF. Now, this is where it gets super interesting as loops have quite an interesting logic. The way loops work is that if the current memory cell is equal to 0, then the program will skip to the corresponding ] or in the latter will just continue running. This might be a bit confusing right now but don't worry as it will start to make a lot more sense when we start writing the interpreter.
  • ] is the closing bracket of the loop and will skip back to the corresponding [ if the memory cell is not equal to 0. In the case that it is, the program will keep running and the loop is complete.
  • . is used the output the contents of the memory cell by ASCII values, so 73 would output: I
  • , is used to take a single character input and is stored in the current memory cell after finding the ASCII value.

This explanation was a bit brief, so here is a great tutorial by Nayoar which explains BrainF well.

Writing the interpreter

I don't want to drone on and on about some interpreting concept, so I better save you the headache and show you the code before I explain it. To get started we would write:

Nothing about the code is particularly impressive but it does serve a purpose to set up some basic variables and the function itself. char_index will be used to keep track of where in the program we are while pointer and memory are used in connection to each other for storing data. loop might be the only one that is not self-explanatory, but it essentially keeps track of brackets to deal with nested loops.

Moving on, we first need to set up a while loop to iterate through the code and create our first few basics instructions like so:

So pretty much, we added a while loop which is the heart that keeps the interpreter going as well as a few simple instructions. I hope most of this is decently self-explanatory but let me explain the logic behind it:

In essence, this allocates another cell of memory into the list if the pointer attempts to move to a value that is greater than the largest index of the memory list. I found this an easy way not to assume the size of the program and a way to make the memory more dynamic. Everything else looks like it doesn't need an explanation so I better skip over it.

Next up we have handling loops which will probably be the most complicated part of this tutorial, so I will try my best to go through it slowly.

Let's start with the open brackets first. Simply speaking, if you remember back when I mentioned how they work, you would see that we can completely ignore them if the current memory cell is not equal to 0. But what do we do when the memory cell is equal to 0? Well to answer that, the program needs to skip to the corresponding closed bracket but it's not as simple as you would think.

I bet most of you would try to find the nearest closing bracket but take a look at this example: [++>[<-]>]. Now the actual code here is gibberish but look how the loops are nested. This would create an issue if we were to take a naive approach so rather we set loop = 1 as in saying that we have 1 open bracket. We then continue to iterate through the code, adding 1 to loop every time we land on an open bracket and subtracting 1 every time we land on a closed one.

This is clever as a allows us to find the direct corresponding bracket without having to worry about the complex structure of nested loops.

For those of you wondering, the closed brackets work the same way except in reverse, so as to skip back to the open bracket if the memory value is greater than 0. The rest of the code is pretty simple and I like to think of it almost like housekeeping as it just makes sure that everything runs smoothly.

Last but not least we have the input and output shown below:

I hope that I don't need to explain this but I'll go over the two things that aren't completely obvious. Firstly, all end = '' does is it makes sure that everything prints on the same line without non-specified linebreaks. char_index += 1 just moves through the code to get ready for the next iteration of the while loop.

All in all the complete code is right down here:

To finish off this tutorial, we can test a simple fibonacci program by running the function like so,

This should output:

Conclusion

That's going to be it for this tutorial and I hoped you enjoyed it. Please leave any sort of feedback/comments as I would appreciate them. Also please don't hesitate to ask questions and I will get back to you with an answer.

You are viewing a single comment. View All
MrVoo

Noice!

IMayBeMe

@MrVoo thank you!