Custom Compiler

I've had an interest in compilers and assemblers for a long time now. I among other applications enjoy designing my own CPUs in programs like logisim or digital and then writing assemblers and compilers to run code on them. I've made a few compilers at this point with varying quality as I've become better at programming. But for now i thought I'd write about the latest compiler i made...

The Setup

There's this program called "OCEmu". It's a emulator for a Lua framework originally created for a Minecraft mod of all things. That mod called "Open Computers" allows the user to write Lua to control a late 80's style terminal computer inside the game with surprisingly time accurate limitations. To understand the depth we're speaking of here, your computers need hardrives, EEPROMs, RAM, CPUs, GPUs and more to function correctly and you get to write your own bios and operating system for them. OCEmu then, emulates these computers allowing you to work with the neat framework without needing to play Minecraft.

The Problem

This framework is great and all but there's just one thing that's always bothered me about it. It only runs Lua! And as anyone who's worked with Lua for any extended period of time can attest to, it's just a pain to deal with sometimes...

  • No type checking

  • Varying numbers of both arguments and return values from functions

  • all indexing operations begin at 1 instead of 0

  • No language support for OOP

Though these qualities can be desired in some applications of Lua i feel it doesn't suit my uses very well as i develop these operating systems in it... So i decided to make my own language on top of Lua and extend it so that it's less painful for me to use the framework.

Plus, it's also just a great excuse to learn more about compilers!

The Design and Implementation

I wanted to keep the language as simple as possible so that I could develop a finished product in a reasonable amount of time. In this goal I chose to use Lua as a base for my own language. The idea being that I can simplify all the stages of the compiler by simply copying function blocks from source to output. removing any need to parse them as long as I leave the syntax inside function blocks unaltered.

Since this entire language is going to be more or less based around adding OOP to Lua I first thought I'd name the language LuaC and have C stand for Class. But after realizing that the .luac extension is apparently already in use by compiled Lua files i settled for the name LuaZ as it sounds some-what similar.

The first feature to add is classes! I took a lot of inspiration from C# while making the compiler and so I chose to add the "class" keyword to the language after which the "field", "function", or "prop" keywords could occur before ending the class with the "end" keyword in the classic Lua spirit.

The "field" keyword simply adds a instance variable to the class with a given name and start value.

The "function" keyword has the same syntax as a normal Lua function. The only difference being that this variant of function acts as a member method for the class it's inside and has the benefit of allowing some type checking through the use of a <> delimiter for any given argument with allowed types comma separated inside it.

The "prop" keyword adds a C#-style property to the class instance. When accessing it, it looks just like a field but invokes the "get" and "set" functions when you try to read or write to it. the setter also gets a hidden argument named "value" passed to it holding what the caller wants to set the property to. This argument can be renamed and type checked exactly as with functions by simply adding parentheses after the "set" keyword and adding the argument inside.

any of these 3 keywords can also be prefixed with static to instead add them globally on the class itself.

Beyond the class feature, i also wanted all current Lua files to be valid LuaZ files. To achieve this i also allow arbitrary Lua code outside any class blocks.

I also added a "using" keyword to replace the normal Lua "require" function so that the language could look at file cross referencing and help resolve cyclic includes for me.

After a lot of learning and working on it i eventually managed to implement this language spec and write a compiler for it in C# taking in a file or folder of LuaZ source code and spitting out a new folder of pure Lua files full of function calls to a "ClassBuilder" object I've written that acts as a runtime or standard library of sorts. When invoked, this object builds up the class before retuning the final class object.

For a brief overview of how i use this language: Most LuaZ files begin with a set of using statements to pull in dependencies before running some normal Lua code to set up things like settings and lookup tables before beginning a class block, implementing a class, and then ending with a short Lua block after the class block to return the class object from the file so that it may be used by other files.

The Result:

I've since making LuaZ I've went on to use it in a few Lua projects of mine. On the side you can see an example of a project written in LuaZ running in OCEmu (the lag is due to the framework. It's meant to be a 80's computer, those didn't run very fast. I'm pushing the framework to it's limits with that GUI...)

The fun thing about this entire project is that all this can run inside Minecraft! That's still just insane to me! Even now...

This entire compiler project was a great deal of fun just into itself and by it's nature has even proved useful after it's completion as I now continue to actively work in the LuaZ language I designed when I make large projects that need to build on Lua as it's a common scripting language for games.

The language is far, far from perfect but I think it's a great first iteration.

I've also got ambitious plans for a new and improved LuaZ V2.0. With a full parser for the function bodies allowing a me to completely ditch the Lua syntax for a more C-like feel and the ability to better optimize the generated Lua code. I've also started looking into LLVM and trying to add that as a secondary backend allowing the language to produce a runnable standalone binary. I also want to try including more novell language design concepts and add compile-time execution to the language and so much more!