Platform
Docs
Solutions
ContactLog In

Start Routing Notifications Today!

Courier is a notification service that centralizes all of your templates and messaging channels in one place which increases visibility and reduces engineering time.

Sign-up

wasm-code-generation-header
ENGINEERING

Build a WebAssembly Language for Fun and Profit: Code Generation

Drew Youngwerth

September 01, 2022

The final phase of our compiler is code generation. This phase takes the AST and converts it to a set of executable instructions. In our case, WebAssembly. To accomplish this, we are going to use a popular WebAssembly compiler toolchain called binaryen.

Binaryen does most of the heavy compiling tasks for us. Tricky optimizations like dead-code removal, code folding, and more are all handled out of the box. Thanks to binaryen, we can make a powerful language with very little code.

This is the third article in the Build a programming language series. If you’re just starting out, start with the first post here before continuing on.

The Compile function

One of the trickiest tasks in code generation is what to call the code generation function. For this article, I have opted to call the main function compile. The compile function will take our root AST (a BlockNode) and return a new binaryen.Module. The binaryen.Module exposes functions that can optimize and emit wasm in various formats.

Here's the outline:

Generating A Function Map

Next we define the generateFunctionMap function. This function crawls the entire expression tree to find and register function definitions. Its important we do this before actually compiling the functions as some functions may call other functions before they've been defined.

The return type of generateFunctionMap is a map where the key is the function name and the value is an object containing all the important information about the function the compiler needs to know about. For now, all we need is returnType.

Here’s the return type definition:

Now that we have defined our return type, we can define the actual function:

Onto getFunctionIdentifier we called earlier. This function is simple. It takes a BlockNode, ensures that the second identifier is a TypedIdentifierNode, and then returns the identifier and return type:

As noted in the getFunctionIdentifier function, Binaryen doesn't understand what type the string typeIdentifier is. To handle this we have to map our defined types to binaryen types. For now, we'll just support i32 and f32. Thankfully binaryen uses the same nomenclature we do. So the map function is pretty simple:

getFunctionIdentifier Made a call to a new function, isNodeType. This function is essentially the same concept as the isTokenType function we defined in the parsing article, only for ASTNode instead of Token.

Here's the definition:

With the mapper finished we can start generating some code.

Compiling Expressions

The compileExpression function is where we really start to make use of binaryen to model the generated machine code. Because of the tree structure of an, ahem, abstract syntax tree, compileExpression is highly recursive. This is one of my favorite things about programming languages, their patterns tend to lend themselves to elegant recursive functions with high levels of code re-use.

Let's start with defining the parameters of compileExpression. We will need to pass the binaryen.Module and the functionMap we created earlier, the actual expression we are compiling, and any parameters of the function this expression may be a part of (if it's inside) of a function. When there are more than two parameters of a function it can be difficult to visually keep track of what is what. So I like to make it clear by grouping them together in an object. This enforces labeling the parameters on call and as a result, improves code readability.

Here's the interface of that object:

Now that we have the options for compileExpression defined, we can define the actual function. compileExpression takes CompileExpressionOpts as a parameter and returns a number. The job of this function is to take an expression and determine what type of expression it is, from there it can pass the expression to another compiler function that can handle that specific type of expression.

Why return a number? When we build an expression with binaryen it returns a number as an identifier for that expression. This allows us to compile an expression ahead of time and then reference that expression later down the line.

Here's the definition:

Let's define the compileBlock functions. Because this function is also compiling an expression, we can re-use the previously defined CompileExpressionOpts, but we'll narrow the expression field to the BlockNode type, since we know we are compiling a block by the time this function is called:

Note: If you're curious to see how looping works in binaryen / WebAssembly works, check out my blog post on the subject here. Spoiler alert, its pretty weird.

The last simple expression we'll compile in this section is the identifier expression. If compileExpression was passed a lone IdentifierNode it means that the expression evaluates to the value of the identifier. In wispy, we don't have variables and function identifiers are caught before the could've been passed here. That means the only thing IdentifierNode can resolve to is a parameter.

Here's the definition:

The final expression type left to compile is a function call. This is interesting enough to warrant its own section.

Compiling Function Calls

In wispy function calls are blocks with multiple expressions where the first expression is an identifier. The job of compileFunction is to determine which function is being called, what its parameters and return type are, and finally, building the call instruction with binaryen.

Here's the definition:

Let's define the compileIf function before we move onto the compileFunction... function.

Compiling Function Definitions

Function definitions are a whole lot like function calls, so the function structure is pretty similar. We take CompileBlockOpts and return a number (the binaryen expression reference).

Here's the definition:

Now, let's define the getFunctionParameters function. This function takes the function BlockNode, that is, the entire unmodified function definition, and extracts its parameters. The function returns two values, parameters and parameterTypes.

The first returned value, parameters, is a map where the key is the parameter identifier, and the value is the information needed to access the parameter down the line within the function body.

The second returned value is an array of binaryen types. There is one type for each defined parameter, and they must remain in the order they are defined. This is because binaryen doesn't reference parameters by their names, instead it references them by the index of the array in which they are defined. Don't worry if this is confusing to you, the code should make things a little more clear. If you need, refer to the compileIdentifier definition, to get a better understanding of how this works in practice.

Here's the definition:

Now all that's left is to define the standard library. This part of the code isn't super interesting. We are essentially just mapping primitive WebAssembly instructions to a name to be referenced within wispy.

Here's the definitions. The only important information is the name we are associating with each instruction:

By now src/compiler.mts should [look something like this](https://github.com/drew-y/wispy/blob/58d5872f8d927358c1f8f70ebb4dda6d9458a8c8/src/compiler.mts). With that, our compiler is finished. It's time to execute some wispy!

Putting It All Together

Now that we have finished our compiler, we can finally run our code.

First, replace the contents of src/index.mts with this:

Now build and run project:

If all goes well (and you passed the number 15 to fib), you should see the number 610 in the output of your console. If so, you've done it, you've made a working WebAssembly language. Congrats!

A full copy of the wispy language implementation is available under an open-source license at https://github.com/drew-y/wispy.

Start Routing Notifications Today!

Courier is a notification service that centralizes all of your templates and messaging channels in one place which increases visibility and reduces engineering time.

Sign-up

More from Engineering

courier-ios-thumbnail
PRODUCT NEWSENGINEERING

Simplifying notifications with the Courier iOS SDK

Push notifications are a valuable tool for keeping users informed and increasing their engagement with your app. You can use push notifications to alert users about promotions, new content, or any other important updates. While push notifications are a powerful tool, setting up push notifications in iOS can be a daunting task that requires a significant amount of effort and time. Fortunately, the Courier iOS Mobile Notifications Software Development Kit (SDK) simplifies this process.

Mike Miller

Mike Miller

March 23, 2023

Courier Android SDK thumbnail
PRODUCT NEWSENGINEERING

Building Android push notifications with Firebase and Courier’s SDK

Push notifications have become an essential part of modern mobile apps, allowing you to keep your users engaged and informed. However, implementing push for different platforms can be a complex and time-consuming task, requiring developers to set up and handle token management, testing, and other logistical details.

Mike Miller

Mike Miller

March 21, 2023

Build your first notification in minutes

Send up to 10,000 notifications every month, for free.

Get started for free

Email & push notification

Build your first notification in minutes

Send up to 10,000 notifications every month, for free.

Get started for free

Email & push notification

Platform

Users

Content

Channels

Sending

Workflows

Preferences

Inbox

Workspaces

Observability

API Status

Changelog

© 2024 Courier. All rights reserved.