My Big Ninkasi Presentation
So quite a while ago I started building a scripting language and, over the course of about a year, actually finished it (for some definitions of "finished"). The scripting language was named "Ninkasi", after the Sumerian beer goddess.
Years later, as part of a job interview, the company I was applying to wanted me to do a presentation to the programming team. The presentation could be whatever I wanted it to be, so I picked the one thing I'm more familiar with than anyone else, more complex than most stuff I've worked on, has well-defined goals, and is a project that I actually finished, and that was my scripting language.
I've gone ahead and transcribed the slides I made from that presentation here, for anyone interested in the inner workings or development of it.
It's a project I'm still damn proud of, even if there are a few things I'd do differently if I did it again, and I cover those things near the end of the presentation.
Original slides are available here, created on Aug 8th, 2020: Google Slides deck. (Going forward, any updates are going to happen on this article, so the slides may be outdated.)
The scripting language source itself is available here:
The preprocessor is available here:
1. Ninkasi
I made a scripting language and now you have to hear me talk about it.
2. Why on Earth would you do something like that?
I spent years on this so I better make this answer good.
3. Features I wanted
- State serialization.
- Not provided by Lua in ways I found satisfactory.
- Serialization of external objects through callbacks.
- Functioning sandboxing all the way through.
- Variable number of ticks.
- Can technically do this in Lua with a modified VM.
- Previous attempt at a scripting system was all-or-nothing for
execution because it just recursively eval’d an AST.
- Hard to pause and return when you’re that far down the stack.
4. More features I wanted
- Sandboxing.
- Memory usage restrictions (can't fill hosting application's address space).
- (Wrapped) malloc failure leaves VM in stable state to be cleaned up with no leaks and keeps Valgrind happy.
- Tick count restrictions (prevent abusive infinite loops).
- Parser/deserializer both sandboxable.
- (Up to my ability with AFL and Valgrind.)
5. Even more features I wanted
- Determinism.
- Lua doesn't have it. Some things indexed by pointer.
- Floating-point determinism only cared about as far as running on the same platform.
- Should run the exact same on every platform, aside from that floating point determinism.
- Verified in testing by hashing the entire VM state and comparing serialized state snapshots
6. Features I wanted, last one
- External object lifetime management.
- Fire off callbacks when external object handle is GCed. (Can do in Lua through metatables, but as a result sandboxing involves removing metatable manipulation.)
- Hosting application can hold external handles to internal VM objects. (Can do in Lua with gross registry tricks.)
7. Okay I lied. This is the most important one
-
FOR FUNSIES.
-
BECAUSE I CAN AND NOBODY CAN STOP ME.
8. Other cool stuff
Wasn’t part of the original reasoning but I like it anyway!
9. Cool stuff
- C-like syntax.
- Maybe unintentionally closer to Javascript.
- No dependencies beyond C standard lib.
- Might even be able to run without that with some tweaks.
- Coroutines!
- Added late in development because I missed having it in Lua.
- Written entirely in C89.
- Except for // comments everywhere.
- Compiles in C++ mode, too.
10. Cool stuff (cont.)
- Runs in DOS in real-mode and 32-bit protected mode, up through
modern platforms like 64-bit Linux.
- DOS version compiled on Watcom 10.0a from 1994.
- Other compilers might need some minor work in nktypes.h to figure out object sizes through preprocessor. (int32_t not in C89, int in real-mode = 16-bit, int in 32-bit mode = 32-bit, etc).
- VM state snapshots are binary compatible with VM running on other
platforms.
- Not cross-endian compatible, could be with some work.
11. Cool stuff (cont.)
- Handles malloc() failures.
- Will get into this later.
12. Example code
What the heck does this thing even look like?
13. C API Example
This is an extremely simple program that creates a VM, hooks up an output function, compiles a one-line hard-coded script, executes it, and cleans up. Error reporting is minimal.
#include "nkx.h"
#include <stdio.h>
#include <assert.h>
// "print" function callback.
void printFunc(struct NKVMFunctionCallbackData *data)
{
nkuint32_t i;
for(i = 0; i < data->argumentCount; i++) {
printf("%s", nkxValueToString(data->vm, &data->arguments[i]));
}
}
int main(int argc, char *argv[])
{
// Create the VM.
struct NKVM *vm = nkxVmCreate();
// Create a compiler so we can compile new code. (Loading binary
// state snapshots doesn't need this.)
struct NKCompilerState *compiler = nkxCompilerCreate(vm);
// The most basic function we will need is "print". Otherwise it's
// not possible to get anything out. (Well, it is. You just have
// to have the hosting application explicitly pull data out of the
// finished VM.)
nkxVmSetupExternalFunction(
vm, compiler,
"print", // Used as as internal name for matching up during
// deserialization AND as a variable name so the
// script can call it.
printFunc, // The function itself.
nktrue, // True to add this as a global variable (otherwise
// there is no way to call it right away).
NK_INVALID_VALUE // Everything else is optional argument type
// and argument count checking.
);
// Compile the script. (You would do this multiple times before
// finalizing for multi-file scripts, or use a preprocessor to
// make it all go in as a single string.)
const char *scriptText = "print(\"foobar\\n\");\n";
nkxCompilerCompileScript(
compiler,
scriptText,
"internal" // Source file name (for error reporting).
);
// Done adding source files to compile. Write the end and finish
// setting up exported data.
nkxCompilerFinalize(compiler);
// TODO: Error check the compiler output for real (and display
// error messages if needed).
assert(!nkxVmHasErrors(vm));
// Run the program. (There are other ways to trigger execution,
// but this is the simplest.)
nkxVmExecuteProgram(vm);
// TODO: Error check execution.
assert(!nkxVmHasErrors(vm));
// Clean up and return success.
nkxVmDelete(vm);
return 0;
}
14. Ninkasi Code Example
Here's a simple example of Ninkasi script code. It generates a Mandelbrot pattern (using numbers instead of colors), and prints it to the console.
For this script to work, it would need a "print" function created, just like above in the C API example. ("print" is not a build-in function.)
// Mandelbrot generator demo for Ninkasi
for(var y = -1.0; y < 1.0; y = y + 0.05) {
for(var x = -2.0; x < 1.0; x = x + 0.03) {
var u = 0.0;
var v = 0.0;
var u2 = u * u;
var v2 = v * v;
var k;
for(k = 1; k < 100 && u2 + v2 < 4.0; ++k) {
v = 2.0 * u * v + y;
u = u2 - v2 + x;
u2 = u * u;
v2 = v * v;
}
if(k < 40) {
print(k % 10);
} else {
print(".");
}
}
print("\n");
}
15. The language itself
16. Types
- integers
- We use this (1 and 0) for booleans, too.
- floats
- strings
- Immutable.
- ID into string table.
- functions
- ID into function table.
- arg count + start index.
- Optional external function ID (for C callbacks).
- "objects"
- ID into object table.
- Analogous to Lua tables.
- foo.bar and foo["bar"] are the same.
- Can be external references with type info.
- Coroutines are treated as one of these.
- nil
- Just nil.
17. Syntax
- Syntax is C-like.
- Comments:
// This is a comment.
- Variable declarations:
var thisIsAVariable;
var thisIsAVariableWithAValue = 1234;
- if/else:
if(expression) statement;
if(expression) { statement; } else { statement; }
18. Syntaxes
- while/for/do while:
while(expression) statement;
for(x = 0; x < 20; ++x) print(x, "\n");
do { statement; } while(condition);
- Variable declaration supported in for loop. Scope is limited to loop.
for(var x = 0; x < 20; ++x) print(x, "\n");
- Function declarations:
function functionName(arg1, arg2) { return arg1 + arg2; }
19. Syntax 3
- Inline function declarations:
var functionName = function(arg1, arg2) { return arg1 + arg2; }
- Note: Not a closure. Doesn't capture any variables.
- Function calls:
functionName(1, 2);
- Object creation/manipulation:
var thing = object();
thing["foo"] = "bar";
thing.foo = "bar";
20. Syntax: Resurrection
Coroutines:
// This is the function that the coroutine will execute.
function functionToCall()
{
print("Hello\n");
// Yield control back to the parent context.
yield();
print("there!\n");
}
// Create the coroutine object.
var coroutineObject = coroutine(functionToCall);
// Run it.
print("1... ");
resume(coroutineObject);
print("2... ");
resume(coroutineObject);
Output:
1... Hello
2... there!
21. Error handling
What to do when everything explodes so you don’t HCF for real.
22. “Normal” runtime/compile errors
- Error report recorded as a string.
- Theoretically possible to clear the stack, reset IP to program start, and restart. Not recommended, though.
- Leaves VM object itself in a "valid" state (objects can be destroyed/cleaned up normally).
- Program itself is probably not in a valid state (remnants left on stack, half-completed instructions).
23. malloc() failure handling
- Almost every nkx function starts with a setjmp.
- Allocation failures cause a longjmp to the most recent setjmp.
- Internal allocations are tracked so they can be freed without having to actually unwind the stack.
- Never longjmps past application code, if used correctly.
24. malloc() failure is considered a "catastrophic error"
- VM will refuse to attempt more allocations.
- VM state cannot be serialized. (Can with normal errors.)
- Possible lurking NULL pointers left in VM.
- NO DANGLING POINTERS.
- Can still be cleaned up, but needs care, especially with respect to external objects.
- External object tables/state MUST BE VALID because we don't really own those objects and can't clean them up ourselves.
25. Multi-module compilation
Or how I learned to stop worrying and wrote a C89 preprocessor from scratch.
26. I wrote a C89 preprocessor from scratch.
- Does that count as multi-module compilation? It should count. I’m
counting it.
- Every build is just a unity build.
- You can use it to compile C89 programs.
- Works with GNU headers.
- Tested against mcpp's test suite.
- Valid programs worked.
- Some invalid programs made it through.
- Preprocessor is too permissive.
- Not really a big deal.
27. … in C89.
- Didn't use setjmp/longjmp for malloc() failure handling like Ninkasi itself.
- Literally just checked every malloc().
- Debatable as to which approach is less gross.
- Sometimes "goto" is okay.
- Used to be religiously against goto in C.
- Every use of goto in nkpp is super clear.
- Verbose names on your labels.
- Very limited uses!
- Pretty much just for cleanup to jump to the end of a function where all the temporary data teardown happens.
28. Bytecode
There’s no assembler so if you want to go lower-level you get to do it
the hard fun way.
29. 42 instructions
- 1 no-op
- 6 basic arithmetic (+,-,*,/,-,%)
- 5 "push literal value onto stack" (int/float/str/func/nil)
- 4 misc stack manipulation (pop,pop#,peek,poke)
- 2 static space manipulation (peek,poke)
- 4 jumps and branches (jmp,jz,call,ret)
- 1 "end"
- 10 comparison and expression logic (>,<,>=,<=,==,!=,===,!,&&,||)
- 4 "object" related (create/set/get/len)
- 2 that are gross and should be removed soon related to an anti-feature I never should have done.
- 3 coroutine (create/yield/resume)
30. Memory
- All instructions address either the stack, a space in the static
address space, or a field on an object.
- Objects are the closest equivalent to a heap it has.
- No other registers.
31. Serialized binary format
- Contains the entire state of the VM.
- Includes even things like...
- the bytecodes themself.
- what errors have been thrown.
- garbage collector state.
- external data owned by hosting application that it decides to serialize with external objects.
- file/line markers for debugging.
- exported function/variable names.
- A “compiled” Ninkasi script (.nkb) is a serialized state that was saved after compilation but before execution.
32. Difficult Fun bugs found during development
33. Real-mode DOS specific stuff
- In the “Huge” memory model with the 1MB address space, size_t is
still 16-bits.
- Internally Ninkasi used 32-bit for everything. 32-bit allocation of > 65535 bytes got truncated. Worse yet, test was a string copy. Allocation was truncated but copy was not, so it just filled the next chunk of memory with whatever the scripter wanted.
- Most C standard library functions for stuff like string.h were replaced with hand-built versions to operate identically regardless of size_t limits, because of this limitation.
- Default malloc wrapper now checks to see if allocation size is over ~(size_t)0 bytes.
34. PowerPC (Linux) specific
- 32-bit structs get packed differently from 32-bit ints in arguments
on whatever calling convention was the default for GCC.
- Passed a structure containing a single uint32 in a varags field that turned it into a 32-bit uint on the other side. Stack corruption. (Function was for debug output.)
35. Everywhere
- SIGFPE without dividing by zero.
- Integer division instruction protected from divide-by-zero. Still got SIGFPE through AFL run.
- -2147483648 / -1 = +2147483648
- +2147483648 is outside of 32-bit signed twos complement integer.
- SIGFPE generated when the resulting value can’t expressed in the output range.
- Actually ran into this one twice - runtime and optimizer had the same bug.
36. Everywhere (cont.)
- AFL discovered lots of places where a buffer size in the deserializer could be specified as 0xffffffff. 0xffffffff + allocation header = truncated tiny allocation.
- AFL discovered many instances where allocation count * object size
were > max size_t.
- Wrote an array allocator/reallocator wrapper that checks for overflow.
37. Lessons learned
- AFL is amazing. Always use AFL.
- Valgrind too.
- If you really want to prevent all integer overflows, you have a lot
of checking to do.
- Ninkasi has overflow checks everywhere.
38. Future Work
39. Faster hash tables
- Current hash tables are "buckets with linked lists".
- Terrible for cache locality.
- Flat array with quadratic probing would be better.
- Tradeoff of performance vs simplicity.
40. PUSHLITERAL_INT is like 1/4 to 1/2 of my generated instructions
- Many instructions take two parameters with one immediate.
- Wasting cycles just to get the immediate value onto the stack before immediately being used.
- Will mean extra instructions - more complicated instruction set.
- Tradeoff of performance vs simplicity, again.
41. Documentation’s not great
- Finish incomplete documentation.
- Example program.
- Complete example library.
- Custom types.
- Garbage collection handling.
- Error recovery.
- I’ve probably put more into this presentation than all of the documentation combined.
42. Add postfix increment/decrement
- Some minor extra stack juggling.
- Optimizer should be able to turn postincrement/postdecrement into preincrement/predecrement if return value is unused.
43. Remove the previously-mentioned anti-feature
- foo.bar(x) becomes foo.bar(foo, x) internally.
- I have better ideas about how to handle this.
- This will be the first thing I fix next time I touch the project.
44. “Closures” handled in a mediocre way
- Make an object (“ob”) with _exec and _data fields, and when you call
it like a function, it turns into ob._exec(_data).
- The names are given special meaning.
- This is a temporary stand-in for a real solution.
45. Variadic functions
- Add functions that take variable argument counts.
- C callback functions can do this.
- Functions written in the language itself cannot at the moment.
46. Inline function compilation
- Right now it literally just inserts the function code right there with a big JMP at the start to skip over it during normal execution of the parent function.
- This is in-elegant, but it works for now.
47. Type queries
- Should be able to see if an object is a given type.
- We can trivially implement this in C callbacks for basic types, but external types need a little bit of API work.
48. An actual standard library
- Because I’m tired of implementing a new version of print() with every new project.
- rand(), sin(x), cos(x), abs(x), array(...) (inline array initialization), int(x), float(x) all had to be implemented for a game project. These should be built-in.
- Even len(x) used to be like this.
That’s all!
Gruedorf
Also, I'm counting this as my official first Gruedorf entry.
It's a bit coincidental, because I didn't realize it had been going until right before I decided to publish it (and also, hopefully, have it be the start of me blogging regularly again).
Posted: 2021-08-02