Neko JIT

posted on 2005-12-31

This week I’ve been working on a JIT X86 code generator for Neko. After 4 days most of the opcodes are implemented and it seems to work very well. Also, this have been possible with only very few changes in the VM itself since the JIT engine is entirely written in NekoML, so it’s actually a Neko Library (while most JIT engines are usually written in C and are part of the VM).

At first, I was thinking about a two-layers approach, one first generator that will translate the Neko bytecode into abstract CPU-independant opcodes an then a second generator that will generate the corresponding X86 opcodes. After two days I realized that my abstract opcodes where actually very near from X86 ones, for the good reason that I didn’t know what exactly should be abstracted.

I was studying the X86 opcodes at the same time I was writting the JIT. Although I made a little ASM back to my Pascal days, I didn’t know at all the opcodes binary representations (quite difficult actually since there is a lot of different forms for the same opcode). Since I dont’t know precisely the opcodes for others platforms where JIT would be ported (AMD64 ?), it was a bad idea from the beginning to abstract something I didn’t know about.

So I ended-up merging my two-layers into a one-layer generator, with a small separate X86 library that is taking care of binary representation of opcodes. Basicly, the JIT is doing the following things :

  1. read the module that need to be JIT. This is done using the module API of Neko standard library which allow to read and safely manipulate a module without initializing it.
  2. from decoded integers representing opcodes, rebuild the NekoML opcodes (this require some NekoML magic)
  3. with each opcode, generate corresponding X86 opcodes : you need to keep some VM state in registers (stack, accumulator, VM instance…). It means actually rewriting the VM bytecode interpreter loop in assembler.
  4. use the X86 library to build a string representing the opcodes
  5. use the $jit builtin to execute the code : this builtin is the only unsafe builtin in all Neko. It means that you can only crash the VM by using this builtin. For security sandboxes, it might be nice to change it into a primitive, but right now it’s more easy to debug this way.

Now that the code is generated, it’s not finished. The Bytecode Functions are stored into the module globals table, so if you run the JIT-code now, all function calls with go back to the bytecode interpreter instead of running the JIT version of the function. It was then needed to introduce a new type of function (after “bytecode-function” and “C primitive”) which would callback into JIT, and transform all module globals storing bytecode functions into the corresponding JIT-function.

Works like a charm.

However there still a few things left :

  • implement missing Neko opcodes (mainly numeric operations) plus the JmpTable opcode which is more difficult
  • right now the VM cannot callback JIT-functions, and before they can actually start executing the JIT code, it is needed to correctly setup the registers for JIT-mode
  • same for exceptions : right now when an exception is catched, the control will return to the bytecode version of the module instead of the JIT version

After that is done, it will be pretty nice since both JIT and VM interp can callback each other transparently while still keeping nice Neko features such as stack traces fully working.

You can watch the JIT sources in nekoCVS/src/jit/Jit.nml, that’s only a 26K NekoML file right now.

haXe updates

posted on 2005-12-23

The haXe project was started two months ago, and even if I’m far from working fulltime on it, it’s making good improvments with today Alpha 5 release.

Right now there is only SWF, Neko and XML output but I’m planing to add JavaScript output soon - so it will be available for January. Then, it will need some work for testing, trying to reduce as much as possible the platform differences, and increase the number of available libraries in the distribution. Here are some of the problems that need to be tackle :

  • one big platform different between Neko and Flash is that you can call a Function with any number of arguments in Flash while you must call with the exact number of arguments in Neko ( This does not prevent to add optional arguments to the language). It’s not that much a big problem since if you’re doing writing dynamic code (95% of the time ?) arguments count is enforced by the typechecker.
  • dealing with Flash obscurities. For example it was found that the XML attributes field does not inherit from Object while still being an Object. Call it a bug… Making “instanceof” work correctly was also a bit difficult since in flash “hello” instanceof String is false while new String(”hello”) instance String is true…
  • Classes organization. I don’t like so much current model. I was thinking moving all Flash classes into the “flash” package (just like AS3 is doing), the “neko” classes into the “neko” package, and keep only standard haXe classes into the root package. Not sure yet because it will cause confusion for existing JS/Flash users
  • Exceptions traces in Flash : I wanted to get the same CallStackTrace and ExceptionStrackTrace features as Neko, but it’s not possible without adding some extra computations that will update a global stack variable. This would cost CPU but might be a debug option that could be turn on/off with conditional compilation flags
  • AVM2 bytecode : some people are starting to work on it. I don’t have time yet, I will start looking at it more deeply after haXe 1.0