THIS BLOG HAS BEEN MOVED TO ncannasse.fr, please update your bookmarks !

Neko 1.2

posted on 2006-01-09

Neko 1.2 have just been released on http://nekovm.org

This include several major changes :

  • runtime exceptions : several operations that were returning “null” before are now raising an exception. This is the case for invalid function calls (not a function or invalid number of arguments), object field access (for example null.x) , array access (still null if outside of bounds) and numerical operations.
  • linker : using “nekoc -link” you can now link a lot of .n bytecode files together into a single standalone .n file
  • nekoboot : this utility enable you to create standalone executables from a single bytecode file
  • renaming : the neko virtual machine is now named “neko” (instead of nekovm) and both neko and nekoml compilers are named “nekoc” and “nekoml”. Compilers are built using nekoboot and are then standalon executables (this is more easy to use, simply “nekoc myfile.neko”).
  • TCO : tail recursion optimizations in Neko
  • Object Prototypes : object can now have chained prototype (see Language Reference Documentation)
  • standard library : added UTF8 support, improved XML parser, and other useful primitives as well
  • licence change : Neko 1.2 is now LGPL while Neko 1.1 was GPL

There is also some experimental JIT but it’s not yet stable and is currently very slow (the JIT-generator, not the generated code).

I’m looking for people interested in using Neko as a target for compilers and research papers, contact me directly or on Neko mailing list if interested.

haXe Type System Extensions for Async Calls

posted on 2006-01-06

Here’s an interesting problem to tackle. haXe is a common language targeting different platforms (Flash, JavaScript and Neko) with different API. It would be nice to have some kind of integrated RPC so you can directly and transparently call JS code from Flash, or Neko (server side) from JS (client side).

For example, let’s say you want to access the server filesystem from Flash. First you’ll have to connect to the server and negociate the rights to do this, but that’s not the difficult part. Ideally, you would just use the File class directly and if you have the rights all calls would be forwarded to the server and responses sent back transparently.

class File {
    public function new( name : String ) {
         ...
    }
    public function write( data : String ) : Void {
         ...
    }
    public function close() : Void {
         ...
    }
    public function size() : Int {
         ...
    }
}

The problem here is that File is implemented using haXe Neko APIs which are not available from Flash. Actually, when compiling haXe for Flash program, you don’t even have access to the File class since it’s in a separate directory.

In that situation, you would endup normally to write a FileProxy class that would implement the methods in terms of crossplatform RPC calls and use this proxy class instead of the original one. The idea behind Type System Extension is to have theses proxy automatically generated and correctly handled by the type system.

In order to have this work, we need to introduce a special parametrized type named Proxy. For instance you will first instanciate a Proxy<File> and then manipulate it as a File. A Proxy have the same methods as its parameter but all calls are forwarded using RPC.

To be able to pass the typechecker, the Proxy need to look for more directories so we can find the File class. Then, instead of fully typing and compiling it, we only need to parse it and extract the types of public methods and fields. This will build the type of the Proxy. As a consequence you cannot rely on haXe type inference of the proxied class, all the File methods must be correctly typed (in the case they’re not, maybe just removing them from the Proxy type is a sound possibility).

But this is not enough. RPC sometimes assume that the calls are synchronous, so every call is “waiting” for the answer of the server before returning. This is not possible with Flash and JS, since all network calls are asynchronous. The Proxy need then to make all methods return values asynchronous, for example transforming the method :

public function size() : Int

into the following :

public function size() : Async<Int>

with Async having the following definition for example :

class Async<T> {
      public var onValue : T -> Void; // event callback
}

This way instead of using transparantly the File class, you’ll do somethink like :

      var f : Proxy<File> = ... // instanciate
      f.write("hello world!"); // transparent call
      var size : Async<Int> = f.size(); // async call
      size.onValue = function(s : Int) {
           trace("file size is " + s);
      }

(the local variable types here are optional, they’re not needed since there is local type inference but it helps clarifying the example).

There is several problems left :

  • Classes platform : right now there is no “platform” information for a class, so if the File had one method that would return another File, the Proxy can’t know it should turn it into Async<Proxy<File>> instead of just Async<File> which is not correct.
  • Object persistance : since you create an object on the “other side”, you need to keep references of them. This can be done by keeping a connection open. All objects are free when the connection close (with maybe some chances to reconnect). But when an object is no longer needed, the client need to explicitly free it (by adding free() to the Proxy type). This is a limitation of the platforms that don’t offer finalizers for objects.

After being able to specify “platform” information for a class, it should be possible to write an Asynchronous RPC engine directly in haXe. This would enable inter-platform Proxy-transparent communications, with only some extensions to the type system.

Neko Boot

posted on 2006-01-05

I took a few hours to work at modifying the NekoVM boot so it’s possible to create standalone binaries.

The idea is first to link the bytecode (using the -link neko compiler parameter). This will look for all $loader.loadmodule("constant string",$loader) calls inside the bytecode, and will replace it by the corresponding inlined module. It works very nice with NekoML and MotionTypes, and should work for all other languages that are using such statements for module resolution, which is the normal way of doing things.

neko -link myapp.n MyModule

The output is one big linked bytecode file that shouldn’t need any other .n bytecode library anymore. Only in the case there is unpredictable loadmodule calls, then theses are left as-it and the loaded bytecode files are needed.

Then a simple Neko program called nekoboot will concat the NekoVM boot with the linked bytecode, and modify at some place a special value that is storing the filesize so it can access to it. At runtime, since the filesize is defined, the boot will load the module directly by reading itself instead of taking an argument. You only need std.ndll that is required by the boot loader, and of course the VM library (neko.dll or libneko.so) and other ndll C libraries that you are using.

nekoboot myapp.n

This will build a myapp.exe (or without extension on Linux/OSX) standalone binary.

Neko JIT

posted on 2005-12-31

This week I’ve been working on a JIT X86 code generator for Neko. After 4 days most of the opcodes are implemented and it seems to work very well. Also, this have been possible with only very few changes in the VM itself since the JIT engine is entirely written in NekoML, so it’s actually a Neko Library (while most JIT engines are usually written in C and are part of the VM).

At first, I was thinking about a two-layers approach, one first generator that will translate the Neko bytecode into abstract CPU-independant opcodes an then a second generator that will generate the corresponding X86 opcodes. After two days I realized that my abstract opcodes where actually very near from X86 ones, for the good reason that I didn’t know what exactly should be abstracted.

I was studying the X86 opcodes at the same time I was writting the JIT. Although I made a little ASM back to my Pascal days, I didn’t know at all the opcodes binary representations (quite difficult actually since there is a lot of different forms for the same opcode). Since I dont’t know precisely the opcodes for others platforms where JIT would be ported (AMD64 ?), it was a bad idea from the beginning to abstract something I didn’t know about.

So I ended-up merging my two-layers into a one-layer generator, with a small separate X86 library that is taking care of binary representation of opcodes. Basicly, the JIT is doing the following things :

  1. read the module that need to be JIT. This is done using the module API of Neko standard library which allow to read and safely manipulate a module without initializing it.
  2. from decoded integers representing opcodes, rebuild the NekoML opcodes (this require some NekoML magic)
  3. with each opcode, generate corresponding X86 opcodes : you need to keep some VM state in registers (stack, accumulator, VM instance…). It means actually rewriting the VM bytecode interpreter loop in assembler.
  4. use the X86 library to build a string representing the opcodes
  5. use the $jit builtin to execute the code : this builtin is the only unsafe builtin in all Neko. It means that you can only crash the VM by using this builtin. For security sandboxes, it might be nice to change it into a primitive, but right now it’s more easy to debug this way.

Now that the code is generated, it’s not finished. The Bytecode Functions are stored into the module globals table, so if you run the JIT-code now, all function calls with go back to the bytecode interpreter instead of running the JIT version of the function. It was then needed to introduce a new type of function (after “bytecode-function” and “C primitive”) which would callback into JIT, and transform all module globals storing bytecode functions into the corresponding JIT-function.

Works like a charm.

However there still a few things left :

  • implement missing Neko opcodes (mainly numeric operations) plus the JmpTable opcode which is more difficult
  • right now the VM cannot callback JIT-functions, and before they can actually start executing the JIT code, it is needed to correctly setup the registers for JIT-mode
  • same for exceptions : right now when an exception is catched, the control will return to the bytecode version of the module instead of the JIT version

After that is done, it will be pretty nice since both JIT and VM interp can callback each other transparently while still keeping nice Neko features such as stack traces fully working.

You can watch the JIT sources in nekoCVS/src/jit/Jit.nml, that’s only a 26K NekoML file right now.

haXe updates

posted on 2005-12-23

The haXe project was started two months ago, and even if I’m far from working fulltime on it, it’s making good improvments with today Alpha 5 release.

Right now there is only SWF, Neko and XML output but I’m planing to add JavaScript output soon - so it will be available for January. Then, it will need some work for testing, trying to reduce as much as possible the platform differences, and increase the number of available libraries in the distribution. Here are some of the problems that need to be tackle :

  • one big platform different between Neko and Flash is that you can call a Function with any number of arguments in Flash while you must call with the exact number of arguments in Neko ( This does not prevent to add optional arguments to the language). It’s not that much a big problem since if you’re doing writing dynamic code (95% of the time ?) arguments count is enforced by the typechecker.
  • dealing with Flash obscurities. For example it was found that the XML attributes field does not inherit from Object while still being an Object. Call it a bug… Making “instanceof” work correctly was also a bit difficult since in flash “hello” instanceof String is false while new String(”hello”) instance String is true…
  • Classes organization. I don’t like so much current model. I was thinking moving all Flash classes into the “flash” package (just like AS3 is doing), the “neko” classes into the “neko” package, and keep only standard haXe classes into the root package. Not sure yet because it will cause confusion for existing JS/Flash users
  • Exceptions traces in Flash : I wanted to get the same CallStackTrace and ExceptionStrackTrace features as Neko, but it’s not possible without adding some extra computations that will update a global stack variable. This would cost CPU but might be a debug option that could be turn on/off with conditional compilation flags
  • AVM2 bytecode : some people are starting to work on it. I don’t have time yet, I will start looking at it more deeply after haXe 1.0