Jazelle

I've changed my mind about how the unhandled bytecodes are interpreted. The information below is still valid (I'll sort it out and update it eventually), but in the meantime take a look at these posts:
http://lists.maemo.org/pipermail/maemo-developers/2007-July/010853.html
http://lists.maemo.org/pipermail/maemo-developers/2007-July/010855.html
http://lists.maemo.org/pipermail/maemo-developers/2007-July/010870.html

Jazelle is a hardware extension of the ARM architecture provided on some ARM processors. The Nokia N800 uses an OMAP 2420 (ARM1136) which is Jazelle enabled.
For more general information, take a look at the ARM website (http://www.arm.com/products/esd/jazelle_home.html) and Wikipedia page (http://en.wikipedia.org/wiki/ARM_architecture#Jazelle). 

As you'll see from the ARM website (http://www.arm.com/products/esd/jazelle_architecture.html), there are two incarnations of Jazelle. Jazelle DBX (Direct Bytecode eXecution), which is the one in the N800 and in which I'm interested, and Jazelle RCT (Runtime Compilation Target).

Jazelle DBX gives the processor the ability to switch modes and directly interpret Java bytecode instructions. This means that the ARM processor can replace the bytecode interpreter loop in a Java virtual machine. It should be noted that it's not quite this easy, some Java bytecode instructions (very complex or infrequently used) are not directly supported and therefore some form of unrecognised instruction trap must be used along with the Java mode to ensure that these instructions are recognised and can be implemented in software.

So, it would be nice to be able to use this hardware acceleration in open-source Java virtual machines.

Sources of information

This is an interesting article about Jazelle (http://www.ftponline.com/javapro/2002_06/magazine/columns/javatogo/), which mentions the fact that the ARM registers R0 to R3 are used to contain the top items on the Java stack. Although other articles talk about a partially register-based stack, this is the only one that says which registers are used.

The Jazelle patent itself is available here (http://www.google.com/patents?id=iMt6AAAAEBAJ&dq=7089539) and contains some more interesting information about the registers used and programmatic flow when dealing with one of the unrecognised Java bytecodes.

What to do

I'm neither an expert ASM programmer nor a kernel hacker, but I thought this would be a good way of gaining experience (or at least knowledge in both), so I've been looking into how an unrecognised instruction (probably better called an illegal instruction) handler could be written to work with the Jazelle hardware.

My first idea was to simply alter the CPU exception vector to point to some code to handle the illegal instruction exception. This ought to work, however it's not possible to access this memory location (0x04 or 0xFFFF0004 iirc) from user space. Therefore you'd need a kernel module to make the change. If you were to use a kernel module, you'd need to encapsulate all of the functionality of the handler within it, otherwise there might be a security risk of sorts. This is an option, and probably the easiest one to understand.

However the Linux kernel traps these exceptions and it would be possible, rather than intercepting the exception, to accept the processing from the kernel. This is something that the old FPA/FPE floating point exmulation did. I don't know whether a module such as this can simply be plugged into the system, or if it would require alterations to be made to the kernel.

Along these lines, the kernel informs threads of exceptions by means of signals. Signals (such as the SIGILL that this ought to produce) can be trapped and handled. I note that the ARM trap.c code only saves registers R0 to R10 (there are many more), while the Jazelle hardware obviously uses higher registers (when returning from an unrecognised instruction - see the patent) but is it  necessary to save these? So this is another option, and one which 'plays' better with the kernel. The one problem one might find when trying to use this technique is that the return branch to the Java bytecode address requires a specific BXJ instruction with the bytecode address in R14 and the address of an emulation function (just in case the hardware is not present) in R12. Chances are that the code that returns after the SIGILL handler doesn't perform these functions, but then again I've just read that the handler pushes some code onto the stack to allow it to reset the register state, so perhaps a bit of extra code could be pushed to perform fill the required registers and do BXJ.

That's my current thinking (the thinking of someone who's rather enjoying having something interesting to learn about, but doesn't necessarily have any idea what he's talking about). I must read more of the patent and see whether there are any other important/useful tidbits in there.

Cheers,


Simon

P.S. If anyone has any comments or ideas, please email me at my uni address (go up to http://bath.ac.uk/enpsgp to see what it is). Thanks!