Jazelle
I've changed my mind about how the unhandled
bytecodes are interpreted. The information below is still valid (I'll
sort it out and update it eventually), but in the meantime take a look
at these posts:
http://lists.maemo.org/pipermail/maemo-developers/2007-July/010853.html
http://lists.maemo.org/pipermail/maemo-developers/2007-July/010855.html
http://lists.maemo.org/pipermail/maemo-developers/2007-July/010870.html
Jazelle
is a hardware extension of the ARM architecture provided on some ARM
processors. The Nokia N800 uses an OMAP 2420 (ARM1136) which is Jazelle
enabled.
For more general information, take a look at the ARM
website (http://www.arm.com/products/esd/jazelle_home.html) and
Wikipedia page
(http://en.wikipedia.org/wiki/ARM_architecture#Jazelle).
As
you'll see from the ARM website
(http://www.arm.com/products/esd/jazelle_architecture.html), there are
two incarnations of Jazelle. Jazelle DBX (Direct Bytecode eXecution),
which is the one in the N800 and in which I'm interested, and Jazelle
RCT (Runtime Compilation Target).
Jazelle DBX gives the
processor the ability to switch modes and directly interpret Java
bytecode instructions. This means that the ARM processor can replace
the bytecode interpreter loop in a Java virtual machine. It should be
noted that it's not quite this easy, some Java bytecode instructions
(very complex or infrequently used) are not directly supported and
therefore some form of unrecognised instruction trap must be used along
with the Java mode to ensure that these instructions are recognised and
can be implemented in software.
So, it would be nice to be able to use this hardware acceleration in open-source Java virtual machines.
Sources of information
This
is an interesting article about Jazelle
(http://www.ftponline.com/javapro/2002_06/magazine/columns/javatogo/),
which mentions the fact that the ARM registers R0 to R3 are used to
contain the top items on the Java stack. Although other articles talk
about a partially register-based stack, this is the only one that says
which registers are used.
The Jazelle patent itself is available
here (http://www.google.com/patents?id=iMt6AAAAEBAJ&dq=7089539) and
contains some more interesting information about the registers used and
programmatic flow when dealing with one of the unrecognised Java
bytecodes.
What to do
I'm neither an expert ASM programmer
nor a kernel hacker, but I thought this would be a good way of gaining
experience (or at least knowledge in both), so I've been looking into
how an unrecognised instruction (probably better called an illegal
instruction) handler could be written to work with the Jazelle hardware.
My
first idea was to simply alter the CPU exception vector to point to
some code to handle the illegal instruction exception. This ought to
work, however it's not possible to access this memory location (0x04 or
0xFFFF0004 iirc) from user space. Therefore you'd need a kernel module
to make the change. If you were to use a kernel module, you'd need to
encapsulate all of the functionality of the handler within it,
otherwise there might be a security risk of sorts. This is an option,
and probably the easiest one to understand.
However the Linux
kernel traps these exceptions and it would be possible, rather than
intercepting the exception, to accept the processing from the kernel.
This is something that the old FPA/FPE floating point exmulation did. I
don't know whether a module such as this can simply be plugged into the
system, or if it would require alterations to be made to the kernel.
Along
these lines, the kernel informs threads of exceptions by means of
signals. Signals (such as the SIGILL that this ought to produce) can be
trapped and handled. I note that the ARM trap.c code only saves
registers R0 to R10 (there are many more), while the Jazelle hardware
obviously uses higher registers (when returning from an unrecognised
instruction - see the patent) but is it necessary to save these?
So this is another option, and one which 'plays' better with the
kernel. The one problem one might find when trying to use this
technique is that the return branch to the Java bytecode address
requires a specific BXJ instruction with the bytecode address in R14
and the address of an emulation function (just in case the hardware is
not present) in R12. Chances are that the code that returns after the
SIGILL handler doesn't perform these functions, but then again I've
just read that the handler pushes some code onto the stack to allow it
to reset the register state, so perhaps a bit of extra code could be
pushed to perform fill the required registers and do BXJ.
That's
my current thinking (the thinking of someone who's rather enjoying
having something interesting to learn about, but doesn't necessarily
have any idea what he's talking about). I must read more of the patent
and see whether there are any other important/useful tidbits in there.
Cheers,
Simon
P.S.
If anyone has any comments or ideas, please email me at my uni address
(go up to http://bath.ac.uk/enpsgp to see what it is). Thanks!