Next: Run-time Systems Up: C73-Compiler Techniques Previous: Basic Code Generation

Target Architectures

Virtual Architectures

Language oriented instructions sets: makes compilation simpler, helps portability and the level abstraction introduced simplifies the task of native code generation (if required) or allows direct interpretation.

A prototypical byte code interpreter might look like this:

switch code[pc++] {
case instruction0:
...
case instruction1:
...
...
case instruction255:
...
}

Of course, byte-code interpretation has a cost, typically put at about a factor of 10 over native code, although it can be as low as 2. An initial reaction to designing a byte-code instruction set might be to make each one simple, so that its execution be quick. Perhaps surprisingly, this is not a good strategy, since the more intructions that are executed, the higher the proportion of time spent in the interpreter and the lower the proportion spent doing work for the program being interpreted. In consequence, it is better to make each instructions do as much work as possible, ie. a complex instruction set, which is contrary to the trend in hardware. However, this is not necessarily easy and there has been some interesting work on the static and dynamic behaviour of simple instruction sets to synthesize more complex ones.

P-code

Designed for Pascal. Implemented as a micro-code on the ICL PERQ. Variants/extensions for Modula and Oberon. Special instructions for range checking (CHK) and for non-local variable access (LDA). Stack model, suitable for byte-code interpretation. Also interesting are ORD, SQI, SQR. See handout.

ALM

Conceived for Lisp-shows its age-small set of relatively general instructions (note freedom of operand forms). Register model, unsuitable for interpretation because the overhead of operand decoding is too high. See handout.

WAM

The only fast way to execute Prolog. Uses four stacks: heap, stack (locals), trail and push-down list or PDL (saved bindings).

SECD

Actaully a finite state machine for interpreting programs (primarily functional, but not necessarily). Control can be any kind of instruction sequence as long as you define the transition tuples to go with it (see Henderson, Functional Programming). See handout. Devised by Landin-earliest example of a VM-[Landin, 1964]. This version is based on the description appearing in Field and Harrison. Note this is an applicative order machine, ie. call by value; it can be modified to support call by need.

The state of the SECD machine is stored in four stacks: store, environment, control and dump. The entire machine can be described in terms of the movement of data between these stacks. The main data structure is the closure, which is a 3-tuple of identifier, expression and environment (set of bindings). The execution cycle is described in detail in the handout.

In the example, we are computing (twice succ) 0, where twice = lambda f . lambda x . f(f x) and succ = lambda x . + x 1, therefore the result should be 2. In consequence, the input expression is (comb (comb twice succ) 0).

Smalltalk

An object-oriented language, which was byte-coded from the start (micro-code on Alto, Dorado, and later on 68K etc.) Byte code is OO. For full details see [Goldberg \& Robson, 1985]. Byte codes come in four classes:

stack:: There are 126 of these, of which 107 are varieties of push, 18 of store into a location and 1 which removes. Most interesting is the push active context operation-reification-like a process.
jump:: Thirty one kinds of jumps. Change the instruction pointer in the active context. Short and long (2 byte) forms.
send:: Eighty-three kinds of send, but most are to primitive operations. Cause a message to be sent: receiver and arguments are already on the active context's evaluation stack.
return:: Six different flavours of return: self, true, false, nil, tos and block return tos.

Typical Physical Architectures

CISC and RISC. RISC philosophy. Co-processors. Hardware trends. Multiprocessor architectures: shared (Unix boxes) versus distributed memory (transputers and i860). Multiple instruction multiple data. Single instruction multiple data (vector and array processors). Superscalar/pipeline architectures.

Motorola 680x0: CISC. Complexity for compiler writer from two register classes. Many complex addressing modes-hard to decide which to use. Instructions targetted at HLL constructs: chk, dbcc-impossible to use. Opt for use as RISC machine-shorter intructions (time and space).
Mips R3300: RISC. 32 general purpose registers. Simple addressing modes. Delayed branch. Multiple and divide instructions, rather than steps. No semaphore operation. Mips R4400: 64 bit word. TI C60: 6 delay slots. Alpha:???
Sparc: RISC. Four register groups (not classes): global, local, input (parameter receipt) and output (parameter preparation). Observation on stack depth in running programs. Stack cache with register access: register windows-256 registers. But save and restore are not RISC instructions. See overheads.
transputer: CISC. Stack architecture designed for distributed processing-hardware support for process scheduling and interprocess communications (links): note that stack is essential since ensures process state is stored on context switch. Byte code instruction set of curious design: 4 bits instruction and 4 bits data. Six indirectly manipulable 32 bit registers: A, B, C (three-level stack), W (workspace), I (instruction), O (operate). Physical memory only. 2K/4K on-chip memory-signed, starting at #x80000000.

Next: Run-time Systems Up: C73-Compiler Techniques Previous: Basic Code Generation

masjap@
Fri Oct 21 18:42:03 BST 1994