197 lines
7.5 KiB
Text
197 lines
7.5 KiB
Text
rt "mterp" README
|
|
|
|
NOTE: Find rebuilding instructions at the bottom of this file.
|
|
|
|
|
|
==== Overview ====
|
|
|
|
Every configuration has a "config-*" file that controls how the sources
|
|
are generated. The sources are written into the "out" directory, where
|
|
they are picked up by the Android build system.
|
|
|
|
The best way to become familiar with the interpreter is to look at the
|
|
generated files in the "out" directory.
|
|
|
|
|
|
==== Config file format ====
|
|
|
|
The config files are parsed from top to bottom. Each line in the file
|
|
may be blank, hold a comment (line starts with '#'), or be a command.
|
|
|
|
The commands are:
|
|
|
|
handler-style <computed-goto|jump-table>
|
|
|
|
Specify which style of interpreter to generate. In computed-goto,
|
|
each handler is allocated a fixed region, allowing transitions to
|
|
be done via table-start-address + (opcode * handler-size). With
|
|
jump-table style, handlers may be of any length, and the generated
|
|
table is an array of pointers to the handlers. This command is required,
|
|
and must be the first command in the config file.
|
|
|
|
handler-size <bytes>
|
|
|
|
Specify the size of the fixed region, in bytes. On most platforms
|
|
this will need to be a power of 2. For jump-table implementations,
|
|
this command is ignored.
|
|
|
|
import <filename>
|
|
|
|
The specified file is included immediately, in its entirety. No
|
|
substitutions are performed. ".cpp" and ".h" files are copied to the
|
|
C output, ".S" files are copied to the asm output.
|
|
|
|
asm-alt-stub <filename>
|
|
|
|
When present, this command will cause the generation of an alternate
|
|
set of entry points (for computed-goto interpreters) or an alternate
|
|
jump table (for jump-table interpreters).
|
|
|
|
fallback-stub <filename>
|
|
|
|
Specifies a file to be used for the special FALLBACK tag on the "op"
|
|
command below. Intended to be used to transfer control to an alternate
|
|
interpreter to single-step a not-yet-implemented opcode. Note: should
|
|
note be used on RETURN-class instructions.
|
|
|
|
op-start <directory>
|
|
|
|
Indicates the start of the opcode list. Must precede any "op"
|
|
commands. The specified directory is the default location to pull
|
|
instruction files from.
|
|
|
|
op <opcode> <directory>|FALLBACK
|
|
|
|
Can only appear after "op-start" and before "op-end". Overrides the
|
|
default source file location of the specified opcode. The opcode
|
|
definition will come from the specified file, e.g. "op OP_NOP arm"
|
|
will load from "arm/OP_NOP.S". A substitution dictionary will be
|
|
applied (see below). If the special "FALLBACK" token is used instead of
|
|
a directory name, the source file specified in fallback-stub will instead
|
|
be used for this opcode.
|
|
|
|
alt <opcode> <directory>
|
|
|
|
Can only appear after "op-start" and before "op-end". Similar to the
|
|
"op" command above, but denotes a source file to override the entry
|
|
in the alternate handler table. The opcode definition will come from
|
|
the specified file, e.g. "alt OP_NOP arm" will load from
|
|
"arm/ALT_OP_NOP.S". A substitution dictionary will be applied
|
|
(see below).
|
|
|
|
op-end
|
|
|
|
Indicates the end of the opcode list. All kNumPackedOpcodes
|
|
opcodes are emitted when this is seen, followed by any code that
|
|
didn't fit inside the fixed-size instruction handler space.
|
|
|
|
The order of "op" and "alt" directives are not significant; the generation
|
|
tool will extract ordering info from the VM sources.
|
|
|
|
Typically the form in which most opcodes currently exist is used in
|
|
the "op-start" directive.
|
|
|
|
==== Instruction file format ====
|
|
|
|
The assembly instruction files are simply fragments of assembly sources.
|
|
The starting label will be provided by the generation tool, as will
|
|
declarations for the segment type and alignment. The expected target
|
|
assembler is GNU "as", but others will work (may require fiddling with
|
|
some of the pseudo-ops emitted by the generation tool).
|
|
|
|
A substitution dictionary is applied to all opcode fragments as they are
|
|
appended to the output. Substitutions can look like "$value" or "${value}".
|
|
|
|
The dictionary always includes:
|
|
|
|
$opcode - opcode name, e.g. "OP_NOP"
|
|
$opnum - opcode number, e.g. 0 for OP_NOP
|
|
$handler_size_bytes - max size of an instruction handler, in bytes
|
|
$handler_size_bits - max size of an instruction handler, log 2
|
|
|
|
Both C and assembly sources will be passed through the C pre-processor,
|
|
so you can take advantage of C-style comments and preprocessor directives
|
|
like "#define".
|
|
|
|
Some generator operations are available.
|
|
|
|
%include "filename" [subst-dict]
|
|
|
|
Includes the file, which should look like "arm/OP_NOP.S". You can
|
|
specify values for the substitution dictionary, using standard Python
|
|
syntax. For example, this:
|
|
%include "arm/unop.S" {"result":"r1"}
|
|
would insert "arm/unop.S" at the current file position, replacing
|
|
occurrences of "$result" with "r1".
|
|
|
|
%default <subst-dict>
|
|
|
|
Specify default substitution dictionary values, using standard Python
|
|
syntax. Useful if you want to have a "base" version and variants.
|
|
|
|
%break
|
|
|
|
Identifies the split between the main portion of the instruction
|
|
handler (which must fit in "handler-size" bytes) and the "sister"
|
|
code, which is appended to the end of the instruction handler block.
|
|
In jump table implementations, %break is ignored.
|
|
|
|
The generation tool does *not* print a warning if your instructions
|
|
exceed "handler-size", but the VM will abort on startup if it detects an
|
|
oversized handler. On architectures with fixed-width instructions this
|
|
is easy to work with, on others this you will need to count bytes.
|
|
|
|
|
|
==== Using C constants from assembly sources ====
|
|
|
|
The file "art/runtime/asm_support.h" has some definitions for constant
|
|
values, structure sizes, and struct member offsets. The format is fairly
|
|
restricted, as simple macros are used to massage it for use with both C
|
|
(where it is verified) and assembly (where the definitions are used).
|
|
|
|
If a constant in the file becomes out of sync, the VM will log an error
|
|
message and abort during startup.
|
|
|
|
|
|
==== Development tips ====
|
|
|
|
If you need to debug the initial piece of an opcode handler, and your
|
|
debug code expands it beyond the handler size limit, you can insert a
|
|
generic header at the top:
|
|
|
|
b ${opcode}_start
|
|
%break
|
|
${opcode}_start:
|
|
|
|
If you already have a %break, it's okay to leave it in place -- the second
|
|
%break is ignored.
|
|
|
|
|
|
==== Rebuilding ====
|
|
|
|
If you change any of the source file fragments, you need to rebuild the
|
|
combined source files in the "out" directory. Make sure the files in
|
|
"out" are editable, then:
|
|
|
|
$ cd mterp
|
|
$ ./rebuild.sh
|
|
|
|
The ultimate goal is to have the build system generate the necessary
|
|
output files without requiring this separate step, but we're not yet
|
|
ready to require Python in the build.
|
|
|
|
==== Interpreter Control ====
|
|
|
|
The mterp fast interpreter achieves much of its performance advantage
|
|
over the C++ interpreter through its efficient mechanism of
|
|
transitioning from one Dalvik bytecode to the next. Mterp for ARM targets
|
|
uses a computed-goto mechanism, in which the handler entrypoints are
|
|
located at the base of the handler table + (opcode * 128).
|
|
|
|
In normal operation, the dedicated register rIBASE
|
|
(r8 for ARM, edx for x86) holds a mainHandlerTable. If we need to switch
|
|
to a mode that requires inter-instruction checking, rIBASE is changed
|
|
to altHandlerTable. Note that this change is not immediate. What is actually
|
|
changed is the value of curHandlerTable - which is part of the interpBreak
|
|
structure. Rather than explicitly check for changes, each thread will
|
|
blindly refresh rIBASE at backward branches, exception throws and returns.
|