TIL
----
TIL stands for Threaded Interpretive Language.
The TILs uses RPN. TILs use blank/space character as delimiter. Parameters
are passed to methods by pushing values onto a Data Stack(DS). Here is an example expression
that adds two numbers "2" and "3" and puts the result on the stack, in the language we are
talking about:
2 3 +
A TIL implementation uses the register provided by the hardware for certain purposes. TIL uses
a register for holding the address of current word being executed (known as W) and a register for
holding the address of the next word to execute (known as IP- Instruction Pointer). TIL uses a
register for acting as Data Stack pointer and Return Stack pointer.
Forth, work of Chuck Moore, is an example of TIL.
Outer Interpreter
------------------
The system provides REPL, where one can type statements and the system executes the statement
by interpreting it. This REPL is termed "Outer Interpreter". It tokenizes the input stream
(using blank/space character) and each token is looked up in the systems Dictionary. If an
entry is present, the corresponding code is executed. If an entry is not found, the token
is checked for being a number. If number conversion test passes, then the token is pushed
onto the DS. If the number converstion test fails, then the "Outer Interpreter" reports error
(Unknown Word) and stops processing. If no error has occurred, the "Outer Interpreter"
keeps repeating the above steps until the stream is exhausted.
Primary Words
-------------
Core words, also termed as Primary Words, are implemented using the assembly instructions of the
target hardware. A batch of assembly instructions that does meaningful work (add two numbers, fetch
value at specified address, store value at specified address...) are named for later accessing.
Secondary Words
---------------
Secondary words are implemented using existing primaries or other secondaries. Secondary word
implementation contains just a sequence of addresses. Suppose one wants to write some code that
doubles a given number. The corresponding TIL code looks like this:
2 DUP + -- (answer will be 4)
3 DUP + -- (answer will be 6)
This could be reused, let this be written as a secondary. The secondary word DBL can be defined as
DBL => DUP +
The memory map of the secondary word DBL will look like:
83 00 : <address of NEST+2>
83 02 : <address of DUP>
83 04 : <address of +>
83 06 : <address of UNNEST>
Words NEST and UNNEST will be explained later.
Dictionary
----------
A TIL's dictionary is a generally a linked list consisting of primary and Secondary Words. Each word
is represented by an entry, that has some meta information (name by which the word is known, length
of the word name, some flags) and is followed by the code corresponding to the word.
Follows the dictionary entry for word named "+" - it expects two values on the stack. It adds them
up and puts the value on the top of the stack. The memory map shown below starts at address location "81 00".
The first two bytes contains the address of the previous entry (this is the link in the linked list).
The next byte "2B" is the ASCII value of "+" and the next byte is the length of the word name: 1.
Next two bytes are flags. The next two bytes point to the executable code for this word.
81 00 : <address of previous entry> 2B 01 00 00
81 06 : 81 08
81 08 : MOV @RDSP+, R0
: ADD R0, RTOPDS
: JMP NEXT
Follows the dictionary entry for word named "DUP" - it expects a value on the stack. It duplicates
the entry on top of the data stack. The memory map shown below starts at address location "81 20"
(some arbitrary value used for illustration). The first two bytes contains address of the previous
dictionary entry (in this case "+" is used as previous entry). The next three bytes "44 55 50" is
ASCII value for "DUP" and the next byte is the length of the word name: 3. Next two bytes are flags.
The next two bytes point to the executable code for this word.
81 20 : 81 00 44 55 50 03 00 00
81 28 : 81 2A
81 2A : MOV R6, @-R7
: JMP NEXT
Follows the dictionary entry for the word named "DBL". It doubles the number on top of stack.
The memory map shown below starts at address location "81 40" (some arbitrary value used for illustration).
The first two bytes contains address of previous dictionary entry (in this case "DUP"). The next three
bytes "44 42 4C" is the ASCII value for "DBL" and the next byte is the lengh of the word name: 3. Next
two bytes are flags. The next two bytes point to executable code location of some word named NEST.
The next two bytes point to word DUP. The next two bytes point to word +. The next two bytes point to
some word named UNNEST.
81 40 : 81 20 44 42 4C 03 00 00
81 48 : <address of NEST+2>
81 4A : 81 28
81 4C : 81 06
81 4E : <address of UNNEST>
Inner Interpreter
-----------------
One might have noticied that dictionary entries above has code for each word ending with a JUMP
instruction to a label NEXT. NEXT is a primary word, also known as the Inner Interpreter.
To understand what NEXT does, consider that user has typed "2 DBL". The outer interpreter realizes
"2" is a number and puts it on top of the stack. Outer Interpreter finds the word "DBL" in dictionary.
The execution starts by loading executable code onto the Program Counter, it is done by loading
data at address pointed to from location 81 48 - data at 81 48 points to (address of NEST)+2, hence code
present at (address of NEST)+2 will be loaded onto Program Counter. The word NEST does few things (saves
the IP) The last instruction of NEST is "JMP NEXT". When the processor is executing code related to NEST,
TIL's W register (current word Register) will be pointing to 81 48 (address of the current word DBL). At the
end of the execution of NEST (before jumping to NEXT), IP will be made to point to 81 4A. Code related
to NEXT loads value (81 28) at address pointed (81 4A) to by IP onto the W register, it increments IP (so that
IP points to 81 4c- next word to execute) and then loads value data (executable code of DUP at 81 2A) onto
program counter from address pointed to by W (81 28).
Thus NEXT does the job of ticking the Program Counter with the executable code of the next word
to be executed. Hence NEXT is called the inner interpreter. Follows the code related to NEXT, NEST, UNNEST
NEXT:
MOV @IP+, W
MOV @W+, PC
NEST:
MOV IP, @-RP
MOV W, IP
JMP NEXT
UNNEST:
MOV @RP+, IP
JMP NEXT
IP - Instruction Pointer
W - Current Word Address Pointer
RP - Return Stack Pointer
Essentially what NEXT, NEST, UNNEST accomplishes is this: these words make sure that executable code
is properly loaded onto the Program Counter. When a TIL is executing a secondary (which is just a sequence
of addresses), NEXT threads the executable code of each word so that the executable code of next word in the
list (of addresses of the secondary) is properly loaded onto Program Counter. Hence the name 'Threaded Code'.
Self-Generating
---------------
Once a set of core words are implemented (as primary), then secondaries can be created using the
existing vocabulary. When the user types in TIL code, there might be some useful abstractions. A TIL provides
what is known as COLON word (a secondary) which allows for creating new secondaries. So the word DBL can be
created by user using COLON word as follows:
: MYDBL DUP + ;
When the interpreter executes the above code, the ":" creates a dictionary entry with name being "MYDBL" and ";" compiles
in the address of words DUP and + (with proper threading using NEST and UNNEST)
88 00 : <address of previous entry> 4D 59 44 42 4C 05 00 00
88 0A : <address of NEST+2>
88 0C : <address of DUP>
88 0E : <address of +>
88 10 : <address of UNNEST>
Thus a TIL allows the user to extend the core language allowing the user to compile in new words representing new
abstractions.
References
----------
Threaded Interpreive Languages (book) http://portal.acm.org/citation.cfm?id=542825
http://www.zetetics.com/bj/papers/moving1.htm
http://www.complang.tuwien.ac.at/forth/threaded-code.html