Core Words Of Flight - Take 7 Charles Eric LaForest laforest AT eecg.utoronto.ca 2008/03/26 Added RTU instruction compilation 2006/12/11 Merged VM branch with LOOK alteration into the COMSIM kergen which contains defines to create single and dual-core kernels 2006/03/02 Altered LOOK to not pop the string on the input stack and to return the value of NAME-END if no match so we can test against that for error checking. This means changing some low-level words in the Flight extensions so as to keep things working as before since most of the time the input string doesn't need to be kept. 2006/02/25 Altered LOOK to use NAME-END as end of dictionary address, so it can be temporarily altered for retargeted compilation. 2006/01/05 Fixed PUSH-STRING to set the tail 2005/11/01: FOR FUTURE WORK PUSH-STRING should create the tail, instead of SCAN-STRING. Otherwise, every function afterwards that allocates memory using PUSH-STRING (which is what it really does) has to create the tail, which we already know has to contain its own address (unless it was converted into a name entry). 2005/10/18: Altered COMPILE-OPCODE to NEXT-SLOT at the beginning. Altered the DEFN sub-words. Mainly added the concept of a null slot to denote a HERE location in which nothing has been compiled yet. This is definitely not optimal, but it fixed a subtle bug. Other words should check for null slot condition and thus not ALIGN needlessly, saving the occasional skipped, wasted memory word. FOR FUTURE WORK: Noted that the whole slot shifting mechanism is wasteful: Instead, shift the contents of HERE, and add in the opcode in the last slot. The slot mask then devolves into a counter, which since it is less than 32, can be a simple one-hot count, eliminating the need for an addition. However, ALIGN will then have to check and shift the opcodes into the zeroth sloth if required. 2005/09/13: Factored DEFN into DEFN-AS, NEW-WORD, and DEFN, to allow for naming a chunk of code after it has been compiled, so one can overlay a word with another of the same name which refers to the original word internally. Tweaked COMPILE-OPCODE (moved a DROP to eliminate a >R...R> pair). Substituted '|' with '+'. (since PC@ is 0, this works, but maybe it's slower...) 2005/03/18: NOTE: It someone never occured to me that the sequence SCAN 5 SCAN NUMI can't work if each SCAN'ed in string overwrites the previous one (NUMI would go and convert itself!). Thus SCAN has to accumulate strings and LOOK must eat them. This does mean that a function can have multiple string arguments now, as a generalization of the Data Stack. NOTE: Important edit! Conditional jumps no longer always reload the ISR. If the branch is not taken, the next instruction executes as usual. The previous behaviour was intended to make all flow-control instructions behave the same, to avoid exceptions, but this new behaviour seems better. The code density improves, since 'JMP0 JMP' can now be consecutive in a memory word. However, make sure this does not slow things down or bloat the state machine. This shouldn't break this code, assuming the subsequent instructions within a memory word are NOPs, but the compilation words will have to be altered to take this into account. NOTE: Since SCAN, DEFN, and LOOK are not in a hierarchical relationship, passing the string address via the stack is wrong. It complicates the code. It should be stored in a named variable. This recapitulates scope blocks, interestingly. 0 Misc. These are the lowest level input buffer manipulation routines. The input buffer behaves like a simple stack of counted strings. The top of stack pointer is INPUT, the bottom pointer is THERE. 0.1 STRING-TAIL Returns the address of the tail of a string. Takes the head address of the string. The tail contains an address used for referencing code or terminating string comparisons. >A A@+ (get string length and skip it) A> + ; (string tail address) 0.2 PUSH-STRING Alters INPUT to allocate the space for a counted string. Sets the tail of the string to point to its own address. Takes the length of the string from TOS. DUP LIT 1 + NOT (account for string count and address tail) LIT [address of INPUT] >A A@ + DUP A! (adjust INPUT) >A A! (store string length count at (INPUT)) LIT [address of INPUT] >A A@ CALL STRING-TAIL DUP >A A! ; 0.3 POP-STRING Alters INPUT to discard the most recently SCAN'ed string. LIT [address of INPUT] >A A@ call STRING-TAIL LIT 1 + (point to head of next string) LIT [address of INPUT] >A A! ; Or for comparison: LIT [address of INPUT] DUP >R >A A@ DUP (get (INPUT)) >A A@ (get string length) + LIT 2 + (add length+2 to (INPUT) to point to next string) R> >A A! ; (update INPUT) 0.4 COMPARE-STRINGS Takes the head address of two strings. Returns the addresses (in the same order) of the first non-matching pair of symbols. Note that the practice of tailling each string with the address of the tail guarantees that the comparison will terminate (at the tail). >A >R LOOP: R@+ A@+ XOR JMP0 LOOP LIT -1 R> + LIT -1 A> + ; 1 SCAN 1.1 READ1 Presumed here is that READ1 always returns a machine word. Implementation depends on interface to outside world. Here I assume a memory port for illustration purposes. LIT [address of input port] >A A@ ; (read in one machine word) 1.2 SCAN-STRING Called to read a string into a pushed string entry. LIT [address of INPUT] >A A@ >R R@+ (get string length and prep) SCAN-STRING-LOOP: call READ1 R!+ (READ1 uses A) LIT -1 + DUP JMP0 DONE JMP SCAN-STRING-LOOP DONE: (storage address should be == (INPUT)+length+1 now) DROP R> DUP >A A! ; (store address of end of string in itself for LOOK termination) 1.4 SCAN The location after the string is kept free to hold (THERE) as a LOOK termination marker, or an actual code address should it become a name entry. Input is a counted string, where the first machine word contains the following number of machine words used by the string, regardless of symbol encoding. Always reads in the string. There is no check. The sender of the string can check if "INPUT HERE -" is greater than the string length to see if there is space. The check depends on the fact that HERE points to an address that begins at 0 and increments, while INPUT points to a location that begins at THERE and decrements. THERE begins at the end of memory and decrements towards zero. If they meet, then there is no more free memory. Thus if the start address is lesser than the one HERE points at, there is no room for the string (or for anything else at all!). call READ1 call PUSH-STRING jmp SCAN-STRING 2 DEFN 2.1 FIRST-SLOT Begin compiling at zeroth slot. LIT [address of SLOT] >A LIT [mask for slot 0] A! ; 2.2 LAST-SLOT Setup at fifth slot, so the next compilation will occur at the zeroth slot. LIT [address of SLOT] >A LIT [mask for slot 5] A! ; 2.3 NULL-SLOT Null slot mask. Denotes that HERE is empty. LIT [address of SLOT] >A LIT [mask for null slot] A! ; 2.4 ALIGN Makes HERE point to the next free location, so we don't point a name at the tail of the previous procedure or clobber literals. Make HERE-NEXT point to the following location. Mark HERE as empty with NULL-SLOT. LIT [address of HERE-NEXT] >A A@ DUP >R LIT 0 R!+ (zero (PC@) for compilation and increment) R> A! (update HERE-NEXT) LIT [address of HERE] >A A! (update HERE) JMP NULL-SLOT 2.5 NEXT-SLOT Point to first slot if HERE is empty (NULL-SLOT), else point to next free slot at HERE, else ALIGN. ( N.B.: Major hardware change: slot 0 is now on LSB side (so we don't have to worry about the MSB with '2/') ) LIT [address of SLOT] >A A@ (get slot mask) LIT [null slot mask] OVER XOR JMP0 HEREEMPTY LIT [fifth slot mask] OVER XOR JMP0 HEREFULL 2* 2* 2* 2* 2* (next 5-bit slot) A! ; HEREFULL: DROP CALL ALIGN JMP FIRST-SLOT HEREEMPTY: DROP JMP FIRST-SLOT 2.6 DEFN-AS Takes the aligned address of a code entry (usually (HERE) ) and the address of a SCANed string (from INPUT) and converts it to a name entry. Updates THERE. String must be the only one in the input buffer stack as it is converted in place, so INPUT must be pointing to it and its tail must be at THERE. Because of this, INPUT does not need to be changed. LIT [address of THERE] >A A@ >A A! (store address in (THERE)) LIT [address of INPUT] >A A@ (get string start address) LIT -1 + (move to first free location before it) LIT [address of THERE] >A A! ; 2.7 NEW-WORD Returns an aligned address to compile at. call ALIGN LIT [address of HERE] >A A@ (get (HERE)) ; 2.8 DEFN Defines a name entry for the current code entry location. CALL NEW-WORD JMP DEFN-AS 3 LOOK 3.2 LOOK Eats the topmost name string on the input buffer stack. Returns address of code associated with name or new (INPUT) if no match. No error checking other than for end of dictionary. LIT [address of THERE] >A A@ LIT 1 + DUP (address of latest name entry string) LOOP: LIT [address of INPUT] >A A@ (address of topmost SCANed string) call COMPARE-STRINGS LIT [address of INPUT] >A A@ call STRING-TAIL XOR JMPO MATCH DROP call STRING-TAIL (uses the DUPed name entry address) LIT 1 + (point to start of next name entry) DUP LIT [address of NAME-END] >A A@ XOR JMP0 NOMATCH (are we at end of name dict?) DUP JMP LOOP MATCH: >A DROP A@ ; (return code address for name entry) NOMATCH: DROP LIT [address of NAME-END] >A A@ ; (return address of end of dict) 4 Compilation words 4.1 COMPILE-INIT Sets up HERE/HERE-NEXT/SLOT for compilation LIT [address of HERE] >A LIT [initial compilation address] A!+ A> LIT [address of HERE-NEXT] >A A! LIT [address of SLOT] >A LIT [mask for slot 5] A! LIT [address of HERE] >A A@ >A LIT [PC@|PC@|PC@|PC@|PC@|PC@] A! ; 4.2 COMPILE-OPCODE Takes an opcode (in slot 0) and compiles it at the next empty HERE/SLOT location. Assumes the current slot is full. Leaves SLOT pointing to next full slot. This fails for PC@, since its opcode is all zeroes. (But then, it shouldn't be necessary to compile it explicitely, that's what ALIGN is for) Note the "+" instead of "|" at the end. We can do this since empty slots are zeroed (since PC@ is 0). call NEXT-SLOT >R LIT [address of SLOT] >A A@ ~ (get slot mask and invert) R> (place opcode on top) OVER OVER & JMP0 COMPILE (slot 0) 2* 2* 2* 2* 2* (shift opcode by one slot) OVER OVER & JMP0 COMPILE (slot 1) 2* 2* 2* 2* 2* OVER OVER & JMP0 COMPILE (slot 2) 2* 2* 2* 2* 2* OVER OVER & JMP0 COMPILE (slot 3) 2* 2* 2* 2* 2* OVER OVER & JMP0 COMPILE (slot 4) 2* 2* 2* 2* 2* COMPILE: (then we *have* to be in slot 5) LIT [address of HERE] >A A@ >A A@ + A! DROP ; (compile opcode, drop mask, return) 4.3 COMPILE-LITERAL Take a procedure address/literal (for CALL/JMP/JMP0/LIT). Compile it at HERE-NEXT. Increments HERE-NEXT. LIT [address of HERE-NEXT] >A A@ >A A!+ (store address/literal and increment) A> LIT [adress of HERE-NEXT] >A A! ; (update HERE-NEXT) 4.4 CMPC, (CALL) Takes the address of a procedure an compiles a call to it. Then aligns to the next free word since later slots in current word will never execute. LIT [opcode for CALL] CALL COMPILE-OPCODE CALL COMPILE-LITERAL JMP ALIGN 4.5 CRTU, (RTU) Compiles a Return To User instruction then aligns to the next free word since later slots in current word will never execute since ISR is not saved. Upon a trap, exection will resume in the following memory word. LIT [opcode for RTU] CALL COMPILE-OPCODE JMP ALIGN 4.6 CMPJ, (JMP) LIT [opcode for JMP] CALL COMPILE-OPCODE CALL COMPILE-LITERAL JMP ALIGN 4.7 CMP0, (JMP0) If JMP0 not taken, the next opcodes run instead, unlike calls and jumps LIT [opcode for JMP0] CALL COMPILE-OPCODE JMP COMPILE-LITERAL 4.8 CMP+, (JMP+) LIT [opcode for JMP+] CALL COMPILE-OPCODE JMP COMPILE-LITERAL 4.9 NUMC, (LIT) LIT [opcode for LIT] CALL COMPILE-OPCODE JMP COMPILE-LITERAL 5. EXEC/NXEC 5.1 EXECUTE If called: Take a procedure address and call to it. If inlined/jumped to: Take a procedure address and jump to it. (take note of tail-recursion elimination) >R ; 5.2 EXEC Read in a procedure address and call to it. call READ1 call EXECUTE jmp EXEC 5.3 NXEC Read in a string, look up the function address, call to it. call SCAN call LOOK call POP_STRING call EXECUTE jmp NXEC 6. Inline compilation Copying a word slot-by-slot requires decompiling it instruction by instruction and recompiling it at the new location. This requires reimplementing the compilation words 'in reverse'. Complicated and ugly. Simply ALIGNing and copying existing code is no good since the words that will be worthwhile to inline are the shortest, and so many slots will get proportionally wasted. The extreme case is when inlining machine instructions: you would end up with one opcode per word! So that's the key: we know in advance which words will end up inlined, and so we write a word which when run compiles the code in situ. It's a macro! 6.1 (DUP), (DROP), (+), (A!+), etc... LIT [opcode in slot 0] JMP COMPILE-OPCODE 7 NUMI This is where we can't avoid a predefined character encoding, even though it's disjoint from higher symbolic representation. Ideally, this would be some UNICODE code, but I'll settle for decimal ASCII numerals for now (0 -> 48,...9 -> 57), one per word. Wasteful, but simple. 7.1 10* Multiply an integer by 10. DUP 2* 2* 2* OVER + + ; 7.2 NUMI Takes a string address of an unsigned decimal number and returns the corresponding integer on the stack. No error or overflow checking! LIT [address of INPUT] >A A@ >R R@+ (read in length) LIT 0 >A DUP JMP0 DONE (if N == 0) LOOP: A> call 10* R@+ LIT -48 + + >A (shift total by radix, get number, convert to int, add) LIT -1 + DUP JMP0 DONE JMP LOOP DONE: DROP R> DROP A> JMP POP-STRING 8 Interesting finds 8.1 10^N Take a number, returns its power of 10. No range checking. (This formed the basis for NUMI, it's also a generic framework for run-time iteration. The substitution of SWAPs with 'A' instructions also seems significant.) LIT 1 >A DUP JMP0 DONE (if N==0) LOOP: A> call 10* >A LIT -1 + DUP JMP0 DONE JMP LOOP DONE: DROP A> ;