Mercurial > hg > index.cgi
view docs/Interpreter Operation.txt @ 125:0607e4e20702
Correct offset error for keyword table lookup
author | William Astle <lost@l-w.ca> |
---|---|
date | Sun, 07 Jan 2024 20:35:51 -0700 |
parents | 1c1a0150fdda |
children |
line wrap: on
line source
This document is intended to descibe the operation of the interpreter including program text management, parsing, and execution. In general, LWBasic preserves the line oriented nature of Color Basic but extends it somewhat to be more flexible and more efficient to interpret. The primary way it does this is to split the parsing and execution processes and to pre-parse numeric and string constants. By doing this, it removes a lot of complexity from the interpretation loop. Parsing is done when a line is entered into the program meaning that syntax errors can be detected immediately instead of at run time. It also means the interpretation loop does not have to do slow processing like finding the end of a statement. This will be most noticeable for things like IF statements. Parsing transforms the program into a byte code. This byte code will often end up being larger than the original program text. The byte code consists of a sequence of line structures which consist of a pointer to the next line followed by a 16 bit binary line number. If the pointer is NULL, the end of the program has been reached. Each line consists of a sequence of "operations" which with zero or more operands. Each of these is described below. Each section below starts with a header line containing the operation code, which may be more than 8 bits, a symbolic abbreviation for the operation, and an English short description. Following that is a longer description of the operation code and the encoding of its parameters. It should be noted that various syntactic particles do not get encoded in the final result even though they are keywords. The list of those is: TAB, TO, SUB, THEN, ELSE, STEP, OFF, FN, USING, AS, ERR, ERROR, BRK, BREAK, RGB, CMP. Thus they will appear in the keyword tables but not in the encoded program. Note also that some keywords serve as both commands and functions. In those cases, there will be separate operation codes. 00 EOL End of Line This operation signals the end of a program line. Interpretation will continue with the next program line, or end if no further lines exist. 01 CONST0 Zero constant Exactly what it says on the tin. This evaluates to an integer constant zero. Because zero values are common, having a dedicated code for this is beneficial for overal byte code compactness. 02 CONST1 One constnat Exactly what it says on the tin. Because one is a very common constant, encoding it specifically in a single byte seems sensible as a means to keep the byte code size smaller. 03 INT8 8 bit signed integer constant This is a signed 8 bit integer constant. Most constants in programs are small integers. By encoding these specially, we keep the byte code more compact. This saves three bytes over encoding integers at 32 bits. 04 INT16 16 bit signed integer constant This encoding is present to avoid taking up 32 bits for the integer data when 16 bits will do. Again, this is intended to keep the byte code a bit more compact. This saves two bytes over encoding integers at 32 bits. 05 INT32 32 bit signed integer constant Exactly what it says on the tin. 06 BCD48 BCD Floating Point This is a 48 bit BCD floating point value where the first byte contains the sign bit and 7 bit exponent (stored with a bias of 64). The remaining five bytes contain the 10 BCD digits of the significand. 07 BCD16 BCD Floating Point (2 significant digits) 08 BCD24 BCD Floating Point (4 significant digits) 09 BCD32 BCD Floating Point (6 significant digits) Because many numbers will only need a small number of significant digits, encodings for numbers needing only two or four significant digits are provided. These are intended to keep the byte code more compact. 0A STRING String constant This encodes a string constant whose length fits in an 8 bit unsigned byte. The first byte is the length, which may be zero, with the remaning bytes being the string data. The string data may contain any binary values. 0B LSTRING Long string constant This is exactly like STRING above but uses a 16 bit length field for encoding very long strings. This will not normally occur in programs but is included in case it is required. 1D VARS Scalar variable reference This is a reference to a scalar variable. It is followed by a variable type (integer, floating point, string) (upper 3 bits) and length (lower 5 bits) byte followed by the variable name *without* a type sigil. Note that this encoding is also used in the DIM command. Note that type 0 indicates an unspecified type (no sigil) which will be looked up at runtime and defaults to floating point. 1E VARA Array variable reference This is exactly like VARS except following the variable name string, a sequence of expressions specifying the subscript values follows. The sequence of expressions begins with a count (8 bits) followed by the expressions. The expression count is required to allow skipping over the subscript references without having to know how many dimensions an array has. Further, it is not possible to know how many dimensions are required at parse time. Note that this encoding is also used in the DIM command. Note that type 0 indicates an unspecified type (no sigil) which will be looked up at runtime and defaults to floating point. 1F EXPR Expression This indicates an expression to be evaluated. It is followed by a sequence of terms and operators to be evaluated. The expression is stored in postfix order and will be evaluated using an expression evaluation stack. Each operation will fetch zero or more operands from the evaluation stack, do its calculation, and then push its result back onto the evaluation stack. When an "end of expression" operator is encountered, the result is popped from the stack and left in the result destination. Note that an end of expression operator is required because unary operators exist. Note that an expression will be converted back to infix notation when listed using parentheses only as required to account for operator precedence. This means that an expression entered with parentheses may be listed back out without parentheses. Postfix notation is used to store expressions because it avoids having to deal with operator precedence at run time. 20 EOE End of expression operator This signifies the end of an expression and triggers the expression evaluator to return its result. 21 NEG Negation 22 ADD Addition 23 SUB Subtraction 24 MUL Mulltiplication 25 DIV Division 26 MOD Modulus 27 NOT Boolean not 28 AND Boolean and 29 OR Boolean or 2A XOR Boolean exlusive or 2B COM Bitwise complement 2C LAND Bitwise and 2D LOR Bitwise or 2E LXOR Bitwise exclusive or 2F CONCAT String concatenation 30 EQ Equality comparison 31 NE Inequality comparison 32 GT Greater than comparison 33 LT Less than comparison 34 GE Greater than or equal comparison 35 LE Less than or equal comparison 36 EXP Exponentiation These are the basic arithmetic, boolean, and logical operators. 40...7F: built in functions 40 SGN 41 INT 42 ABS 43 USR 44 RND 45 SIN 46 PEEK 47 LEN 48 STR$ 49 VAL 4A ASC 4B CHR$ 4C EOF 4D JOYSTK 4E LEFT$ 4F RIGHT$ 50 MID$ 51 POINT 52 INKEY$ 53 MEM 54 ATN 55 COS 56 TAN 57 EXP 58 FIX 59 LOG 5A POS 5B SQR 5C HEX$ 5D VARPTR 5E INSTR 5F TIMER 60 PPOINT 61 STRING$ 62 CVN 63 FREE 64 LOC 65 LOF 66 MKN$ 67 LPEEK 68 BUTTON 69 ERNO/ERRNO 6A ERLIN/ERRLINE 6B ATTR 80...DF: commands 80 FOR 81 GOTO 82 GOSUB 83 REM 84 ' (Separate to REM because of different semantics) 85 IF 86 DATA 87 PRINT 88 ON 89 INPUT 8A END 8B NEXT 8C DIM 8D READ 8E RUN 8F RESTORE 90 RETURN 91 POP 92 STOP 93 POKE 94 CONT 95 LIST 96 CLEAR 97 NEW 98 OPEN 99 CLOSE 9A LLIST 9B SET 9C RESET 9D CLS 9E MOTOR 9F SOUND A0 EXEC A1 DEL A2 EDIT A3 TRON A4 TROFF A5 DEF A6 LET A7 LINE A8 PCLS A9 PSET AA PRESET AB SCREEN AC PCLEAR AD COLOR AE CIRCLE AF PAINT B0 GET B1 PUT B2 DRAW B3 PCOPY B4 PMODE B5 PLAY B6 RENUM B7 DIR B8 DRIVE B9 FIELD BA FILES BB KILL BC LOAD BD LSET BE MERGE BF RENAME C0 RSET C1 SAVE C2 WRITE C3 VERIFY C4 UNLOAD C5 DSKINI C6 BACKUP C7 COPY C8 DSKI$ C9 DSKO$ CA DOS CB WIDTH CC PALETTE CD LPOKE CE LOCATE CF ATTR