Mercurial > hg > index.cgi
view src/fps.s @ 92:6ac267da2216
Correctly calculate existing significand digits for fp to ascii conversion
You need to start at the end of the digits and work left to find the first
nonzero digit if you want the number of real digits in the significand.
Going the other way looking for zeroes causes it to bail out too soon and
lop off digits at the first zero encountered within the number, even if it
isn't a trailing zero.
author | William Astle <lost@l-w.ca> |
---|---|
date | Tue, 17 Oct 2023 17:12:36 -0600 |
parents | ecca1fcfc34b |
children | 69af7224f614 |
line wrap: on
line source
*pragmapush list *pragma list ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Single precision floating point arithmetic package ; ; The single precision floating point values are stored as follows (unpacked): ; ; Byte Length What ; 0 1 Exponent with a +63 bias, 0 = number is zero ; 1 5 BCD significand (10 digits) ; 6 1 Sign (00 = positive, FF = negative) ; ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Unpack single precision BCD floating point at (X) to fpa1 fps_unpack1 ldb ,x ; get sign and exponent bne fps_unpack1b ; brif not zero ldd zero ; zero out the entire value to represent 0 std fpa1 std fpa1+2 std fpa1+4 sta fpa1+6 rts fps_unpack1b sex ; get sign flag to store sta fpa1+fpa.sign ; set accumulator andb #0x7f ; lose sign from exponent stb fpa1+fpa.exp ; set exponent ldd 1,x ; copy signficand over std fpa1+fpa.sig ldd 3,x std fpa1+fpa.sig+2 lda 5,x sta fpa1+fpa.sig+4 rts ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Unpack single precision BCD floating point at (X) to fpa0 fps_unpack0 ldb ,x ; get sign and exponent bne fps_unpack0b ; brif not zero ldd zero ; zero out the entire value to represent 0 std fpa0 std fpa0+2 std fpa0+4 sta fpa0+6 rts fps_unpack0b sex ; get sign flag to store sta fpa0+fpa.sign ; set accumulator andb #0x7f ; lose sign from exponent stb fpa0+fpa.exp ; set exponent ldd 1,x ; copy signficand over std fpa0+fpa.sig ldd 3,x std fpa0+fpa.sig+2 lda 5,x sta fpa0+fpa.sig+4 rts ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Pack single precision BCD floating point in fpa1 to (X) fps_pack1 lda fpa1+fpa.sign ; get sign bits anda #0x80 ; only keep high bit ora fpa1+fpa.exp ; merge with exponent sta ,x ; put in destination ldd fpa1+fpa.sig ; copy significand over std 1,x ldd fpa1+fpa.sig+2 std 3,x lda fpa1+fpa.sig+4 sta 5,x rts ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Pack single precision BCD floating point in fpa0 to (X) fps_pack0 lda fpa0+fpa.sign ; get sign bits anda #0x80 ; only keep high bit ora fpa0+fpa.exp ; merge with exponent sta ,x ; put in destination ldd fpa0+fpa.sig ; copy significand over std 1,x ldd fpa0+fpa.sig+2 std 3,x lda fpa0+fpa.sig+4 sta 5,x rts ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Copy binary arguments fps_copyargs clr fpa0+fpa.extra ; clear extra precision bits clr fpa1+fpa.extra bsr fps_unpack0 ; unpack first argument to fpa0 leax ,u ; point to second argument bra fps_unpack1 ; now unpack it to fpa1 and return ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Single precision BCD floating point subtraction ; (X) - (U) -> (Y) fps_sub bsr fps_copyargs ; copy input arguments ldb fpa1+fpa.exp ; subtracting zero? beq fps_add0 ; brif so - do nothing com fpa1+fpa.sign ; invert sign bra fps_add1 ; go handle addition of negative ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Single precision BCD floating point addition ; (X) + (U) -> (Y) ; ; Note: must denormalize the *shorter* value. However, we may end up with values the same length so we may not be able ; to easily identify which one is larger. ; ; Note that the magnitude can get substantially smaller after addition of a negative so renormalization is necessary ; after the operation. As a result, some additional precision is maintained during the addition operation to allow for ; correct rounding. ; ; Rounding will be according to the standard rules: 0...4 go toward zero, 5...9 go away from zero. fps_add bsr fps_copyargs ; fetch arguments ldb fpa1+fpa.exp ; is second argument zero? bne fps_add1 ; brif not fps_add0 leax ,y ; if second argument is zero, do nothing and return first operand bra fps_pack0 fps_add1 ldb fpa0+fpa.exp ; is first argument zero? bne fps_add3 ; brif not fps_add2 leax ,y ; point to output argument bra fps_pack1 ; pack second argument to result and return fps_add3 ldd zero ;* initialize extra precision bytes following significand which we need std fpa0+fpa.sig+5 ;* to handle denormalization of the smaller operand std fpa0+fpa.sig+7 sta fpa0+fpa.sig+9 std fpa1+fpa.sig+5 std fpa1+fpa.sig+7 sta fpa1+fpa.sig+9 subb fpa0+fpa.exp ; calculate exponent difference lbeq fps_add8 ; brif same exponent - no need to denormalize bcs fps_add4 ; brif second operand is bigger ldx #fpa1 ; point to second operand to denormalize cmpb #10 ; are we going to shift more than the precision? bhs fps_add0 ; brif so - return first operand bra fps_add5 ; go shift operand right by B places fps_add4 negb ; get positive number of shifts ldx #fpa0 ; shift left operand right B places lda fpa1+fpa.exp ; set exponent of result to larger of two sta fpa0+fpa.exp cmpb #10 ; shifting more than precision? bhs fps_add2 ; brif so - return second operand fps_add5 subb #2 ; do we need at least two places? bcs fps_add6 ; brif not lda fpa.sig+8,x ; shift right into extra digits sta fpa.sig+9,x lda fpa.sig+7,x sta fpa.sig+8,x lda fpa.sig+6,x sta fpa.sig+7,x lda fpa.sig+5,x sta fpa.sig+6,x lda fpa.sig+4,x sta fpa.sig+5,x lda fpa.sig+3,x sta fpa.sig+4,x lda fpa.sig+2,x sta fpa.sig+3,x lda fpa.sig+1,x sta fpa.sig+2,x lda fpa.sig,x sta fpa.sig+1,x clr fpa.sig,x bra fps_add5 ; go see if we have more to shift fps_add6 incb ; do we still have a digit to shift? bne fps_add8 ; brif not lsr fpa.sig,x ; shift a digit right ror fpa.sig+1,x ror fpa.sig+2,x ror fpa.sig+3,x ror fpa.sig+4,x ror fpa.sig+5,x ror fpa.sig+6,x ror fpa.sig+7,x ror fpa.sig+8,x ror fpa.sig+9,x lsr fpa.sig,x ror fpa.sig+1,x ror fpa.sig+2,x ror fpa.sig+3,x ror fpa.sig+4,x ror fpa.sig+5,x ror fpa.sig+6,x ror fpa.sig+7,x ror fpa.sig+8,x ror fpa.sig+9,x lsr fpa.sig,x ror fpa.sig+1,x ror fpa.sig+2,x ror fpa.sig+3,x ror fpa.sig+4,x ror fpa.sig+5,x ror fpa.sig+6,x ror fpa.sig+7,x ror fpa.sig+8,x ror fpa.sig+9,x lsr fpa.sig,x ror fpa.sig+1,x ror fpa.sig+2,x ror fpa.sig+3,x ror fpa.sig+4,x ror fpa.sig+5,x ror fpa.sig+6,x ror fpa.sig+7,x ror fpa.sig+8,x ror fpa.sig+9,x fps_add8 clra ; clear carry so regular addition works below lda fpa0+fpa.sign ; do signs differ? eora fpa1+fpa.sign sta fpa1+fpa.exp ; non-zero if signs differ; we don't need fpa1 exponent any more beq fps_add11 ; brif not - just add ldx #fpa1 ; default to second argument being smaller lda fpa0+fpa.sig ; compare high digits of significand cmpa fpa1+fpa.sig bne fps_add9 ; brif top digits differ lda fpa0+fpa.sig+1 ; next digits? cmpa fpa1+fpa.sig+1 bne fps_add9 ; brif digits differ; pattern continues lda fpa0+fpa.sig+2 cmpa fpa1+fpa.sig+2 bne fps_add9 lda fpa0+fpa.sig+3 cmpa fpa1+fpa.sig+3 bne fps_add9 lda fpa0+fpa.sig+4 cmpa fpa1+fpa.sig+4 bne fps_add9 lda fpa0+fpa.sig+5 cmpa fpa1+fpa.sig+5 ; don't have to check other extras; only one set will be set fps_add9 bhs fps_add10 ; brif first operand is bigger ldx #fpa0 ; point to first operand as smaller lda fpa1+fpa.sign ; set result sign to that of larger operand sta fpa0+fpa.sign fps_add10 ldd #0x9999 ;* calculate 9's complement of smaller operand (X must already point to it) subd fpa.sig,x ;* we'll complete the 10's complement by setting C on the way into the std fpa.sig,x ;* addition sequence ldd #0x9999 subd fpa.sig+2,x std fpa.sig+2,x ldd #0x9999 subd fpa.sig+4,x std fpa.sig+4,x ldd #0x9999 subd fpa.sig+6,x std fpa.sig+6,x ldd #0x9999 subd fpa.sig+8,x std fpa.sig+8,x coma ; set carry going into add fps_add11 lda fpa0+fpa.sig+9 ; do the addition (10 bytes) adca fpa1+fpa.sig+9 daa sta fpa0+fpa.sig+9 lda fpa0+fpa.sig+8 adca fpa1+fpa.sig+8 daa sta fpa0+fpa.sig+8 lda fpa0+fpa.sig+7 adca fpa1+fpa.sig+7 daa sta fpa0+fpa.sig+7 lda fpa0+fpa.sig+6 adca fpa1+fpa.sig+6 daa sta fpa0+fpa.sig+6 lda fpa0+fpa.sig+5 adca fpa1+fpa.sig+5 daa sta fpa0+fpa.sig+5 lda fpa0+fpa.sig+4 adca fpa1+fpa.sig+4 daa sta fpa0+fpa.sig+4 lda fpa0+fpa.sig+3 adca fpa1+fpa.sig+3 daa sta fpa0+fpa.sig+3 lda fpa0+fpa.sig+2 adca fpa1+fpa.sig+2 daa sta fpa0+fpa.sig+2 lda fpa0+fpa.sig+1 adca fpa1+fpa.sig+1 daa sta fpa0+fpa.sig+1 lda fpa0+fpa.sig adca fpa1+fpa.sig daa sta fpa0+fpa.sig ror fpa1+fpa.exp ;* do sign flag and carry differ? will set V if so; we will never have rol fpa1+fpa.exp ;* a real carry on the subtract case but might on the add case bvc fps_normalize ; brif we didn't overflow significand - we can normalize the result inc fpa0+fpa.exp ; bump exponent to account for overflow lbmi OVERROR ; brif it overflowed ror fpa0+fpa.sig ; do a bit dance ror fpa0+fpa.sig+1 ror fpa0+fpa.sig+2 ror fpa0+fpa.sig+3 ror fpa0+fpa.sig+4 ror fpa0+fpa.sig+5 ror fpa0+fpa.sig+6 ror fpa0+fpa.sig+7 ror fpa0+fpa.sig+8 ror fpa0+fpa.sig+9 ror fpa0+fpa.sig ror fpa0+fpa.sig+1 ror fpa0+fpa.sig+2 ror fpa0+fpa.sig+3 ror fpa0+fpa.sig+4 ror fpa0+fpa.sig+5 ror fpa0+fpa.sig+6 ror fpa0+fpa.sig+7 ror fpa0+fpa.sig+8 ror fpa0+fpa.sig+9 ror fpa0+fpa.sig ror fpa0+fpa.sig+1 ror fpa0+fpa.sig+2 ror fpa0+fpa.sig+3 ror fpa0+fpa.sig+4 ror fpa0+fpa.sig+5 ror fpa0+fpa.sig+6 ror fpa0+fpa.sig+7 ror fpa0+fpa.sig+8 ror fpa0+fpa.sig+9 ror fpa0+fpa.sig ror fpa0+fpa.sig+1 ror fpa0+fpa.sig+2 ror fpa0+fpa.sig+3 ror fpa0+fpa.sig+4 ror fpa0+fpa.sig+5 ror fpa0+fpa.sig+6 ror fpa0+fpa.sig+7 ror fpa0+fpa.sig+8 ror fpa0+fpa.sig+9 ; fall through to normalization routine ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Normalize a floating point value in fpa0 with extra precision digits at fpaextra (up to 5 bytes) and return the ; packed result at (Y) ; ; The first step is to shift the significand left until a nonzero digit is in the leftmost position. This will bring ; in extra precision digits from fpaextra through fpaextra4 until either 10 shifts are done or the decimal exponent ; underflows. In both of those cases, the result will be zero. ; ; Once the leftmost digit is nonzero, the leftmost extra precision digit is checked to see if it is >= 5. If so, the ; significand will have 1 added to it. If that triggers a carry, the decimal exponent will be incremented and the ; significand will have its leftmost digit set to 1. ; ; This will trigger an overflow if the decimal exponent exceeds the allowed range. fps_normalize clrb ; initialize the exponent adjustment counter fps_normalize0 lda fpa0+fpa.sig ; do we have a nonzero digit in the first pair? bne fps_normalize1 ; brif so lda fpa0+fpa.sig+1 ; shift everything left 2 spaces sta fpa0+fpa.sig lda fpa0+fpa.sig+2 sta fpa0+fpa.sig+1 lda fpa0+fpa.sig+3 sta fpa0+fpa.sig+2 lda fpa0+fpa.sig+4 sta fpa0+fpa.sig+3 lda fpaextra sta fpa0+fpa.sig+4 lda fpaextra+1 sta fpaextra lda fpaextra+2 sta fpaextra+1 lda fpaextra+3 sta fpaextra+2 lda fpaextra+4 sta fpaextra+3 clr fpaextra+4 subb #2 ; account for two digit positions cmpb #-10 ; have we shifted the whole set of digits (assumes originally normalized) bgt fps_normalize0 ; brif not fps_normalize4 clr fpa0+fpa.exp ; set result to zero clr fpa0+fpa.sign fps_normalize5 leax ,y ; point to return location jmp fps_pack0 ; return result fps_normalize1 bita #0xf0 ; is the high digit zero? bne fps_normalize3 ; brif not lsl fpaextra ; only need to shift one extra position here since there won't be more shifts rol fpa0+fpa.sig+4 rol fpa0+fpa.sig+3 rol fpa0+fpa.sig+2 rol fpa0+fpa.sig+1 rol fpa0+fpa.sig lsl fpaextra rol fpa0+fpa.sig+4 rol fpa0+fpa.sig+3 rol fpa0+fpa.sig+2 rol fpa0+fpa.sig+1 rol fpa0+fpa.sig lsl fpaextra rol fpa0+fpa.sig+4 rol fpa0+fpa.sig+3 rol fpa0+fpa.sig+2 rol fpa0+fpa.sig+1 rol fpa0+fpa.sig lsl fpaextra rol fpa0+fpa.sig+4 rol fpa0+fpa.sig+3 rol fpa0+fpa.sig+2 rol fpa0+fpa.sig+1 rol fpa0+fpa.sig decb ; account for digit shift fps_normalize3 addb fpa0+fpa.exp ; adjust exponent stb fpa0+fpa.exp ble fps_normalize4 ; brif we underflowed to zero ldb fpaextra ; get extra precision digit andb #0xf0 ; keep only the highest extra precision digit cmpb #0x50 ; do we need to round? blo fps_normalize5 ; brif not lda fpa0+fpa.sig+4 ; bump low digits adda #1 daa sta fpa0+fpa.sig+4 bcc fps_normalize5 ; brif no carry - done lda fpa0+fpa.sig+3 ; keep going until all significand bytes handled adda #1 daa sta fpa0+fpa.sig+3 bcc fps_normalize5 lda fpa0+fpa.sig+2 adda #1 daa sta fpa0+fpa.sig+2 bcc fps_normalize5 lda fpa0+fpa.sig+1 adda #1 daa sta fpa0+fpa.sig+1 bcc fps_normalize5 lda fpa0+fpa.sig adda #1 daa sta fpa0+fpa.sig bcc fps_normalize5 lda #0x10 ; overflowed the significand - shift a 1 digit in sta fpa0+fpa.sig inc fpa0+fpa.exp ; and bump exponent to reflect that bpl fps_normalize5 ; brif we didn't overflow - return result OVERROR ldb #err_ov ; raise overflow jmp ERROR ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Single precision BCD multiply (X) by (U) ; ; Calculate: (X) * (U) -> (Y) ; ; First, the routine calculates the new exponent and sign. The exponents simply add together while the sign is the ; exclusive OR of the two input signs. That gives negative for differing signs and positive for matching signs. ; Then the multiplication of the significands works by initializing an accumulator large enough to hold twice as ; many digits as each significand to zeroes. It also needs an extra byte to the left to handle overflow for each ; digit operation. It then works through each digit of the multiplier and uses that as a count to add the ; multiplicand to the *high digits* of the accumulator. Then, before moving to the next digit, the accumulator and ; mulltiplier are shifted right and the process is repeated. Once the process is done, the result required for the ; floating point calculation will be the high 10 digits of the accumulator with extra digits stored in the ; remaining bytes. ; ; Once the multiplication is complete, the significand is normalized. See above. fps_mul jsr fps_copyargs ; fetch arguments lda fpa0+fpa.exp ; first argumennt zero? bne fps_mul0 ; brif not leax ,y ; return first argument (0) jmp fps_pack0 fps_mul0 lda fpa1+fpa.exp ; second argument zero? bne fps_mul1 ; brif not leax ,y ; return second argument (0) jmp fps_pack1 fps_mul1 lda fpa0+fpa.sign ; calculate result sign eora fpa1+fpa.sign sta fpa0+fpa.sign ; result will go into fpa0 lda fpa0+fpa.exp ; fetch exponent of multiplicand suba #64 ; remove bias sta fpa0+fpa.exp ; save it lda fpa1+fpa.exp ; get exponent of multplier suba #64 ; remove bias adda fpa0+fpa.exp ; add exponents cmpa #63 ; did we overflow upward? bgt OVERROR ; brif so cmpa #-63 ; did we underflow? bhs fps_mul3 ; brif not fps_mul2 clr fpa0+fpa.exp ; return zero if we underflow leax ,y jmp fps_pack0 fps_mul3 adda #64 ; add bias to resulting exponent sta fpa0+fpa.exp ; save result exponent ldd zero ;* zero out result buffer high digits (extra digits + significand and std fpaextra ;* one extra byte to make sure lowest extra digit is zero) std fpaextra+2 std fpaextra+4 sta fpaextra+6 ldb fpa1+fpa.sig+4 ;* do each byte of multiplier significand in sequence bsr fps_mul4 ldb fpa1+fpa.sig+3 bsr fps_mul4 ldb fpa1+fpa.sig+2 bsr fps_mul4 ldb fpa1+fpa.sig+1 bsr fps_mul4 ldb fpa1+fpa.sig bsr fps_mul4 ldd fpaextra+1 ; copy result into fpa0 significand (overflow byte will be zero) std fpa0+fpa.sig ldd fpaextra+3 std fpa0+fpa.sig+2 ldd fpaextra+5 sta fpa0+fpa.sig+4 ldd fpaextra+7 std fpa0+fpa.sig+6 ldd fpaextra+9 std fpa0+fpa.sig+8 jmp fps_normalize ; go normalize the result fps_mul4 bne fps_mul6 ; brif at least one digit is nonzero ldd fpaextra+8 ; shift right by 8 bits std fpaextra+9 ldd fpaextra+6 std fpaextra+7 ldd fpaextra+4 std fpaextra+5 ldd fpaextra+2 sta fpaextra+3 ldd fpaextra std fpaextra+1 clr fpaextra rts fps_mul5 bsr fps_mul11 ; add multiplicand to accumulator decb ; account for iteration fps_mul6 bitb #0x0f ; done everything for this digit? bne fps_mul7 bsr fps_mul9 ; shift accumulator bra fps_mul8 fps_mul7 bsr fps_mul11 ; add multiplicand to accumulator subb #0x10 ; account for iteration fps_mul8 bitb #0xf0 ; done all iterations? bne fps_mul7 ; brif not fps_mul9 lsr fpaextra ; shift result ror fpaextra+1 ror fpaextra+2 ror fpaextra+3 ror fpaextra+4 ror fpaextra+5 ror fpaextra+6 ror fpaextra+7 ror fpaextra+8 ror fpaextra+9 ror fpaextra+10 lsr fpaextra ror fpaextra+1 ror fpaextra+2 ror fpaextra+3 ror fpaextra+4 ror fpaextra+5 ror fpaextra+6 ror fpaextra+7 ror fpaextra+8 ror fpaextra+9 ror fpaextra+10 lsr fpaextra ror fpaextra+1 ror fpaextra+2 ror fpaextra+3 ror fpaextra+4 ror fpaextra+5 ror fpaextra+6 ror fpaextra+7 ror fpaextra+8 ror fpaextra+9 ror fpaextra+10 lsr fpaextra ror fpaextra+1 ror fpaextra+2 ror fpaextra+3 ror fpaextra+4 ror fpaextra+5 ror fpaextra+6 ror fpaextra+7 ror fpaextra+8 ror fpaextra+9 ror fpaextra+10 rts fps_mul11 lda fpa0+fpa.sig+4 ; add digits starting at the bottom adda fpaextra+5 daa sta fpaextra+5 lda fpa0+fpa.sig+3 adca fpaextra+4 daa sta fpaextra+4 lda fpa0+fpa.sig+2 adca fpaextra+3 daa sta fpaextra+3 lda fpa0+fpa.sig+1 adca fpaextra+2 daa sta fpaextra+2 lda fpa0+fpa.sig adca fpaextra+1 daa sta fpaextra+1 bcc fps_mul12 inc fpaextra ; handle carry out fps_mul12 rts ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Single precision BCD divide ; ; Calculate (X) รท (U) -> (Y) ; ; First, calculate the result exponent by subtracting the decimal exponent of (U) from that of (X). If the result goes ; above the exponent range, raise overflow. If it goes below the exponent range, return zero. ; ; The calculate the result sign the same as for multiplcation. Also calculate the 10's complement of the divisor. ; ; First, we copy the divdend into the extra precision digits in fpa1 since they aren't needed for anything else ; here. Then we set the desired nonzero digit counter to 11 to indicate 10 significant digits plus a digit for ; rounding and set the flag indicating a nonzero digit was seen to zero to indicate it hasn't been seen. Also set ; the extra carry byte to zero. Then we run through the digit pair routine 10 times to set 20 digits of the ; significand and extra precision for fpa0. The digit pair routine just calls the digit loop twice while doing ; some bookkeeping. ; ; We also pre-cacluate the 9's complement of the divisor and store that in fpaextra to save time during the ; subtraction steps. ; ; For each digit, the process is as follows: ; ; 1. If the desired digit count is zero, do nothing and just return a zero for the quotient digit. ; 2. Initialize the quotient digit to 0 ; 3. If the extra carry is nonzero, we know the result "goes" so go on to step 6 ; 4. Compare the divisor to the residue; this is faster on average than doing the subtraction and then ; reversing the last subtraction loop. ; 5. If the divisor goes *exactly* into the residue, bump the quotient digit, clear the desired digit count ; and return the quotient digit; we have no need to do anything more ; 6. If the divisor does go, subtract the divisor from the residue, bump the digit, and go back to step 3 ; 7. If the resulting digit is nonzero, set the "nonzero digit seen" flag to 1 (exactly one) ; 8. Subtract the nonzero digit flag from the desired digits flag so after enough digits, we stop doing the ; division loops. ; 9. Return the quotient digit. fps_div jsr fps_copyargs ; get arguments to the accumulators lda fpa1+fpa.exp ; get divisor exponent bne fps_div0 ; brif not zero DIV0ERROR ldb #err_div0 ; raise division by zero jmp ERROR fps_div0 suba #64 ; remove bias sta fpa1+fpa.exp ; save for later calculation lda fpa0+fpa.exp ; get quotient exponent bne fps_div2 ; brif not zero - we have to do work fps_div1 clr fpa0+fpa.exp ; make sure result is zero leax ,y ; return zero result jmp fps_pack0 fps_div2 suba #64 ; remove bias suba fpa1+fpa.exp ; subtract divisor exponent cmpa #64 ; did we overflow upward? lbge OVERROR ; brif so cmpa #-64 ; did we overflow downward (underflow)? bls fps_div1 ; brif we underflow adda #64 ; add back the bias sta fpa0+fpa.exp ; set result exponent lda fpa0+fpa.sign ; calculate result sign (XOR of argument signs) eora fpa1+fpa.sign sta fpa0+fpa.sign ldd fpa0+fpa.sig ; initialize residue to dividend std fpa1+fpa.sig ldd fpa0+fpa.sig+2 std fpa1+fpa.sig+2 lda fpa0+fpa.sig+4 sta fpa1+fpa.sig+4 ldd #11 ; initialize digit counter and nonzero seen flag std fpaextra+5 sta fpaextra+7 ; set ongoing extra carry digits to zero ldd #0x9999 ;* calculate 9's complement of divisor for later; we'll introduce a carry subd fpa1+fpa.sig ;* to the first byte to complete the 10's complement's +1 to save doing std fpaextra ;* the extra work here ldd #0x9999 subd fpa1+fpa.sig+2 std fpaextra+2 lda #0x99 suba fpa1+fpa.sig+4 sta fpaextra+4 bsr fps_div3 ; calculate the quotient byte by byte stb fpa0+fpa.sig bsr fps_div3 stb fpa0+fpa.sig+1 bsr fps_div3 stb fpa0+fpa.sig+2 bsr fps_div3 stb fpa0+fpa.sig+3 bsr fps_div3 stb fpa0+fpa.sig+4 bsr fps_div3 stb fpa0+fpa.sig+5 bsr fps_div3 stb fpa0+fpa.sig+6 bsr fps_div3 stb fpa0+fpa.sig+7 bsr fps_div3 stb fpa0+fpa.sig+8 bsr fps_div3 stb fpa0+fpa.sig+9 jmp fps_normalize ; go normalize the result and return fps_div3 bsr fps_div5 ; do a digit lslb ; shift it over lslb lslb lslb sta fpaextra+8 ; save it bsr fps_div5 ; do next digit addb fpaextra+8 ; combine the two quotient digits fps_div4 rts fps_div5 ldb fpaextra+6 ; do we even need to do the division? beq fps_div4 ; brif not - return 0 clrb ; initialize quotient digit fps_div6 lda fpaextra+7 ; did we have a carry last time? bne fps_div7 ; brif so - we know it "goes" lda fpa1+fpa.sig ; is the divisor less than the dividend residue? cmpa fpa1+fpa.sig+5 bhi fps_div8 ; brif high byte is larger than residue lda fpa1+fpa.sig+1 ; and keep going for all 5 bytes cmpa fpa1+fpa.sig+6 bhi fps_div8 lda fpa1+fpa.sig+2 cmpa fpa1+fpa.sig+7 bhi fps_div8 lda fpa1+fpa.sig+3 cmpa fpa1+fpa.sig+8 bhi fps_div8 lda fpa1+fpa.sig+4 cmpa fpa1+fpa.sig+9 bhi fps_div8 ; brif divisor is greater than the residue bne fps_div7 ; brif it didn't go exactly incb ; bump quotient for this "go" clr fpaextra+6 ; indicate no more digits needed rts fps_div7 coma ; set carry to complete 10's complement of divisor lda fpa1+fpa.sig+9 ; do the "subtraction" adca fpaextra+4 daa sta fpa1+fpa.sig+9 lda fpa1+fpa.sig+8 adca fpaextra+3 daa sta fpa1+fpa.sig+8 lda fpa1+fpa.sig+7 adca fpaextra+2 daa sta fpa1+fpa.sig+7 lda fpa1+fpa.sig+6 adca fpaextra+1 daa sta fpa1+fpa.sig+6 lda fpa1+fpa.sig+5 adca fpaextra daa sta fpa1+fpa.sig+5 lda fpaextra+7 ; and handle the carry byte adca #0x99 sta fpaextra+7 incb ; bump digit count bra fps_div6 ; go see if we need another subtraction fps_div8 tstb ; nonzero digit? beq fps_div9 ; brif not lda #1 ; set nonzero flag sta fpaextra+5 fps_div9 lda fpaextra+6 ; adjust digit count suba fpaextra+5 sta fpaextra+6 lsl fpa1+fpa.sig+9 ; shift residue one decimal digit rol fpa1+fpa.sig+8 rol fpa1+fpa.sig+7 rol fpa1+fpa.sig+6 rol fpa1+fpa.sig+5 rol fpaextra+7 lsl fpa1+fpa.sig+9 rol fpa1+fpa.sig+8 rol fpa1+fpa.sig+7 rol fpa1+fpa.sig+6 rol fpa1+fpa.sig+5 rol fpaextra+7 lsl fpa1+fpa.sig+9 rol fpa1+fpa.sig+8 rol fpa1+fpa.sig+7 rol fpa1+fpa.sig+6 rol fpa1+fpa.sig+5 rol fpaextra+7 lsl fpa1+fpa.sig+9 rol fpa1+fpa.sig+8 rol fpa1+fpa.sig+7 rol fpa1+fpa.sig+6 rol fpa1+fpa.sig+5 rol fpaextra+7 rts ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ; Convert a floating point number in val0 to a string in strbuff ; The maximum size of a string generated here is 19 bytes representing 10 significant digits, 5 leading zeroes after the ; decimal point, the decimal point, a leading zero, a leading sign, and a trailing NUL. fps_toascii ldu #strbuff+1 ; point to output buffer lda #0x20 ; set sign to "blank" sta -1,u clr fpaextra+1 ; set decimal point offset; zero means none clr fpaextra+3 ; disable "E" notation ldx #val0+val.value ; unpack value to fpa0 so we can mess with it jsr fps_unpack0 lda fpa0+fpa.exp ; check exponent bne fps_toascii0 ; brif not zero lda #'0 ; make the number zero and print it sta ,u clr 1,u rts fps_toascii0 lda fpa0+fpa.sign ; negative? bpl fps_toascii1 ; brif not lda #'- ; negative sign sta -1,u fps_toascii1 lda #10 ; assume 10 significant digits exist ldx #fpa0+fpa.sig+5 ; point to significand fps_toascii2 ldb ,-x ; get digit pair bitb #0x0f ; is right digit zero? bne fps_toascii3 ; brif not - we've counted all the significant digits deca ; trailing zero - uncount it bitb #0xf0 ; is left digit zero? bne fps_toascii3 ; brif not - we've counted all the sigificant digits deca ; trailing zero - uncount it cmpx #fpa0+fpa.sig ; done all significant digits? bhi fps_toascii2 ; brif not fps_toascii3 sta fpaextra ; save significant digits that exist lda fpa0+fpa.exp ; fetch exponent suba #64 ; remove bias sta fpa0+fpa.exp ; save for later bpl fps_toascii14 ; brif no leading zeroes suba fpaextra ; get number of significant digits plus decimal count cmpa #-15 ; do we end up with too many digits (leading 0 + significant digits) blt fps_toascii15 ; brif too many - do scientific notation ldd #'0*256+'. ; put a leading "0." std ,u++ ldb fpa0+fpa.exp ; get exponent back, which is the number of leading digits (negative) bra fps_toascii4a ; go add leading zeroes if needed fps_toascii4 sta ,u+ ; put a zero fps_toascii4a incb ; done all bne fps_toascii4 ; brif not fps_toascii5 ldx #fpa0+fpa.sig ; point to significand ldd #0xf000 ; set digit mask and counter sta fpaextra+2 fps_toascii6 lda ,x ; get digit pair anda fpaextra+2 ; keep the desired digit bita #0xf0 ; is it the left hand digit? beq fps_toascii7 ; brif not or digit is 0 lsra ; right justify digit lsra lsra lsra fps_toascii7 adda #0x30 ; turn it into ascii sta ,u+ ; stuff it in the output dec fpaextra ; done all digits? beq fps_toascii9 ; brif so dec fpaextra+1 ; are we at the decimal point? bne fps_toascii8 ; brif not lda #'. ; put a decimal sta ,u+ fps_toascii8 com fpaextra+2 ; flip digit mask bpl fps_toascii6 ; handle another digit leax 1,x ; move to next digit byte bra fps_toascii6 ; now go handle next digit fps_toascii9 ldb fpaextra+3 ; get decimal exponent to display beq fps_toascii13 lda #'E ; output "E" sta ,u+ tstb ; negative? bpl fps_toascii10 ; brif not negb ; positivize it lda #'- ; put a minus for the exponent sta ,u+ fps_toascii10 cmpb #10 ; do we have two digits for exponent? blo fps_toascii12 ; brif not lda #0x2f ; initialize left digit fps_toascii11 inca ; bump digit subb #10 ; are we at the right digit? bpl fps_toascii11 ; brif not yet addb #10 ; undo extra subtraction sta ,u+ ; save left digit fps_toascii12 addb #0x30 ; turn right digit to ascii stb ,u+ ; save right digit fps_toascii13 clr ,u ; put a NUL at the end of the result rts fps_toascii14 cmpa #9 ; is it in range for number of significant digits? bgt fps_toascii15 ; brif not - do scientific notation inca ; exponent 0 has decimal point after first digit sta fpaextra+1 ; save decimal point location cmpa fpaextra ; is it more than the number of significant digits? ble fps_toascii5 ; brif not - just convert the significand sta fpaextra ; make sure we include the pre-decimal zeroes bra fps_toascii5 ; go convert the significand fps_toascii15 ldb #1 ; put decimal after the first digit stb fpaextra+1 sta fpaextra+3 ; enable the "E" notation with the correct exponent bra fps_toascii5 ; actually convert the number *pragmapop list