Monday, February 12, 2018

Extending The Commodore 64 BASIC with 1st-class tokens

There are a number of nice tutorials that talk about extending the C64's BASIC. However, all the ones I could find talk about having a singl-character prefix, e.g., @, and then making single-character commands that follow that. This is the usual approach, because it is much easier, as you don't need to make the LIST command able to interpret the new commands.  However, it really makes for an ugly result, and hard to read code. For the new MEGA BASIC on the MEGA65, I wanted to avoid this, and have "real" BASIC keywords for the extensions, so that the resulting programs would be easier to write, and easier to read, and much easier to learn how to do things by reading other peoples programs.  So, I had to find out how to make a proper BASIC extension myself.

There are three main functions that you have to modify for this: (1) tokenisation (the part where words get turned into single-byte values when stored in a program; (2) detokenisation (the part where those get turned back into readable words, which really only matters for the LIST command); and (3) executing tokens. In this post, I'll show you how I implemented each one. This post will go into quite low-level detail of this, which might not be interesting to some, so please accept my apologies in advance if that is the case. However, for others who wish to implement an extension for the C64 BASIC, it would seem to be the only publicly released breakdown of how to make one, so I think it is warranted.

Before I dive into the tokenisation and detokenisation routines, it is worth mentioning a common problem to them both: The existing list of BASIC keywords as stored in the BASIC ROM is 254 bytes long.  By keeping it under 256 bytes, the tokenisation and detokenisation routines are able to be somewhat simpler, by using only index registers to scan through the list.  This has to change if we want to allow more keywords. As I don't anticipate making more keywords than 2*256 bytes, I was able to make changes that only worried about checking which half of the list is being read.


My token list is simply formed by taking the standard 255 bytes from the BASIC ROM at $A09E, and appending my new ones to the end:


tokenlist:
        ;; Reserve space for C64 BASIC token list, less the end $00 marker

        ;; (I copy these in place with a little loop elsewhere)
        .res ($A19C - $A09E + 1), $00
        ;; End of list marker (remove to enable new tokens)
                ;.byte $00
        ;; Now we have our new tokens
        ;; extra_token_count must be correctly set to the number of tokens
        ;; (This lists only the number of tokens that are good for direct use.
        ;; Keywords found only within statements are not in this tally.)
        extra_token_count = 5
        token_fast = $CC + 0
        .byte "FAS",'T'+$80
        token_slow = $CC + 1
        .byte "SLO",'W'+$80

        token_canvas = $CC + 2
        .byte "CANVA",'S'+$80
        token_colour = $CC + 3
        .byte "COLOU",'R'+$80
        token_tile = $CC + 4
        .byte "TIL",'E'+$80

        token_first_sub_command = token_tile + 1
       
        ;; These tokens are keywords used within other
        ;; commands, not as executable commands. These
        ;; will all generate syntax errors.
        token_text = token_first_sub_command + 0
        .byte "TEX",'T'+$80
        token_sprite = token_first_sub_command + 1
        .byte "SPRIT",'E'+$80
        token_screen = token_first_sub_command + 2
        .byte "SCREE",'N'+$80
        token_border = token_first_sub_command + 3
        .byte "BORDE",'R'+$80
        token_set = token_first_sub_command + 4
        .byte "SE",'T'+$80
        token_delete = token_first_sub_command + 5
        .byte "DELET",'E'+$80
        token_stamp = token_first_sub_command + 6
        .byte "STAM",'P'+$80
        token_at = token_first_sub_command + 7
        .byte "A",'T'+$80
        token_from = token_first_sub_command + 8
        .byte "FRO",'M'+$80
        ;; And the end byte
        .byte $00       


Another key issue is that most of the possible token values are already used.  Only token values between $CD and $FE are available, i.e., only 49 of them.  If I want to avoid double-byte tokens (which I really do), then I need to keep my new keywords to less than that number.  We'll see if I manage to do that, but for now, we will try. On both these points, I suspect that BASIC 10 on the C65 probably exceeds these limits, thus making the tokeniser and detokeniser more complex than they are in BASIC 2 on the C64. 

The tokeniser

The tokeniser is the routine that scans a line of BASIC text, and changes keywords into single-byte tokens.  It has to take account of various quirks, such as the PI character officially being a token with the special value $FF, and know whether parts of the line are inside quotation marks or follow a REM statement. This all makes the logic of the tokeniser routine rather more subtle and complex than one might first imagine.  As a result, I figured it was safer to essentially replicate the functionality of the standard tokeniser routine as exactly as possible, adding only the ability to scan a keyword list that is upto 511 bytes long, instead of 255 bytes long.

This routine is effectively the same as in the C64 BASIC ROM at $A57C, but with the changes to allow the token list to be two pages long. I have also tried to more fully document the routine in the comments and labels. Some of the labels have address suffixes on them, that indicate where in the original C64 BASIC ROM routine they were located, to make it easier to reference between the two of them. 

The algorithm is basically one of seeing if the current character might be the start of a keyword that should be tokenised, and if so, doing a string match against it.  Tokens are stored in the token list with bit 7 set on the last character, so, for example RUN is stored as $52 $55 $CE, where $CE = $4E + $80 to represent the terminal N of the word.  This trick is also how the short-cuts for keywords using shifted letters works: If the result of comparing the input character and the keyword letter is $80, then it is either the terminal character of the keyword, or a shifted letter in the input being tokenised.  This is also why typing something like RU(shift-N) won't work.

It also occurs to me that this allows for a rather subtle bug, where if you typed a keyword with the last character shifted, and then characters for the following token, it will match against the first token, so typing RU(shift-N)I(shift-F)RESTORE would actually be interpreted and tokenised as though you had just typed RUN. Quite what use you might make of this, I have no idea, but I have confirmed that it does indeed work.  I haven't bothered to try to fix this little problem in my extension, but just report it here in case it is of curiosity to anyone.  I'd be interested to know if anyone else had previously discovered this quirk.

Back to the task at hand, for the longer token list, I have added a "hi page flag" that is set when accessing the upper of the two pages of tokens, and clear for the lower.  In places this is initialised to $FF when set, so that INC token_hi_page_flag causes it to be cleared.  This is used in particular near the start following tokeniseNextChar, so that the routine can continue to pre-increment the pointer into the token list (which is normally just held in the Y register).

This means that the logic for advancing and, retreating the token pointer is no longer as simple as INY or DEY. To keep things understandable, I have broken out the code to do this into tokenListReadByte, tokenListAdvancePointer etc.

Beyond that, there is not really much more to say about it, without having to dive deep into explaining every little subtly about the C64 BASIC tokenise routine, other than to repeat that this tokeniser supports only single-byte tokens. To allow more tokens, we would need to maintain a 16-bit counter for the token value, and if it were >253, come up with a double-byte token, e.g., $FE <second byte>.  i.e., token 254 would be the first token to require two bytes, because token $FF is already used for PI, and we need to have at least one token reserved as a kind of escape code, which above I have suggested $FE as the logical candidate for this role. If I end up using too many tokens, that is probably the method that I would use.

megabasic_tokenise:

        ;; Get the basic execute pointer low byte
        LDX    $7A
        ;; Set the save index
        LDY    #$04
        ;; Clear the quote/data flag
        STY    $0F

@tokeniseNextChar:
        ;; Get hi page flag for tokenlist scanning, so that if we INC it, it will
        ;; point back to the first page.  As we start with offset = $FF, the first
        ;; increment will do this. Since offsets are pre-incremented, this means
        ;; that it will switch to the low page at the outset, and won't switch again
        ;; until a full page has been stepped through.
        PHA
        LDA     #$FF
        STA    token_hi_page_flag
        PLA
       
        ;; Read a byte from the input buffer
        LDA    $0200,X
        ;; If bit 7 is clear, try to tokenise
        BPL    @tryTokenise
        ;; Now check for PI (char $FF)
        CMP    #$FF     ; = PI
        BEQ    @gotToken_a5c9
        ;; Not PI, but bit 7 is set, so just skip over it, and don't store
        INX
        BNE    @tokeniseNextChar
@tryTokenise:
        ;; Now look for some common things
        ;; Is it a space?
        CMP    #$20    ; space
        BEQ    @gotToken_a5c9
        ;; Not space, so save byte as search character
        STA    $08
        CMP    #$22    ; quote marks
        BEQ    @foundQuotes_a5ee
        BIT    $0F    ; Check quote/data mode
        BVS    @gotToken_a5c9 ; If data mode, accept as is
        CMP    #$3F           ; Is it a "?" (short cut for PRINT)
        BNE    @notQuestionMark
        LDA    #$99    ; Token for PRINT
        BNE    @gotToken_a5c9 ; Accept the print token (branch always taken, because $99 != $00)
@notQuestionMark:
        ;; Check for 0-9, : or ;
        CMP     #$30
        BCC    @notADigit
        CMP    #$3C
        BCC    @gotToken_a5c9
@notADigit:
        ;; Remember where we are upto in the BASIC line of text
        STY    $71
        ;; Now reset the pointer into tokenlist
        LDY    #$00
        ;; And the token number minus $80 we are currently considering.
        ;; We start with token #0, since we search from the beginning.
        STY    $0B
        ;; Decrement Y from $00 to $FF, because the inner loop increments before processing
        ;; (Y here represents the offset in the tokenlist)
        DEY
        ;; Save BASIC execute pointer
        STX    $7A
        ;; Decrement X also, because the inner loop pre-increments
        DEX
@compareNextChar_a5b6:
        ;; Advance pointer in tokenlist
        jsr tokenListAdvancePointer
        ;; Advance pointer in BASIC text
        INX
@compareProgramTextAndToken:
        ;; Read byte of basic program
        LDA    $0200, X
        ;; Now subtract the byte from the token list.
        ;; If the character matches, we will get $00 as result.
        ;; If the character matches, but was ORd with $80, then $80 will be the
        ;; result.  This allows efficient detection of whether we have found the
        ;; end of a keyword.
        bit     token_hi_page_flag
        bmi    @useTokenListHighPage
        SEC
        SBC    tokenlist, Y
        jmp    @dontUseHighPage
@useTokenListHighPage:
        SEC
        SBC    tokenlist+$100,Y
@dontUseHighPage:
        ;; If zero, then compare the next character
        BEQ    @compareNextChar_a5b6
        ;; If $80, then it is the end of the token, and we have matched the token
        CMP    #$80
        BNE    @tokenDoesntMatch
        ;; A = $80, so if we add the token number stored in $0B, we get the actual
        ;; token number
        ORA    $0B
@tokeniseNextProgramCharacter:
        ;; Restore the saved index into the BASIC program line
        LDY    $71
@gotToken_a5c9:
        ;; We have worked out the token, so record it.
        INX
        INY
        STA    $0200 - 5, Y
        ;; Now check for end of line (token == $00)
        LDA    $0200 - 5, Y
        BEQ @tokeniseEndOfLine_a609

        ;; Now think about what we have to do with the token
        SEC
        SBC    #$3A
        BEQ    @tokenIsColon_a5dc
        CMP    #($83 - $3A) ; (=$49) Was it the token for DATA?
        BNE    @tokenMightBeREM_a5de
@tokenIsColon_a5dc:
        ;; Token was DATA
        STA    $0F    ; Store token - $3A (why?)
@tokenMightBeREM_a5de:
        SEC
        SBC    #($8F - $3A) ; (=$55) Was it the token for REM?
        BNE    @tokeniseNextChar
        ;; Was REM, so say we are searching for end of line (== $00)
        ;; (which is conveniently in A now)
        STA    $08   
@label_a5e5:
        ;; Read the next BASIC program byte
        LDA    $0200, X
        BEQ    @gotToken_a5c9
        ;; Does the next character match what we are searching for?
        CMP    $08
        ;; Yes, it matches, so indicate we have the token
        BEQ    @gotToken_a5c9

@foundQuotes_a5ee:
        ;; Not a match yet, so advance index for tokenised output
        INY
        ;; And write token to output
        STA    $0200 - 5, Y
        ;; Increment read index of basic program
        INX
        ;; Read the next BASIC byte (X should never be zero)
        BNE    @label_a5e5

@tokenDoesntMatch:
        ;; Restore BASIC execute pointer to start of the token we are looking at,
        ;; so that we can see if the next token matches
        LDX    $7A
        ;; Increase the token ID number, since the last one didn't match
        INC    $0B
        ;; Advance pointer in tokenlist from the end of the last token to the start
        ;; of the next token, ready to compare the BASIC program text with this token.
@advanceToNextTokenLoop:
        jsr     tokenListAdvancePointer
        jsr     tokenListReadByteMinus1
        BPL    @advanceToNextTokenLoop
        ;; Check if we have reached the end of the token list
        jsr    tokenListReadByte
        ;; If not, see if the program text matches this token
        BNE    @compareProgramTextAndToken

        ;; We reached the end of the token list without a match,
        ;; so copy this character to the output, and
        LDA    $0200, X
        ;; Then advance to the next character of the BASIC text
        ;; (BPL acts as unconditional branch, because only bytes with bit 7
        ;; cleared can get here).
        BPL    @tokeniseNextProgramCharacter
@tokeniseEndOfLine_a609:
        ;; Write end of line marker (== $00), which is conveniently in A already
        STA    $0200 - 3, Y
        ;; Decrement BASIC execute pointer high byte
        DEC    $7B
        ;; ... and set low byte to $FF
        LDA    #$FF
        STA    $7A
        RTS

tokenListAdvancePointer:   
        INY
        BNE    @dontAdvanceTokenListPage
        PHP
        PHA
        LDA    token_hi_page_flag
        EOR    #$FF
        STA    token_hi_page_flag
        ;; XXX Why on earth do we need these three NOPs here to correctly parse the extra
        ;; tokens? If you remove one, then the first token no longer parses, and the later
        ;; ones get parsed with token number one less than it should be!
        NOP
        NOP
        NOP
        PLA
        PLP
@dontAdvanceTokenListPage:
        PHP
        PHX
        PHA
        tya
        tax
        bit    token_hi_page_flag
        bmi    @page2
        jmp    @done
@page2:       
        @done:
       
        PLA
        PLX
        PLP
        RTS

tokenListReadByte:   
        bit     token_hi_page_flag
        bmi    @useTokenListHighPage
        LDA    tokenlist, Y
        RTS
@useTokenListHighPage:
        LDA    tokenlist+$100,Y
        RTS       

tokenListReadByteMinus1:   
        bit     token_hi_page_flag
        bmi    @useTokenListHighPage
        LDA    tokenlist - 1, Y
        RTS
@useTokenListHighPage:
        LDA    tokenlist - 1 + $100,Y
        RTS       


The detokenisation routine also heavily draws on the original C64 BASIC ROM's routine (located at $A71A).  This routine is rather simpler, because it doesn't need to do any string matching, or handling of shorted tokens using SHIFT+letter:

megabasic_detokenise:
        ;; The C64 detokenise routine lives at $A71A-$A741.
        ;; The routine is quite simple, reading through the token list,
        ;; decrementing the token number each time the end of at token is
        ;; found.  The only complications for us, is that we need to change
        ;; the parts where the token bytes are read from the list to allow
        ;; the list to be two pages long.

        ;; Print non-tokens directly
        bpl     jump_to_a6f3
        ;; Print PI directly
        cmp    #$ff
        beq    jump_to_a6f3
        ;; If in quote mode, print directly
        bit    $0f
        bmi     jump_to_a6f3

        ;; At this point, we know it to be a token

        ;; Tokens are $80-$FE, so subtract #$7F, to renormalise them
        ;; to the range $01-$7F
        SEC
        SBC    #$7F
        ;; Put the normalised token number into the X register, so that
        ;; we can easily count down
        TAX
        STY    $49     ; and store it somewhere necessary, apparently

        ;; Now get ready to find the string and output it.
        ;; Y is used as the offset in the token list, and gets pre-incremented
        ;; so we start with it equal to $00 - $01 = $FF
        LDY    #$FF
        ;; Set token_hi_page_flag to $FF, so that when Y increments for the first
        ;; time, it increments token_hi_page_flag, making it $00 for the first page of
        ;; the token list.
        STY    token_hi_page_flag

       
@detokeniseSearchLoop:
        ;; Decrement token index by 1
        DEX
        ;; If X = 0, this is the token, so read the bytes out
        beq    @thisIsTheToken
        ;; Since it is not this token, we need to skip over it
@detokeniseSkipLoop:
        jsr tokenListAdvancePointer
        jsr tokenListReadByte
        BPL    @detokeniseSkipLoop
        ;; Found end of token, loop to see if the next token is it
        BMI    @detokeniseSearchLoop
@thisIsTheToken:
        jsr tokenListAdvancePointer
        jsr tokenListReadByte
        ;; If it is the last byte of the token, return control to the LIST
        ;; command routine from the BASIC ROM
        BMI    jump_list_command_finish_printing_token_a6ef
        ;; As it is not the end of the token, print it out
        JSR    $AB47
        BNE    @thisIsTheToken

        ;; This can only be reached if the next byte in the token list is $00
        ;; This could only happen in C64 BASIC if the token ID following the
        ;; last is attempted to be detokenised.
        ;; This is the source of the REM SHIFT+L bug, as SHIFT+L gives the
        ;; character code $CC, which is exactly the token ID required, and
        ;; the C64 BASIC ROM code here simply fell through the FOR routine.
        ;; Actually, understanding this, makes it possible to write a program
        ;; that when LISTed, actually causes code to be executed!
        ;; However, this vulnerability appears not possible to be exploited,
        ;; because $0201, the next byte to be read from the input buffer during
        ;; the process, always has $00 in it when the FOR routine is run,
        ;; causing a failure when attempting to execute the FOR command.
        ;; Were this not the case, REM (SHIFT+L)I=1TO10:GOTO100, when listed
        ;; would actually cause GOTO100 to be run, thus allowing LIST to
        ;; actually run code. While still not a very strong form of source
        ;; protection, it could have been a rather fun thing to try.

        ;; Instead of having this error, we will just cause the character to
        ;; be printed normally.
        LDY    $49
jump_to_a6f3:   
        JMP     $A6F3
jump_list_command_finish_printing_token_a6ef:
        JMP    $A6EF


The one piece of interest, is that this routine is responsible for the REM SHIFT+L bug when listing a program on the C64. This bug *almost* creates a "code injection" bug in the C64's BASIC, as described in the comments above.  The bug is caused by detokenising SHIFT-L causes a fall-through at the end of the detokenisation routine into the start of the FOR command's routine in the ROM. However, as I describe in the comments above, the contents of the next byte of input when detokenising seems to always be $00 or unusable, which will cause a syntax error in the FOR routine, when it tries to parse the variable it should be using as the loop iterator.  Actually, digging a bit deeper, the next token byte it reads is whatever followed the LIST command.  However, it is only really possible to have a colon or $00 there, so I think it is still not exploitable.
I'd be very interested to hear if anyone can think of a way to exploit this, so that LISTing a newly loaded program causes code of your choosing to be executed.

Speaking of execution, that just leaves our token execution routine to present:

megabasic_execute:       
        JSR    $0073
        ;; Is it a MEGA BASIC primary keyword?
        CMP    #$CC
        BCC    @basic2_token
        CMP    #token_first_sub_command
        BCC    megabasic_execute_token
        ;; Handle PI
        CMP    #$FF
        BEQ    @basic2_token
        ;; Else, it must be a MEGA BASIC secondary keyword
        ;; You can't use those alone, so ILLEGAL DIRECT ERROR
        jmp megabasic_perform_illegal_direct_error
@basic2_token:
        ;; $A7E7 expects Z flag set if ==$00, so update it
        CMP    #$00
        JMP    $A7E7

megabasic_execute_token:
        ;; Normalise index of new token
        SEC
        SBC     #$CC
        ASL
        ;; Clip it to make sure we don't have any overflow of the jump table
        AND    #$0E
        TAX
        PHX
        ;; Get next token/character ready
        JSR    $0073
        PLX
        JMP     (newtoken_jumptable,X)

        ;; Tokens are $CC-$FE, so to be safe, we need to have a jump
newtoken_jumptable:
        .word     megabasic_perform_fast
        .word     megabasic_perform_slow
        .word    megabasic_perform_canvas ; canvas operations, including copy/stamping, clearing, creating new
        .word    megabasic_perform_colour ; set colours
        .word    megabasic_perform_tile ; "TILE" command, used for TILESET and other purposes
        .word    megabasic_perform_syntax_error ; "SET" SYNTAXERROR: Used only with TILE to make TILESET
        .word     megabasic_perform_syntax_error
        .word     megabasic_perform_syntax_error
        .word     megabasic_perform_syntax_error

        basic2_main_loop     =    $A7AE

This routine is really quite simple: Check if the token is not one of ours, in which case jump to the normal BASIC 2 routine, else work out which one, and jump to the appropriate routine.  I have cheated slightly, and used a C65/M65 new addressing mode for JMP, that makes jump tables much easier to implement.

Each of the routines then does what is required of it, before jumping to an error routine, or back to the BASIC 2 main loop. Here is the FAST command, which is nice and simple:


megabasic_perform_fast:
        jsr    enable_viciv
        LDA    #$40
        TSB    $D054
        TSB    d054_bits
        JMP    basic2_main_loop


As you can see, there is really nothing to it.  If you want to read arguments and the like, then you have to call the same routines that the BASIC ROM would use to do the same.  The best way to discover these is to look at an existing routine that implements a BASIC command, and see how it does it.  If you want to see how we have used them, or indeed to look at how these routines are strung together to make a fully-functioning BASIC extension, the source code for MEGA BASIC (in its currently unfinished state) is all here on github.

So there you have it.  I'll hopefully post an update soon that shows some more of the progress that has been made on MEGA BASIC. In particular, the CANVAS command and its variations are now almost complete, with the exception of the commands to get and set individual tiles in a canvas.

Monday, February 5, 2018

First steps towards MEGA BASIC for the MEGA65

It's got late, so just a quick post with some fun screen shots and a video.  I'll do another post another day talking about how I have extended the C64 BASIC tokeniser with proper keywords, as there doesn't seem to be many examples of this that I can find on line.

But for now, we have a demo of the new CANVAS STAMP command.

CANVASes in MEGA BASIC are things you can put graphics on, more precisely, they are screen-like things that you can cover with characters. However, as these are MEGA65 characters, they can be 8x8 blocks where each pixel is specified by an 8-bit palette entry, i.e., you can freely use any of 256 colours in each of these 8x8 blocks. 

Each CANVAS has a size, up to 255x255, and you can have upto 255 of them (you will run out of RAM before then in any case, most likely).

There is also the special CANVAS 0, which is the screen you see. If you STAMP something onto a visible part of CANVAS 0, then it will appear on the screen, e.g., if I do:

CANVAS 1 STAMP ON CANVAS 0 AT 0,0

This tells MEGA BASIC that I want to draw whatever is in CANVAS 1 onto the screen, beginning at the top left.  You can also STAMP just a part of a CANVAS, but that is too advanced for today.

Notice that I haven't said anything about how to turn on "graphics mode" or pick the right mode etc. That's because you don't need to do that in MEGA BASIC.  The text mode screen is automatically overlain onto CANVAS 0 in real time. There is a raster interrupt that uses a few % of the CPU time to do this each frame, so you can even POKE onto the screen, and it will still magically update and appear together with the graphics.

However, that's all a bit abstract. What we need is to see this in action.

So first we start at the READY prompt for MEGA BASIC. It really is a V0.1 at the moment, with lots of bugs, and most things are not yet implemented.


So lets do that CANVAS STAMP command I mentioned above, to draw the contents of CANVAS 1 (a picture of the MEGA65 logo) onto the screen (CANVAS0):

Oh dear, we can't see part of it, because it is hidden behind the text.  That's okay, because we can just do shift-HOME to rub the text out, and then we can see the full logo behind:

 We could also get a bit more excited and draw three logos, instead of 1:
This ability to seamlessly work with text and graphics using simple commands is very much the philosophy of MEGA BASIC. There will also be commands for easy manipulation of sprites, and most likely also using SID tunes in the standard format that loads at $1000-$1FFF, perhaps with commands like TUNE LOAD "some.sid",8 , TUNE 1 and TUNE OFF.  There will also be CANVAS and SPRITE editors on the MEGA65, as well as cross-platform tools, to suit various needs. Basically if you want to make a nice little game or graphically rich program, it will be super easy and super fun to do.

So lets make something a bit more sophisticated. Right now we don't have those nice editors etc, and it is VERY late, so we will just draw MEGA65 logos randomly on the screen. To do this, we need only four lines of quite understandable BASIC:


And then, pronto, we have logos being drawn all over the screen!
It should also me mentioned that because it is based on characters, the drawing is really quite fast.  You can paste quite a few CANVASes per second, even at 1MHz. And with the FAST command to select 50MHz, you can do even more. Here is a little video (sorry for the bad audio, my microphone is quite sick at the moment) to see it all in action, including stamping many of these big logos on the screen per second:



Wednesday, January 31, 2018

Creating some documentation on the MEGA65's enhanced text mode

This is a bit of a work in progress, but I have created some documentation for the enhanced text mode of the MEGA65. It is still missing the 256-colour mode information, but does explain the core of the memory layout when using 2 bytes per char instead of one.

https://github.com/MEGA65/mega65-core/blob/px100mhz/doc/viciv-modes.md

Hopefully I will be able to expand on this over the coming week, however, this post also marks the end of my annual leave, and thus, sadly, progress will almost certainly slow down again for a while. That said, I am very happy with the progress made over the past six weeks or so. I tried to keep a bit of a progress log for my own personal satisfaction, and while I know I missed a bunch of stuff, it is still quite a list of things that have been dealt with:

24DEC17 - 800x600 video modes work
24DEC17 - Joystick input not working
24DEC17 - CPU bug fixed (Boulder Mark etc runs fine)
24DEC17 - b0 command in UART monitor stops CPU on BRK instruction
25DEC17 - $DC00 always reads as zeros
26DEC17 - Fix sprite fine horizontal placement problem
26DEC17 - PDM/Sigma-Delta audio output working
26DEC17 - Kickstart looks for file "NTSC", if not present, switches to PAL
26DEC17 - CIA clock speed is always 1MHz, except in C128 2MHz mode.
26DEC17 - Fix CIA clock halving bug
26DEC17 - $D016 smooth scroll in 320H mode fixed
26DEC17 - CIA is 1MHz even in 2MHz mode
26DEC17 - NumLock on PS/2 / USB keyboard is now "joystick lock" (WASD+shift, cursors+space)
27DEC17 - Got rid of single stray pixel by right border
27DEC17 - Make On-screen-keyboard X position variable via $D619
27DEC17 - C= + <- key to toggle matrix mode on C64 keyboards
27DEC17 - On-screen-keyboard again shows key events
28DEC17 - Stop kickstart screen format getting clobbered when setting PAL/NTSC
28DEC17 - Fix doubled first row of pixels in chargen/bitmap
28DEC17 - Joystick input on MEGA65 r1 PCB
28DEC17 - Stereo channel swap/merge on $D6F9
29DEC17 - Speed up PCB synthesis via map command line option
29DEC17 - Find bug stopping IEC serial working (was driving lines high)
29DEC17 - Find hardware errata; No SRQ line on IEC serial port
29DEC17 - Make joystick controlled quick-synthesising debug rig
29DEC17 - Right SID is now on right channel
29DEC17 - $D612 bits 6-7 allow rotation of joystick inputs by 180 degrees
30DEC17 - $D03x can no longer be written over in VIC-II mode
30DEC17 - IEC serial port works, at least partially
30DEC17 - Digital audio outputs reduced volume to prevent amplifier complaining
30DEC17 - Find and fix timer b and ISR reading bugs in CIAs
31DEC17 - Investigate and fix lack of shift register in CIAs for C65 DOS fast mode
31DEC17 - C65 disk drive check succeeds without shift-register status kludge
01JAN18 - VIC-II Bitmap mode displays last pixel row as though from next char row
01JAN18 - Fix trimming of sprite pixels vertically and when expanded horizontally
01JAN18 - Cartridge port accesses now work
01JAN18 - Cartridge ROMs now map to memory
02JAN18 - International Soccer cartridge works
03JAN18 - Ultimax-Mode cartridges work
04JAN18 - 1351/POT interface proof-of-concept
04JAN18 - 3.5" floppy drive proof-of-concept
05JAN18 - dotclock on cartridge port sped up from 4MHz to ~6.5MHz
06JAN18 - dotclock on cartridge port correctly at 8MHz
06JAN18 - 1351 mouse/POT inputs work correctly, and are in MEGA65 VHDL
07JAN18 - 16-colour sprite mode
07JAN18 - Sprite rendering bugs fixed
08JAN18 - Top row of pixels in sprite can now collide
08JAN18 - Sprite:sprite collision detection much more accurate (one Impossible Mission bug remains)
08JAN18 - Sprite Y position corrected
09JAN18 - Ethernet can received (but frames lack CRC!)
09JAN18 - 1581 repaired, and 3.5" test disks prepared
09JAN18 - Ethernet TX phase correction for r1 PCB (but CRC received by nexys still wrong)
09JAN18 - Ethernet TX and RX fully working on both r1 PCB and Nexys4DDR
10JAN18 - Resistor pull-up pack for floppy interface
10JAN18 - Floppy drive reads from real disk
10JAN18 - Worked out how to decode floppy data
10JAN18 - Real drive tracks F011 activity
10JAN18 - Step and spin-up delays set for real floppy drive
11JAN18 - MFM gap finder
11JAN18 - MFM gap quantiser
11JAN18 - MFM byte decoder
11JAN18 - 1581 sector decoder
11JAN18 - CRC16
11JAN18 - Real floppy mode for F011
12JAN18 - MFM decoder decodes real disks (but MEGA65 doesn't get the data for some reason)
13JAN18 - Amiga/1351 mouse / joystick auto detection
13JAN18 - Copy 1351 mouse status to Amiga mouse status to avoid mouse cursor jumping
14JAN18 - Fix problems with buffer writing when reading from FDC
14JAN18 - $14 F011 command waits instead of steps, as expedted
15JAN18 - Can load sector from FDC into sector buffer (but job doesn't complete properly)
15JAN18 - Ethernet MIIM now working
15JAN18 - FDC sector rotation bug fixed
15JAN18 - sector buffer collapsed to one physical copy (saves 2 BRAMs)
15JAN18 - FDC sector reading now works with C65 ROM (can DIR a real disk)
20JAN18 - Pull in Daniel England's diskmenu improvements
19JAN18 - Fix FDISK bugs for system partition creation
20JAN18 - Implement basic system partition reading
20JAN18 - Use CTRL,ALT and SHIFT to control boot process instead of FPGA switches
21JAN18 - Enhanced DMA list mode (and update FDISK, Kickstart to use it)
21JAN18 - Multiplier in CPU
21JAN18 - Hold RESTORE for HyperTrap, instead of double-tap (no stray NMI caused)
26JAN18 - Config program loads config from SD card
26JAN18 - Psygnosis owl sprite demo
26JAN18 - Fix logo display bug with new enhanced DMA
28JAN18 - Config utility works and saves and loads
28JAN18 - V400 character rendering bugs fixed
28JAN18 - V400 border position bugs fixed
29JAN18 - Virtual D81 F011 reading works again
29JAN18 - VD81 (buffered) writing seems to work

The main thing is that the hardware has been tested well enough to allow for the second revision of the motherboard to be designed and assembled over the next couple of months.

Raster splits in BASIC

I have started trying to generate some documentation on the advanced character modes of the MEGA65's VIC-IV video controller, and in the process decided to write some simple example programs, so that people can more easily see what each feature does, and experiment with them themselves.  To make the tests more accessible, I figured that I would write them in BASIC, rather than assembly language. 

The trick was, if I am changing the character mode, and yet want to have some kind of useful information on the display, then I would need a raster split, so that part of the screen could be in the special mode, but the information in ordinary text mode. But hang on, I am programming in BASIC, not assembly, so how can possibly have a raster split?  The answer is with 50MHz! What used to take a second on a C64 can be done in a single frame on the MEGA65, and what used to take a frame, can be done in just a few rasters.  Thus I figured that if I used the rather under rated WAIT command in C64 BASIC, I could probably get a stable enough raster split for this purpose. It wont be perfect, but it will still be pretty good.

Wait is a bit of a strange beast.  It takes three arguments: The address to wait for certain values on, a value to AND with the contents of the location, and a value to XOR before doing the AND.  So:

WAIT 53265,128,0

Will wait while  (PEEK(53265) XOR 0) AND 128) = 0

That is, it will wait until the VIC-II(or III or IV) is near the bottom of the screen, and:

WAIT 53265,64,0

will wait while   (PEEK(53266) XOR 0) AND 64) = 0

That is, until the VIC-II/III/IV has just about finished drawing the top two lines of text.  If I only want to put text in the top line, that's close enough for my little test program.

So, lets put this together into a little loop that changes the border colour based on where we are on the screen:

2000 R = 53248 + 17: R2 = R + 1
2010 REM WAIT FOR BOTTOM OF SCREEN
2020 WAIT R,128,0: POKE 53280,0
2050 REM WAIT UNTIL AFTER A COUPLE OF ROWS OF TEXT
2060 WAIT R,128,128: POKE 53280,2
2070 WAIT R1,64,0: POKE 53280, 1
2090 GET A$: IF A$="" GOTO 2010

Line 2000 works out $D011 and $D012, the two raster indicator registers on the VIC-II in decimal, so that we can easily use them in the routine.

Line 2020 waits until bit 7 of $D011 = 1, i.e., for the bottom part of the screen to start being drawn, and then sets the border colour to black.

Line 2060 waits until we are out of the vertical fly back with $D011=0 again, and then sets the border colour to red.

Line 2070 waits for raster #64, i.e., almost the end of the second row of text, and then makes the border white.

So, if this works, we should see white border for most of the screen, but red at the top, and black at the bottom, and indeed we do:


So now you don't need to learn assembly language to do a raster split any more ;)

Tuesday, January 30, 2018

Displaying 256 colour images and 16 colour sprites

The material in this post is from Daniel, who has been working hard on tools for preparing and displaying 256-colour images on the MEGA65, using the full-colour text mode, where each pixel is represented by a byte, and where characters can be re-used to save space when drawing a low-entropy image. But over to Daniel...

As you all might know, I've been working on a few demos getting a few things tested and working...  At first, I needed a nice sprite pointer and there were some problems I couldn't solve so I handed the tests over to Paul and he sent me back an awesome demo.  After implementing my pointer, I was quite inspired and wanted to try to improve on what he had started.  So I upped-the-ante and released a three sprite animation with a few more frames.  I had to generate all of the data myself, including extracting each frame of the animation.  I wrote a GUI tool to handle the data conversion because I couldn't find Paul's for looking.  I had tried to do a five sprite version but I couldn't figure out the memory utilisation at the time but I'll will be revisiting it in the future.

Next, I wanted to get some 320x200x256 images on the screen.  Quite some time ago (gosh, is it two years already?) I built a tool to convert images into a format that I could use with the 16bit tile feature on the Mega65.  I dragged it out and tried to get it working.  I found out that it was well out-of-date and didn't work.  In fact, when I built the tool, the feature was still in the design stage and it was never actually used to produce an image on the M65 (at the time the C64GS).  

After some consultation with Paul I started to rework the tool.  I also had to build a loader program for the M65 side.  After quite a bit of trial-and-error and annoying Paul with some silly questions, I finally got an image on the screen.  I didn't actually release that demo because it was too simple and seemed to have a few problems.  And...  The image was a very basic outline of something that I might want to reuse later...

I needed to test the colour use so I made the very pretty spectrum example.  I just used a gradient brush in GIMP to paint out the gradient and converted it to 256 colours using dithering which I don't normally do because I thought it might be nice for this example.  However, when I tried to convert it, my tool came back with some 800 tiles.  That was impossible for my loader to handle at the time because it required that the tiles be stored in a contiguous block.  So, I reworked the image, copying sections such that it is actually comprised of only two unique sections (one of them repeated three times).  I was worried it would look terrible but I think its okay.




[Ed: each 8x8 tile in this mode requires 64 bytes, one byte for each pixel, so 800 tiles requires 51,200 bytes, so some would need to go under IO or under ROMs to fit in the C64 memory map].

So the general principle was now sound but I needed to be sure that the tile reuse by foreground colour replacement was actually working.  I came up with the idea of making a little Andy Warhol tribute using the breadbox image which I had made some time ago.  For something that seems so simple, I had to struggle with GIMP to get it to do what I wanted.  I had to munge the images a few times before I got what I was looking for.  In the end, I think its rather pretty and the colour substitution was working.  I did have a problem with the flip bits initially but I checked some information Paul gave me and corrected the problem in my tool.








Now onto the big stuff...  In staring the process I had wanted to put a 1000 tile image on the screen.  That's a unique tile for every location, a real 320x200x256 image and I wanted to have sprites, too.  I had already made the image I wanted to use (a very long process in Photoshop, let me tell you!  I recoloured the whole image - every single detail and that was just the start!).  It was actually for a little animation demo I did a long time ago...

I wanted to replicate some of that demo but my tool and loader were not up to the task in any shape or form.  Firstly, I was having to modify my loader for each individual image and secondly, my tool could only output to a contiguous block of RAM.  If you look at the memory map for the M65, you'll see that there just isn't a contiguous 64kB block of RAM that you can use when you need screen RAM and so on, too.

When I was asking Paul about it he quite casually said, "just don't make it contiguous and use whatever memory you can."  Hmm...  I don't know if he knew the complexity of what he was suggesting but I knew what I had to do.  I had to rework my tool to and image format to allow for segment based mapping.  I also wanted it to be able to be processed by a generic loader because I was really over all of the complicated calculations I had to do to get the data loading loops to work and it just wasn't something that could be sold as a solution.

I built a test application that would allow me to specify free, reserved and system segments in the memory range that I could use.  It actually came together rather quickly.  Next, I planned how to change my data format and make it loadable by a generic loader.  I then incorporated my changes into my conversion tool.  That wasn't quite so easy...  I found some issues with the GUI messaging that were preventing me from knowing when the user had actually done the mapping, of all things that could go wrong.  Other than a few minor bugs, the tool side was completed more quickly than I had anticipated.  

Next I had to write the loader.  Oh boy.  My loaders up to that point where horribly hacked so I pretty much had to start from scratch.  That was okay because the data format had changed quite significantly.  After a few calculation problems, complete and utter system failures (I'm sure it was my fault), copy-and-paste bugs and a weird, you have to do this, in this order thing, I got the loader working.  It took me a few more hours to do than I'd planned for though but it didn't stop me being ecstatic at my success!

While I was waiting for something, I'd written the small sprite test that I wanted to use in the final demo.  I quickly incorporated it into my image program (and bothered Paul with silly questions again which I figured out before he got back to me) and voila!  A true 320x200x256 image with 16 colour sprites!  1000 unique tiles and 256 colours!

There is still a bug though...  In the middle of the screen, offset a little to the right, a tile is incorrect.  I have no idea why.  I'll need to consult Paul.  Also, as the sprite approaches the bottom border, it gets a strange warping effect on it.  Something Paul will need to fix.

My tool allows for arbitrary sizes of images (up to 255x255 cells).  You may be able to guess what's comming next...

I'll release the updated version of my tool and loader in a day or two.  I promised someone I would be available tomorrow and have a day away from coding.

ARGH!  I was supposed to sleep but I just had to add the music I wanted, too...  Please enjoy "Beachparty" by Zyron.  It sounds a little off pitch, though.  My humble apologies!

Edit:  I got up first thing in the morning to look at why that tile was broken.  I finally figured it out and with some direction from Paul was able to fix it. [Ed: The sprite pointer list that normally lives directly after the screen memory was in the middle of one of the tiles. Fortunately, this list can be moved around easily on the MEGA65 by writing to $D06C/D/E, so that was easily solved.]


 And the following video shows all three loading and being run on the MEGA65.  Apologies for the effectively silent audio level in the video, as I don't have my decent camera here, and the microphone on my phone is half dead.


MEGA65 configuration utility / Konfigurations Programme

[Diese Post versuche ich nochmal auf Englisch und Deutsch beide vor zu stellen. Hoffentlich ist mein Deutsch nicht so schlecht, dass es noch bequem zu lesen wird. Auch danke Arndt für einige Korrigierungen dabei.]

After lots of posts full of words, here is one mostly full of pictures.  Daniel has been beavering away on a configuration utility for the MEGA65, so that you can set all the important settings, without having to load anything.  Daniel has done a great job, and freed me up to work on various other things, mostly bug fixing various little things, in the meantime.

Die letzten Posten haben alle viele Wörter gemacht, hier ist mal einer mit vielerr Bildern. Daniel ist bis zum Umfallen geschuftet und ein "Konfigurations-Utility" für den MEGA65 geschrieben. Jetzt kann man alle wichtigen Eingstellungen vornehmen, ohne irgendwas dafür laden zu müssen. Daniel hat das richtig gut gemacht und mir damit Gelegenheit gegeben, verschiedene Dinge voranzubringen, meist nötige Bug-Fixes. 

So first, we start by holding C= and resetting the MEGA65, and then pressing ALT while still holding C= after kickstart says "release control to continue booting". This makes the Utility Menu appear. Think of it like Batman's utility belt, only 8-bit, and generally lacking in the shark repellant department:

Als Erstes: Wenn wir beim Starten oder Resetten C= drücken, meldet das Kickstart "Release control to continue booting". Wenn stattdessen die ALT-Taste dazu gedrückt wird, erscheint das Utility-Menü. Das muss man wie Batmans Ausrüstungsgürtel vorstellen, nur in 8-bit,  und natürlich ohne die Haiabwehr-Sachen:



When we press 1 to load the new configuration utility, it quickly appears, and can be controlled by mouse or keyboard to check and set various configuration options. The following few photos show the current contents of the screens (which is likely to change over time).
 
Wenn wir die 1 drücken, kommt ziemlich schnell das Konfigurations-Utility. Es wird entweder mit der Maus oder per Tastatur bedient, um damit alle möglichen Konfigurationsoptionen einzustellen. Die nächsten Bilder zeigen den gegenwärtigen Inhalt dieses Bildschirms (was sich in Zukunft sehr wahrscheinlich noch ändern wird).







Then when you are finished checking and setting everything, you can save and exit as you wish.  There is a confirmation prompt for added comfort.

Nach der Anschluss der Einstellungen kann man (wenn man möchte) speichern und die Konfigurationen verlassen. Es gibt eine Abfrage, die einen an die verschiedenen Speichermöglichkeiten erinnert.





Friday, January 26, 2018

Testing 16-colour sprite mode

What originally started out as debuging the 16-colour sprite mode to allow for a more colourful mouse pointer in the MEGA65 configuration program ended up going a bit further.

Daniel wrote a test program for 16 colour sprites to reproduce some strange behaviour that he was seeing. The mistake in the VHDL was simple, and quickly fixed. In the process, we also tweaked the transparent colour selection, so that it is now the foreground colour of the sprite.

Having fixed that and a few other little problems, I wanted to make a more interesting test of the 16-colour sprites.  So I decided it would be fun to animate a well-known 16-colour sprite assembly from the Amiga.

The first step was to produce the sprite data in the correct format. To do this, I added a new mode to the pngprepare program in the MEGA65 source tree, that takes a PNG with 16 or less colours, and produces a nice binary format that can be used natively on the MEGA65, including the palette entries, and information about how tall the sprites are, so that extended height mode can be used.

Then I modified Daniel's program to use that binary format, and to animate the resulting sprites. The result is quite nice:



This uses just two of the eight sprites, side-by-side to get the 32 pixel horizontal resolution.  The sprite itself is at 320x200.

So then I started thinking about using some multiplexing to get more owls on the screen.  I can easily use 3 pairs of sprites to get three owls on without multiplexing (the 4th pair could be used, but I would have to do a bit more palette fiddling, and it is already late). As the owl is $57 pixels tall, and I can have three overlapping owls, this means one every 29 (decimal) rasters.  I could also trim the sprites vertically (at the moment the blank space above and below the own as it moves is part of the sprite, wasting about 50% of the height in any one frame), so it would be quite easy to get more than double this number of owls on the screen.  But, let's have a few more owls anyway, for good measure:


I gave in and did the palette fiddling I mentioned, so there are four pairs of sprites, which are then multiplex 3x over (parts are concealed by the top and bottom borders).  So there are probably 8 or 9 whole 32x86 pixel 16-colour owls worth -- all using the 8 hardware sprites, not Amiga-style BOBs.  It would also be possible to horizontally multiplex the sprites (this is easier on the MEGA65 than on the C64, in part because the CPU is 50x faster, and there is DMA to splat the sprite X positions during a raster line), and the blank space above/below the owls could also be trimmed out to optimise things, allowing fairly easily to have about 4x more owls on the screen if we wanted to.  But I'll leave that fun to someone else...