Encoding details for Vorpal later: fully documented

It’s been a while since I dissected Vorpal later and I was planning to document low level technical details in the official documentation for CBM Flux Studio.
However, I might not be able to finish version 1.0 any time soon due to work-related commitments, so I thought to share here the documentation I wrote so far.

Worth noting: the below encoding details were gathered from G64 images that I got for research purposes, which were created with KryoFlux and DTC from original NTSC disks (sometimes still sealed in their original packaging at the time they were sampled). Disk images created from non-original disks (especially if they were initially written using patched images), or for different video standards, might present encoding discrepancies when compared with what I mention here. You’ve been warned.

Let’s start from the drive code that looks for the soft sync of a custom sector, as part of the code can be misleading.

; **************************************************************
; * Read next block on the current track (entry point: @S058C) *
; **************************************************************

T0585   .byte $F8 ; %11111000, i.e. starts with five 1-bits

B0586   BIT $1C00     
        CLV           
        BMI B05B3     ; Branch if the block-sync flag is not set, meaning 
                      ; that there aren't 10 one bits in a row or more

S058C   LDX #$21      ; Read 33 bytes interleaved 5 times = 165 ($A5) bytes in total

        LDY #$EA      ; Opcode for NOP that optionally disables ROR instructions below

        LDA $1C01     ; Reset the data latch
        CLV           ; Clear the byte ready flag

B0594   BVC B0594     

        LDA $1C01     ; Read one byte
        EOR #$FF      ; Invert all bits
        BEQ B0586     ; Branch if we originally read %11111111

        BIT T0585     ; Bitwise AND to %11111000 = $F8
        CLV           
        BEQ B05B3     ; Branch if we originally read %11111xxx

B05A3   AND #$0F      ; Select lower nibble
        BNE B0594     ; Loop unless we originally read %xxxx1111

        BVC *+0       ; Wait for another byte to be ready
        CLV           

        LDA $1C01     
        EOR #$FF      
        BMI B05A3     ; Branch if bit 7 was clear
        BEQ B0594     ; Branch if it's another $FF

B05B3   STA $92       ; Save byte for framing

        LSR           ; Test bit 0
        BCS B05BA     ; If clear in the original value, don't enable shifting with ROR

        LDY #$6A      ; Opcode for ROR @M05C3

B05BA   BVC B05BA     ; Wait for another byte to be ready
        CLV           

        STY M05C3     ; Self mod

The above code is definitely optimized for size and speed, both of which are a requirement when disk reading code is running in a CBM disk drive. Unfortunately for us, it also means it is not extremely readable. However, I am going to explain what exactly it does.

We are dealing with very short block syncs that don’t trigger the hardware block-sync signal that I discussed in a previous post. In fact, Vorpal later blocks use 8 one bit sync patterns and data is encoded according to a custom Group Coded Recording (GCR) scheme that not only ensures that no more than two zeros occur in a row, but also ensures that not more than four ones occur in a row. I will get back to the custom GCR scheme later on. Let’s finish the description of the soft sync pattern.

I mentioned 8 one bit sync patterns, but framing is achieved by using an alternating pattern just after each soft sync. Furthermore, the pre-sync pattern for block 0 is different from other blocks in the same track:

  • Block 0 : 00 11111111 0101010
  • Block n : 10 11111111 0101010

Each pre-sync follows a start mark, 00110011 (0x33), which comes after a lead-in pattern, 1010101001 (GCR value for 0x80), that repeats about 80 times on track 1. When each track is mastered, such lead-in is partly overwritten by the trailing sequence, which is an 8-bit repeating pattern, 10110101 (0xB5), followed by a stop mark with four bits set, 10111101 (0xBD).

Each block of data contains a 128 byte payload that corresponds to 160 bytes once converted to GCR (128 / 8 * 10 = 160). Additionally, a XOR check-byte is included at the end of each block, followed by the block ID itself, which is not GCR encoded as it is not transferred back to the host computer. In fact, the GCR decoding in Vorpal later occurs during the storage in the disk drive RAM and the subsequent transmission process, using the framing information gathered at the moment the soft sync and repeating pattern were read in.

I am not going to paste here the annotated assembly code for the GCR decoding part as it is quite specialized and not so easy to follow. I will include it in my book instead.

The way the block ID is encoded is quite clever. Again, bear in mind that the block ID is not transmitted back to the host computer but used by the drive code. Because part of the GCR decoding for data bytes happens during the transmission to the host computer, if the block ID was GCR encoded, the drive code would need to include an additional GCR decoder: that would not fit in the available RAM.
So, if you are not using GCR because you don’t want to implement the slow decoding logic in the disk drive RAM or reuse the even slower one available in the disk drive ROM, how can you avoid having more than two zero bits occurring in a row when you encode a 6-bit value (specifically a value between 0x00 and 0x2E for Vorpal later)? You can just group the original bits two by two and add a one bit in between: job done. That’s, essentially, what Vorpal later does.

Finally, here’s the GCR table used by Vorpal later:

01001 = 0x09 -> 0x0
01010 = 0x0A -> 0x1
01011 = 0x0B -> 0x2
01101 = 0x0D -> 0x3
01110 = 0x0E -> 0x4
01111 = 0x0F -> 0x5
10010 = 0x12 -> 0x6
10011 = 0x13 -> 0x7
10101 = 0x15 -> 0x8
10110 = 0x16 -> 0x9
10111 = 0x17 -> 0xA
11001 = 0x19 -> 0xB
11010 = 0x1A -> 0xC
11011 = 0x1B -> 0xD
11101 = 0x1D -> 0xE
11110 = 0x1E -> 0xF

I said that the encoding ensures no more than four 1 bits occur in a row in the encoded stream, but the above encoding seems to suggest that if we have the nybble 0x5 followed by the nybble 0xF then the resulting GCR string will have 8 one bits occurring in a row!
Not so fast. In fact, the following alternative GCR encoding also applies to deal with those cases:

01100 = 0x0C -> 0x5 (used if the code after 0x5 starts with 1)
10100 = 0x14 -> 0xA (used if the code after 0xA starts with 1)
00101 = 0x05 -> 0xE (used if the code after 0xE starts with 0)
00110 = 0x06 -> 0xF (used if the code before 0xF ends with 1)

As you can appreciate, in order to use either the base encoding or the alternative one, the original data had to be pre-processed or written as deferred.

That’s all about this one. Now it’s your move, guys. Try and share the full details of another disk loader or your choice: V-MAX!, Rapidlok, Vorpal older, etc.
I don’t think an annotated disassembled listing is going to cut it: try to come up with a few paragraphs that give the bigger picture, possibly fleshing out details with snippets of code, if necessary, just the way I did here.

Remember this: we are not trying to understand the code itself, but rather what it is trying to do with limited tech available in the 80s. I could have provided 4 pages of annotated disassembled code for the GCR decoding process, rather than the above GCR tables, but that would have been pretty useless. What matters with Vorpal later are those GCR tables: we can wow at the actual implementation of the decoder all day long, but that doesn’t help writing tools to actually decode and validate the data. The GCR tables do.

About Luigi Di Fraia

I am a Senior DevOps Engineer so I get to work with the latest technologies and open-source software. However, in my private time I enjoy retro-computing.
This entry was posted in Retrocomputing, Reverse Engineering, Technical and tagged , , , , , , , , . Bookmark the permalink.

2 Responses to Encoding details for Vorpal later: fully documented

  1. Pingback: Luigi Di Fraia's e-Footsteps

  2. Pingback: Commodore 64 cartridges: theory of operation and Ocean bank switching described | Luigi Di Fraia's e-Footsteps

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s