Gameboy Development Forum

Discussion about software development for the old-school Gameboys, ranging from the "Gray brick" to Gameboy Color
(Launched in 2008)

You are not logged in.

Ads

#1 2019-05-02 19:07:49

bbbbbr
Member
Registered: 2019-03-04
Posts: 126

Decoding GBTD Compression - How does the String token work?

Hey All,

I'm working on a game (ZGB engine) and wanted to explore using the "GB Compression" available in GBTD export. (In theory, to save myself the time of writing my own compressor, but those time savings have probably been negated by now big_smile)

Using the source (GBCompress.Pas) for the GBTD common functions I was able to write a set_bkg_data() decompressor for all of the compression tokens (Byte, Word, "Trash") except for the String token. Does anyone have an understanding of how this works?

I looked around online, but the site linked regarding the compression scheme is not online anymore and I had trouble accessing it's entry in the Internet Archive.

The compression encoding code in GBTD for that particular token is not the easiest to follow.

As best as I can tell, it looks like the String token may encode an offset to a previous memory location where a matching run-length sequence can be found. But this understanding doesn't quite work out (or I'm not reading it right). It's also not clear why it only (on a superficial review) seems to encode for 0x00 and 0xFF.


Thanks!

Offline

 

#2 2019-05-03 02:29:05

Zalo
Member
From: Spain
Registered: 2016-07-26
Posts: 103
Website

Re: Decoding GBTD Compression - How does the String token work?

Not sure if this helps but source code for GBTD is available here

Offline

 

#3 2019-05-03 08:02:38

tmk
Member
Registered: 2017-05-01
Posts: 63
Website

Re: Decoding GBTD Compression - How does the String token work?

You can simply use existing solutions like PuCrunch or RNC. There's also LZ4 if you don't care about compression ratio that much.

Last edited by tmk (2019-05-03 08:03:37)

Offline

 

#4 2019-05-03 12:28:29

bbbbbr
Member
Registered: 2019-03-04
Posts: 126

Re: Decoding GBTD Compression - How does the String token work?

Thank you both!

I didn't see the replies until just now, so I have solved the problem before reading this.

It turns out the String token does encode a relative, negative offset (back reference) from the current memory location.

This is not particularly convenient if decoding only small chunks at a time, since the back reference can point as near back as 4 bytes earlier or as far back as the start of the Decoded output. So decoding directly into VRAM seems to be the way around that issue (which comes with it's own drawbacks).

Anyhow, here is an initial working version. It's very slow and not optimized, but hopefully easy for people to follow if they want to understand how to decompress the stream.

Code:

// THIS NEEDS TO BE NON-BANKED

#define GB_COMP_LENGTH_MASK  0x3F
#define GB_COMP_TOKEN_MASK   0xC0
#define GB_COMP_TOKEN_BYTE   0x00
#define GB_COMP_TOKEN_WORD   0x40
#define GB_COMP_TOKEN_STRING 0x80
#define GB_COMP_TOKEN_NOENCODE 0xC0 // AKA "Trash" in GBTD/GBMB compression source
#define GB_TILE_SIZE_BYTES   16

void set_bkg_data_gbtd_compressed(UINT8 first_tile, UINT8 n_tiles, UINT8 * p_tile_data) {

    UINT8 rle_val[2];
    UINT8 token;
    UINT8 rle_len;
    UINT8 rle_toggle;
    UINT8 byte_count;
    UINT8 * p_vram;
    UINT8 * p_rle_backref;
    UINT16 rle_offset;

    // Decode happens directly into VRAM because
    // the RLE String Token can reference an arbitary 
    // previous locationin read from DECOMPRESSED output

    // TODO: 0x9000u hardwired for VRAM Tile data, fix me
    p_vram = 0x9000u + ((UINT16)first_tile * GB_TILE_SIZE_BYTES);

    rle_len = 0;

    while (n_tiles) {

        // Read a tile worth of bytes
        for (byte_count = 0; byte_count < GB_TILE_SIZE_BYTES; byte_count++) {

            // Load the next RLE token if needed
            if (rle_len == 0) {
                // Read a RLE token
                // First byte should always be a token
                token   = (*p_tile_data) & GB_COMP_TOKEN_MASK;
                rle_len = ((*p_tile_data) & GB_COMP_LENGTH_MASK) + 1;

                p_tile_data++; // Move to next ENCODED byte

                rle_toggle = 0; // Reset RLE decoded byte flipflop

                // Reading a RLE Length zero = END of encoded data
                // Then break out of loop and quit
                if (rle_len == 0) {
                    n_tiles = 0;
                    break;
                }

                // Read how to handle ENCODED RLE Value
                switch (token) {
                    case GB_COMP_TOKEN_BYTE:
                        // Load only one byte
                        rle_val[0] = *p_tile_data;
                        p_tile_data++;
                        break;

                    case GB_COMP_TOKEN_WORD:
                        // If token is Word double the RLE decode length
                        rle_len *= 2;

                        // Load LS byte then MS Byte
                        rle_val[0] = *p_tile_data;
                        p_tile_data++;
                        rle_val[1] = *p_tile_data;
                        p_tile_data++;
                        break;

                    case GB_COMP_TOKEN_STRING:
                        // This token: copy N bytes from negative offset of current
                        // DECODED memory location (back reference)
                        rle_offset = (UINT16)(*p_tile_data);
                        p_tile_data++;
                        rle_offset |= (UINT16)((*p_tile_data)<<8);
                        p_tile_data++;
                        // Convert input from from Signed to Unsigned
                        rle_offset = (rle_offset ^ 0xFFFF) + 1;

                        // Assign the string back reference relative to the current buffer pointer
                        p_rle_backref = p_vram - (unsigned char *)rle_offset;
                        break;

                    case GB_COMP_TOKEN_NOENCODE: // AKA "Trash Bytes" in GBTD
                        // This token : copy next N bytes directly from ENCODED input
                        break;
                }
            }


            while((STAT_REG & 0x3) == 0x03); // VRAM (memory at 8000h-9FFFh) is accessible during Mode 0-2

            // Copy the decoded byte into VRAM
            switch (token) {
                case GB_COMP_TOKEN_BYTE:
                    // Copy from cached repeating RLE value
                    *p_vram = rle_val[0];
                    break;

                case GB_COMP_TOKEN_WORD:
                    // Copy from cached repeating RLE value
                    // Toggle between MS/LS bytes input values
                    *p_vram = rle_val[rle_toggle % 2];
                    rle_toggle++;
                    break;

                case GB_COMP_TOKEN_STRING:
                    // Copy byte from the backreferenced VRAM address
                    // Then increment backreference to next VRAM byte
                    *p_vram = *p_rle_backref;
                    p_rle_backref++;
                    break;

                case GB_COMP_TOKEN_NOENCODE:
                    // Copy directly from encoded input
                    *p_vram = *p_tile_data;
                    p_tile_data++;
                    break;
            }

            p_vram++; // Move to next VRAM byte

            rle_len--; // Decrement number of RLE bytes remaining
        }

        n_tiles--; // Decrement number of remaining tiles to decode
        // first_tile++; // Move to next tile in VRAM
    }
}


void ZInitScrollTilesColor_RLE(UINT8 first_tile, UINT8 n_tiles, UINT8* tile_data, UINT8 tile_bank, UINT8* palette_entries) {
    UINT8 i;

    PUSH_BANK(tile_bank);
    set_bkg_data_gbtd_compressed(first_tile, n_tiles, tile_data);
    for(i = first_tile; i < first_tile + n_tiles; ++i) {
        scroll_tile_info[i] = palette_entries ? palette_entries[i] : 0;
    }
    POP_BANK;
}

Last edited by bbbbbr (2019-05-03 12:30:33)

Offline

 

Board footer

Powered by PunBB
© Copyright 2002–2005 Rickard Andersson