Difference between revisions of "Quirks by difficulty"

From GbdevWiki
Jump to: navigation, search
(let's build a test game)
 
(Unranked: di halt support)
 
(11 intermediate revisions by 3 users not shown)
Line 5: Line 5:
  
 
; Impact
 
; Impact
: Things affecting more popular games would rank earlier than things affecting only some obscure Japanese release. Things causing games to fail to boot would rank earlier than (say) noticeable scoring or RNG differences, in turn earlier than things causing only visual glitches. Things not used by any licensed game or notable homebrew game, such as the GBC PCM registers, might rank after the top 100.
+
: Things affecting more popular games would rank earlier than things affecting only some obscure Japanese release. Things causing games to fail to boot would rank earlier than (say) noticeable scoring or RNG differences, in turn earlier than things causing only visual glitches. Cases where an emulated game might write out of bounds to host memory should also rank early. Areas where an emulator is too ''lenient,'' causing homebrew games to appear to work but fail on hardware, generally rank later. Things not used by any licensed game or notable homebrew game, such as the GBC PCM registers, might rank after the top 100.
 
; Ease of fixing
 
; Ease of fixing
 
: Expected programmer time and impact on emulation speed from fixing an inaccuracy. For example, things requiring mid-scanline cycle accuracy would be later on the list, as they're less practical to achieve in emulators for MCUs or retro PCs or consoles. Things only BGB and SameBoy get right can rank near the end.
 
: Expected programmer time and impact on emulation speed from fixing an inaccuracy. For example, things requiring mid-scanline cycle accuracy would be later on the list, as they're less practical to achieve in emulators for MCUs or retro PCs or consoles. Things only BGB and SameBoy get right can rank near the end.
  
The target audience is maintainers and users of stable emulators, as an example of a game that doesn't work. Some behaviors have so much impact that they're hard to test in anything resembling a game. Thus we can treat any behavior needed to progress through the boot ROM, menu, and early in-game scenes of the most popular games (such as ''Tetris'', ''Super Mario Land'' and some version of ''Pokémon'') as a prerequisite. Thus we're looking for things that widely used emulators get wrong, not things that block typical users from even considering using an emulator. There are other, more for early emulator development.
+
The target audience is maintainers and users of stable emulators, as an example of a game that doesn't work. Some behaviors have so much impact that they're hard to test in anything resembling a game. Thus we can treat any behavior needed to progress through the boot ROM, menu, and early in-game scenes of iconic monochrome (DMG) games (such as ''Tetris'', ''Super Mario Land'', and some version of ''Pokémon'') as a prerequisite. Thus we're looking for things that widely used emulators get wrong, not things that block typical users from even considering using an emulator. There are other, more focused tests for early emulator development.
  
The minimum and maximum achievable scores should be the same on GBC and DMG. Thus for each GBC-only test, there should be a corresponding test of a DMG-only behavior. If one platform has significantly more test-worthy quirks, particularly GBC DMA, it's safe to repeat the test that.
+
The minimum and maximum achievable scores should be the same on Game Boy Color (GBC) and DMG. Thus for each GBC-only test, there should be a corresponding test of a DMG-only behavior. If one platform variant has significantly more test-worthy quirks, particularly GBC DMA, it's safe to repeat a test that the other platform variant doesn't support the feature in the first place.
  
The CPU shows whether each test passed or failed by either collecting a coin or stopping its spinning. For this reason, the CPU needs to be able to judge whether each test passed or failed, making some behaviors untestable. This includes video output (such as palettes and pixel response time), audio output (such as noise LFSR lockup, as DMG lacks PCM registers), and some of the PPU-LCD desyncs caused by mid-scanline window manipulation.
+
The test shows whether each test passed or failed by either collecting a coin or stopping its spinning. For this reason, the CPU needs to be able to judge whether each test passed or failed, making some behaviors untestable. This includes video output (such as palettes and pixel response time), audio output (such as noise LFSR lockup, as DMG lacks PCM registers), and some of the PPU-LCD desyncs caused by mid-scanline window manipulation. Sometimes a CPU-visible behavior can be used as a proxy for a graphical quirk, such as measuring the effect of the 10 sprites limit on mode 3 time. In addition, I'd prefer tests that complete within one or two frames (about 35,000 M-cycles), not minutes-long exhaustive tests like those in ZEXALL. Slightly longer tests can be justified as "digging" for coins by crouching on an actuator, standing back up, and waiting for the coin to spawn.
  
I'd like to pick 10 of the easiest and most impactful things that NO$GMB gets wrong and which can be tested on a monochrome (DMG) system. This would keep the player from proceeding past the initial section of the map, which requires at least 5 out of 10 coins to complete.
+
The test is one ROM, as if one game was released. Testing differences between MBC1 and MBC5 mappers may not be practical unless the test is [[wikipedia:kayfabe|worked]] as a game and its sequel.
 +
 
 +
I'd like to pick 10 of the easiest and most impactful things that NO$GMB gets wrong and don't differ between DMG and GBC. This would keep the player from proceeding past the initial section of the map, which requires at least 5 out of 10 coins to complete.
  
 
== Ranked ==
 
== Ranked ==
Line 21: Line 23:
  
 
== Unranked ==
 
== Unranked ==
 +
 +
Some of these tests are specified in broad strokes. I encourage emulator developers to explain specific problems they ran into implementing these instructions in order to narrow down what to watch out for.
  
 
* Writes to VRAM are processed in modes 0, 1, and 2, and ignored in mode 3.
 
* Writes to VRAM are processed in modes 0, 1, and 2, and ignored in mode 3.
Line 30: Line 34:
 
* Each sprite extends mode 3 by about 8 cycles. (Lenient)
 
* Each sprite extends mode 3 by about 8 cycles. (Lenient)
 
* More strict mode 3 duration timing based on sprite X in relation to SCX.
 
* More strict mode 3 duration timing based on sprite X in relation to SCX.
 +
* Sprites beyond the tenth do not extend mode 3.
 +
* Sprites extend mode 3 even if disabled in <code>LCDC</code> on GBC and do not on DMG.
 +
* Mode 3 duration at tail end of OAM DMA
 
* $D000 WRAM banking follows pattern 1, 1, 2, 3, 4, 5, 6, 7, ... (GBC) or 1, ... (DMG).
 
* $D000 WRAM banking follows pattern 1, 1, 2, 3, 4, 5, 6, 7, ... (GBC) or 1, ... (DMG).
 +
* <code>daa</code> with select 0-9 inputs, including half carry
 +
* <code>daa</code> with select A-F inputs
 +
* <code>daa</code> with additional A-F inputs
 +
* Timing of serial interrupt with 8.2 kHz internal clock
 +
* N and H flags after various operations
 +
* APU length counters (NR11, NR21, NR31, NR41) expire at the correct time (NR52) (may require digging)
 +
* LCD off/on behavior: values of STAT, LY, etc. and no interrupts (May require extra care not to harm DMG/MGB screen)
 +
* <code>DIV</code> write any value should reset
 +
* Correct count of M-cycles to increase <code>DIV</code>
 +
* <code>DIV</code> reset should reset the div increase from that exact cycle
 +
* <code>DIV</code> reset at right time should still increase <code>TIMA</code>
 +
* <code>DIV</code> reset influences APU length counters
 +
* <code>TIMA</code> reload and interrupt is delayed by 1 cycle
 +
* Higher bits of <code>IE</code>/<code>IF</code> should be fixed value
 +
* (GBC) 228 M-cycles per scanline
 +
* (GBC) Timing of STOP instruction when switching to double speed
 +
* (GBC) Timing of STOP instruction when switching to single speed
 +
* (GBC) HDMA targeted to anything that is not VRAM
 +
* (GBC) HDMA ignores low address bits
 +
* (GBC) HDMA overflow stop
 +
* (GBC) HDMA start during HBlank
 +
* <code>SB</code> read back seeing shift happening
 +
* Enabling STAT IRQ for the current mode while in that mode causes immediate IRQ
 +
* <code>halt</code> with IME off waits for an interrupt
 +
* <code>halt</code> with IME off and a pending interrupt causes the next instruction byte to be read twice: twice as an opcode or as an opcode and its operand
 +
* OAM bug: Doing 16-bit <code>inc</code> or <code>dec</code> during some mode with register pairs in $FE00-$FEFF causes stray writes to OAM on DMG and doesn't on GBC
 +
* Echo RAM at $E000-$FDFF
 +
* <code>di</code> is immediate, and <code>ei</code> is delayed by one instruction
 +
* Joypad interrupt (timing thereof should probably be one of the last tests to avoid having to cut away to Telling LYs as a minigame)
 +
* Delay between switching the key matrix and the buttons showing up (requires digging or a ladder; may differ among DMG, SGB, GBC)
 +
* <code>ld hl, sp+(-128)</code> and other negative values
 +
 +
Probably out of the top 100:
 +
* Undocumented GBC registers (PCM, etc.)
 +
* Mode1 of MBC1 (writing 0 to $2000, behavior of $6000, etc.) because of tradeoff with MBC5 testing
 +
* Behaviors most likely to differ among DMG revisions or among GBC revisions, such as GBC $FEA0-$FEFF

Latest revision as of 00:07, 23 November 2020

I, PinoBatch, am planning a broad-spectrum test ROM for Game Boy disguised as a simple platform game. I intend for it to cover 100 different CPU, PPU, APU, and RAM issues.

Criteria

I'd like to rank each tested behavior by a combination of several factors:

Impact
Things affecting more popular games would rank earlier than things affecting only some obscure Japanese release. Things causing games to fail to boot would rank earlier than (say) noticeable scoring or RNG differences, in turn earlier than things causing only visual glitches. Cases where an emulated game might write out of bounds to host memory should also rank early. Areas where an emulator is too lenient, causing homebrew games to appear to work but fail on hardware, generally rank later. Things not used by any licensed game or notable homebrew game, such as the GBC PCM registers, might rank after the top 100.
Ease of fixing
Expected programmer time and impact on emulation speed from fixing an inaccuracy. For example, things requiring mid-scanline cycle accuracy would be later on the list, as they're less practical to achieve in emulators for MCUs or retro PCs or consoles. Things only BGB and SameBoy get right can rank near the end.

The target audience is maintainers and users of stable emulators, as an example of a game that doesn't work. Some behaviors have so much impact that they're hard to test in anything resembling a game. Thus we can treat any behavior needed to progress through the boot ROM, menu, and early in-game scenes of iconic monochrome (DMG) games (such as Tetris, Super Mario Land, and some version of Pokémon) as a prerequisite. Thus we're looking for things that widely used emulators get wrong, not things that block typical users from even considering using an emulator. There are other, more focused tests for early emulator development.

The minimum and maximum achievable scores should be the same on Game Boy Color (GBC) and DMG. Thus for each GBC-only test, there should be a corresponding test of a DMG-only behavior. If one platform variant has significantly more test-worthy quirks, particularly GBC DMA, it's safe to repeat a test that the other platform variant doesn't support the feature in the first place.

The test shows whether each test passed or failed by either collecting a coin or stopping its spinning. For this reason, the CPU needs to be able to judge whether each test passed or failed, making some behaviors untestable. This includes video output (such as palettes and pixel response time), audio output (such as noise LFSR lockup, as DMG lacks PCM registers), and some of the PPU-LCD desyncs caused by mid-scanline window manipulation. Sometimes a CPU-visible behavior can be used as a proxy for a graphical quirk, such as measuring the effect of the 10 sprites limit on mode 3 time. In addition, I'd prefer tests that complete within one or two frames (about 35,000 M-cycles), not minutes-long exhaustive tests like those in ZEXALL. Slightly longer tests can be justified as "digging" for coins by crouching on an actuator, standing back up, and waiting for the coin to spawn.

The test is one ROM, as if one game was released. Testing differences between MBC1 and MBC5 mappers may not be practical unless the test is worked as a game and its sequel.

I'd like to pick 10 of the easiest and most impactful things that NO$GMB gets wrong and don't differ between DMG and GBC. This would keep the player from proceeding past the initial section of the map, which requires at least 5 out of 10 coins to complete.

Ranked

To be written

Unranked

Some of these tests are specified in broad strokes. I encourage emulator developers to explain specific problems they ran into implementing these instructions in order to narrow down what to watch out for.

  • Writes to VRAM are processed in modes 0, 1, and 2, and ignored in mode 3.
  • Writes to OAM are processed in modes 0 and 1, and something else happens in modes 2 and 3.
  • OAM DMA is processed even during mode 3.
  • OAM DMA blocks ROM access (Does it on GBC?)
  • Writing STAT acts as if $FF were written for one cycle, if and only if running on DMG.
  • 114 M-cycles per scanline
  • Each sprite extends mode 3 by about 8 cycles. (Lenient)
  • More strict mode 3 duration timing based on sprite X in relation to SCX.
  • Sprites beyond the tenth do not extend mode 3.
  • Sprites extend mode 3 even if disabled in LCDC on GBC and do not on DMG.
  • Mode 3 duration at tail end of OAM DMA
  • $D000 WRAM banking follows pattern 1, 1, 2, 3, 4, 5, 6, 7, ... (GBC) or 1, ... (DMG).
  • daa with select 0-9 inputs, including half carry
  • daa with select A-F inputs
  • daa with additional A-F inputs
  • Timing of serial interrupt with 8.2 kHz internal clock
  • N and H flags after various operations
  • APU length counters (NR11, NR21, NR31, NR41) expire at the correct time (NR52) (may require digging)
  • LCD off/on behavior: values of STAT, LY, etc. and no interrupts (May require extra care not to harm DMG/MGB screen)
  • DIV write any value should reset
  • Correct count of M-cycles to increase DIV
  • DIV reset should reset the div increase from that exact cycle
  • DIV reset at right time should still increase TIMA
  • DIV reset influences APU length counters
  • TIMA reload and interrupt is delayed by 1 cycle
  • Higher bits of IE/IF should be fixed value
  • (GBC) 228 M-cycles per scanline
  • (GBC) Timing of STOP instruction when switching to double speed
  • (GBC) Timing of STOP instruction when switching to single speed
  • (GBC) HDMA targeted to anything that is not VRAM
  • (GBC) HDMA ignores low address bits
  • (GBC) HDMA overflow stop
  • (GBC) HDMA start during HBlank
  • SB read back seeing shift happening
  • Enabling STAT IRQ for the current mode while in that mode causes immediate IRQ
  • halt with IME off waits for an interrupt
  • halt with IME off and a pending interrupt causes the next instruction byte to be read twice: twice as an opcode or as an opcode and its operand
  • OAM bug: Doing 16-bit inc or dec during some mode with register pairs in $FE00-$FEFF causes stray writes to OAM on DMG and doesn't on GBC
  • Echo RAM at $E000-$FDFF
  • di is immediate, and ei is delayed by one instruction
  • Joypad interrupt (timing thereof should probably be one of the last tests to avoid having to cut away to Telling LYs as a minigame)
  • Delay between switching the key matrix and the buttons showing up (requires digging or a ladder; may differ among DMG, SGB, GBC)
  • ld hl, sp+(-128) and other negative values

Probably out of the top 100:

  • Undocumented GBC registers (PCM, etc.)
  • Mode1 of MBC1 (writing 0 to $2000, behavior of $6000, etc.) because of tradeoff with MBC5 testing
  • Behaviors most likely to differ among DMG revisions or among GBC revisions, such as GBC $FEA0-$FEFF