Difference between revisions of "Quirks by difficulty"
(Difficulty of making mapper tests) |
(Incorporate suggestions by nitro2k01 and others) |
||
Line 5: | Line 5: | ||
; Impact | ; Impact | ||
− | : Things affecting more popular games would rank earlier than things affecting only some obscure Japanese release. Things causing games to fail to boot would rank earlier than (say) noticeable scoring or RNG differences, in turn earlier than things causing only visual glitches. Things not used by any licensed game or notable homebrew game, such as the GBC PCM registers, might rank after the top 100. | + | : Things affecting more popular games would rank earlier than things affecting only some obscure Japanese release. Things causing games to fail to boot would rank earlier than (say) noticeable scoring or RNG differences, in turn earlier than things causing only visual glitches. Cases where an emulated game might write out of bounds to host memory should also rank early. Areas where an emulator is too ''lenient,'' causing homebrew games to appear to work but fail on hardware, generally rank later. Things not used by any licensed game or notable homebrew game, such as the GBC PCM registers, might rank after the top 100. |
; Ease of fixing | ; Ease of fixing | ||
: Expected programmer time and impact on emulation speed from fixing an inaccuracy. For example, things requiring mid-scanline cycle accuracy would be later on the list, as they're less practical to achieve in emulators for MCUs or retro PCs or consoles. Things only BGB and SameBoy get right can rank near the end. | : Expected programmer time and impact on emulation speed from fixing an inaccuracy. For example, things requiring mid-scanline cycle accuracy would be later on the list, as they're less practical to achieve in emulators for MCUs or retro PCs or consoles. Things only BGB and SameBoy get right can rank near the end. | ||
− | The target audience is maintainers and users of stable emulators, as an example of a game that doesn't work. Some behaviors have so much impact that they're hard to test in anything resembling a game. Thus we can treat any behavior needed to progress through the boot ROM, menu, and early in-game scenes of | + | The target audience is maintainers and users of stable emulators, as an example of a game that doesn't work. Some behaviors have so much impact that they're hard to test in anything resembling a game. Thus we can treat any behavior needed to progress through the boot ROM, menu, and early in-game scenes of iconic monochrome (DMG) games (such as ''Tetris'', ''Super Mario Land'', and some version of ''Pokémon'') as a prerequisite. Thus we're looking for things that widely used emulators get wrong, not things that block typical users from even considering using an emulator. There are other, more focused tests for early emulator development. |
− | The minimum and maximum achievable scores should be the same on GBC and DMG. Thus for each GBC-only test, there should be a corresponding test of a DMG-only behavior. If one platform variant has significantly more test-worthy quirks, particularly GBC DMA, it's safe to repeat a test that the other platform variant doesn't support the feature in the first place. | + | The minimum and maximum achievable scores should be the same on Game Boy Color (GBC) and DMG. Thus for each GBC-only test, there should be a corresponding test of a DMG-only behavior. If one platform variant has significantly more test-worthy quirks, particularly GBC DMA, it's safe to repeat a test that the other platform variant doesn't support the feature in the first place. |
The CPU shows whether each test passed or failed by either collecting a coin or stopping its spinning. For this reason, the CPU needs to be able to judge whether each test passed or failed, making some behaviors untestable. This includes video output (such as palettes and pixel response time), audio output (such as noise LFSR lockup, as DMG lacks PCM registers), and some of the PPU-LCD desyncs caused by mid-scanline window manipulation. In addition, I'd prefer tests that complete within one or two frames (about 35,000 M-cycles), not minutes-long exhaustive tests like those in ZEXALL. Slightly longer tests can be justified as "digging" for coins by crouching on an actuator, standing back up, and waiting for the coin to spawn. | The CPU shows whether each test passed or failed by either collecting a coin or stopping its spinning. For this reason, the CPU needs to be able to judge whether each test passed or failed, making some behaviors untestable. This includes video output (such as palettes and pixel response time), audio output (such as noise LFSR lockup, as DMG lacks PCM registers), and some of the PPU-LCD desyncs caused by mid-scanline window manipulation. In addition, I'd prefer tests that complete within one or two frames (about 35,000 M-cycles), not minutes-long exhaustive tests like those in ZEXALL. Slightly longer tests can be justified as "digging" for coins by crouching on an actuator, standing back up, and waiting for the coin to spawn. | ||
Line 17: | Line 17: | ||
The test is one ROM, as if one game was released. Testing differences between MBC1 and MBC5 mappers may not be practical unless the test is [[wikipedia:kayfabe|worked]] as a game and its sequel. | The test is one ROM, as if one game was released. Testing differences between MBC1 and MBC5 mappers may not be practical unless the test is [[wikipedia:kayfabe|worked]] as a game and its sequel. | ||
− | I'd like to pick 10 of the easiest and most impactful things that NO$GMB gets wrong and | + | I'd like to pick 10 of the easiest and most impactful things that NO$GMB gets wrong and don't differ between DMG and GBC. This would keep the player from proceeding past the initial section of the map, which requires at least 5 out of 10 coins to complete. |
== Ranked == | == Ranked == | ||
Line 40: | Line 40: | ||
* Timing of serial interrupt with 8.2 kHz internal clock | * Timing of serial interrupt with 8.2 kHz internal clock | ||
* N and H flags after various operations | * N and H flags after various operations | ||
− | * | + | * APU length counters (NR11, NR21, NR31, NR41) expire at the correct time (NR52) (may require digging) |
− | * LCD off/on behavior (May require extra care not to harm DMG/MGB screen) | + | * LCD off/on behavior: values of STAT, LY, etc. and no interrupts (May require extra care not to harm DMG/MGB screen) |
* <code>DIV</code> write any value should reset | * <code>DIV</code> write any value should reset | ||
+ | * Correct count of M-cycles to increase <code>DIV</code> | ||
* <code>DIV</code> reset should reset the div increase from that exact cycle | * <code>DIV</code> reset should reset the div increase from that exact cycle | ||
* <code>DIV</code> reset at right time should still increase <code>TIMA</code> | * <code>DIV</code> reset at right time should still increase <code>TIMA</code> | ||
− | * <code>DIV</code> reset influences APU | + | * <code>DIV</code> reset influences APU length counters |
* <code>TIMA</code> reload and interrupt is delayed by 1 cycle | * <code>TIMA</code> reload and interrupt is delayed by 1 cycle | ||
* Higher bits of <code>IE</code>/<code>IF</code> should be fixed value | * Higher bits of <code>IE</code>/<code>IF</code> should be fixed value | ||
− | * Timing of STOP instruction in GBC when switching to double speed | + | * (GBC) 228 M-cycles per scanline |
− | * Timing of STOP instruction in GBC when switching to single speed | + | * (GBC) Timing of STOP instruction in GBC when switching to double speed |
− | * HDMA targeted to anything that is not VRAM | + | * (GBC) Timing of STOP instruction in GBC when switching to single speed |
− | * HDMA overflow stop | + | * (GBC) HDMA targeted to anything that is not VRAM |
− | * HDMA start during HBlank | + | * (GBC) HDMA overflow stop |
+ | * (GBC) HDMA start during HBlank | ||
* <code>SB</code> read back seeing shift happening | * <code>SB</code> read back seeing shift happening | ||
* Enabling STAT IRQ for the current mode while in that mode | * Enabling STAT IRQ for the current mode while in that mode | ||
* <code>halt</code> with IME off and a pending interrupt causes the next instruction byte to be read twice: twice as an opcode or as an opcode and its operand | * <code>halt</code> with IME off and a pending interrupt causes the next instruction byte to be read twice: twice as an opcode or as an opcode and its operand | ||
+ | * OAM bug: Doing 16-bit <code>inc</code> or <code>dec</code> during some mode with register pairs in $FE00-$FEFF causes stray writes to OAM on DMG and doesn't on GBC | ||
+ | * Echo RAM at $E000-$FDFF | ||
+ | * <code>di</code> is immediate, and <code>ei</code> is delayed by one instruction | ||
+ | * Joypad interrupt (timing thereof should probably be one of the last tests to avoid having to cut away to Telling LYs as a minigame) | ||
+ | * Delay between switching the key matrix and the buttons showing up (requires digging or a ladder; may differ among DMG, SGB, GBC) | ||
+ | * <code>ld hl, sp+(-128)</code> and other negative values | ||
Probably out of the top 100: | Probably out of the top 100: | ||
− | * Undocumented GBC registers | + | * Undocumented GBC registers (PCM, etc.) |
− | * Mode1 of MBC1 | + | * Mode1 of MBC1 (writing 0 to $2000, behavior of $6000, etc.) because of tradeoff with MBC5 testing |
+ | * Behaviors most likely to differ among DMG revisions or among GBC revisions, such as GBC $FEA0-$FFFF |
Revision as of 09:45, 18 November 2020
I, PinoBatch, am planning a broad-spectrum test ROM for Game Boy disguised as a simple platform game. I intend for it to cover 100 different CPU, PPU, APU, and RAM issues.
Criteria
I'd like to rank each tested behavior by a combination of several factors:
- Impact
- Things affecting more popular games would rank earlier than things affecting only some obscure Japanese release. Things causing games to fail to boot would rank earlier than (say) noticeable scoring or RNG differences, in turn earlier than things causing only visual glitches. Cases where an emulated game might write out of bounds to host memory should also rank early. Areas where an emulator is too lenient, causing homebrew games to appear to work but fail on hardware, generally rank later. Things not used by any licensed game or notable homebrew game, such as the GBC PCM registers, might rank after the top 100.
- Ease of fixing
- Expected programmer time and impact on emulation speed from fixing an inaccuracy. For example, things requiring mid-scanline cycle accuracy would be later on the list, as they're less practical to achieve in emulators for MCUs or retro PCs or consoles. Things only BGB and SameBoy get right can rank near the end.
The target audience is maintainers and users of stable emulators, as an example of a game that doesn't work. Some behaviors have so much impact that they're hard to test in anything resembling a game. Thus we can treat any behavior needed to progress through the boot ROM, menu, and early in-game scenes of iconic monochrome (DMG) games (such as Tetris, Super Mario Land, and some version of Pokémon) as a prerequisite. Thus we're looking for things that widely used emulators get wrong, not things that block typical users from even considering using an emulator. There are other, more focused tests for early emulator development.
The minimum and maximum achievable scores should be the same on Game Boy Color (GBC) and DMG. Thus for each GBC-only test, there should be a corresponding test of a DMG-only behavior. If one platform variant has significantly more test-worthy quirks, particularly GBC DMA, it's safe to repeat a test that the other platform variant doesn't support the feature in the first place.
The CPU shows whether each test passed or failed by either collecting a coin or stopping its spinning. For this reason, the CPU needs to be able to judge whether each test passed or failed, making some behaviors untestable. This includes video output (such as palettes and pixel response time), audio output (such as noise LFSR lockup, as DMG lacks PCM registers), and some of the PPU-LCD desyncs caused by mid-scanline window manipulation. In addition, I'd prefer tests that complete within one or two frames (about 35,000 M-cycles), not minutes-long exhaustive tests like those in ZEXALL. Slightly longer tests can be justified as "digging" for coins by crouching on an actuator, standing back up, and waiting for the coin to spawn.
The test is one ROM, as if one game was released. Testing differences between MBC1 and MBC5 mappers may not be practical unless the test is worked as a game and its sequel.
I'd like to pick 10 of the easiest and most impactful things that NO$GMB gets wrong and don't differ between DMG and GBC. This would keep the player from proceeding past the initial section of the map, which requires at least 5 out of 10 coins to complete.
Ranked
- To be written
Unranked
Some of these tests are specified in broad strokes. I encourage emulator developers to explain specific problems they ran into implementing these instructions in order to narrow down what to watch out for.
- Writes to VRAM are processed in modes 0, 1, and 2, and ignored in mode 3.
- Writes to OAM are processed in modes 0 and 1, and something else happens in modes 2 and 3.
- OAM DMA is processed even during mode 3.
- OAM DMA blocks ROM access (Does it on GBC?)
- Writing STAT acts as if $FF were written for one cycle, if and only if running on DMG.
- 114 M-cycles per scanline
- Each sprite extends mode 3 by about 8 cycles. (Lenient)
- More strict mode 3 duration timing based on sprite X in relation to SCX.
- $D000 WRAM banking follows pattern 1, 1, 2, 3, 4, 5, 6, 7, ... (GBC) or 1, ... (DMG).
-
daa
with select 0-9 inputs, including half carry -
daa
with select A-F inputs -
daa
with additional A-F inputs - Timing of serial interrupt with 8.2 kHz internal clock
- N and H flags after various operations
- APU length counters (NR11, NR21, NR31, NR41) expire at the correct time (NR52) (may require digging)
- LCD off/on behavior: values of STAT, LY, etc. and no interrupts (May require extra care not to harm DMG/MGB screen)
-
DIV
write any value should reset - Correct count of M-cycles to increase
DIV
-
DIV
reset should reset the div increase from that exact cycle -
DIV
reset at right time should still increaseTIMA
-
DIV
reset influences APU length counters -
TIMA
reload and interrupt is delayed by 1 cycle - Higher bits of
IE
/IF
should be fixed value - (GBC) 228 M-cycles per scanline
- (GBC) Timing of STOP instruction in GBC when switching to double speed
- (GBC) Timing of STOP instruction in GBC when switching to single speed
- (GBC) HDMA targeted to anything that is not VRAM
- (GBC) HDMA overflow stop
- (GBC) HDMA start during HBlank
-
SB
read back seeing shift happening - Enabling STAT IRQ for the current mode while in that mode
-
halt
with IME off and a pending interrupt causes the next instruction byte to be read twice: twice as an opcode or as an opcode and its operand - OAM bug: Doing 16-bit
inc
ordec
during some mode with register pairs in $FE00-$FEFF causes stray writes to OAM on DMG and doesn't on GBC - Echo RAM at $E000-$FDFF
-
di
is immediate, andei
is delayed by one instruction - Joypad interrupt (timing thereof should probably be one of the last tests to avoid having to cut away to Telling LYs as a minigame)
- Delay between switching the key matrix and the buttons showing up (requires digging or a ladder; may differ among DMG, SGB, GBC)
-
ld hl, sp+(-128)
and other negative values
Probably out of the top 100:
- Undocumented GBC registers (PCM, etc.)
- Mode1 of MBC1 (writing 0 to $2000, behavior of $6000, etc.) because of tradeoff with MBC5 testing
- Behaviors most likely to differ among DMG revisions or among GBC revisions, such as GBC $FEA0-$FFFF