Gameboy Development Forum

l0k1 · 2015-09-16 19:03:37

Hey y'all,
100% noob question time.
I've been trying to wrap my head around DMA usage, and the pros/cons of using DMA vs. just writing to the OAM (specifically the DMA usage in not-color gameboys, i.e. not the HDMA stuff). Have any of you written programs that utilize DMA, and is DMA the preferred method of doing things?
/end noob post
Thanks!

l0k1 · 2015-09-23 16:39:21

Replying to myself.
Did some math, and DMA is definitely faster and smaller than copying things into OAM without it. If my math is correct, then a rolled DMA loop is preferable to a rolled not-DMA loop if you are handling more than 4 sprites, and is preferable to an unrolled not-DMA loop if you are handling more than 20 (the unrolled loop would be MUCH larger in size also).

Xephyr · 2015-09-24 13:55:27

This is for a classic DMG GameBoy? Or the results are the same on a GBC game?
Anyway, that's always interesting to know!

l0k1 · 2015-09-24 23:24:37

This is for DMG only (I'm focusing on devving for DMG), but from what I read regarding the GBC, the general-purpose DMA is slower, but the HDMA is even faster. The DMA on the GBC is also a lot more dynamic/useful.

In re-writing the non-DMA code, I forgot about the OAM bug, so my code was wrong. Adjusting for that bug makes the non-DMA code even slower.

So, here's how I figured out timing (in assembly, sorry).

Preamble: this is only the actual copying, no wrapper routines or anything. Wrapper routines would add overhead, sure, but not enough to skew the results majorly. Also, these functions will copy to the ENTIRE OAM ram, not just 2 or 3 sprites.

First, a rolled DMA loop.

Code:

ld A,[dma_copy_location]   ;3 bytes, 16 cycles
ldh [rDMA],A               ;2 bytes, 12 cycles - note the "LDH", not LD
ld A,40                    ;2 bytes, 8 cycles
.loop
dec A                      ;1 byte, 4 cycles
jr nz,.loop                ;2 bytes, 12 cycles if it jumps, 8 if it doesn't
ret                        ;1 byte, 16 cycles.

Total byte count: 11
Total cycle count: 688

No unrolled DMA loop, as that wouldn't fit into HRAM (and would be kinda pointless, as the loop above only serves to wait until DMA is complete.)

Rolled non-DMA copy:

Code:

ld HL,sprite_copy_location    ;3 bytes, 12 cycles
ld BC,_OAMRAM                 ;3 bytes, 12 cycles
ld D,160                      ;2 bytes, 8 cycles
.loop
ld A,[HL+]                    ;1 byte, 8 cycles
ld [BC],A                     ;1 byte, 8 cycles
inc C                         ;1 byte, 4 cycles
jr nz,.skip_B                 ;2 bytes, 12/8 cycles - can't just "inc BC", or
inc B                         ;1 byte, 4 cycles - trash gets written to the OAM
.skip_B                       ;
dec D                         ;1 byte, 4 cycles
jr nz,.loop                   ;2 bytes, 12/8 cycles
ret                           ;1 byte, 16 cycles

Total bytes: 18
Total cycles: 8324(!)

And, finally, a non-DMA unrolled copy routine

Code:

ld HL,sprite_copy_location   ;3 bytes, 12 cycles
ld BC,oam_ram                ;3 bytes, 12 cycles
ld A,[HL]                    ;1 byte, 8 cycles
ld [BC],A                    ;1 byte, 8 cycles
inc C                        ;1 byte, 4 cycles
jr nz,.skip_B                ;2 bytes, 12/8 cycles
.skip_B
;... Repeat from "ld A,[HL]" to ".skip_B 38 more times.
ld A,[HL]                    ;1 byte, 8 cycles
ld [BC],A                    ;1 byte, 8 cycles
ret                          ;1 byte, 16 cycles

Total bytes: 486(!)
Total cycles: 5744

If there's a faster way to code those routines, my math was off, or you just have questions, lemme know.

EDIT: Aligning comments in the code tags.

Last edited by l0k1 (2015-09-24 23:26:29)

Gameboy Development Forum

Ads

#1 2015-09-16 19:03:37

DMG/MGB DMA Usage

#2 2015-09-23 16:39:21

Re: DMG/MGB DMA Usage

#3 2015-09-24 13:55:27

Re: DMG/MGB DMA Usage

#4 2015-09-24 23:24:37

Re: DMG/MGB DMA Usage

Code:

Code:

Code:

Board footer