Gameboy Development Forum

Discussion about software development for the old-school Gameboys, ranging from the "Gray brick" to Gameboy Color
(Launched in 2008)

You are not logged in.

Ads

#1 2017-11-02 17:04:45

Mills
Member
Registered: 2012-12-21
Posts: 132

Fastest way to copy data to VRAM

I was playing around and trying to make a fake parallax effect.

The effect is made by copying 25 tiles to VRAM address (9000).
But it looks like it is not fast enough to avoid some artifacts on the top of the screen.

this is the asm function I made, and then I call it from c:


Code:

    .globl    _SET_BKG_DAT
    .STAT    = 0x41


_SET_BKG_DAT:
    PUSH        BC
    LDA        HL,5(SP)        ; tile data address from c function parameter
    LD        B,(HL)        ; BC = data
    DEC        HL
    LD        C,(HL)
    LD        H,B
    LD        L,C
    LD        DE,#0x9000      ; VRAM address of tile 0 to DE 
    LD        B,#256        ; bytes to copy
    LD        C,#144        ; bytes to copy (16 x Tile = 25)
    
    .loop:
    .wait:
    LDH        A,(.STAT)
    AND        #0x02
    JR        NZ,.wait        ; wait for stat mode to access VRAM
    
    LD        A,(HL)           ; copy tile data from HL address to A 
    LD        (DE),A        ; copy tile data to VRAM address stored in DE
    INC        HL            ; increase reading address
    INC        DE            ; increase writing address    
    DEC        B            ; B=B-1
    JR        NZ,.loop         ; if B > 0, loop    
    
    .loop1:
    .wait2:
    LDH        A,(.STAT)
    AND        #0x02
    JR        NZ,.wait2        ; wait for stat mode to access VRAM
    
     LD        A,(HL)           ; copy tile data from HL address to A 
     LD        (DE),A        ; copy tile data to VRAM address stored in DE
    INC        HL            ; increase reading address
    INC        DE            ; increase writing address
    DEC        C            ; C=C-1
     JR        NZ,.loop1        ; if C > 0, loop
    
    POP        BC
    RET

Is there a faster way to do this?
I saw the game toki tory copying 64 tiles to fake the parallax (it is a game boy color game, so it has the extra speed), and I wonder how the game do that.

Thanks.

Last edited by Mills (2017-11-02 17:08:37)

Offline

 

#2 2017-11-02 19:15:42

Jonas
Member
From: Frankfurt, Germany
Registered: 2016-06-06
Posts: 112
Website

Re: Fastest way to copy data to VRAM

Check out  this blog entry by Imanolea. According to the author, his routine allows you to copy 41 tiles per VBlank to VRAM. Obviously, you can save some time if you avoid using jump instructions and use REPT instead of the loops.

Offline

 

#3 2017-11-02 19:18:16

ssjason123
Member
Registered: 2017-03-21
Posts: 45

Re: Fastest way to copy data to VRAM

There are a couple quick items you could do:
1. If your tile counts are fixed you could unroll your loops but this comes at the cost of space and flexibility for variable counts.
2. You can change cases where you use LD A, (HL) and INC HL to a single LD A, (HLI) or LDI A, (HL) depending on your assembler to save a couple cycles.

With Gameboy Color I believe there is a generic DMA that you can use to copy a block of data while you can do other things.

Offline

 

#4 2017-11-02 19:26:08

Mills
Member
Registered: 2012-12-21
Posts: 132

Re: Fastest way to copy data to VRAM

Thanks!

My asm knowledge is ver limited, I didn't know I could replace the loops smile.

I was also looking for the dma function in GBC but I could not find any easy sample I could understand sad.

EDIT: It looks like the gbdk compiller does not have any  REPT, .rept , .rep... instruction, so I'm afraid I won't be able to use it.

Last edited by Mills (2017-11-02 20:00:49)

Offline

 

#5 2017-11-02 20:15:08

nitro2k01
Administrator
Registered: 2008-02-22
Posts: 242

Re: Fastest way to copy data to VRAM

First let's optimize the first part of the code, that loads the input paramters.

Code:

_SET_BKG_DAT:
    LDA        HL,5(SP)        ; tile data address from c function parameter
    LD        B,(HL)        ; BC = data
    DEC        HL
    LD        C,(HL)
    LD        H,B
    LD        L,C

Here it's useful to know about the load and increment/decrement instructions that will A to/from HL and the increment or decrement HL. In GBDK's assembler, the syntax is for example LD A,(HLI)

Code:

    LDA       HL,4(SP)       ; point to LOWER byte of address. (4 instead of 5.)
    LD        A,(HLI)        ; Load lower byte into A and increment HL. Now ready to load upper byte.
    LD        H,(HL)         ; Load upper byte into H. Since we no longer need HL, it's ok to destroy it by loading the new value directly into H
    LD        L,A            ; Load previously loaded lower byte into L.

But that's just an aside. Now let's optimize the copy routine.

So first, let me dampen your enthusiasm. (Unless you're making a GBC game.) Toki Tori is using the new DMA that's found only on GBC. You probably won't be able to copy 25 tiles within VBlank on monochrome GB. On GBC, the problem is very easily solved, however.

Code:

    .HDMA1 = 0x51    ; DMA source high byte
    .HDMA2 = 0x52    ; DMA source low byte
    .HDMA3 = 0x53    ; DMA destination high byte
    .HDMA4 = 0x54    ; DMA destination low byte
    .HDMA5 = 0x55    ; DMA length/mode/start
_SET_BKG_DAT:
    ; We needn't PUSH/POP BC since those registers are not touched in this routine.
    LDA       HL,5(SP)       ; point to UPPER byte of address this time.
    LD        A,(HLD)        ; Load upper byte into A and DEcrement HL. Now ready to load lower byte.
    LDH       (HDMA1),A      ; Load upper byte into source register
    LD        A,(HL)         ; Load lower byte into A.
    LDH       (HDMA2),A      ; Load lower byte into source register
    LD        A,#0x90
    LDH       (HDMA3),A      ; Load upper byte into destination register
    XOR       A,A            ; Load A with 0 using a trick
    LDH       (HDMA4),A      ; Load lower byte into destination register
    LD        A,#24          ; Specify length, in terms of chunks of 16 bytes, minus 1.
    LDH       (HDMA5),A      ; Start transfer
    RET

Keep in mind however that the buffer address that is specified needs to be aligned to 16 bytes, which probably can never be guaranteed by GBDK.

If we need to copy using the CPU, be warned that this probably can't be done within VBlank that many tiles. But let's look at some tricks. First, you should be using the load and increment instruction. The part in the inner loop now looks like this:

Code:

     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       DE          ; Increment DE

The next we thing we could do to accelerate the routine is to partially unroll it, to remove some of the overhead of the loop. Then you compensate by running the loop fewer times. This can give some significant savings for tight loops, although it quickly gives diminishing returns if unrolled many times. Unrolling also of course consumes code space.

Code:

     ; 1
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       DE          ; Increment DE
     ; 2
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       DE          ; Increment DE
     ; 3
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       DE          ; Increment DE
     ; 4
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       DE          ; Increment DE

Now observe something about INC DE. Since you're writing to tile data, we can assume that the destination is aligned to 16 bytes and might start at 0x9000, 0x9001 or 0x9440 but never 0x9001 or 0x9123 for example. Then observe that DE will increment E and that D is only affect if the increment carries over from E, for example 0x90FF to 0x9100. This means as long as we repeat the code a small power of two, (repeated 2, 4, 8, 16 etc times) we can make an assumption and replace INC DE with INC E in all but the last unrolled copy of the loop, since the last copy MAY increment DE over a 256 byte boundary.

Code:

     ; 1
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       E           ; Increment DE
     ; 2
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       E           ; Increment DE
     ; 3
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       E           ; Increment DE
     ; 4
     LD        A,(HLI)     ; Load source value into A increment pointer
     LD        (DE),A      ; Store at destination
     INC       DE          ; Increment DE

Now, the next issue is the STAT check. There are different approaches to that. One approach is to simply remove the check and always call the copy routine so early in VBlank that it is guaranteed to finish.

If the check is to be done, and allow copying in HBlank as well as VBlank, we need consider what happens if the check is done near the end of the period. This is sort of ok because VBL and HBL are both followed by mode 2, in which OAM is inaccessible, but VRAM is still accessible. This gives some let in the timing. At this point we need to do some instruction cycle calculations. Mode 2 lasts for a minimum of 77 clock cycles, which means a minimum of 19 instruction cycles. (1 instruction cycles=the time of for example one NOP.)

Code:

    .globl    _SET_BKG_DAT
    .STAT    = 0x41
    .HDMA1 = 0x51    ; DMA source high byte
    .HDMA2 = 0x52    ; DMA source low byte
    .HDMA3 = 0x53    ; DMA destination high byte
    .HDMA4 = 0x54    ; DMA destination low byte
    .HDMA5 = 0x55    ; DMA length/mode/start
_SET_BKG_DAT:
    LDA       HL,4(SP)       ; point to LOWER byte of address. (4 instead of 5.)
    LD        A,(HLI)        ; Load lower byte into A and increment HL. Now ready to load upper byte.
    LD        H,(HL)         ; Load upper byte into H. Since we no longer need HL, it's ok to destroy it by loading the new value directly into H
    LD        L,A            ; Load previously loaded lower byte into L.

    LD        DE,#0x9000     ; VRAM address of tile 0 to DE 
    LD        C,#100         ; Number of (4 bytes) to copy (16*25=400. 400/4=100)

    .loop:
    .wait:
    LDH       A,(.STAT)
    AND       #0x02       ; 2 instruction cycles
    JR        NZ,.wait    ; 2 cycles if jump not taken. Wait for stat mode to access VRAM.

    ; (Total: 4 cycles at this points)
    LD        A,(HLI)     ; 2 Load source value into A increment pointer
    LD        (DE),A      ; 2 Store at destination
    INC       E           ; 1 Increment DE
    ; (Total: 9 cycles at this points)
    LD        A,(HLI)     ; 2 Load source value into A increment pointer
    LD        (DE),A      ; 2 Store at destination
    INC       E           ; 1 Increment DE
    ; (Total: 14 cycles at this points)
    LD        A,(HLI)     ; 2 Load source value into A increment pointer
    LD        (DE),A      ; 2 Store at destination
    INC       E           ; 1 Increment DE
    ; (Total: 19 cycles at this points. BZZZT! Safe limit exceeded. The last copy below may write to inaccessible VRAM.)
    LD        A,(HLI)     ; 2 Load source value into A increment pointer
    LD        (DE),A      ; 2 Store at destination
    INC       DE          ; 1 Increment DE

    DEC       B           ; B=B-1
    JR        NZ,.loop    ; if B > 0, loop

So, the loop can really only be unrolled safely 3 times with this code. (Unless you build more complex logic to handle the access timing.) In practice though, you want powers of 2, so unrolling just 2 times is probably a good idea.

So the final code is suggested to look like:

Code:

    .globl    _SET_BKG_DAT
    .STAT    = 0x41
    .HDMA1 = 0x51    ; DMA source high byte
    .HDMA2 = 0x52    ; DMA source low byte
    .HDMA3 = 0x53    ; DMA destination high byte
    .HDMA4 = 0x54    ; DMA destination low byte
    .HDMA5 = 0x55    ; DMA length/mode/start
_SET_BKG_DAT:
    LDA       HL,4(SP)       ; point to LOWER byte of address. (4 instead of 5.)
    LD        A,(HLI)        ; Load lower byte into A and increment HL. Now ready to load upper byte.
    LD        H,(HL)         ; Load upper byte into H. Since we no longer need HL, it's ok to destroy it by loading the new value directly into H
    LD        L,A            ; Load previously loaded lower byte into L.

    LD        DE,#0x9000     ; VRAM address of tile 0 to DE 
    LD        C,#200         ; Number of (2 bytes) to copy (16*25=400. 400/2=200)

    .loop:
    .wait:
    LDH       A,(.STAT)
    AND       #0x02       ; 2 instruction cycles
    JR        NZ,.wait    ; 2 cycles if jump not taken. Wait for stat mode to access VRAM.

    ; (Total: 4 cycles at this points)
    LD        A,(HLI)     ; 2 Load source value into A increment pointer
    LD        (DE),A      ; 2 Store at destination
    INC       E           ; 1 Increment DE
    ; (Total: 9 cycles at this points)
    LD        A,(HLI)     ; 2 Load source value into A increment pointer
    LD        (DE),A      ; 2 Store at destination
    INC       DE          ; 1 Increment DE
    ; (Total: 14 cycles at this points)

    DEC       B           ; B=B-1
    JR        NZ,.loop    ; if B > 0, loop

But if you can guarantee that the code finishes within VBlank, you can remove the STAT check, and then unroll the loop practically as many times as you want for faster execution. (Remember to adjust the C value, as well change all unrolled copies to INC E except the last one.)

Not mentioned in this post is the hairy subject of stack copy, as well as better ways of timing the copy.


Blog: Gameboy Genius
"A journey of a thousand miles begins with one small step"
Old Chinese Proverb

Offline

 

#6 2017-11-02 20:17:50

nitro2k01
Administrator
Registered: 2008-02-22
Posts: 242

Re: Fastest way to copy data to VRAM

Jonas wrote:

Check out  this blog entry by Imanolea. According to the author, his routine allows you to copy 41 tiles per VBlank to VRAM. Obviously, you can save some time if you avoid using jump instructions and use REPT instead of the loops.

That's a different problem. Imanolea's code is copying 41 bytes to arbitrary places in VRAM, which can update 41 map entries. Mills wants to copy the tile data of 25 tiles, which is 25*16=400 bytes.


Blog: Gameboy Genius
"A journey of a thousand miles begins with one small step"
Old Chinese Proverb

Offline

 

#7 2017-11-03 01:46:35

Jonas
Member
From: Frankfurt, Germany
Registered: 2016-06-06
Posts: 112
Website

Re: Fastest way to copy data to VRAM

nitro2k01 wrote:

That's a different problem. Imanolea's code is copying 41 bytes to arbitrary places in VRAM, which can update 41 map entries. Mills wants to copy the tile data of 25 tiles, which is 25*16=400 bytes.

Ok, I see. I didn't realize, spoke too early and am very sorry.

@Mills: Then just one more remark to the original question: If you adjust your artwork, you should need far less than 25 tiles per VBlank to create a fake parallax effect with your method. The parallax effect in the Asterix game works this way and if I counted correctly, it only exchanges the tile data for 11 tiles.

Last edited by Jonas (2017-11-03 01:47:34)

Offline

 

#8 2017-11-03 06:23:53

Mills
Member
Registered: 2012-12-21
Posts: 132

Re: Fastest way to copy data to VRAM

Thanks a lot nitro2k01, I´ll take my time to read the info.

Jonas I wanted a really big parallax patern, so I ended with a 5x5 tile patern.

Here is the scene I made, I think i'll someday make a demo filled with cute scenes like this smile.

https://github.com/mills32/GAMEBOY-DEMOS/blob/master/MISC/parallax.bmp?raw=true

You see here some issues at the top of the screen.

These issues are solved using the non-DMA optimized function nitro2k01 made smile.
DMG could not copy the 25 tiles on time, so still has the issue.

The DMA function didn't work, I surelly made something wrong.

Last edited by Mills (2017-11-03 08:24:21)

Offline

 

#9 2017-11-03 13:04:01

ISSOtm
Member
From: Somewhere in Echo RAM
Registered: 2017-04-18
Posts: 160
Website

Re: Fastest way to copy data to VRAM

If the DMA function didn't work, the problem may come from the source pointer not being 16-byte aligned.
(To put it differently, the source AND destination pointers must be $XXX0. Ensure that is true before using DMA.)

The other problem that may exist is that this DMA is called GDMA (as opposed to HDMA), and attempts to copy all data at once, even if the CPU can't access VRAM. The DMA should be started during VBlank, preferably not too close to the end.
If you cannot start the DMA early enough during VBlank, you can also use its little brother HDMA, which copies tiles at a rate of 1 tile per scanline. All you have to do is use the exact same function, BUT set bit 7 of the value written to HDMA5.


The French Lord Of Laziness.
Legend of Zelda and Undertale fan, I also tend to break Pokémon R/B/Y a little too much.

Twitter | Me on GCL | Discord : ISSOtm#9015 | Skype : isso.tm (I don't login anymore)

Offline

 

#10 2017-11-03 15:47:20

Mills
Member
Registered: 2012-12-21
Posts: 132

Re: Fastest way to copy data to VRAM

ISSOtm wrote:

If the DMA function didn't work, the problem may come from the source pointer not being 16-byte aligned.
(To put it differently, the source AND destination pointers must be $XXX0. Ensure that is true before using DMA.)

The other problem that may exist is that this DMA is called GDMA (as opposed to HDMA), and attempts to copy all data at once, even if the CPU can't access VRAM. The DMA should be started during VBlank, preferably not too close to the end.
If you cannot start the DMA early enough during VBlank, you can also use its little brother HDMA, which copies tiles at a rate of 1 tile per scanline. All you have to do is use the exact same function, BUT set bit 7 of the value written to HDMA5.

Probably the source is not aligned so I think I'll have to include the tiles in asm at a specific address.

EDIT: placing the data at, for example, 0x4000, made the dma function work, in gbdk this is how it works.

Code:

.area    _Tiles (ABS)
.org    0x4000

.db 0x00... ;tile data

Is incredibly fast, I can't even see the cpu usage.

Last edited by Mills (2017-11-03 16:57:56)

Offline

 

#11 2017-11-07 15:03:50

Mills
Member
Registered: 2012-12-21
Posts: 132

Re: Fastest way to copy data to VRAM

So I read people arguing about toki tori, It surelly uses the DMA function, but nobody really knows how this game makes the parallax.

I was tempted to try it, and the first thing I discovered, is you can't simply scan the data array, because DMA only works with 16 bytes blocks.
So to make a bkg that moves 2 pixels at a time, I had to store 64 tiles * 32 positions in X * 32 in Y... that was huge and would not fit in a 4 MB rom.

Then I read about rotating and shifting data, but this can only work for a vertical parallax, you still have to store 32 copies of the 64 tile animation, and it is still huge.

And the last thing I realised is you don't need to store animation tiles for the positions that are aligned, so I think this could be a clue.

I'll keep trying.

Last edited by Mills (2017-11-07 15:46:08)

Offline

 

#12 2017-11-07 19:10:44

ISSOtm
Member
From: Somewhere in Echo RAM
Registered: 2017-04-18
Posts: 160
Website

Re: Fastest way to copy data to VRAM

You can copy the tiles into WRAM, edit them there (using bitshifts, for example), and then transfer using DMA. I do that in my game ; it's even easier combined with the GBC's WRAM banks, since you don't have to clutter the "main" WRAM bank, just dedicate a bank to tiles.


The French Lord Of Laziness.
Legend of Zelda and Undertale fan, I also tend to break Pokémon R/B/Y a little too much.

Twitter | Me on GCL | Discord : ISSOtm#9015 | Skype : isso.tm (I don't login anymore)

Offline

 

#13 2017-11-08 07:00:30

Mills
Member
Registered: 2012-12-21
Posts: 132

Re: Fastest way to copy data to VRAM

ISSOtm wrote:

You can copy the tiles into WRAM, edit them there (using bitshifts, for example), and then transfer using DMA. I do that in my game ; it's even easier combined with the GBC's WRAM banks, since you don't have to clutter the "main" WRAM bank, just dedicate a bank to tiles.

I thought DMA could not transfer data from ram to vram.

Using the generic slower function, I copied the tiles to 0xC000 the start of the RAM, then I could copy them to 0x9000 without any problems.

But the DMA did not work sad.

Last edited by Mills (2017-11-08 08:34:29)

Offline

 

#14 2017-11-08 08:49:37

ISSOtm
Member
From: Somewhere in Echo RAM
Registered: 2017-04-18
Posts: 160
Website

Re: Fastest way to copy data to VRAM

How did you try the DMA ? I can assure you it's possible to copy tiles from WRAM to VRAM, my game does it with no problem. (I have some experience with that, since I had to research DMA behavior on beware's request because BGB has some small accuracy issues regarding CGB DMA.)
If possible, paste the code you're using here.

In theory, I'd do this. (This is RGBASM syntax, but you can easily convert this to GBDK's assembler syntax.)

Code:

InitParallax:
ld hl, ParallaxTiles
ld de, wParallaxBuffer
ld bc, (size)
jp Copy ; Generic copy function

ScrollLeft:
ScrollRight:
ScrollUp:
ScrollDown:
; Edit wParallaxBuffer in place, but this depends on how exactly you arranged the tiles

RefreshParallaxTiles:
ld a, [rHDMA5]
rla ; Bit 7 is set if a transfer is active
jr nc, RefreshParallaxTiles ; Wait until the transfer has completed

ld c, LOW(rHDMA1)
ld a, HIGH(wParallaxBuffer)
ld [c], a
inc c
ld a, LOW(wParallaxBuffer)
ld [c], a
inc c
ld a, HIGH(vParallaxTiles)
ld [c], a
inc c
ld a, LOW(vParallaxTiles)
ld [c], a
inc c

ld a, (number of tiles minus 1)
ld [c], a
ret

Note that you can preserve c in that last function by using ldh's instead of ld [c], a's. Shouldn't matter with C, though.


The French Lord Of Laziness.
Legend of Zelda and Undertale fan, I also tend to break Pokémon R/B/Y a little too much.

Twitter | Me on GCL | Discord : ISSOtm#9015 | Skype : isso.tm (I don't login anymore)

Offline

 

#15 2017-11-08 09:30:34

Mills
Member
Registered: 2012-12-21
Posts: 132

Re: Fastest way to copy data to VRAM

ISSOtm wrote:

How did you try the DMA ? I can assure you it's possible to copy tiles from WRAM to VRAM, my game does it with no problem. (I have some experience with that, since I had to research DMA behavior on beware's request because BGB has some small accuracy issues regarding CGB DMA.)
If possible, paste the code you're using here.

As I said, my programming experience is limited, and also the asm, so once I understood and tested  the sample functions nitro2k01 posted, I tried This:

Code:

    .HDMA1 = 0xff51    ; DMA source high byte
    .HDMA2 = 0xff52    ; DMA source low byte
    .HDMA3 = 0xff53    ; DMA destination high byte
    .HDMA4 = 0xff54    ; DMA destination low byte
    .HDMA5 = 0xff55    ; DMA length/mode/start

_COPY_TO_RAM:

    LD        A,0x16        
    LDH        (.HDMA1),A        
    LD        A,0x00
    LDH        (.HDMA2),A        ; I stored tiles at 1600
    
    LD        A,#0xC0
    LDH        (.HDMA3),A      
    LD        A,#0x00         
    LDH        (.HDMA4),A      ; Copy to C000
    
    LD        A,#63            ; Copy 64 tiles
    LDH        (.HDMA5),A      ; Start transfer
    
    RET

_COPY_TO_VRAM:    

    LD        A,#0xC0        
    LDH        (.HDMA1),A        
    LD        A,#0x00
    LDH        (.HDMA2),A        ; Source = C000
    
    LD        A,#0x90
    LDH        (.HDMA3),A      
    LD        A,#0x00         
    LDH        (.HDMA4),A      ; Copy to 9000
    
    LD        A,#63            ; Copy 64 tiles
    LDH        (.HDMA5),A      ; Start transfer    
    
    RET

I used the functions inside if (pressing buttons) after a vblank and waiting a bit just in case...

Also, there is nothing else using that ram, because the generic copy function worked well and no garbage showed at all.

Last edited by Mills (2017-11-08 09:38:51)

Offline

 

#16 2017-12-15 07:12:13

ISSOtm
Member
From: Somewhere in Echo RAM
Registered: 2017-04-18
Posts: 160
Website

Re: Fastest way to copy data to VRAM

Been a while, but I didn't notice this thread had had an update (PunBB is full of errors, I don't get half of the new reply notifications).

Anyways ; the reason why this didn't work is that DMA can only copy to VRAM, so the high byte is effectively ([HDMA3] & 0x1F) | 0x80, ie. here (0xC0 & 0x1F) = 0x00, 0x00 | 0x80 = 0x80. That's why the function failed : you need to use a generic copy function to copy to WRAM, and DMA *only* to copy to VRAM.


The French Lord Of Laziness.
Legend of Zelda and Undertale fan, I also tend to break Pokémon R/B/Y a little too much.

Twitter | Me on GCL | Discord : ISSOtm#9015 | Skype : isso.tm (I don't login anymore)

Offline

 

#17 2017-12-15 07:52:44

nitro2k01
Administrator
Registered: 2008-02-22
Posts: 242

Re: Fastest way to copy data to VRAM

ISSOtm wrote:

Been a while, but I didn't notice this thread had had an update (PunBB is full of errors, I don't get half of the new reply notifications).

Hmm. It will only send out one notification until you visit the site again while logged in. Could that be what you re experiencing?


Blog: Gameboy Genius
"A journey of a thousand miles begins with one small step"
Old Chinese Proverb

Offline

 

#18 2017-12-15 11:33:13

ISSOtm
Member
From: Somewhere in Echo RAM
Registered: 2017-04-18
Posts: 160
Website

Re: Fastest way to copy data to VRAM

Nope, rather that sometimes new replies are posted, but the "New posts" icon change and link don't appear (and thus sometimes I miss posts)
Plus, when I read a topic, it's still marked as unread, so I have to click "Mark all topic as read". And sometimes this link sends me to an error page, thus I'm forced to type the home page's address in the address bar.

Last edited by ISSOtm (2017-12-15 11:34:42)


The French Lord Of Laziness.
Legend of Zelda and Undertale fan, I also tend to break Pokémon R/B/Y a little too much.

Twitter | Me on GCL | Discord : ISSOtm#9015 | Skype : isso.tm (I don't login anymore)

Offline

 

#19 2017-12-19 13:18:43

Mills
Member
Registered: 2012-12-21
Posts: 132

Re: Fastest way to copy data to VRAM

I think I know how to do a 8x8 tile parallax smile.
Of course I only had the idea and then tested it with the slow GBDK functions, and it can be done.
Now i'm trying to use the DMA and non-DMA transfer data functions, for the moment it didn´t work well...

So Using the toky tory maps as example this is what I got.

1- Create a map containing metatiles with 8x8 small tiles arranged like this:

        00    08    16   
        01    09    17
        02    10    18   
        03    11    19   
        04    12    etc...
        05    13   
        06    14   
        07    15   

Ultill you have the 63th tile at the bottom right corner. This metatile has to be seamless, like the toki tori ones.

2- Having this arrangement, vertical scrolling can be done if you use the non-DMA function to copy the tiles of the columns line by line, with different offsets, from ROM to VRAM.
you just have to tell the asm function, to go back to 0 offset when it reaches the end of the column (the 8th tile).

3- For the horizontal scrolling we need 64 predefined copies of the 8x8 metatile, that is a 64Kb file, so it is not so big smile. And we just use the DMA funcion to scroll horizontally.

4- When copying tiles with the non-DMA function, we have to use the ROM address of the horizontal copy we are in, to make the vertical scroll from there.

I know the horizontal scrolling will work, but I still didn't make 64 copies of a metatile.

So I tried to code the vertical scroll, and just got one column working, starting at 0x9000 address, and only scrolls down:

Code:

_CPU_TRANSFER:
    PUSH    BC
    PUSH    DE
    LDA        HL,2(SP)           ; point to LOWER byte of address
    LD        A,(HLI)               
    LD        H,(HL)
    LD        L,A                ; HL = data source address
    
    LD        DE,#0x9000         ; VRAM address of tile 0 to DE 
    LD        C,#64             ; 8 Tiles

    .loop:
    .wait:
    LDH        A,(.STAT)
    AND        #0x02       ;
    JR        NZ,.wait        ; 

    ;COPY DATA
    LD        A,(HLI)     ; 2 Load source value into A increment pointer
    LD        (DE),A      ; 2 Store at destination
    INC        DE          ; DE=DE+1

    LD        A,(HLI)     ; 
    LD        (DE),A      ; 
    INC        DE          ; DE=DE+1    
    
    LD        A,L
    CP        #128        ; if L = 128, we reached the end of a 8 tile column
    JR        Z,.set_L    ; 
    
    DEC        C           ; C=C-1
    JR        NZ,.loop    ; if C > 0, loop
    
    POP        DE
    POP        BC
    RET

    
.set_L:
    LD        L,#0         ; Go back to first line
    DEC        C           ; C=C-1
    JR        NZ,.loop    ; if C > 0, loop    
    
    POP        DE
    POP        BC
    RET

It's a start smile.

Last edited by Mills (2017-12-19 18:37:12)

Offline

 

Board footer

Powered by PunBB
© Copyright 2002–2005 Rickard Andersson