Discussion about software development for the old-school Gameboys, ranging from the "Gray brick" to Gameboy Color
(Launched in 2008)
You are not logged in.
Hi!
I'm working on implementing some simple "dizzy" game engine.
Source code, Compiled ROM.
Already implemented:
1. Dizzy animation and "physics" within a room
2. World made of "rooms" (2x6 in my test example), room switching
3. Collision maps
4. Using window for "title" and "inventory".
But I have some issues with it.
1. Sometimes artefacts appear on dizzy sprites. I think that's because of my use of interrupts, or maybe something is fundamentally wrong with timings, I don't know how to manage this.
2. Physics is not as good as in original, so it is a task to work on.
And there is also a "fundamental" problem with items. In the original game you may put as many items on the location as you want, but on GB it is a problem: you have not so much sprites and background tiles, and the game itself has rich graphics. Thinking about it, but got no solution yet.
So, what do you think?
Screenshot:
Offline
Hi. I will tell you some things that need fixing.
Enable exceptions in BGB
The glitches are because data is written while VRAM is not writeable. I haven't debugged in detail why this happens but this is the issue. It is recommended that you turn on exceptions in BGB. (Right click, options.) Probably enable all exceptions except "cart troubleshooting mode" and "initialize GBC pal with 4 greys". Then when an exception triggers, it means something happened that you should probably fix.
Don't copy tile data
The biggest problem is that you are drawing Dizzy by updating the tile data. This means that you have to update 192 bytes for every animation frame of Dizzy. This takes up far too much frame time atm.
It is better if you can have all sprite tile data already in tile RAM and change OAM data which can be transferred quickly using OAM DMA in VBlank. You may need to change some details in the art. Example of changes you can make to save tiles on Dizzy's "standing still" animation:
Make Dizzy's body at most 16 pixels wide and 16 pixels high. Now you can mirror his body with X-flip. You can move the sprite up and down to to animate it. 2 tiles and 2 sprites used. Maybe you can put them close together so the body is 15 pixels wide if you only want a 1 pixel gap between the eyes.
Make Dizzy's hands 2 sprites. For one hand you can use 1 sprite and two tiles. (In 8x16 mode you need an empty sprite under the hand for protection. ) Maybe you can also use the same tile for both the up and down animation using Y flip. 2 (1 real+1empty) tiles and 2 sprites used.
Make Dizzy's feet 2 sprites. In 8x16 mode you need an empty sprite under each foot for protection again. 2 tiles and 2 sprites.
So with a budget of 6 tiles and 6 sprite entries in OAM you can potentially do both animation frames of the standing still animation. Maybe there are other things you should change to make this work better but this is just an example on how to think about the graphics.
Faster copy
Sometimes you actually need to copy data to VRAM though. Your copy is slow. For every byte copied, it checks both the number of bytes left to copy and then whether HBlank is active. I've described a really copy routine here. That's an advanced method though. You can still make use of partial unrolling though. If you are interested in optimized copying, I can tell you more.
Style choices to use less tiles
Going form top to bottom, there are many variations of the tree trunk tiles. Maybe you can use fewer tiles and still get good visual variety. For example, if you currently have these tiles:
A
B
C
D
A
B
C
D
A
B
C
D
maybe you can use only 2 or 3 tiles and place them randomly to create variety instead.
A
A
B
A
B
B
A
B
A
A
B
B
This is again just an example. And if you need it because you run out of tiles.
Offline
Hi! Thank you for such a detailed answer.
nitro2k01 wrote:
Make Dizzy's hands 2 sprites. For one hand you can use 1 sprite and two tiles. (In 8x16 mode you need an empty sprite under the hand for protection. ) Maybe you can also use the same tile for both the up and down animation using Y flip. 2 (1 real+1empty) tiles and 2 sprites used.
Unfortunately, that's won't help. Most of the time dizzy goes left or right, jumping and rolling. There is no chance to fit him into 16x16 pixels + 2 sprites, and this exact animation is a part of "look and feel" so it is not a point to simplify it. Some phases may be transformed according to your advice, but it might be better to have one style of handling for all main character animation, than to switch it on-the-fly.
I think that the real point is to use nine 8x8 sprites insted of six 8x16. It gives 3 tiles of economy, also corner sprites are empty in some phases of animation may be hiding them is better than zeroing? So, if I, for example, optimize the tile location in memory by usage, and then copy 5 to 9 tiles according to the animation phase and fix the 8x8 sprites location according with the phase, and then hide the other sprites... I thought, there might also be some side effects with "copy tile+move sprites" instead of just copy.
nitro2k01 wrote:
Faster copy
Sometimes you actually need to copy data to VRAM though. Your copy is slow. For every byte copied, it checks both the number of bytes left to copy and then whether HBlank is active. I've described a really copy routine here. That's an advanced method though. You can still make use of partial unrolling though. If you are interested in optimized copying, I can tell you more.
As I understood, the method consists of two general ideas: 1) to detect the exact time of horizontal blank using interrupt on each line + halt in the copy routine instead of polling STAT 2) to use POP for reading memory and unfolding the cycle to make copy faster.
As you see, I already use the ly==lyc interrupt to switch the window on and off, is there is a way to combile these two methods to get the benefits of this fast copy? Checking and switching between the two branches may eat almost all the benefit?
I also have a question: how much time do you have to safely copy data to vram when you use polling of the STAT? Is it safe to use constructions like:
$unshr50: ldh A,(#_STAT_REG) and #0x02 jr NZ, $unshr50 ld A, (BC) inc BC ld (HL+), A ld A, (BC) ld (HL+), A inc BC jr $unshr0A
or this:
$unshr40: ldh A,(#_STAT_REG) and #0x02 jr NZ, $unshr40 ld A, (BC) ld (HL+), A ld (HL+), A inc BC jr $unshr0A
Where is the edge?
nitro2k01 wrote:
Going form top to bottom, there are many variations of the tree trunk tiles. Maybe you can use fewer tiles and still get good visual variety. For example, if you currently have these tiles:
Yes, I know, but at this time it is not a point, the rooms are auto-generated from BMP image, I only made them to fit in 80h tiles. There are, for example, boulders that are the same, but shifted one pixel, there are some unnecessary intersections and so on. The locations should be completely redrawn by hand. I just needed something to debug on.
The question is not in how to save some exact tiles, but in general. The original game allows to put all the items in the game to one location, you will never save that much tiles. I was thinking about some kind of simple "tile allocator", or may be to generate tiles with multiple objects just before putting them... Since it is a quest, not an arcade, you have plenty of time to generate everything when you "put" or "take" something from inventory, or entering a new room, but how to organize all this?
ps: I also updated the rom: https://www.dropbox.com/s/105b0p6gr8ziz … e.zip?dl=0 it's a bit better now.
Last edited by toxa (2020-03-23 07:07:10)
Offline
toxa wrote:
As you see, I already use the ly==lyc interrupt to switch the window on and off, is there is a way to combile these two methods to get the benefits of this fast copy? Checking and switching between the two branches may eat almost all the benefit?
You don't need to check which branch because the code is running with the interrupt master enable flag off (DI) and is using halt as a synchronized wait. So the interrupt vector is not called inside the copy. This would not work since the copy is a bit "evil". The stack pointer is used as data source so if the interrupt vector would be called, the return address and saved registers would potentially overwrite the data source or try to write to ROM!
If you are using this method you should decompress the tiles to temporary space in RAM and then copy to VRAM. The top portion where the window is has no sprites (see below) and is 24 lines high. You could maybe do this:
Interrupt on LY==0.
Decompress sprite tiles. (Can you do this in 12 lines? You can now remove the LCD check from the decompress routine since it will only be used with either temporary RAM or VRAM with LCD off as destination.)
Change STAT to trigger on HBlank.
Use stack copy to copy the data. (Should ideally take 12 lines or 9 if you switch to 8x8 sprites.)
Restore STAT to trigger on LY==LYC.
Use a simple wait loop to wait until LY==24 and disable the window.
Other strategies are possible: Decompress data on odd frames and copy data on even frames...
toxa wrote:
I also have a question: how much time do you have to safely copy data to vram when you use polling of the STAT? Is it safe to use constructions like:
Code:
$unshr50: ldh A,(#_STAT_REG) and #0x02 jr NZ, $unshr50 ld A, (BC) inc BC ld (HL+), A ld A, (BC) ld (HL+), A inc BC jr $unshr0Aor this:
Code:
$unshr40: ldh A,(#_STAT_REG) and #0x02 jr NZ, $unshr40 ld A, (BC) ld (HL+), A ld (HL+), A inc BC jr $unshr0AWhere is the edge?
You can calculate the timings using the LCD timings in Pan Docs combined with the instruction timings. Mode 0 is HBlank. Mode 2 is sprite processing for the next line, and is some extra time when you can write to VRAM. But be careful that the timings vary depending on whether there are sprites on the current and next line.
Also be careful about what happens when HBlank is already active when the check is done. The check passes and you start copying, and then HBlank ends earlier than you expect! One solution is to first check for not in HBlank and then check for HBlank. Or use HALT for sync, but use normal copy instead of stack copy...
http://bgb.bircd.org/pandocs.htm#lcdstatusregister
http://bgb.bircd.org/pandocs.htm#cpuinstructionset
Offline
nitro2k01 wrote:
You don't need to check which branch because the code is running with the interrupt master enable flag off (DI) and is using halt as a synchronized wait. So the interrupt vector is not called inside the copy. This would not work since the copy is a bit "evil". The stack pointer is used as data source so if the interrupt vector would be called, the return address and saved registers would potentially overwrite the data source or try to write to ROM!
Yes, of course, this is clear. The question is different.
If you have a look at the code, you see that there is lcd_interrupt() handler that uses a table of line numbers to control when the next interrupt happens. it switches window on on line 0, then of on line 23, then if inventory is shown, then back on on line 55, and then off on line 111. This takes some nonzero time. Your approach is, if we need to copy, to fire this interrupt on every line. So some kind of combination of code is needed, since we can not have two independent handlers on the same event, and we have too short time to emulate this.
nitro2k01 wrote:
If you are using this method you should decompress the tiles to temporary space in RAM and then copy to VRAM.
dizzy animation is not compressed, there is no need to decompress it, just copy tiles.
nitro2k01 wrote:
The top portion where the window is has no sprites (see below) and is 24 lines high. You could maybe do this:
Just to clarify. Somewhere in the code stack_copy() is called. We mask all other interrupts, then processor stops in HALT and waiting LY=LYC handler is called. We already have this handler enabled, because we need to control the visibility of window. After we fell through halt we DI, modify stack, and all this you say "evil" stuff, then we copy some (say 8 bytes because we have not so much time as in your "clear" case). When done, restore stack, EI and unmask all the other interrupts.
Is this correct?
nitro2k01 wrote:
Also be careful about what happens when HBlank is already active when the check is done. The check passes and you start copying, and then HBlank ends earlier than you expect! One solution is to first check for not in HBlank and then check for HBlank. Or use HALT for sync, but use normal copy instead of stack copy...
Yes I can, but I thought you have the answer already. In this worst case we have at least mode2, when writing to vram is still possible. So, is it safe to read and write two bytes if the "polling" check passed ok, and right then mode2 happens? Or it so short, that is for one byte only? Or even one byte is not safe (in fact never safe, because there might be some timer or serial or joystick interriupts)?
Last edited by toxa (2020-03-23 12:12:50)
Offline
toxa wrote:
Just to clarify. Somewhere in the code stack_copy() is called. We mask all other interrupts, then processor stops in HALT and waiting LY=LYC handler is called. We already have this handler enabled, because we need to control the visibility of window. After we fell through halt we DI, modify stack, and all this you say "evil" stuff, then we copy some (say 8 bytes because we have not so much time as in your "clear" case). When done, restore stack, EI and unmask all the other interrupts.
Is this correct?
Pretty much. Although maybe one thing you don't understand yet. Stack copy is using HBlank interrupt. (Enabled by STAT bit 3.) It triggers an interrupt every time HBlank period starts, instead of one particular line. So you set this up once, and in the loop all you need to do every cycle is clear IF ($FF0F). No need to change the line every time. When stack copy is done, STAT can be restored to LY=LYC mode for matching line 23 and turning off the window.
toxa wrote:
In this worst case we have at least mode2, when writing to vram is still possible. So, is it safe to read and write two bytes if the "polling" check passed ok, and right then mode2 happens? Or it so short, that is for one byte only? Or even one byte is not safe (in fact never safe, because there might be some timer or serial or joystick interriupts)?
You can calculate this. We need <77 cycles. So in this case it seems yes, two bytes are possible.
$unshr50: ldh A,(#_STAT_REG) ; and #0x02 ; 8 8 jr NZ, $unshr50 ; 8 16 ld A, (BC) ; 8 24 inc BC ; 8 32 ld (HL+), A ; 8 40 ld A, (BC) ; 8 48 ld (HL+), A ; 8 56 inc BC jr $unshr0A
Offline
nitro2k01 wrote:
Although maybe one thing you don't understand yet. Stack copy is using HBlank interrupt. (Enabled by STAT bit 3.) It triggers an interrupt every time HBlank period starts, instead of one particular line. So you set this up once, and in the loop all you need to do every cycle is clear IF ($FF0F). No need to change the line every time. When stack copy is done, STAT can be restored to LY=LYC mode for matching line 23 and turning off the window.
But we still need to be in sync with LY, because stack_copy may occur anytime, just before any line, may be before 23 (in fact, i need to check multiple lines, not just 23). If we don't manipulate window and sprite visibility, because our copy routine is executing in the wrong moment, then there will be flickering. Is manipulating of modes is possible? How to do this better?
Last edited by toxa (2020-03-23 17:39:00)
Offline
toxa wrote:
But we still need to be in sync with LY, because stack_copy may occur anytime, just before any line, may be before 23 (in fact, i need to check multiple lines, not just 23). If we don't manipulate window and sprite visibility, because our copy routine is executing in the wrong moment, then there will be flickering. Is manipulating of modes is possible? How to do this better?
We know a couple of things.
1) We trigger the first event on LY==0.
2) Copying takes a known amount of time, 1 line/tile. So the copy will take 12 lines in this case, maybe a little more with setup etc, so let's say 15 lines.
3) We want to trigger on line 23 to hide the HUD.
It's easy to see that we have maybe 8 lines of margin. It's a super long time so if we just do things right, there's nothing to worry about.
Offline
Well, that means, that we must copy only when we are on the beginning of the screen update. So everything should be in sync, I think it's not that easy, because it's really hard to calculate game logic timings and so on. I'll try! Even without this, the game now runs better. Please look at the compiled rom above.
Offline
Implementing "lives" and "energy" i faced with a "problem of resurrection". After death, the main character should appear in the "last safe place". This "last safe place" might be 5 rooms away: you jump from the cliff, fly through several empty (or not) rooms and fell into water and die. Or even harder: you jump from one cliff, fall on another narrow one, but cannot stop on it, roll, and then into water. You should not appear on that unreachable small cliff. You also should not appear in the places of the rooms you have never been, because it is a way to cheat. Any ideas?
Offline
toxa wrote:
Any ideas?
I made it somehow, but not quite shure, that it works completely correct.
Offline
Couldn't you record the last time Dizzy was on "solid ground" (or "safe platforms" or whatever) and mark little narrow cliffs as unsafe? Then just record the position as a respawn position on every step.
Offline
Tauwasser wrote:
Then just record the position as a respawn position on every step.
I made something like this. If you stand or walk, not in falling state, energy is not decreasing and so on. Additionally, every collision handler may reset "safe position" flag, so you can not appear on moving platforms, etc. Now i'm trying to find any cases where my algorythm is not working.
ps: Aclually, there is no "solid ground". Normally, you "soar" above solid ground, and every calculation step you "begin to fall", but this fails, because falling will get you in the ground, so you "continue soaring". That's why safe position detection is a bit tricky.
Last edited by toxa (2020-04-01 03:36:58)
Offline
One more question. I'm running out of memory. What is the simpliest way to move some rare used data, such as room graphics and maps, to additional banks with gbdk-n? It does not support it out of box. Unfortunately, I have more than 20KB of compressed graphics and tilemaps. ;(
Everything is almost working: walking and physics, enemies, items and inventory, dialogs... The only thing left is "using items" to solve the quest, but all memory has gone!
Last edited by toxa (2020-04-02 18:20:14)
Offline
you can now complete the first task: find a key and run the elevator, then collect 3 coins. unfortunately, no bytes left for the troll to let you pass.
Offline
As I understand it, gbdk-n is basically using a moder SDCC and providing the gbdk libraries in an updated format.
The -boN and -baN to set the ROM and RAM bank of the code respective should work, see the SDCC Manual.
You will need to compile each c file separately per bank and set the correct command-line setting. After that, you will need to link everything together and call makebin with the correct options:
makebin: convert a Intel IHX file to binary or GameBoy format binary. Usage: makebin [options] [<in_file> [<out_file>]] Options: -p pack mode: the binary file size will be truncated to the last occupied byte -s romsize size of the binary file (default: 32768) -Z genarate GameBoy format binary file GameBoy format options (applicable only with -Z option): -yo n number of rom banks (default: 2) -ya n number of ram banks (default: 0) -yt n MBC type (default: no MBC) -yn name cartridge name (default: none) -yc GameBoy Color Arguments: <in_file> optional IHX input file, '-' means stdin. (default: stdin) <out_file> optional output file, '-' means stdout. (default: stdout)
The numbers for MBCs are the same as the Game Boy header's. These numbers get copied directly into the ROM for the most part (there is some logic check for ROM/RAM bank numbers).
However, banking seems broken at the moment in SDCC: see GBDK-n Issue #5.
Offline
Tauwasser wrote:
However, banking seems broken at the moment in SDCC: see GBDK-n Issue #5.
I used the old gbdk linker together with modern sdcc, as described by Zalo in that discussion. It took a while to get it working, but it's not that difficult as I thought. You may have a look at the make.bat from my sources. I also used #pragma bank <N> to specify the bank, instead of -yo/-ya.
There are two tasks now to accomplish in the new version of ROM.
Offline
Last edited by toxa (2020-04-07 16:40:32)
Offline
it seems that i can deal with everything, except that f#%$*ng vram glitches! no idea.
Last edited by toxa (2020-04-10 16:25:54)
Offline