Discussion about software development for the old-school Gameboys, ranging from the "Gray brick" to Gameboy Color
(Launched in 2008)
You are not logged in.
Hi.
I just started out with assembly so please bear with me.
My goal is do do this:
add hl, Label1 + ( [Var1] * 11 )
This of course doesn't work for several reasons. One being that Var1 is an 8 bit variable and Label1 is a 16 bit address.
But I can't figure out how to do this. I know it takes some puzzling to get it to work.
Are there any libraries with arithmetic functions that could handle something like this?
Thanks,
t-man
Offline
Hey T-man! Having a table with items that are 11 bytes big is a really bad idea! If you care about fast access, pad the table up to 16 bytes per item. This will allow you to use a trick using the swap opcode. Swap swaps the lower and higher nibbles (nibble = 4 bits, half of a byte) of a bit. But this is also equivalent to a right shift by 4, if the higher 4 bits are 0. And a right shift by 4 is equivalent to a multiplication by 16.
So you could do something like:
; Assuming the index is in A and is in the range 0-15 ld HL,Label1 ; Load the table base ld D,0 ; Load 0 into the higher byte of DE $00xx ld E,A ; Load the index into E $000x swap E ; "Multiply" the index by 16, assuming higher nibble = 0 $00x0 add HL,DE ; Add de to HL
This only allows you to have 16 values in your table, unless you add more code.
You could also pad the table up to 12 bytes per item. Again, 11 bytes is really bad for tricks.
With 12 bytes, you could do this...
You multiply the volue by 2, by using the fast add A,A operation. You can do this several times. When you're at 4*index, you save a copy of the number, and later it to 8*index. This gives you 12*index.
; Assuming the index is in A and is in the range 0-20 ld HL,Label1 ; Load the table base ld D,0 ; Load 0 into the higher byte of DE $00xx add A,A ; A = 2 * index add A,A ; A = 4 * index ld E,A ; E = 4 * index add A,A ; A = 8 * index add A,E ; A = 8 * index + 4 * index = 12 * index add HL,DE ; Add de to HL
This relatively simple algorithm allows you to use up to 21 table items.
But by now, any seasoned asm programmers will have reacted that I haven't talked about aligned tables, so here goes... The most expensive operations in both the algorithms are the ld HL,$xxxx and add HL,DE operations. Can they be avoided? Yes. If you put the table base on a 256 byte boundry on purpose, you can just load a the higher byte of the address into H, and then load the lower byte with the calculated table position.
Let's say you place your table at $3400 on purpose. (Just a made-up address) Then you can do ld H,$34 and all that will vary between table positions is L.
The algorithms above could then be written as:
org $3400 ; Fixed offset. Syntax is different between assemblers Label1: ; Assuming the index is in A and is in the range 0-15 ld H,<Label1 ; Load the upper table base into H. Syntax differs between assemblers. ld L,A ; Load the index into L $0x swap L ; "Multiply" the index by 16, assuming higher nibble = 0 $00x0
That's short and sweet! If you do multiple lookups and never destroy H, you even only need to load H with its value once, since all you're changing is L.
org $3400 ; Fixed offset. Syntax is different between assemblers Label1: ; Assuming the index is in A and is in the range 0-20 ld H,<Label1 ; Load the upper table base into H. Syntax differs between assemblers. add A,A ; A = 2 * index add A,A ; A = 4 * index ld L,A ; L = 4 * index add A,A ; A = 8 * index add A,L ; A = 8 * index + 4 * index = 12 * index ld L,A ; Load the final value into L
Notice that I'm using L instead of E as the scrap register. This of course since I'm using whatever register will later be overwritten.
Good luck!
Offline
Thanks a lot nitro! This is really helpful information and just the kind I need to learn in order to produce neat and effective code.
The aligned table thing is definitely something I'll make use of.
Cheers!
Offline
Also, there are some cases where you actually do have to multiply stuff that isn't a power of two... In this case you can get some decent speed gains over a naive addition loop by using Booth's multiplier, look it up on Wikipedia for a basic explanation. Additionally if you're multiplying numbers in a small domain (i.e. if you only have a few possible input values), you may want to consider precomputing all of those possible values in a table of some kind. Of course, for memory offsets, you really can't do that, and it's much better to just align on a power-of-two boundary.
Offline