Difference between revisions of "Struct"

From GbdevWiki
Jump to: navigation, search
(rgbds-structs and how to access them)
(No difference)

Revision as of 19:40, 18 January 2019

A record or struct is a compound data type consisting of a sequence of fields. These fields define different properties of an entity, with different dimensions, units, etc. For an actor in a video game, these properties might be displacements, velocities, time counters, animation frame IDs, hit points, and the like.

To a CPU, an array of structs appears as a 2-dimensional array of bytes. Each row is one instance, and the fields of that instance appear as columns. Reading and writing structs or other 2D arrays on the 8080 family isn't quite as efficient as on some other architectures.

Defining structures

The rgbds-structs macro pack by ISSOtm for RGBDS can be used to make constants for field offsets within a struct. An example from its documentation:

    struct NPC
    words 1, YPos         ; 2 bytes
    words 1, XPos         ; 2 bytes
    bytes 1, YBox         ; 1 byte
    bytes 1, PlayStation  ; 1 byte
    bytes 2, GfxID        ; 2 bytes
    longs 2, MovementData ; 8 bytes
    end_struct

The macro pack transforms the above into the following constants:

NPC_YPos equ 0
NPC_XPos equ 2
NPC_YBox equ 4
NPC_PlayStation equ 5
NPC_GfxID equ 6
NPC_MovementData equ 8
sizeof_NPC equ 16

It also provides a dstruct macro to define a single named instance of a struct with a label for each of its fields, as described in its documentation.

Seeking to struct fields

Imagine that you have a bunch of actors (player, enemies, etc.) in a game. Each loaded actor's state is stored in a 32-byte struct aligned to a 32-byte boundary. Given a pointer to some field in the struct, there are several ways to make HL point to the offset of a different field. Each has its own cost in bytes, cycles, and register use. The best in a particular case depends on the circumstances and the struct's layout.

(All cycle counts below refer to memory cycles or mcycles, which are 1.05 MHz on a Game Boy or single-speed GBC.)

Offset differs by 1: 1 byte, 1 cycle.

  inc l

Sequential access to fields is always fastest. If the previous access was between A and [HL], seeking might not even cost that, as the Game Boy CPU supports post-increment and post-decrement. However, it may prove difficult to predict the order of accesses months in advance, particularly for branchy enemy AI code.

Offset differs by 1 bit, such as $05 to $0D or $17 to $07: 2 bytes, 2 cycles.

  set 4, l

From any known offset, clobbering A: 4 bytes, 4 cycles.

  ld a, new_offset - old_offset
  add l
  ld l, a

This and other methods that clobber A are more useful for loads (ld a, [hl]) than for stores (ld [hl], a).

If the low byte of a known offset is in B, C, D, or E, the same technique can be used. This might prove useful if you have a bunch of structs in the same 256-byte page.

  ld a, new_offset - old_offset
  add b
  ld l, a

If DE or BC points at a known offset (the this-in-DE pattern): 4 bytes, 5 cycles.

  ld hl, new_offset - old_offset
  add hl, de

From an unknown offset into the struct, clobbering A: 6 bytes, 6 cycles.

  ld a, %11100000
  and a, l
  or a, new_offset
  ld l, a

This might be used if a subroutine left HL at some arbitrary offset into the struct.

From an unknown offset into the struct, clobbering nothing: 10 bytes, 10 cycles.

  res 0, l
  res 1, l
  res 2, l
  res 3, l
  res 4, l
  ; Change these res to set depending on the bits of the offset

Not very efficient; included for completeness.

If the structs are stored with 224 bytes of padding between them: 2 bytes, 2 cycles.

  ld l, offset

This is the "2-dimensional RAM" pattern that many games for Z80-based computers, such as ZX Spectrum and MSX, use to save about one mcycle compared to keeping this in IX. The actor ID is in H, the field ID in L. To use this in practice, a game would interleave the actor structs with other data that fit in the padding, such as other structs or smaller 1-dimensional arrays. On RGBDS, interleaving this array with several other arrays may require fixing the arrays' addresses at particular locations in WRAM in order to work around RGBASM's arbitrary dichotomy between labels and constants. In any case, it does not work on 1K machines, such as ColecoVision or SG-1000.

A load or store after any of these seeks adds 2 cycles.

Comparison to other architectures

Architectures other than 8080 family have different strengths and weaknesses, which imposes tradeoffs on how structs are stored.

For example, the indexed addressing modes of the 6502 and 65816 are better suited toward a structure of arrays, which transposes the 2D array to be column-major: the first fields of all instances are stored together, the second fields of all instances are stored together, and so on. Even multi-byte fields get broken up into separate arrays for the low and high bytes. Indexed mode performs a seek and load in 4 cycles or a seek and store in 5. However, the 6502 is weaker cycle-for-cycle than the 8080 at reading through large arrays, as its pointer-based addressing mode is slower. (See also CPU speed comparison.)

Load and store instructions on the Z80, 68000, MIPS, and ARM architectures support offset addressing, which temporarily adds a small number to a pointer before the load or store. This allows leaving the pointer to the start of the struct in a register and specifying the offset for each read or write. On Z80 and 68000, this is slightly slower than sequential access.