Rust GBA Guide

Here's a book that'll help you program in Rust on the Game Boy Advance (GBA).

It's a work in progress of course, but so is most of everything in Rust.

I'm out to teach you how to program in Rust on the GBA, obviously. However, while there is a gba crate, and while I genuinely believe it to be a good and useful crate for GBA programming, we will not be using the gba crate within this book. In fact we won't be using any crates at all. We can call it the Handmade Hero approach, if you like.

I don't want to just teach you how to use the gba crate, I want to teach you what you'd need to know to write the crate from scratch if it wasn't there.

Each chapter of the book will focus on a few things you'll need to know about GBA programming and then present a fully self-contained example that puts those ideas into action. Just one file per example, no dependencies, no external assets, no fuss. The examples will be in the text of the book within code blocks, but also you can find them in the examples directory of the repo if you want to get them that way.

I will try not to ask too much of the reader ahead of time, but you are expected to have already read The Rust Book.

It's very difficult to know when you've said something that someone else won't already know about, or if you're presenting ideas out of order. If things aren't clear please file an issue and we'll try to address it.

If you want to contact us you should join the Rust Community Discord and ask in the #gamedev channel.

Ketsuban is the wizard who knows much more about how it all works
Lokathor is the fool who decided to write a crate and book for it.

If it's not a GBA specific question then you can probably ask any of the other folks in the server as well (there's a few hundred folks).

If you want to read more about developing on the GBA there are some other good resources as well:

Tonc, a tutorial series written for C, but it's what I based the ordering of this book's sections on.
GBATEK, a homebrew tech manual for GBA/NDS/DSi. We will regularly link to parts of it when talking about various bits of the GBA.
CowBite is another tech specification that's more GBA specific. It's sometimes got more ASCII art diagrams and example C struct layouts than GBATEK does.

Before you can build a GBA game you'll have to follow some special steps to setup the development environment. Perhaps unfortunately, there's enough detail here to warrant a mini-chapter all on its own.

Once again, extra special thanks to Ketsuban, who first dove into how to make this all work with rust and then shared it with the world.

Obviously you need your computer to have a working rust installation. However, you'll also need to ensure that you're using a nightly toolchain (we will need it for inline assembly, among other potential useful features). You can run rustup default nightly to set nightly as the system wide default toolchain, or you can use a toolchain file to use nightly just on a specific project, but either way we'll be assuming the use of nightly from now on. You'll also need the rust-src component so that cargo-xbuild will be able to compile the core crate for us in a bit, so run rustup component add rust-src.

Next, you need devkitpro. They've got a graphical installer for Windows that runs nicely, and I guess pacman support on Linux (I'm on Windows so I haven't tried the Linux install myself). We'll be using a few of their general binutils for the arm-none-eabi target, and we'll also be using some of their tools that are specific to GBA development, so even if you already have the right binutils for whatever reason, you'll still want devkitpro for the gbafix utility.

On Windows you'll want something like C:\devkitpro\devkitARM\bin and C:\devkitpro\tools\bin to be added to your PATH, depending on where you installed it to and such.
On Linux you'll also want it to be added to your path, but if you're using Linux I'll just assume you know how to do all that. I'm told that the default installation path is /opt/devkitpro/devkitARM/bin, so look there first if you didn't select some other place.

Finally, you'll need cargo-xbuild. Just run cargo install cargo-xbuild and cargo will figure it all out for you.

Once the system wide tools are ready, you'll need some particular files each time you want to start a new project. You can find them in the root of the rust-console/gba repo.

thumbv4-none-agb.json describes the overall GBA to cargo-xbuild (and LLVM) so it knows what to do. Technically the GBA is thumbv4-none-eabi, but we change the eabi to agb so that we can distinguish it from other eabi devices when using cfg flags.
crt0.s describes some ASM startup stuff. If you have more ASM to place here later on this is where you can put it. You also need to build it into a crt0.o file before it can actually be used, but we'll cover that below.
linker.ld tells the linker all the critical info about the layout expectations that the GBA has about our program, and that it should also include the crt0.o file with our compiled rust code.

The next steps only work once you've got some source code to build. If you need a quick test, copy the hello1.rs file from our examples directory in the repository.

Once you've got something to build, you perform the following steps:

arm-none-eabi-as crt0.s -o crt0.o
- This builds your text format crt0.s file into object format crt0.o. You don't need to perform it every time, only when crt0.s changes, but you might as well do it every time so that you never forget to because it's a practically instant operation.
cargo xbuild --target thumbv4-none-agb.json
- This builds your Rust source. It accepts most of the normal options, such as --release, and options, such as --bin foo or --examples, that you'd expect cargo to accept.
- You can not build and run tests this way, because they require std, which the GBA doesn't have. If you want you can still run some of your project's tests with cargo test --lib or similar, but that builds for your local machine, so anything specific to the GBA (such as reading and writing registers) won't be testable that way. If you want to isolate and try out some piece code running on the GBA you'll unfortunately have to make a demo for it in your examples/ directory and then run the demo in an emulator and see if it does what you expect.
- The file extension is important. cargo xbuild takes it as a flag to compile dependencies with the same sysroot, so you can include crates normally. Well, creates that work in the GBA's limited environment, but you get the idea.

At this point you have an ELF binary that some emulators can execute directly. This is helpful because it'll have debug symbols and all that, assuming a debug build. Specifically, mgba 0.7 beta 1 can do it, and perhaps other emulators can also do it.

However, if you want a "real" ROM that works in all emulators and that you could transfer to a flash cart there's a little more to do.

arm-none-eabi-objcopy -O binary target/thumbv4-none-agb/MODE/BIN_NAME target/ROM_NAME.gba
- This will perform an objcopy on our program. Here I've named the program arm-none-eabi-objcopy, which is what devkitpro calls their version of objcopy that's specific to the GBA in the Windows install. If the program isn't found under that name, have a look in your installation directory to see if it's under a slightly different name or something.
- As you can see from reading the man page, the -O binary option takes our lovely ELF file with symbols and all that and strips it down to basically a bare memory dump of the program.
- The next argument is the input file. You might not be familiar with how cargo arranges stuff in the target/ directory, and between RLS and cargo doc and stuff it gets kinda crowded, so it goes like this:
  - Since our program was built for a non-local target, first we've got a directory named for that target, thumbv4-none-agb/
  - Next, the "MODE" is either debug/ or release/, depending on if we had the --release flag included. You'll probably only be packing release mode programs all the way into GBA roms, but it works with either mode.
  - Finally, the name of the program. If your program is something out of the project's src/bin/ then it'll be that file's name, or whatever name you configured for the bin in the Cargo.toml file. If your program is something out of the project's examples/ directory there will be a similar examples/ sub-directory first, and then the example's name.
- The final argument is the output of the objcopy, which I suggest putting at just the top level of the target/ directory. Really it could go anywhere, but if you're using git then it's likely that your .gitignore file is already setup to exclude everything in target/, so this makes sure that your intermediate game builds don't get checked into your git.
gbafix target/ROM_NAME.gba
- The gbafix tool also comes from devkitpro. The GBA is very picky about a ROM's format, and gbafix patches the ROM's header and such so that it'll work right. Unlike objcopy, this tool is custom built for GBA development, so it works just perfectly without any arguments beyond the file name. The ROM is patched in place, so we don't even need to specify a new destination.

And you're finally done!

Of course, you probably want to make a script for all that, but it's up to you. On our own project we have it mostly set up within a Makefile.toml which runs using the cargo-make plugin. It's not really the best plugin, but it's what's available.

Traditionally a person writes a "hello, world" program so that they can test that their development environment is setup properly and to just get a feel for using the tools involved. To get an idea of what a small part of a source file will look like. All that stuff.

Normally, you write a program that prints "hello, world" to the terminal. The GBA has no terminal, but it does have a screen, so instead we're going to draw three dots to the screen.

Our first example will be a totally minimal, full magic number crazy town. Ready? Here goes:

hello1.rs

#![feature(start)]
#![no_std]

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
  loop {}
}

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
  unsafe {
    (0x04000000 as *mut u16).write_volatile(0x0403);
    (0x06000000 as *mut u16).offset(120 + 80 * 240).write_volatile(0x001F);
    (0x06000000 as *mut u16).offset(136 + 80 * 240).write_volatile(0x03E0);
    (0x06000000 as *mut u16).offset(120 + 96 * 240).write_volatile(0x7C00);
    loop {}
  }
}

Throw that into your project skeleton, build the program (as described back in Chapter 0), and give it a run in your emulator. You should see a red, green, and blue dot close-ish to the middle of the screen. If you don't, something already went wrong. Double check things, phone a friend, write your senators, try asking Ketsuban on the Rust Community Discord, until you're able to get your three dots going.

So, what just happened? Even if you're used to Rust that might look pretty strange. We'll go over most of the little parts right here, and then bigger parts will get their own sections.


# #![allow(unused_variables)]
#![feature(start)]
#fn main() {
#}

This enables the start feature, which you would normally be able to read about in the unstable book, except that the book tells you nothing at all except to look at the tracking issue.

Basically, a GBA game is even more low-level than the normal amount of low-level that you get from Rust, so we have to tell the compiler to account for that by specifying a #[start], and we need this feature on to do that.


# #![allow(unused_variables)]
#![no_std]
#fn main() {
#}

There's no standard library available on the GBA, so we'll have to live a core only life.


# #![allow(unused_variables)]
#fn main() {
#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
  loop {}
}
#}

This sets our panic handler. Basically, if we somehow trigger a panic, this is where the program goes. However, right now we don't know how to get any sort of message out to the user so... we do nothing at all. We can't even return from here, so we just sit in an infinite loop. The player will have to reset the universe from the outside.

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {

This is our #[start]. We call it main, but it's not like a main that you'd see in a Rust program. It's more like the sort of main that you'd see in a C program, but it's still not that either. If you compile a #[start] program for a target with an OS such as arm-none-eabi-nm you can open up the debug info and see that your result will have the symbol for the C main along side the symbol for the start main that we write here. Our start main is just its own unique thing, and the inputs and outputs have to be like that because that's how #[start] is specified to work in Rust.

If you think about it for a moment you'll probably realize that, those inputs and outputs are totally useless to us on a GBA. There's no OS on the GBA to call our program, and there's no place for our program to "return to" when it's done.

Side note: if you want to learn more about stuff "before main gets called" you can watch a great CppCon talk by Matt Godbolt (yes, that Godbolt) where he delves into quite a bit of it. The talk doesn't really apply to the GBA, but it's pretty good.


# #![allow(unused_variables)]
#fn main() {
  unsafe {
#}

I hope you're all set for some unsafe, because there's a lot of it to be had.


# #![allow(unused_variables)]
#fn main() {
    (0x04000000 as *mut u16).write_volatile(0x0403);
#}

Sure!


# #![allow(unused_variables)]
#fn main() {
    (0x06000000 as *mut u16).offset(120 + 80 * 240).write_volatile(0x001F);
    (0x06000000 as *mut u16).offset(136 + 80 * 240).write_volatile(0x03E0);
    (0x06000000 as *mut u16).offset(120 + 96 * 240).write_volatile(0x7C00);
#}

Ah, of course.


# #![allow(unused_variables)]
#fn main() {
    loop {}
  }
}
#}

And, as mentioned above, there's no place for a GBA program to "return to", so we can't ever let main try to return there. Instead, we go into an infinite loop that does nothing. The fact that this doesn't ever return an isize value doesn't seem to bother Rust, because I guess we're at least not returning any other type of thing instead.

Fun fact: unlike in C++, an infinite loop with no side effects isn't Undefined Behavior for us rustaceans... semantically. In truth LLVM has a known bug in this area, so we won't actually be relying on empty loops in any future programs.

Alright, I cheated quite a bit in the middle there. The program works, but I didn't really tell you why because I didn't really tell you what any of those magic numbers mean or do.

0x04000000 is the address of an IO Register called the Display Control.
0x06000000 is the start of Video RAM.

So we write some magic to the display control register once, then we write some other magic to three magic locations in the Video RAM. Somehow that shows three dots. Gotta read on to find out why!

Before we focus on what the numbers mean, first let's ask ourselves: Why are we doing volatile writes? You've probably never used that keywords before at all. What is volatile anyway?

Well, the optimizer is pretty aggressive, and so it'll skip reads and writes when it thinks can. Like if you write to a pointer once, and then again a moment later, and it didn't see any other reads in between, it'll think that it can just skip doing that first write since it'll get overwritten anyway. Sometimes that's correct, but sometimes it's not.

Marking a read or write as volatile tells the compiler that it really must do that action, and in the exact order that we wrote it out. It says that there might even be special hardware side effects going on that the compiler isn't aware of. In this case, the write to the display control register sets a video mode, and the writes to the Video RAM set pixels that will show up on the screen.

Similar to "atomic" operations you might have heard about, all volatile operations are enforced to happen in the exact order that you specify them, but only relative to other volatile operations. So something like


# #![allow(unused_variables)]
#fn main() {
c.write_volatile(5);
a += b;
d.write_volatile(7);
#}

might end up changing a either before or after the change to c (since the value of a doesn't affect the write to c), but the write to d will always happen after the write to c, even though the compiler doesn't see any direct data dependency there.

If you ever go on to use volatile stuff on other platforms it's important to note that volatile doesn't make things thread-safe, you still need atomic for that. However, the GBA doesn't have threads, so we don't have to worry about those sorts of thread safety concerns (there's interrupts, but that's another matter).

Of course, writing out volatile_write every time is more than we wanna do. There's clarity and then there's excessive. This is a chance to write our first newtype. Basically a type that's got the exact same binary representation as some other type, but new methods and trait implementations.

We want a *mut T that's volatile by default, and also when we offset it... well the verdict is slightly unclear on how offset vs wrapping_offset work when you're using pointers that you made up out of nowhere. I've asked the experts and they genuinely weren't sure, so we'll make an offset method that does a wrapping_offset just to be careful.


# #![allow(unused_variables)]
#fn main() {
#[derive(Debug, Clone, Copy, Hash, PartialEq, Eq, PartialOrd, Ord)]
#[repr(transparent)]
pub struct VolatilePtr<T>(pub *mut T);
impl<T> VolatilePtr<T> {
  pub unsafe fn read(&self) -> T {
    core::ptr::read_volatile(self.0)
  }
  pub unsafe fn write(&self, data: T) {
    core::ptr::write_volatile(self.0, data);
  }
  pub unsafe fn offset(self, count: isize) -> Self {
    VolatilePtr(self.0.wrapping_offset(count))
  }
}
#}

The GBA has a large number of IO Registers (not to be confused with CPU registers). These are special memory locations from 0x04000000 to 0x040003FE. GBATEK has a full list, but we only need to learn about a few of them at a time as we go, so don't be worried.

The important facts to know about IO Registers are these:

Each has their own specific size. Most are u16, but some are u32.
All of them must be accessed in a volatile style.
Each register is specifically readable or writable or both. Actually, with some registers there are even individual bits that are read-only or write-only.
- If you write to a read-only position, those writes are simply ignored. This mostly matters if a writable register contains a read-only bit (such as the Display Control, next section).
- If you read from a write-only position, you get back values that are basically nonsense. There aren't really any registers that mix writable bits with read only bits, so you're basically safe here. The only (mild) concern is that when you write a value into a write-only register you need to keep track of what you wrote somewhere else if you want to know what you wrote (such to adjust an offset value by +1, or whatever).
- You can always check GBATEK to be sure, but if I don't mention it then a bit is probably both read and write.
Some registers have invalid bit patterns. For example, the lowest three bits of the Display Control register can't legally be set to the values 6 or 7.

When talking about bit positions, the numbers are zero indexed just like an array index is.

The display control register is our first actual IO Register. GBATEK gives it the shorthand DISPCNT, so you might see it under that name if you read other guides.

Among IO Registers, it's one of the simpler ones, but it's got enough complexity that we can get a hint of what's to come.

Also it's the one that you basically always need to set at least once in every GBA game, so it's a good starting one to go over for that reason too.

The display control register holds a u16 value, and is located at 0x0400_0000.

Many of the bits here won't mean much to you right now. That is fine. You do NOT need to memorize them all or what they all do right away. We'll just skim over all the parts of this register to start, and then we'll go into more detail in later chapters when we need to come back and use more of the bits.

The lowest three bits (0-2) let you select from among the GBA's six video modes. You'll notice that 3 bits allows for eight modes, but the values 6 and 7 are prohibited.

Modes 0, 1, and 2 are "tiled" modes. These are actually the modes that you should eventually learn to use as much as possible. It lets the GBA's limited video hardware do as much of the work as possible, leaving more of your CPU time for gameplay computations. However, they're also complex enough to deserve their own demos and chapters later on, so that's all we'll say about them for now.

Modes 3, 4, and 5 are "bitmap" modes. These let you write individual pixels to locations on the screen.

Mode 3 is full resolution (240w x 160h) RGB15 color. You might not be used to RGB15, since modern computers have 24 or 32 bit colors. In RGB15, there's 5 bits for each color channel stored within a u16 value, and the highest bit is simply ignored.
Mode 4 is full resolution paletted color. Instead of being a u16 color, each pixel value is a u8 palette index entry, and then the display uses the palette memory (which we'll talk about later) to store the actual color data. Since each pixel is half sized, we can fit twice as many. This lets us have two "pages". At any given moment only one page is active, and you can draw to the other page without the user noticing. You set which page to show with another bit we'll get to in a moment.
Mode 5 is full color, but also with pages. This means that we must have a reduced resolution to compensate (video memory is only so big!). The screen is effectively only 160w x 128h in this mode.

Bit 3 is effectively read only. Technically it can be flipped using a BIOS call, but when you write to the display control register normally it won't write to this bit, so we'll call it effectively read only.

This bit is on if the CPU is in CGB mode.

Bit 4 lets you pick which page to use. This is only relevent in video modes 4 or 5, and is just ignored otherwise. It's very easy to remember: when the bit is 0 the 0th page is used, and when the bit is 1 the 1st page is used.

The second page always starts at 0x0600_A000.

Bit 5 lets you access OAM during HBlank if enabled. This is cool, but it reduces the maximum sprites per scanline, so it's not default.

Bit 6 lets you adjust if the GBA should treat Object Character VRAM as being 2d (off) or 1d (on). This particular control can be kinda tricky to wrap your head around, so we'll be sure to have some extra diagrams in the chapter that deals with it.

Bit 7 forces the screen to stay in VBlank as long as it's set. This allows the fastest use of the VRAM, Palette, and Object Attribute Memory. Obviously if you leave this on for too long the player will notice a blank screen, but it might be okay to use for a moment or two every once in a while.

Bits 8 through 11 control if Background layers 0 through 3 should be active.

Bit 12 affects the Object layer.

Note that not all background layers are available in all video modes:

Mode 0: all
Mode 1: 0/1/2
Mode 2: 2/3
Mode 3/4/5: 2

Bit 13 and 14 enable the display of Windows 0 and 1, and Bit 15 enables the object display window. We'll get into how windows work later on, they let you do some nifty graphical effects.

So what did we do to the display control register in hello1?


# #![allow(unused_variables)]
#fn main() {
    (0x04000000 as *mut u16).write_volatile(0x0403);
#}

First let's convert that to binary, and we get 0b100_0000_0011. So, that's setting Mode 3 with background 2 enabled and nothing else special.

The GBA's Video RAM is 96k stretching from 0x0600_0000 to 0x0601_7FFF.

The Video RAM can only be accessed totally freely during a Vertical Blank (aka "VBlank", though sometimes I forget and don't capitalize it properly). At other times, if the CPU tries to touch the same part of video memory as the display controller is accessing then the CPU gets bumped by a cycle to avoid a clash.

Annoyingly, VRAM can only be properly written to in 16 and 32 bit segments (same with PALRAM and OAM). If you try to write just an 8 bit segment, then both parts of the 16 bit segment get the same value written to them. In other words, if you write the byte 5 to 0x0600_0000, then both 0x0600_0000 and ALSO 0x0600_0001 will have the byte 5 in them. We have to be extra careful when trying to set an individual byte, and we also have to be careful if we use memcopy or memset as well, because they're byte oriented by default and don't know to follow the special rules.

As I said before, RGB15 stores a color within a u16 value using 5 bits for each color channel.


# #![allow(unused_variables)]
#fn main() {
pub const RED:   u16 = 0b0_00000_00000_11111;
pub const GREEN: u16 = 0b0_00000_11111_00000;
pub const BLUE:  u16 = 0b0_11111_00000_00000;
#}

In Mode 3 and Mode 5 we write direct color values into VRAM, and in Mode 4 we write palette index values, and then the color values go into the PALRAM.

Mode 3 is pretty easy. We have a full resolution grid of rgb15 pixels. There's 160 rows of 240 pixels each, with the base address being the top left corner. A particular pixel uses normal "2d indexing" math:


# #![allow(unused_variables)]
#fn main() {
let row_five_col_seven = 5 + (7 * SCREEN_WIDTH);
#}

To draw a pixel, we just write a value at the address for the row and col that we want to draw to.

Mode 4 introduces page flipping. Instead of one giant page at 0x0600_0000, there's Page 0 at 0x0600_0000 and then Page 1 at 0x0600_A000. The resolution for each page is the same as above, but instead of writing u16 values, the memory is treated as u8 indexes into PALRAM. The PALRAM starts at 0x0500_0000, and there's enough space for 256 palette entries (each a u16).

To set the color of a palette entry we just do a normal u16 write_volatile.


# #![allow(unused_variables)]
#fn main() {
(0x0500_0000 as *mut u16).offset(target_index).write_volatile(new_color)
#}

To draw a pixel we set the palette entry that we want the pixel to use. However, we must remember the "minimum size" write limitation that applies to VRAM. So, if we want to change just a single pixel at a time we must

Read the full u16 it's a part of.
Clear the half of the u16 we're going to replace
Write the half of the u16 we're going to replace with the new value
Write that result back to the address.

So, the math for finding a byte offset is the same as Mode 3 (since they're both a 2d grid). If the byte offset is EVEN it'll be the high bits of the u16 at half the byte offset rounded down. If the offset is ODD it'll be the low bits of the u16 at half the byte.

Does that make sense?

If we want to write pixel (0,0) the byte offset is 0, so we change the high bits of u16 offset 0. Then we want to write to (1,0), so the byte offset is 1, so we change the low bits of u16 offset 0. The pixels are next to each other, and the target bytes are next to each other, good so far.
If we want to write to (5,6) that'd be byte 5 + 6 * 240 = 1445, so we'd target the low bits of u16 offset floor(1445/2) = 722.

As you can see, trying to write individual pixels in Mode 4 is mostly a bad time. Fret not! We don't have to write individual bytes. If our data is arranged correctly ahead of time we can just write u16 or u32 values directly. The video hardware doesn't care, it'll get along just fine.

Mode 5 is also a two page mode, but instead of compressing the size of a pixel's data to fit in two pages, we compress the resolution.

Mode 5 is full u16 color, but only 160w x 128h per page.

So what got written into VRAM in hello1?


# #![allow(unused_variables)]
#fn main() {
    (0x06000000 as *mut u16).offset(120 + 80 * 240).write_volatile(0x001F);
    (0x06000000 as *mut u16).offset(136 + 80 * 240).write_volatile(0x03E0);
    (0x06000000 as *mut u16).offset(120 + 96 * 240).write_volatile(0x7C00);
#}

So at pixels (120,80), (136,80), and (120,96) we write three values. Once again we probably need to convert them into binary to make sense of it.

0x001F: 0b0_00000_00000_11111
0x03E0: 0b0_00000_11111_00000
0x7C00: 0b0_11111_00000_00000

Ah, of course, a red pixel, a green pixel, and a blue pixel.

Okay so let's have a look again:

hello1

#![feature(start)]
#![no_std]

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
  loop {}
}

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
  unsafe {
    (0x04000000 as *mut u16).write_volatile(0x0403);
    (0x06000000 as *mut u16).offset(120 + 80 * 240).write_volatile(0x001F);
    (0x06000000 as *mut u16).offset(136 + 80 * 240).write_volatile(0x03E0);
    (0x06000000 as *mut u16).offset(120 + 96 * 240).write_volatile(0x7C00);
    loop {}
  }
}

Now let's clean this up so that it's clearer what's going on.

First we'll label that display control stuff, including using the VolatilePtr type from the volatile explanation:


# #![allow(unused_variables)]
#fn main() {
pub const DISPCNT: VolatilePtr<u16> = VolatilePtr(0x04000000 as *mut u16);
pub const MODE3: u16 = 3;
pub const BG2: u16 = 0b100_0000_0000;
#}

Next we make some const values for the actual pixel drawing


# #![allow(unused_variables)]
#fn main() {
pub const VRAM: usize = 0x06000000;
pub const SCREEN_WIDTH: isize = 240;
#}

Note that VRAM has to be interpreted in different ways depending on mode, so we just leave it as usize and we'll cast it into the right form closer to the actual use.

Next we want a small helper function for putting together a color value. Happily, this one can even be declared as a const function. At the time of writing, we've got the "minimal const fn" support in nightly. It really is quite limited, but I'm happy to let rustc and LLVM pre-compute as much as they can when it comes to the GBA's tiny CPU.


# #![allow(unused_variables)]
#fn main() {
pub const fn rgb16(red: u16, green: u16, blue: u16) -> u16 {
  blue << 10 | green << 5 | red
}
#}

Finally, we'll make a function for drawing a pixel in Mode 3. Even though it's just a one-liner, having the "important parts" be labeled as function arguments usually helps you think about it a lot better.


# #![allow(unused_variables)]
#fn main() {
pub unsafe fn mode3_pixel(col: isize, row: isize, color: u16) {
  VolatilePtr(VRAM as *mut u16).offset(col + row * SCREEN_WIDTH).write(color);
}
#}

So now we've got this:

hello2

#![feature(start)]
#![no_std]

#[panic_handler]
fn panic(_info: &core::panic::PanicInfo) -> ! {
  loop {}
}

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
  unsafe {
    DISPCNT.write(MODE3 | BG2);
    mode3_pixel(120, 80, rgb16(31, 0, 0));
    mode3_pixel(136, 80, rgb16(0, 31, 0));
    mode3_pixel(120, 96, rgb16(0, 0, 31));
    loop {}
  }
}

#[derive(Debug, Clone, Copy, Hash, PartialEq, Eq, PartialOrd, Ord)]
#[repr(transparent)]
pub struct VolatilePtr<T>(pub *mut T);
impl<T> VolatilePtr<T> {
  pub unsafe fn read(&self) -> T {
    core::ptr::read_volatile(self.0)
  }
  pub unsafe fn write(&self, data: T) {
    core::ptr::write_volatile(self.0, data);
  }
  pub unsafe fn offset(self, count: isize) -> Self {
    VolatilePtr(self.0.wrapping_offset(count))
  }
}

pub const DISPCNT: VolatilePtr<u16> = VolatilePtr(0x04000000 as *mut u16);
pub const MODE3: u16 = 3;
pub const BG2: u16 = 0b100_0000_0000;

pub const VRAM: usize = 0x06000000;
pub const SCREEN_WIDTH: isize = 240;

pub const fn rgb16(red: u16, green: u16, blue: u16) -> u16 {
  blue << 10 | green << 5 | red
}

pub unsafe fn mode3_pixel(col: isize, row: isize, color: u16) {
  VolatilePtr(VRAM as *mut u16).offset(col + row * SCREEN_WIDTH).write(color);
}

Exact same program that we started with, but much easier to read.

Of course, in the full gba crate that this book is a part of we have these and other elements all labeled and sorted out for you (not identically, but similarly). Still, for educational purposes it's often best to do it yourself at least once.

It's all well and good to draw three pixels, but they don't do anything yet. We want them to do something, and for that we need to get some input from the user.

The GBA, as I'm sure you know, has an arrow pad, A and B, L and R, Start and Select. That's a little more than the NES/GB/CGB had, and a little less than the SNES had. As you can guess, we get key state info from an IO register.

Also, we will need a way to keep the program from running "too fast". On a modern computer or console you do this with vsync info from the GPU and Monitor, and on the GBA we'll be using vsync info from an IO register that tracks what the display hardware is doing.

As a way to apply our knowledge We'll make a simple "light cycle" game where your dot leaves a trail behind them and you die if you go off the screen or if you touch your own trail. We just make a copy of hello2.rs named light_cycle.rs and then fill it in as we go through the chapter. Normally you might not place the entire program into a single source file, particularly as it grows over time, but since these are small examples it's much better to have them be completely self contained than it is to have them be "properly organized" for the long term.

The Key Input Register is our next IO register. Its shorthand name is KEYINPUT and it's a u16 at 0x4000130. The entire register is obviously read only, you can't tell the GBA what buttons are pressed.

Each button is exactly one bit:

Bit	Button
0	A
1	B
2	Select
3	Start
4	Right
5	Left
6	Up
7	Down
8	R
9	L

The higher bits above are not used at all.

Similar to other old hardware devices, the convention here is that a button's bit is clear when pressed, active when released. In other words, when the user is not touching the device at all the KEYINPUT value will read 0b0000_0011_1111_1111. There's similar values for when the user is pressing as many buttons as possible, but since the left/right and up/down keys are on an arrow pad the value can never be 0 since you can't ever press every single key at once.

When dealing with key input, the register always shows the exact key values at any moment you read it. Obviously that's what it should do, but what it means to you as a programmer is that you should usually gather input once at the top of a game frame and then use that single input poll as the input values across the whole game frame.

Of course, you might want to know if a user's key state changed from frame to frame. That's fairly easy too: We just store the last frame keys as well as the current frame keys (it's only a u16) and then we can xor the two values. Anything that shows up in the xor result is a key that changed. If it's changed and it's now down, that means it was pushed this frame. If it's changed and it's now up, that means it was released this frame.

The other major thing you might frequently want is to know "which way" the arrow pad is pointing: Up/Down/None and Left/Right/None. Sounds like an enum to me. Except that often time we'll have situations where the direction just needs to be multiplied by a speed and applied as a delta to a position. We want to support that as well as we can too.

Let's get down to some code. First we want to make a way to read the address as a u16 and then wrap that in our newtype which will implement methods for reading and writing the key bits.


# #![allow(unused_variables)]
#fn main() {
pub const KEYINPUT: VolatilePtr<u16> = VolatilePtr(0x400_0130 as *mut u16);

/// A newtype over the key input state of the GBA.
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
#[repr(transparent)]
pub struct KeyInputSetting(u16);

pub fn key_input() -> KeyInputSetting {
  unsafe { KeyInputSetting(KEYINPUT.read()) }
}
#}

Now we want a way to check if a key is being pressed, since that's normally how we think of things as a game designer and even as a player. That is, usually you'd say "if you press A, then X happens" instead of "if you don't press A, then X does not happen".

Normally we'd pick a constant for the bit we want, & it with our value, and then check for val != 0. Since the bit we're looking for is 0 in the "true" state we still pick the same constant and we still do the &, but we test with == 0. Practically the same, right? Well, since I'm asking a rhetorical question like that you can probably already guess that it's not the same. I was shocked to learn this too.

All we have to do is ask our good friend Godbolt what's gonna happen when the code compiles. The link there has the page set for the stable 1.30 compiler just so that the link results stay consistent if you read this book in a year or something. Also, we've set the target to thumbv6m-none-eabi, which is a slightly later version of ARM than the actual GBA, but it's close enough for just checking. Of course, in a full program small functions like these will probably get inlined into the calling code and disappear entirely as they're folded and refolded by the compiler, but we can just check.

It turns out that the !=0 test is 4 instructions and the ==0 test is 6 instructions. Since we want to get savings where we can, and we'll probably check the keys of an input often enough, we'll just always use a !=0 test and then adjust how we initially read the register to compensate. By using xor with a mask for only the 10 used bits we can flip the "low when pressed" values so that the entire result has active bits in all positions where a key is pressed.


# #![allow(unused_variables)]
#fn main() {
pub fn key_input() -> KeyInputSetting {
  unsafe { KeyInputSetting(KEYINPUT.read_volatile() ^ 0b0000_0011_1111_1111) }
}
#}

Now we add a method for seeing if a key is pressed. In the full library there's a more advanced version of this that's built up via macro, but for this example we'll just name a bunch of const values and then have a method that takes a value and says if that bit is on.


# #![allow(unused_variables)]
#fn main() {
pub const KEY_A: u16 = 1 << 0;
pub const KEY_B: u16 = 1 << 1;
pub const KEY_SELECT: u16 = 1 << 2;
pub const KEY_START: u16 = 1 << 3;
pub const KEY_RIGHT: u16 = 1 << 4;
pub const KEY_LEFT: u16 = 1 << 5;
pub const KEY_UP: u16 = 1 << 6;
pub const KEY_DOWN: u16 = 1 << 7;
pub const KEY_R: u16 = 1 << 8;
pub const KEY_L: u16 = 1 << 9;

impl KeyInputSetting {
  pub fn contains(&self, key: u16) -> bool {
    (self.0 & key) != 0
  }
}
#}

Because each key is a unique bit you can even check for more than one key at once by just adding two key values together.


# #![allow(unused_variables)]
#fn main() {
let input_contains_a_and_l = input.contains(KEY_A + KEY_L);
#}

And we wanted to save the state of an old frame and compare it to the current frame to see what was different:


# #![allow(unused_variables)]
#fn main() {
  pub fn difference(&self, other: KeyInputSetting) -> KeyInputSetting {
    KeyInputSetting(self.0 ^ other.0)
  }
#}

Anything that's "in" the difference output is a key that changed, and then if the key reads as pressed this frame that means it was just pressed. The exact mechanics of all the ways you might care to do something based on new key presses is obviously quite varied, but it might be something like this:


# #![allow(unused_variables)]
#fn main() {
let this_frame_diff = this_frame_input.difference(last_frame_input);

if this_frame_diff.contains(KEY_B) && this_frame_input.contains(KEY_B) {
  // the user just pressed B, react in some way
}
#}

And for the arrow pad, we'll make an enum that easily casts into i32. Whenever we're working with stuff we can try to use i32 / isize as often as possible just because it's easier on the GBA's CPU if we stick to its native number size. Having it be an enum lets us use match and be sure that we've covered all our cases.


# #![allow(unused_variables)]
#fn main() {
/// A "tribool" value helps us interpret the arrow pad.
#[derive(Debug, Clone, Copy, Default, PartialEq, Eq)]
#[repr(i32)]
pub enum TriBool {
  Minus = -1,
  Neutral = 0,
  Plus = +1,
}
#}

Now, how do we determine which way is plus or minus? Well... I don't know. Really. I'm not sure what the best one is because the GBA really wants the origin at 0,0 with higher rows going down and higher cols going right. On the other hand, all the normal math you and I learned in school is oriented with increasing Y being upward on the page. So, at least for this demo, we're going to go with what the GBA wants us to do and give it a try. If we don't end up confusing ourselves then we can stick with that. Maybe we can cover it over somehow later on.


# #![allow(unused_variables)]
#fn main() {
  pub fn column_direction(&self) -> TriBool {
    if self.contains(KEY_RIGHT) {
      TriBool::Plus
    } else if self.contains(KEY_LEFT) {
      TriBool::Minus
    } else {
      TriBool::Neutral
    }
  }

  pub fn row_direction(&self) -> TriBool {
    if self.contains(KEY_DOWN) {
      TriBool::Plus
    } else if self.contains(KEY_UP) {
      TriBool::Minus
    } else {
      TriBool::Neutral
    }
  }
#}

So then in our game, every frame we can check for column_direction and row_direction and then apply those to the player's current position to make them move around the screen.

With that settled I think we're all done with user input for now. There's some other things to eventually know about like key interrupts that you can set and stuff, but we'll cover that later on because it's not necessary right now.

There's an IO register called VCOUNT that shows you, what else, the Vertical (row) COUNT(er). It's a u16 at address 0x0400_0006, and it's how we'll be doing our very poor quality vertical sync code to start.

What makes it poor? Well, we're just going to read from the vcount value as often as possible every time we need to wait for a specific value to come up, and then proceed once it hits the point we're looking for.
Why is this bad? Because we're making the CPU do a lot of useless work, which uses a lot more power that necessary. Even if you're not on an actual GBA you might be running inside an emulator on a phone or other handheld. You wanna try to save battery if all you're doing with that power use is waiting instead of making a game actually do something.
Can we do better? We can, but not yet. The better way to do things is to use a BIOS call to put the CPU into low power mode until a VBlank interrupt happens. However, we don't know about interrupts yet, and we don't know about BIOS calls yet, so we'll do the basic thing for now and then upgrade later.

So the way that display hardware actually displays each frame is that it moves a tiny pointer left to right across each pixel row one pixel at a time. When it's within the actual screen width (240px) it's drawing out those pixels. Then it goes past the edge of the screen for 68px during a period known as the "horizontal blank" (HBlank). Then it starts on the next row and does that loop over again. This happens for the whole screen height (160px) and then once again it goes past the last row for another 68px into a "vertical blank" (VBlank) period.

One pixel is 4 CPU cycles
HDraw is 240 pixels, HBlank is 68 pixels (1,232 cycles per full scanline)
VDraw is 150 scanlines, VBlank is 68 scanlines (280,896 cycles per full refresh)

Now you may remember some stuff from the display control register section where it was mentioned that some parts of memory are best accessed during VBlank, and also during hblank with a setting applied. These blanking periods are what was being talked about. At other times if you attempt to access video or object memory you (the CPU) might try touching the same memory that the display device is trying to use, in which case you get bumped back a cycle so that the display can finish what it's doing. Also, if you really insist on doing video memory changes while the screen is being drawn then you might get some visual glitches. If you can, just prepare all your changes ahead of time and then assign then all quickly during the blank period.

So first we want a way to check the vcount value at all:


# #![allow(unused_variables)]
#fn main() {
pub const VCOUNT: VolatilePtr<u16> = VolatilePtr(0x0400_0006 as *mut u16);

pub fn vcount() -> u16 {
  unsafe { VCOUNT.read() }
}
#}

Then we want two little helper functions to wait until VBlank and vdraw.


# #![allow(unused_variables)]
#fn main() {
pub const SCREEN_HEIGHT: isize = 160;

pub fn wait_until_vblank() {
  while vcount() < SCREEN_HEIGHT as u16 {}
}

pub fn wait_until_vdraw() {
  while vcount() >= SCREEN_HEIGHT as u16 {}
}
#}

And... that's it. No special types to be made this time around, it's just a number we read out of memory.

Now let's make a game of "light_cycle" with our new knowledge.

light_cycle is pretty simple, and very obvious if you've ever seen Tron. The player moves around the screen with a trail left behind them. They die if they go off the screen or if they touch their own trail.

We need some better drawing operations this time around.


# #![allow(unused_variables)]
#fn main() {
pub unsafe fn mode3_clear_screen(color: u16) {
  let color = color as u32;
  let bulk_color = color << 16 | color;
  let mut ptr = VolatilePtr(VRAM as *mut u32);
  for _ in 0..SCREEN_HEIGHT {
    for _ in 0..(SCREEN_WIDTH / 2) {
      ptr.write(bulk_color);
      ptr = ptr.offset(1);
    }
  }
}

pub unsafe fn mode3_draw_pixel(col: isize, row: isize, color: u16) {
  VolatilePtr(VRAM as *mut u16).offset(col + row * SCREEN_WIDTH).write(color);
}

pub unsafe fn mode3_read_pixel(col: isize, row: isize) -> u16 {
  VolatilePtr(VRAM as *mut u16).offset(col + row * SCREEN_WIDTH).read()
}
#}

The draw pixel and read pixel are both pretty obvious. What's new is the clear screen operation. It changes the u16 color into a u32 and then packs the value in twice. Then we write out u32 values the whole way through screen memory. This means we have to do less write operations overall, and so the screen clear is twice as fast.

Now we just have to fill in the main function:

#[start]
fn main(_argc: isize, _argv: *const *const u8) -> isize {
  unsafe {
    DISPCNT.write(MODE3 | BG2);
  }

  let mut px = SCREEN_WIDTH / 2;
  let mut py = SCREEN_HEIGHT / 2;
  let mut color = rgb16(31, 0, 0);

  loop {
    // read the input for this frame
    let this_frame_keys = key_input();

    // adjust game state and wait for vblank
    px += 2 * this_frame_keys.column_direction() as isize;
    py += 2 * this_frame_keys.row_direction() as isize;
    wait_until_vblank();

    // draw the new game and wait until the next frame starts.
    unsafe {
      if px < 0 || py < 0 || px == SCREEN_WIDTH || py == SCREEN_HEIGHT {
        // out of bounds, reset the screen and position.
        mode3_clear_screen(0);
        color = color.rotate_left(5);
        px = SCREEN_WIDTH / 2;
        py = SCREEN_HEIGHT / 2;
      } else {
        let color_here = mode3_read_pixel(px, py);
        if color_here != 0 {
          // crashed into our own line, reset the screen
          mode3_clear_screen(0);
          color = color.rotate_left(5);
        } else {
          // draw the new part of the line
          mode3_draw_pixel(px, py, color);
          mode3_draw_pixel(px, py + 1, color);
          mode3_draw_pixel(px + 1, py, color);
          mode3_draw_pixel(px + 1, py + 1, color);
        }
      }
    }
    wait_until_vdraw();
  }
}

Oh that's a lot more than before!

First we set Mode 3 and Background 2, we know about that.

Then we're going to store the player's x and y, along with a color value for their light cycle. Then we enter the core loop.

We read the keys for input, and then do as much as we can without touching video memory. Since we're using video memory as the place to store the player's light trail, we can't do much, we just update their position and wait for VBlank to start. The player will be a 2x2 square, so the arrows will move you 2 pixels per frame.

Once we're in VBlank we check to see what kind of drawing we're doing. If the player has gone out of bounds, we clear the screen, rotate their color, and then reset their position. Why rotate the color? Just because it's fun to have different colors.

Next, if the player is in bounds we read the video memory for their position. If it's not black that means we've been here before and the player has crashed into their own line. In this case, we reset the game without moving them to a new location.

Finally, if the player is in bounds and they haven't crashed, we write their color into memory at this position.

Regardless of how it worked out, we hold here until vdraw starts before going to the next loop. That's all there is to it.

Once again, as with the hello1 and hello2 examples, the gba crate covers much of this same ground as our example here, but in slightly different ways.

Better organization and abstractions are usually only realized once you've used more of the whole thing you're trying to work with. If we want to have a crate where the whole thing is well integrated with itself, then the examples would also end up having to explain about things we haven't really touched on much yet. It becomes a lot harder to teach.

So, going forward, we will continue to teach concepts and build examples that don't directly depend on the gba crate. This allows the crate to freely grow without all the past examples becoming a great inertia upon it.

Alright so we can do some basic "movement", but we left a big trail in the video memory of everywhere we went. Most of the time that's not what we want at all. If we want more hardware support we're going to have to use a new video mode. So far we've only used Mode 3, but modes 4 and 5 are basically the same. Instead, we'll switch focus to using a tiled graphical mode.

First we will go over the complete GBA memory mapping. Part of this is the memory for tiled graphics, but also things like all those IO registers, where our RAM is for scratch space, all that stuff. Even if we can't put all of them to use at once, it's helpful to have an idea of what will be available in the long run.

Tiled modes bring us two big new concepts that each have their own complexity: backgrounds and objects. They share some concepts, but fundamentally the background is for creating a very large static space that you can scroll around the view within, and the objects are about having a few moving bits that appear over the background. Careful use of backgrounds and objects is key to having the best looking GBA game, so we won't even be able to cover it all in a single chapter.

And, of course, since most games are pretty boring if they're totally static we'll touch on the kinds of RNG implementations you might want to have on a GBA. Most general purpose RNGs that you find are rather big compared to the amount of memory we want to give them, and they often use a lot of u64 operations, so they end up much slower on a 32-bit machine like the GBA (you can lower 64-bit ops to combinations of 32-bit ops, but that's quite a bit more work). We'll cover a few RNG options that size down the RNG to a good size and a good speed without trading away too much in terms of quality.

To top it all off, we'll make a simple "memory game" sort of thing. There's some face down cards in a grid, you pick one to check, then you pick the other to check, and then if they match the pair disappears.

The GBA Memory Map has several memory portions to it, each with their own little differences. Most of the memory has pre-determined use according to the hardware, but there is also space for games to use as a scratch pad in whatever way the game sees fit.

The memory ranges listed here are inclusive, so they end with a lot of Fs and Es.

We've talked about volatile memory before, but just as a reminder I'll say that all of the memory we'll talk about here should be accessed with volatile with two exceptions:

Work RAM (both internal and external) can be used normally, and if the compiler is able to totally elide any reads and writes that's okay.
However, if you set aside any space in Work RAM where an interrupt will communicate with the main program then that specific location will have to keep using volatile access, since the compiler never knows when an interrupt will actually happen.

0x0 to 0x3FFF (16k)

This is special memory for the BIOS. It is "read-only", but even then it's only accessible when the program counter is pointing into the BIOS region. At all other times you get a garbage value back when you try to read out of the BIOS.

0x2000000 to 0x203FFFF (256k)

This is a big pile of space, the use of which is up to each game. However, the external work ram has only a 16-bit bus (if you read/write a 32-bit value it silently breaks it up into two 16-bit operations) and also 2 wait cycles (extra CPU cycles that you have to expend per 16-bit bus use).

In other words, we should think of EWRAM as if it was "heap space" in a normal application. You can take the time to go store something within EWRAM, or to load it out of EWRAM, but you should always avoid doing a critical computation on values in EWRAM. It's a bit of a pain, but if you wanna be speedy and you have more than just one manipulation that you want to do, you should pull the value into a local variable, do all of your manipulations, and then push it back out at the end.

0x3000000 to 0x3007FFF (32k)

This is a smaller pile of space, but it has a 32-bit bus and no wait.

By default, 0x3007F00 to 0x3007FFF is reserved for interrupt and BIOS use. The rest of it is totally up to you. The user's stack space starts at 0x3007F00 and proceeds down from there. In other words, if you start your own customized IWRAM use at 0x3000000 and go up, eventually you might hit your stack. However, most reasonable uses won't actually cause a memory collision. It's just something you should know about if you're using a ton of stack or IWRAM and then get problems.

0x4000000 to 0x40003FE

We've touched upon a few of these so far, and we'll get to more later. At the moment it is enough to say that, as you might have guessed, all of them live in this region. Each individual register is a u16 or u32 and they control all sorts of things. We'll actually be talking about some more of them in this very chapter, because that's how we'll control some of the background and object stuff.

0x5000000 to 0x50003FF (1k)

Palette RAM has a 16-bit bus, which isn't really a problem because it conceptually just holds u16 values. There's no automatic wait state, but if you try to access the same location that the display controller is accessing you get bumped by 1 cycle. Since the display controller can use the palette ram any number of times per scanline it's basically impossible to predict if you'll have to do a wait or not during VDraw. During VBlank you won't have any wait of course.

PALRAM is among the memory where there's weirdness if you try to write just one byte: if you try to write just 1 byte, it writes that byte into both parts of the larger 16-bit location. This doesn't really affect us much with PALRAM, because palette values are all supposed to be u16 anyway.

The palette memory actually contains not one, but two sets of palettes. First there's 256 entries for the background palette data (starting at 0x5000000), and then there's 256 entries for object palette data (starting at 0x5000200).

The GBA also has two modes for palette access: 8-bits-per-pixel (8bpp) and 4-bits-per-pixel (4bpp).

In 8bpp mode an (8-bit) palette index value within a background or sprite simply indexes directly into the 256 slots for that type of thing.
In 4bpp mode a (4-bit) palette index value within a background or sprite specifies an index within a particular "palbank" (16 palette entries each), and then a separate setting outside of the graphical data determines which palbank is to be used for that background or object (the screen entry data for backgrounds, and the object attributes for objects).

0x6000000 to 0x6017FFF (96k)

We've used this before! VRAM has a 16-bit bus and no wait. However, the same as with PALRAM, the "you might have to wait if the display controller is looking at it" rule applies here.

Unfortunately there's not much more exact detail that can be given about VRAM. The use of the memory depends on the video mode that you're using.

One general detail of note is that you can't write individual bytes to any part of VRAM. Depending on mode and location, you'll either get your bytes doubled into both the upper and lower parts of the 16-bit location targeted, or you won't even affect the memory. This usually isn't a big deal, except in two situations:

In Mode 4, if you want to change just 1 pixel, you'll have to be very careful to read the old u16, overwrite just the byte you wanted to change, and then write that back.
In any display mode, avoid using memcopy to place things into VRAM. It's written to be byte oriented, and only does 32-bit transfers under select conditions. The rest of the time it'll copy one byte at a time and you'll get either garbage or nothing at all.

0x7000000 to 0x70003FF (1k)

The Object Attribute Memory has a 32-bit bus and no default wait, but suffers from the "you might have to wait if the display controller is looking at it" rule. You cannot write individual bytes to OAM at all, but that's not really a problem because all the fields of the data types within OAM are either i16 or u16 anyway.

Object attribute memory is the wildest yet: it conceptually contains two types of things, but they're interlaced with each other all the way through.

Now, GBATEK and CowByte doesn't quite give names to the two data types, though TONC calls them OBJ_ATTR and OBJ_AFFINE. We'll give them Rust names of course. In Rust terms their layout would look like this:


# #![allow(unused_variables)]
#fn main() {
#[repr(C)]
pub struct ObjectAttribute {
  attr0: u16,
  attr1: u16,
  attr2: u16,
  filler: i16,
}

#[repr(C)]
pub struct AffineMatrix {
  filler0: [u16; 3],
  pa: i16,
  filler1: [u16; 3],
  pb: i16,
  filler2: [u16; 3],
  pc: i16,
  filler3: [u16; 3],
  pd: i16,
}
#}

(Note: the #[repr(C)] part just means that Rust must lay out the data exactly in the order we specify, which otherwise it is not required to do).

So, we've got 1024 bytes in OAM and each ObjectAttribute value is 8 bytes, so naturally we can support up to 128 objects.

At the same time, we've got 1024 bytes in OAM and each AffineMatrix is 32 bytes, so we can have 32 of them.

But, as I said, these things are all interlaced with each other. See how there's "filler" fields in each struct? If we imagine the OAM as being just an array of one type or the other, indexes 0/1/2/3 of the ObjectAttribute array would line up with index 0 of the AffineMatrix array. It's kinda weird, but that's just how it works. When we setup functions to read and write these values we'll have to be careful with how we do it. We probably won't want to use those representations above, at least not with the AffineMatrix type, because they're quite wasteful if you want to store just object attributes or just affine matrices.

0x8000000 to 0x9FFFFFF (wait 0)
0xA000000 to 0xBFFFFFF (wait 1)
0xC000000 to 0xDFFFFFF (wait 2)
Max of 32Mb

These portions of the memory are less fixed, because they depend on the precise details of the game pak you've inserted into the GBA. In general, they connect to the game pak ROM and/or Flash memory, using a 16-bit bus. The ROM is read-only, but the Flash memory (if any) allows writes.

The game pak ROM is listed as being in three sections, but it's actually the same memory being effectively mirrored into three different locations. The mirror that you choose to access the game pak through affects which wait state setting it uses (configured via IO register of course). Unfortunately, the details come down more to the game pak hardware that you load your game onto than anything else, so there's not much I can say right here. We'll eventually talk about it more later,

One thing of note is the way that the 16-bit bus affects us: the instructions to execute are coming through the same bus as the rest of the game data, so we want them to be as compact as possible. The ARM chip in the GBA supports two different instruction sets, "thumb" and "non-thumb". The thumb mode instructions are 16-bit, so they can each be loaded one at a time, and the non-thumb instructions are 32-bit, so we're at a penalty if we execute them directly out of the game pak. However, some things will demand that we use non-thumb code, so we'll have to deal with that eventually. It's possible to switch between modes, but it's a pain to keep track of what mode you're in because there's not currently support for it in Rust itself (perhaps some day). So we'll stick with thumb code as much as we possibly can, that's why our target profile for our builds starts with thumbv4.

0xE000000 to 0xE00FFFF (64k)

The game pak SRAM has an 8-bit bus. Why did Pokémon always take so long to save? This is why. It also has some amount of wait, but as with the ROM, the details depend on your game pak hardware (and also as with ROM, you can adjust the settings with an IO register, should you need to).

One thing to note about the SRAM is that the GBA has a Direct Memory Access (DMA) feature that can be used for bulk memory movements in some cases, but the DMA cannot access the SRAM region. You really are stuck reading and writing one byte at a time when you're using the SRAM.

TODO

Rust GBA Guide

Introduction

Style and Purpose

Expected Knowledge

Getting Help

Further Reading

Chapter 0: Development Setup

Per System Setup

Per Project Setup

Compiling

Ch 1: Hello GBA

hello1

A basic hello1 explanation

All Those Magic Numbers

Volatile

Volatile by default

IO Registers

The Display Control Register

Video Modes

CGB Mode

Page Flipping

OAM, VRAM, and Blanking

Screen Layers

In Conclusion...

Video Memory Intro

RGB15

Mode 3

Mode 4

Mode 5

In Conclusion...

hello2

Ch 2: User Input

The Key Input Register

Key Input Code

The VCount Register

light_cycle

Gameplay

Operations

The gba crate doesn't quite work like this

Ch 3: Memory and Objects

GBA Memory

BIOS / System ROM

External Work RAM / EWRAM

Internal Work RAM / IWRAM

IO Registers

Palette RAM / PALRAM

Video RAM / VRAM

Object Attribute Memory / OAM

Game Pak ROM / Flash ROM

Game Pak SRAM

Tiled Backgrounds

Object Basics

GBA RNG

memory_game