Introduction
This is the book for learning how to write GameBoy Advance (GBA) games in Rust.
I'm Lokathor, the main author of the book. There's also Ketsuban who provides the technical advisement, reviews the PRs, and keeps my crazy in check.
The book is a work in progress, as you can see if you actually try to open many of the pages listed in the Table Of Contents.
Feedback
It's very often hard to tell when you've explained something properly. In the same way that your brain will read over small misspellings and correct things into the right word, if an explanation for something you already understand accidentally skips over some small detail then your brain can fill in the gaps without you realizing it.
Please, if things don't make sense then file an issue about it so I know where things need to improve.
Reader Requirements
This book naturally assumes that you've already read Rust's core book:
Now, I know it sounds silly to say "if you wanna program Rust on this old video game system you should already know how to program Rust", but the more people I meet and chat with the more they tell me that they jumped into Rust without reading any or all of the book. You know who you are.
Please, read the whole book!
In addition to the core book, there's also an expansion book that I will declare to be required reading for this:
The Rustonomicon is all about trying to demystify unsafe
. We'll end up using a
fair bit of unsafe code as a natural consequence of doing direct hardware
manipulations. Using unsafe is like swinging a
sword,
you should start slowly, practice carefully, and always pay attention no matter
how experienced you think you've become.
That said, it's sometimes a necessary tool to get the job done, so you have to break out of the borderline pathological fear of using it that most rust programmers tend to have.
Book Goals and Style
So, what's this book actually gonna teach you?
I'm not gonna tell you how to use a crate that already exists.
Don't get me wrong, there is a gba crate, and it's on crates.io and all that jazz.
However, unlike most crates that come with a tutorial book, I don't want to just
teach you how to use the crate. What I want is to teach you what you need to
know so that you could build the crate yourself, from scratch, if it didn't
already exist for you. Let's call it the Handmade
Hero school of design. Much more than you might find
in other Rust crate books, I'll be attempting to show a lot of the why in
addition to just the how. Once you know how to do it all on your own, you can
decide for yourself if the gba
crate does it well, or if you think you can
come up with something that suits your needs better.
Overall the book is sorted for easy review once you're trying to program something, and the GBA has a few interconnected concepts, so some parts of the book end up having to refer you to portions that you haven't read yet. The chapters and sections are sorted so that minimal future references are required, but it's unavoidable.
The actual "tutorial order" of the book is the Examples chapter. Each section of that chapter breaks down one of the provided examples in the examples directory of the repository. We go over what sections of the book you'll need to have read for the example code to make sense, and also how we apply the general concepts described in the book to the specific example cases.
Development Setup
Before you can build a GBA game you'll have to follow some special steps to setup the development environment.
Once again, extra special thanks to Ketsuban, who first dove into how to make this all work with rust and then shared it with the world.
Per System Setup
Obviously you need your computer to have a working rust
installation. However, you'll also need to ensure that
you're using a nightly toolchain (we will need it for inline assembly, among
other potential useful features). You can run rustup default nightly
to set
nightly as the system wide default toolchain, or you can use a toolchain
file to use
nightly just on a specific project, but either way we'll be assuming the use of
nightly from now on. You'll also need the rust-src
component so that
cargo-xbuild
will be able to compile the core crate for us in a bit, so run
rustup component add rust-src
.
Next, you need devkitpro. They've
got a graphical installer for Windows that runs nicely, and I guess pacman
support on Linux (I'm on Windows so I haven't tried the Linux install myself).
We'll be using a few of their general binutils for the arm-none-eabi
target,
and we'll also be using some of their tools that are specific to GBA
development, so even if you already have the right binutils for whatever
reason, you'll still want devkitpro for the gbafix
utility.
- On Windows you'll want something like
C:\devkitpro\devkitARM\bin
andC:\devkitpro\tools\bin
to be added to your PATH, depending on where you installed it to and such. - On Linux you can use pacman to get it, and the default install puts the stuff
in
/opt/devkitpro/devkitARM/bin
and/opt/devkitpro/tools/bin
. If you need help you can look in our repository's .travis.yml file to see exactly what our CI does.
Finally, you'll need cargo-xbuild
. Just run cargo install cargo-xbuild
and
cargo will figure it all out for you.
Per Project Setup
Once the system wide tools are ready, you'll need some particular files each time you want to start a new project. You can find them in the root of the rust-console/gba repo.
thumbv4-none-agb.json
describes the overall GBA to cargo-xbuild (and LLVM) so it knows what to do. Technically the GBA isthumbv4-none-eabi
, but we change theeabi
toagb
so that we can distinguish it from othereabi
devices when usingcfg
flags.crt0.s
describes some ASM startup stuff. If you have more ASM to place here later on this is where you can put it. You also need to build it into acrt0.o
file before it can actually be used, but we'll cover that below.linker.ld
tells the linker all the critical info about the layout expectations that the GBA has about our program, and that it should also include thecrt0.o
file with our compiled rust code.
Compiling
Once all the tools are in place, there's particular steps that you need to compile the project. For these to work you'll need some source code to compile. Unlike with other things, an empty main file and/or an empty lib file will cause a total build failure, because we'll need a no_std build, and rust defaults to builds that use the standard library. The next section has a minimal example file you can use (along with explanation), but we'll describe the build steps here.
-
arm-none-eabi-as crt0.s -o target/crt0.o
- This builds your text format
crt0.s
file into object formatcrt0.o
that's placed in thetarget/
directory. Note that if thetarget/
directory doesn't exist yet it will fail, so you have to make the directory if it's not there. You don't need to rebuildcrt0.s
every single time, only when it changes, but you might as well throw a line to do it every time into your build script so that you never forget because it's a practically instant operation anyway.
- This builds your text format
-
cargo xbuild --target thumbv4-none-agb.json
- This builds your Rust source. It accepts most of the normal options, such
as
--release
, and options, such as--bin foo
or--examples
, that you'd expectcargo
to accept. - You can not build and run tests this way, because they require
std
, which the GBA doesn't have. If you want you can still run some of your project's tests withcargo test --lib
or similar, but that builds for your local machine, so anything specific to the GBA (such as reading and writing registers) won't be testable that way. If you want to isolate and try out some piece code running on the GBA you'll unfortunately have to make a demo for it in yourexamples/
directory and then run the demo in an emulator and see if it does what you expect. - The file extension is important! It will work if you forget it, but
cargo xbuild
takes the inclusion of the extension as a flag to also compile dependencies with the same sysroot, so you can include other crates in your build. Well, crates that work in the GBA's limited environment, but you get the idea.
- This builds your Rust source. It accepts most of the normal options, such
as
At this point you have an ELF binary that some emulators can execute directly (more on that later). However, if you want a "real" ROM that works in all emulators and that you could transfer to a flash cart to play on real hardware there's a little more to do.
-
arm-none-eabi-objcopy -O binary target/thumbv4-none-agb/MODE/BIN_NAME target/ROM_NAME.gba
- This will perform an objcopy on our
program. Here I've named the program
arm-none-eabi-objcopy
, which is what devkitpro calls their version ofobjcopy
that's specific to the GBA in the Windows install. If the program isn't found under that name, have a look in your installation directory to see if it's under a slightly different name or something. - As you can see from reading the man page, the
-O binary
option takes our lovely ELF file with symbols and all that and strips it down to basically a bare memory dump of the program. - The next argument is the input file. You might not be familiar with how
cargo
arranges stuff in thetarget/
directory, and between RLS andcargo doc
and stuff it gets kinda crowded, so it goes like this:- Since our program was built for a non-local target, first we've got a
directory named for that target,
thumbv4-none-agb/
- Next, the "MODE" is either
debug/
orrelease/
, depending on if we had the--release
flag included. You'll probably only be packing release mode programs all the way into GBA roms, but it works with either mode. - Finally, the name of the program. If your program is something out of the
project's
src/bin/
then it'll be that file's name, or whatever name you configured for the bin in theCargo.toml
file. If your program is something out of the project'sexamples/
directory there will be a similarexamples/
sub-directory first, and then the example's name.
- Since our program was built for a non-local target, first we've got a
directory named for that target,
- The final argument is the output of the
objcopy
, which I suggest putting at just the top level of thetarget/
directory. Really it could go anywhere, but if you're using git then it's likely that your.gitignore
file is already setup to exclude everything intarget/
, so this makes sure that your intermediate game builds don't get checked into your git.
- This will perform an objcopy on our
program. Here I've named the program
-
gbafix target/ROM_NAME.gba
- The
gbafix
tool also comes from devkitpro. The GBA is very picky about a ROM's format, andgbafix
patches the ROM's header and such so that it'll work right. Unlikeobjcopy
, this tool is custom built for GBA development, so it works just perfectly without any arguments beyond the file name. The ROM is patched in place, so we don't even need to specify a new destination.
- The
And you're finally done!
Of course, you probably want to make a script for all that, but it's up to you.
On our own project we have it mostly set up within a Makefile.toml
which runs
using the cargo-make plugin.
Hello, Magic
So we know all the steps to build our source, we just need some source.
We're beginners, so we'll start small. With normal programming there's usually a console available, so the minimal program prints "Hello, world" to the terminal. On a GBA we don't have a terminal and standard out and all that, so the minimal program draws a red, blue, and green dot to the screen.
At the lowest level of device programming, it's all Magic Numbers. You write special values to special places and then the hardware does something. A clear API makes every magic number and magic location easy to understand. A clear and good API also prevents you from using the wrong magic number in the wrong place and causing problems for yourself.
This is the minimal example to just test that our build system is all set, so just this once we'll go full magic number crazy town, for fun. Ready? Here goes:
hello_magic.rs
:
#![no_std] #![feature(start)] #[panic_handler] fn panic(_info: &core::panic::PanicInfo) -> ! { loop {} } #[start] fn main(_argc: isize, _argv: *const *const u8) -> isize { unsafe { (0x400_0000 as *mut u16).write_volatile(0x0403); (0x600_0000 as *mut u16).offset(120 + 80 * 240).write_volatile(0x001F); (0x600_0000 as *mut u16).offset(136 + 80 * 240).write_volatile(0x03E0); (0x600_0000 as *mut u16).offset(120 + 96 * 240).write_volatile(0x7C00); loop {} } }
Throw that into your project skeleton, build the program, and give it a run. You
should see a red, green, and blue dot close-ish to the middle of the screen. If
you don't, something already went wrong. Double check things, phone a friend,
write your senators, try asking Lokathor
or Ketsuban
on the Rust Community
Discord, until you're eventually able to
get your three dots going.
Of course, I'm sure you want to know why those numbers are the numbers to use. Well that's what the whole rest of the book is about!
Help and Resources
Help
So you're stuck on a problem and the book doesn't say what to do. Where can you find out more?
The first place I would suggest is the Rust Community
Discord. If it's a general Rust question
then you can ask anyone in any channel you feel is appropriate. If it's GBA
specific then you can try asking me (Lokathor
) or Ketsuban
in the #gamedev
channel.
Emulators
You certainly might want to eventually write a game that you can put on a flash cart and play on real hardware, but for most of your development you'll probably want to be using an emulator for testing, because you don't have to fiddle with cables and all that.
In terms of emulators, you want to be using mGBA, and you want to be using the 0.7 Beta 1 or later. This update lets you run raw ELF files, which means that you can have full debug symbols available while you're debugging problems.
Information Resources
Ketsuban and I didn't magically learn this all from nowhere, we read various technical manuals and guides ourselves and then distilled the knowledge (usually oriented towards C and C++) into this book for Rust.
We have personally used some or all of the following:
- GBATEK: This is the resource. It
covers not only the GBA, but also the DS and DSi, and also a run down of ARM
assembly (32-bit and 16-bit opcodes). The link there is to the 2.9b version on
problemkaputt.de
(the official home of the document), but if you just google for gbatek the top result is for the 2.5 version onakkit.org
, so make sure you're looking at the newest version. Sometimesproblemkaputt.de
is a little sluggish so I've also mirrored the 2.9b version on my own site as well. GBATEK is rather large, over 2mb of text, so if you're on a phone or similar you might want to save an offline copy to go easy on your data usage. - TONC: While GBATEK is basically just a huge tech specification, TONC is an actual guide on how to make sense of the GBA's abilities and organize it into a game. It's written for C of course, but as a Rust programmer you should always be practicing your ability to read C code anyway. It's the programming equivalent of learning Latin because all the old academic books are written in Latin.
- CowBite: This is more like GBATEK, and it's less complete, but it mixes in a little more friendly explanation of things in between the hardware spec parts.
And I haven't had time to look at it myself, The Audio Advance seems to be very good. It explains in depth how you can get audio working on the GBA. Note that the table of contents for each page goes along the top instead of down the side.
Non-Rust GBA Community
There's also the GBADev.org site, which has a forum and everything. They're coding in C and C++, but you can probably overcome that difference with a little work on your part.
I also found a place called GBATemp, which seems to have a more active forum but less of a focus on actual coding.
Quirks
The GBA supports a lot of totally normal Rust code exactly like you'd think.
However, it also is missing a lot of what you might expect, and sometimes we have to do things in slightly weird ways.
We start the book by covering the quirks our code will have, just to avoid too many surprises later.
No Std
First up, as you already saw in the hello_magic
code, we have to use the
#![no_std]
outer attribute on our program when we target the GBA. You can find
some info about no_std
in two official sources:
The unstable book is borderline useless here because it's describing too many things in too many words. The embedded book is much better, but still fairly terse.
Bare Metal
The GBA falls under what the Embedded Book calls "Bare Metal Environments".
Basically, the machine powers on and immediately begins executing some ASM code.
Our ASM startup was provided by Ketsuban
(check the crt0.s
file). We'll go
over how it works much later on, for now it's enough to know that it does
work, and eventually control passes into Rust code.
On the rust code side of things, we determine our starting point with the
#[start]
attribute on our main
function. The main
function also has a
specific type signature that's different from the usual main
that you'd see in
Rust. I'd tell you to read the unstable-book entry on #[start]
but they
literally
just tell you to look at the tracking issue for
it instead, and that's not very
helpful either. Basically it just has to be declared the way it is, even
though there's nothing passing in the arguments and there's no place that the
return value will go. The compiler won't accept it any other way.
No Standard Library
The Embedded Book tells us that we can't use the standard library, but we get
access to something called "libcore", which sounds kinda funny. What they're
talking about is just the core
crate, which is called libcore
within the rust repository for historical reasons.
The core
crate is actually still a really big portion of Rust. The standard
library doesn't actually hold too much code (relatively speaking), instead it
just takes code form other crates and then re-exports it in an organized way. So
with just core
instead of std
, what are we missing?
In no particular order:
- Allocation
- Clock
- Network
- File System
The allocation system and all the types that you can use if you have a global allocator are neatly packaged up in the alloc crate. The rest isn't as nicely organized.
It's possible to implement a fair portion of the entire standard library within a GBA context and make the rest just panic if you try to use it. However, do you really need all that? Eh... probably not?
- We don't need a file system, because all of our data is just sitting there in
the ROM for us to use. When programming we can organize our
const
data into modules and such to keep it organized, but once the game is compiled it's just one huge flat address space. TODO: Parasyte says that a FS can be handy even if it's all just ReadOnly, so we'll eventually talk about how you might set up such a thing I guess, since we'll already be talking about replacements for three of the other four things we "lost". Maybe we'll make Parasyte write that section. - Networking, well, the GBA has a Link Cable you can use to communicate with another GBA, but it's not really like a unix socket with TCP, so the standard Rust networking isn't a very good match.
- Clock is actually two different things at once. One is the ability to store the time long term, which is a bit of hardware that some gamepaks have in them (eg: pokemon ruby/sapphire/emerald). The GBA itself can't keep time while power is off. However, the second part is just tracking time moment to moment, which the GBA can totally do. We'll see how to access the timers soon enough.
Which just leaves us with allocation. Do we need an allocator? Depends on your game. For demos and small games you probably don't need one. For bigger games you'll maybe want to get an allocator going eventually. It's in some sense a crutch, but it's a very useful one.
So I promise that at some point we'll cover how to get an allocator going. Either a Rust Global Allocator (if practical), which would allow for a lot of the standard library types to be used "for free" once it was set up, or just a custom allocator that's GBA specific if Rust's global allocator style isn't a good fit for the GBA (I honestly haven't looked into it).
LLVM Intrinsics
TODO: explain that we'll occasionally have to provide some intrinsics.
Bare Metal Panic
TODO: expand this
- Write
0xC0DE
to0x4fff780
(u16
) to enable mGBA logging. Write any other value to disable it. - Read
0x4fff780
(u16
) to check mGBA logging status.- You get
0x1DEA
if debugging is active. - Otherwise you get standard open bus nonsense values.
- You get
- Write your message into the virtual
[u8; 255]
array starting at0x4fff600
. mGBA will interpret these bytes as a CString value. - Write
0x100
PLUS the message level to0x4fff700
(u16
) when you're ready to send a message line:- 0: Fatal (halts execution with a popup)
- 1: Error
- 2: Warning
- 3: Info
- 4: Debug
- Sending the message also automatically zeroes the output buffer.
- View the output within the "Tools" menu, "View Logs...". Note that the Fatal message, if any doesn't get logged.
Fixed Only
In addition to not having the standard library available, we don't even have a floating point unit available! We can't do floating point math in hardware! We could still do floating point math as software computations if we wanted, but that's a slow, slow thing to do.
Instead let's learn about another way to have fractional values called "Fixed Point"
Fixed Point
TODO: describe fixed point, make some types, do the impls, all that.
Volatile Destination
TODO: replace all this one "the rant" is finalized
There's a reasonable chance that you've never heard of volatile
before, so
what's that? Well, it's a term that can be used in more than one context, but
basically it means "get your grubby mitts off my stuff you over-eager compiler".
Volatile Memory
The first, and most common, form of volatile thing is volatile memory. Volatile memory can change without your program changing it, usually because it's not a location in RAM, but instead some special location that represents an actual hardware device, or part of a hardware device perhaps. The compiler doesn't know what's going on in this situation, but when the program is actually run and the CPU gets an instruction to read or write from that location, instead of just accessing some place in RAM like with normal memory, it accesses whatever bit of hardware and does something. The details of that something depend on the hardware, but what's important is that we need to actually, definitely execute that read or write instruction.
This is not how normal memory works. Normally when the compiler sees us write values into variables and read values from variables, it's free to optimize those expressions and eliminate some of the reads and writes if it can, and generally try to save us time. Maybe it even knows some stuff about the data dependencies in our expressions and so it does some of the reads or writes out of order from what the source says, because the compiler knows that it won't actually make a difference to the operation of the program. A good and helpful friend, that compiler.
Volatile memory works almost the opposite way. With volatile memory we need the compiler to definitely emit an instruction to do a read or write and they need to happen exactly in the order that we say to do it. Each volatile read or write might have any sort of side effect that the compiler doesn't know about, and it shouldn't try to be clever about the optimization. Just do what we say, please.
In Rust, we don't mark volatile things as being a separate type of thing, instead we use normal raw pointers and then call the read_volatile and write_volatile functions (also available as methods, if you like), which then delegate to the LLVM volatile_load and volatile_store intrinsics. In C and C++ you can tag a pointer as being volatile and then any normal read and write with it becomes the volatile version, but in Rust we have to remember to use the correct alternate function instead.
I'm told by the experts that this makes for a cleaner and saner design from a
language design perspective, but it really kinda screws us when doing low
level code. References, both mutable and shared, aren't volatile, so they
compile into normal reads and writes. This means we can't do anything we'd
normally do in Rust that utilizes references of any kind. Volatile blocks of
memory can't use normal .iter()
or .iter_mut()
based iteration (which give
&T
or &mut T
), and they also can't use normal Index
and IndexMut
sugar
like a + x[i]
or x[i] = 7
.
Unlike with normal raw pointers, this pain point never goes away. There's no way
to abstract over the difference with Rust as it exists now, you'd need to
actually adjust the core language by adding an additional pointer type (*vol T
) and possibly a reference type to go with it (&vol T
) to get the right
semantics. And then you'd need an IndexVol
trait, and you'd need
.iter_vol()
, and so on for every other little thing. It would be a lot of
work, and the Rust developers just aren't interested in doing all that for such
a limited portion of their user population. We'll just have to deal with not
having any syntax sugar.
VolatilePtr
No syntax sugar doesn't mean we can't at least make things a little easier for
ourselves. Enter the VolatilePtr<T>
type, which is a newtype over a *mut T
.
One of those "manual" newtypes I mentioned where we can't use our nice macro.
# #![allow(unused_variables)] #fn main() { #[derive(Debug, Clone, Copy, Hash, PartialEq, Eq, PartialOrd, Ord)] #[repr(transparent)] pub struct VolatilePtr<T>(pub *mut T); #}
Obviously we want to be able to read and write:
# #![allow(unused_variables)] #fn main() { impl<T> VolatilePtr<T> { /// Performs a `read_volatile`. pub unsafe fn read(self) -> T { self.0.read_volatile() } /// Performs a `write_volatile`. pub unsafe fn write(self, data: T) { self.0.write_volatile(data); } #}
And we want a way to jump around when we do have volatile memory that's in
blocks. This is where we can get ourselves into some trouble if we're not
careful. We have to decide between
offset and
wrapping_offset.
The difference is that offset
optimizes better, but also it can be Undefined
Behavior if the result is not "in bounds or one byte past the end of the same
allocated object". I asked ubsan (who is the expert
that you should always listen to on matters like this) what that means exactly
when memory mapped hardware is involved (since we never allocated anything), and
the answer was that you can use an offset
in statically memory mapped
situations like this as long as you don't use it to jump to the address of
something that Rust itself allocated at some point. Cool, we all like being able
to use the one that optimizes better. Unfortunately, the downside to using
offset
instead of wrapping_offset
is that with offset
, it's Undefined
Behavior simply to calculate the out of bounds result (with wrapping_offset
it's not Undefined Behavior until you use the out of bounds result). We'll
have to be quite careful when we're using offset
.
# #![allow(unused_variables)] #fn main() { /// Performs a normal `offset`. pub unsafe fn offset(self, count: isize) -> Self { VolatilePtr(self.0.offset(count)) } #}
Now, one thing of note is that doing the offset
isn't const
. The math for it
is something that's possible to do in a const
way of course, but Rust
basically doesn't allow you to fiddle raw pointers much during const
right
now. Maybe in the future that will improve.
If we did want to have a const
function for finding the correct address within
a volatile block of memory we'd have to do all the math using usize
values,
and then cast that value into being a pointer once we were done. It'd look
something like this:
# #![allow(unused_variables)] #fn main() { const fn address_index<T>(address: usize, index: usize) -> usize { address + (index * std::mem::size_of::<T>()) } #}
But, back to methods for VolatilePtr
, well we sometimes want to be able to
cast a VolatilePtr
between pointer types. Since we won't be able to do that
with as
, we'll have to write a method for it:
# #![allow(unused_variables)] #fn main() { /// Performs a cast into some new pointer type. pub fn cast<Z>(self) -> VolatilePtr<Z> { VolatilePtr(self.0 as *mut Z) } #}
Volatile Iterating
How about that Iterator
stuff I said we'd be missing? We can actually make
an Iterator available, it's just not the normal "iterate by shared reference
or unique reference" Iterator. Instead, it's more like a "throw out a series of
VolatilePtr
values" style Iterator. Other than that small difference it's
totally normal, and we'll be able to use map and skip and take and all those
neat methods.
So how do we make this thing we need? First we check out the Implementing Iterator section in the core documentation. It says we need a struct for holding the iterator state. Right-o, probably something like this:
# #![allow(unused_variables)] #fn main() { #[derive(Debug, Clone, Hash, PartialEq, Eq)] pub struct VolatilePtrIter<T> { vol_ptr: VolatilePtr<T>, slots: usize, } #}
And then we just implement core::iter::Iterator on that struct. Wow, that's quite the trait though! Don't worry, we only need to implement two small things and then the rest of it comes free as a bunch of default methods.
So, the code that we want to write looks like this:
# #![allow(unused_variables)] #fn main() { impl<T> Iterator for VolatilePtrIter<T> { type Item = VolatilePtr<T>; fn next(&mut self) -> Option<VolatilePtr<T>> { if self.slots > 0 { let out = Some(self.vol_ptr); self.slots -= 1; self.vol_ptr = unsafe { self.vol_ptr.offset(1) }; out } else { None } } } #}
Except we can't write that code. What? The problem is that we used
derive(Clone, Copy
on VolatilePtr
. Because of a quirk in how derive
works,
this means VolatilePtr<T>
will only be Copy
if the T
is Copy
, even
though the pointer itself is always Copy
regardless of what it points to.
Ugh, terrible. We've got three basic ways to handle this:
- Make the
Iterator
implementation be for<T:Clone>
, and then hope that we always have types that areClone
. - Hand implement every trait we want
VolatilePtr
(andVolatilePtrIter
) to have so that we can override the fact thatderive
is basically broken in this case. - Make
VolatilePtr
store ausize
value instead of a pointer, and then cast it to*mut T
when we actually need to read and write. This would require us to also store aPhantomData<T>
so that the type of the address is tracked properly, which would make it a lot more verbose to construct aVolatilePtr
value.
None of those options are particularly appealing. I guess we'll do the first one
because it's the least amount of up front trouble, and I don't think we'll
need to be iterating non-Clone values. All we do to pick that option is add the
bound to the very start of the impl
block, where we introduce the T
:
# #![allow(unused_variables)] #fn main() { impl<T: Clone> Iterator for VolatilePtrIter<T> { type Item = VolatilePtr<T>; fn next(&mut self) -> Option<VolatilePtr<T>> { if self.slots > 0 { let out = Some(self.vol_ptr.clone()); self.slots -= 1; self.vol_ptr = unsafe { self.vol_ptr.clone().offset(1) }; out } else { None } } } #}
What's going on here? Okay so our iterator has a number of slots that it'll go
over, and then when it's out of slots it starts producing None
forever. That's
actually pretty simple. We're also masking some unsafety too. In this case,
we'll rely on the person who made the VolatilePtrIter
to have selected the
correct number of slots. This gives us a new method for VolatilePtr
:
# #![allow(unused_variables)] #fn main() { pub unsafe fn iter_slots(self, slots: usize) -> VolatilePtrIter<T> { VolatilePtrIter { vol_ptr: self, slots, } } #}
With this design, making the VolatilePtrIter
at the start is unsafe
(we have
to trust the caller that the right number of slots exists), and then using it
after that is totally safe (if the right number of slots was given we'll never
screw up our end of it).
VolatilePtr Formatting
Also, just as a little bonus that we probably won't use, we could enable our new pointer type to be formatted as a pointer value.
# #![allow(unused_variables)] #fn main() { impl<T> core::fmt::Pointer for VolatilePtr<T> { /// Formats exactly like the inner `*mut T`. fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result { write!(f, "{:p}", self.0) } } #}
Neat!
VolatilePtr Complete
That was a lot of small code blocks, let's look at it all put together:
# #![allow(unused_variables)] #fn main() { #[derive(Debug, Clone, Copy, PartialEq, Eq, PartialOrd, Ord, Hash)] #[repr(transparent)] pub struct VolatilePtr<T>(pub *mut T); impl<T> VolatilePtr<T> { pub unsafe fn read(self) -> T { self.0.read_volatile() } pub unsafe fn write(self, data: T) { self.0.write_volatile(data); } pub unsafe fn offset(self, count: isize) -> Self { VolatilePtr(self.0.offset(count)) } pub fn cast<Z>(self) -> VolatilePtr<Z> { VolatilePtr(self.0 as *mut Z) } pub unsafe fn iter_slots(self, slots: usize) -> VolatilePtrIter<T> { VolatilePtrIter { vol_ptr: self, slots, } } } impl<T> core::fmt::Pointer for VolatilePtr<T> { fn fmt(&self, f: &mut core::fmt::Formatter) -> core::fmt::Result { write!(f, "{:p}", self.0) } } #[derive(Debug, Clone, Hash, PartialEq, Eq)] pub struct VolatilePtrIter<T> { vol_ptr: VolatilePtr<T>, slots: usize, } impl<T: Clone> Iterator for VolatilePtrIter<T> { type Item = VolatilePtr<T>; fn next(&mut self) -> Option<VolatilePtr<T>> { if self.slots > 0 { let out = Some(self.vol_ptr.clone()); self.slots -= 1; self.vol_ptr = unsafe { self.vol_ptr.clone().offset(1) }; out } else { None } } } #}
Volatile ASM
In addition to some memory locations being volatile, it's also possible for inline assembly to be declared volatile. This is basically the same idea, "hey just do what I'm telling you, don't get smart about it".
Normally when you have some asm!
it's basically treated like a function,
there's inputs and outputs and the compiler will try to optimize it so that if
you don't actually use the outputs it won't bother with doing those
instructions. However, asm!
is basically a pure black box, so the compiler
doesn't know what's happening inside at all, and it can't see if there's any
important side effects going on.
An example of an important side effect that doesn't have output values would be
putting the CPU into a low power state while we want for the next VBlank. This
lets us save quite a bit of battery power. It requires some setup to be done
safely (otherwise the GBA won't ever actually wake back up from the low power
state), but the asm!
you use once you're ready is just a single instruction
with no return value. The compiler can't tell what's going on, so you just have
to say "do it anyway".
Newtype
There's a great Zero Cost abstraction that we'll be using a lot that you might not already be familiar with: we're talking about the "Newtype Pattern"!
Now, I told you to read the Rust Book before you read this book, and I'm sure you're all good students who wouldn't sneak into this book without doing the required reading, so I'm sure you all remember exactly what I'm talking about, because they touch on the newtype concept in the book twice, in two very long named sections:
- Using the Newtype Pattern to Implement External Traits on External Types
- Using the Newtype Pattern for Type Safety and Abstraction
...Yeah... The Rust Book doesn't know how to make a short sub-section name to save its life. Shame.
Newtype Basics
So, we have all these pieces of data, and we want to keep them separated, and we don't wanna pay the cost for it at runtime. Well, we're in luck, we can pay the cost at compile time.
# #![allow(unused_variables)] #fn main() { pub struct PixelColor(u16); #}
Ah, except that, as I'm sure you remember from The
Rustonomicon
(and from the
RFC
too, of course), if we have a single field struct that's sometimes different
from having just the bare value, so we should be using #[repr(transparent)]
with our newtypes.
# #![allow(unused_variables)] #fn main() { #[repr(transparent)] pub struct PixelColor(u16); #}
Ah, and of course we'll need to make it so you can unwrap the value:
# #![allow(unused_variables)] #fn main() { #[repr(transparent)] pub struct PixelColor(u16); impl From<PixelColor> for u16 { fn from(color: PixelColor) -> u16 { color.0 } } #}
And then we'll need to do that same thing for every other newtype we want.
Except there's only two tiny parts that actually differ between newtype declarations: the new name and the base type. All the rest is just the same rote code over and over. Generating piles and piles of boilerplate code? Sounds like a job for a macro to me!
Making It A Macro
The most basic version of the macro we want goes like this:
# #![allow(unused_variables)] #fn main() { #[macro_export] macro_rules! newtype { ($new_name:ident, $old_name:ident) => { #[repr(transparent)] pub struct $new_name($old_name); }; } #}
Except we also want to be able to add attributes (which includes doc comments), so we upgrade our macro a bit:
# #![allow(unused_variables)] #fn main() { #[macro_export] macro_rules! newtype { ($(#[$attr:meta])* $new_name:ident, $old_name:ident) => { $(#[$attr])* #[repr(transparent)] pub struct $new_name($old_name); }; } #}
And we want to automatically add the ability to turn the wrapper type back into the wrapped type.
# #![allow(unused_variables)] #fn main() { #[macro_export] macro_rules! newtype { ($(#[$attr:meta])* $new_name:ident, $old_name:ident) => { $(#[$attr])* #[repr(transparent)] pub struct $new_name($old_name); impl From<$new_name> for $old_name { fn from(x: $new_name) -> $old_name { x.0 } } }; } #}
That seems like enough for all of our examples, so we'll stop there. We could add more things:
- Making the
From
impl being optional. We'd have to make the newtype invocation be more complicated somehow, the user puts ", no-unwrap" after the inner type declaration or something, or something like that. - Allowing for more precise visibility controls on the wrapping type and on the
inner field. This would add a lot of line noise, so we'll just always have our
newtypes be
pub
. - Allowing for generic newtypes, which might sound silly but that we'll actually
see an example of soon enough. To do this you might think that we can change
the
:ident
declarations to:ty
, but since we're declaring a fresh type not using an existing type we have to accept it as an:ident
. The way you get around this is with a proc-macro, which is a lot more powerful but which also requires that you write the proc-macro in an entirely other crate that gets compiled first. We don't need that much power, so for our examples we'll go with the macro_rules version and just do it by hand in the few cases where we need a generic newtype. - Allowing for
Deref
andDerefMut
, which usually defeats the point of doing the newtype, but maybe sometimes it's the right thing, so if you were going for the full industrial strength version with a proc-macro and all you might want to make that part of your optional add-ons as well the same way you might want optionalFrom
. You'd probably wantFrom
to be "on by default" andDeref
/DerefMut
to be "off by default", but whatever.
As a reminder: remember that macro_rules
macros have to appear before
they're invoked in your source, so the newtype
macro will always have to be at
the very top of your file, or if you put it in a module within your project
you'll need to declare the module before anything that uses it.