Merge pull request #64 from rust-console/lokathor

Lokathor
This commit is contained in:
Lokathor 2019-02-11 13:55:17 -07:00 committed by GitHub
commit 5113a5a1f2
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
9 changed files with 468 additions and 22 deletions

View file

@ -11,7 +11,6 @@ rust:
before_script:
- rustup component add rust-src
- rustup component add clippy --toolchain=nightly
- (test -x $HOME/.cargo/bin/cargo-install-update || cargo install cargo-update)
- (test -x $HOME/.cargo/bin/cargo-xbuild || cargo install cargo-xbuild)
- (test -x $HOME/.cargo/bin/cargo-make || cargo install cargo-make)
@ -31,15 +30,13 @@ script:
- export PATH="$PATH:/opt/devkitpro/tools/bin"
- cd ..
# Run all verificaions, both debug and release
#- cargo clippy
#- cargo clippy --release
- cargo test --no-fail-fast --lib
- cargo test --no-fail-fast --lib --release
- cargo test --no-fail-fast --tests
- cargo test --no-fail-fast --tests --release
- cargo test --lib
- cargo test --lib --release
- cargo test --tests
- cargo test --tests --release
# Let cargo make take over the rest
- cargo make justrelease
# Test build the book so that a failed book build kills this run
# Test a build of the book so that a failed book build kills this run
- cd book && mdbook build
deploy:

View file

@ -2,4 +2,8 @@
# Rust GBA Guide
* [Development Setup](development-setup.md)
* [Volatile](volatile.md)
* [The Hardware Memory Map](the-hardware-memory-map.md)
* [IO Registers](io-registers.md)
* [Bitmap Video](bitmap-video.md)
* [GBA Assembly](gba-asm.md)

1
book/src/bitmap-video.md Normal file
View file

@ -0,0 +1 @@
# Bitmap Video

View file

@ -171,12 +171,14 @@ fn main(_argc: isize, _argv: *const *const u8) -> isize {
}
```
Throw that into your project skeleton, build the program, and give it a run. You
should see a red, green, and blue dot close-ish to the middle of the screen. If
you don't, something _already_ went wrong. Double check things, phone a friend,
write your senators, try asking `Lokathor` or `Ketsuban` on the [Rust Community
Throw that into your project skeleton, build the program, and give it a run in
an emulator. I suggest [mgba](https://mgba.io/2019/01/26/mgba-0.7.0/), it has
some developer tools we'll use later on. You should see a red, green, and blue
dot close-ish to the middle of the screen. If you don't, something _already_
went wrong. Double check things, phone a friend, write your senators, try asking
`Lokathor` or `Ketsuban` on the [Rust Community
Discord](https://discordapp.com/invite/aVESxV8), until you're eventually able to
get your three dots going.
Of course, I'm sure you want to know why those numbers are the numbers to use.
Well that's what the whole rest of the book is about!
Of course, I'm sure you want to know why those particular numbers are the
numbers to use. Well that's what the whole rest of the book is about!

View file

@ -17,21 +17,32 @@ sometimes. Accordingly, you should know how assembly works on the GBA.
Version](http://infocenter.arm.com/help/topic/com.arm.doc.ddi0210c/DDI0210B.pdf)
of the documentation available, if you'd like that.
* In addition to the `ARM7TDMI` book, which is specific to the GBA's CPU, you'll
need to find a copy of the ARM Architecture Reference Manual if you want
general ARM knowledge. The ARM Infocenter has the
[ARMv5](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0100i/index.html)
version of said manual hosted on their site. Unfortunately, they don't seem to
host the `ARMv4T` version of the manual any more.
* The [GBATek: ARM CPU
Overview](https://problemkaputt.de/gbatek.htm#armcpuoverview) also has quite a
bit of info. Some of it is a duplication of what you'd find in the ARM
Infocenter reference manual. Some of it is specific to the GBA's chip. Some of
it is specific to the ARM chips within the DS and DSi. It's a bit of a jumbled
mess, and as with the rest of GBATEK, the explanations are in a "sparse" style
(to put it nicely), so I wouldn't take it as your only source.
Infocenter reference manuals. Some of it is information that's specific to the
GBA's layout and how the CPU interacts with other parts (such as how its
timings and the display adapter's timings line up). Some of it is specific to
the ARM chips _within the DS and DSi_, so be careful to make sure that you
don't wander into the wrong section. GBATEK is always a bit of a jumbled mess,
and the explanations are often "sparse" (to put it nicely), so I'd advise that
you also look at the official ARM manuals.
* The [Compiler Explorer](https://rust.godbolt.org/z/ndCnk3) can be used to
quickly look at assembly versions of your Rust code. That link there will load
up an essentially blank `no_std` file with `opt-level=3` set and targeting
`thumbv6m-none-eabi`. That's _not_ the same target as the GBA (it's two ISA
revisions later, ARMv6 instead of ARMv4), but it's the closest CPU target that
is bundled with rustc, so it's the closest you can get with the compiler
explorer website. If you're very dedicated I suppose you could setup a [local
revisions later, `ARMv6` instead of `ARMv4`), but it's the closest CPU target
that is bundled with `rustc`, so it's the closest you can get with the
compiler explorer website. If you're very dedicated I suppose you could setup
a [local
instance](https://github.com/mattgodbolt/compiler-explorer#running-a-local-instance)
of compiler explorer and then add the extra target definition and so on, but
that's _probably_ overkill.

1
book/src/io-registers.md Normal file
View file

@ -0,0 +1 @@
# IO Registers

View file

@ -0,0 +1,379 @@
# The Hardware Memory Map
So we saw `hello_magic.rs` and then we learned what `volatile` was all about,
but we've still got a few things that are a bit mysterious. You can't just cast
a number into a pointer and start writing to it! That's totally crazy! That's
writing to un-allocated memory! Against the rules!
Well, _kinda_. It's true that you're not allowed to write _anywhere at all_, but
those locations were carefully selected locations.
You see, on a modern computer if you need to check if a key is pressed you ask
the Operating System (OS) to please go check for you. If you need to play a
sound, you ask the OS to please play the sound on a default sound output. If you
need to show a picture you ask the OS to give you access to the video driver so
that you can ask the video driver to please put some pixels on the screen.
That's mostly fine, except how does the OS actually do it? It doesn't have an OS
to go ask, it has to stop somewhere.
Ultimately, every piece of hardware is mapped into somewhere in the address
space of the CPU. You can't actually tell that this is the case as a normal user
because your program runs inside a virtualized address space. That way you can't
go writing into another program's memory and crash what they're doing or steal
their data (well, hopefully, it's obviously not perfect). Outside of the
virtualization layer the OS is running directly in the "true" address space, and
it can access the hardware on behalf of a program whenever it's asked to.
How does directly accessing the hardware work, _precisely_? It's just the same
as accessing the RAM. Each address holds some bits, and the CPU picks an address
and loads in the bits. Then the program gets the bits and has to decide what
they mean. The "driver" of a hardware device is just the layer that translates
between raw bits in the outside world and more meaningful values inside of the
program.
Of course, memory mapped hardware can change its bits at any time. The user can
press and release a key and you can't stop them. This is where `volatile` comes
in. Whenever there's memory mapped hardware you want to access it with
`volatile` operations so that you can be sure that you're sending the data every
time, and that you're getting fresh data every time.
## GBA Specifics
That's enough about the general concept of memory mapped hardware, let's get to
some GBA specifics. The GBA has the following sections in its memory map.
* BIOS
* External Work RAM (EWRAM)
* Internal Work RAM (IWRAM)
* IO Registers
* Palette RAM (PALRAM)
* Video RAM (VRAM)
* Object Attribute Memory (OAM)
* Game Pak ROM (ROM)
* Save RAM (SRAM)
Each of these has a few key points of interest:
* **Bus Width:** Also just called "bus", this is how many little wires are
_physically_ connecting a part of the address space to the CPU. If you need to
transfer more data than fits in the bus you have to do repeated transfers
until it all gets through.
* **Read/Write Modes:** Most parts of the address space can be read from in 8,
16, or 32 bits at a time (there's a few exceptions we'll see). However, a
significant portion of the address space can't accept 8 bit writes. Usually
this isn't a big deal, but standard `memcopy` routine switches to doing a
byte-by-byte copy in some situations, so we'll have to be careful about using
it in combination with those regions of the memory.
* **Access Speed:** On top of the bus width issue, not all memory can be
accessed at the same speed. The "fast" parts of memory can do a read or write
in 1 cycle, but the slower parts of memory can take a few cycles per access.
These are called "wait cycles". The exact timings depend on what you configure
the system to use, which is also limited by what your cartridge physically
supports. You'll often see timings broken down into `N` cycles (non-sequential
memory access) and `S` cycles (sequential memory access, often faster). There
are also `I` cycles (internal cycles) which happen whenever the CPU does an
internal operation that's more than one cycle to complete (like a multiply).
Don't worry, you don't have to count exact cycle timings unless you're on the
razor's edge of the GBA's abilities. For more normal games you just have to be
mindful of what you're doing and it'll be fine.
Let's briefly go over the major talking points of each memory region. All of
this information is also available in GBATEK, mostly in their [memory
map](http://www.akkit.org/info/gbatek.htm#gbamemorymap) section (though somewhat
spread through the rest of the document too).
Though I'm going to list the location range of each memory space below, most of
the hardware locations are actually mirrored at several points throughout the
address space.
### BIOS
* **Location:** `0x0` to `0x3FFF`
* **Bus:** 32-bit
* **Access:** Memory protected read-only (see text).
* **Wait Cycles:** None
The "basic input output system". This contains a grab bag of utilities that do
various tasks. The code is optimized for small size rather than great speed, so
you can sometimes write faster versions of these routines. Also, calling a bios
function has more overhead than a normal function call. You can think of bios
calls as being similar to system calls to the OS on a desktop computer. Useful,
but costly.
As a side note, not only is BIOS memory read only, but it's memory protected so
that you can't even read from bios memory unless the system is currently
executing a function that's in bios memory. If you try then the system just
gives back a nonsensical value that's not really what you asked for. If you
really want to know what's inside, there's actually a bug in one bios call
(`MidiKey2Freq`) that lets you read the bios section one byte at a time.
Also, there's not just one bios! Of course there's the official bios from
Nintendo that's used on actual hardware, but since that's code instead of
hardware it's protected by copyright. Since a bios is needed to run a GBA
emulator properly, people have come up with their own open source versions or
they simply make the emulator special case the bios and act _as if_ the function
call had done the right thing.
* The [TempGBA](https://github.com/Nebuleon/TempGBA) repository has an easy to
look at version written in assembly. It's API and effects are close enough to
the Nintendo version that most games will run just fine.
* You can also check out the [mGBA
bios](https://github.com/mgba-emu/mgba/blob/master/src/gba/bios.c) if you want
to see the C version of what various bios functions are doing.
### External Work RAM (EWRAM)
* **Location:** `0x200_0000` to `0x203_FFFF` (256k)
* **Bus:** 16-bit
* **Access:** Read-write, any size.
* **Wait Cycles:** 2
The external work ram is a sizable amount of space, but the 2 wait cycles per
access and 16-bit bus mean that you should probably think of it as being a
"heap" to avoid putting things in if you don't have to.
The GBA itself doesn't use this for anything, so any use is totally up to you.
At the moment, the linker script and `crt0.s` files provided with the `gba`
crate also have no defined use for the EWRAM, so it's 100% on you to decide how
you wanna use them.
(Note: There is an undocumented control register that lets you adjust the wait
cycles on EWRAM. Using it, you can turn EWRAM from the default 2 wait cycles
down to 1. However, not all GBA-like things support it. The GBA and GBA SP do,
the GBA Micro and DS do not. Emulators might or might not depending on the
particular emulator. See the [GBATEK system
control](https://problemkaputt.de/gbatek.htm#gbasystemcontrol) page for a full
description of that register, though probably only once you've read more of this
tutorial book and know how to make sense of IO registers and such.)
### Internal Work RAM (IWRAM)
* **Location:** `0x300_0000` to `0x300_7FFF` (32k)
* **Bus:** 32-bit
* **Access:** Read-write, any size.
* **Wait Cycles:** 0
This is where the "fast" memory for general purposes lives. By default the
system uses the 256 _bytes_ starting at `0x300_7F00` _and up_ for system and
interrupt purposes, while Rust's program stack starts at that same address _and
goes down_ from there.
Even though your stack exists in this space, it's totally reasonable to use the
bottom parts of this memory space for whatever quick scratch purposes, same as
EWRAM. 32k is fairly huge, and the stack going down from the top and the scratch
data going up from the bottom are unlikely to hit each other. If they do you
were probably well on your way to a stack overflow anyway.
The linker script and `crt0.s` file provided with the `gba` crate use the bottom
of IWRAM to store the `.data` and `.bss` [data
segments](https://en.wikipedia.org/wiki/Data_segment). That's where your global
variables get placed (both `static` and `static mut`). The `.data` segment holds
any variable that's initialized to non-zero, and the `.bss` section is for any
variable initialized to zero. When the GBA is powered on, some code in the
`crt0.s` file runs and copies the initial `.data` values into place within IWRAM
(all of `.bss` starts at 0, so there's no copy for those variables).
If you have no global variables at all, then you don't need to worry about those
details, but if you do have some global variables then you can use the _address
of_ the `__bss_end` symbol defined in the top of the `gba` crate as a marker for
where it's safe for you to start using IWRAM without overwriting your globals.
### IO Registers
* **Location:** `0x400_0000` to `0x400_03FE`
* **Bus:** 32-bit
* **Access:** different for each IO register
* **Wait Cycles:** 0
The IO Registers are where most of the magic happens, and it's where most of the
variety happens too. Each IO register is a specific width, usually 16-bit but
sometimes 32-bit. Most of them are fully read/write, but some of them are read
only or write only. Some of them have individual bits that are read only even
when the rest of the register is writable. Some of them can be written to, but
the write doesn't change the value you read back, it sets something else.
Really.
The IO registers are how you control every bit of hardware besides the CPU
itself. Reading the buttons, setting display modes, enabling timers, all of that
goes through different IO registers. Actually, even a few parts of the CPU's
operation can be controlled via IO register.
We'll go over IO registers more in the next section, including a few specific
registers, and then we'll constantly encounter more IO registers as we explore
each new topic through the rest of the book.
### Palette RAM (PALRAM)
* **Location:** `0x500_0000` to `0x500_03FF` (1k)
* **Bus:** 16-bit
* **Access:** Read any, single bytes mirrored (see text).
* **Wait Cycles:** Video Memory Wait (see text)
This is where the GBA stores color palette data. There's 256 slots for
Background color, and then 256 slots for Object color.
GBA colors are 15 bits each, with five bits per channel and the highest bit
being totally ignored, so we store them as `u16` values:
* `X_BBBBB_GGGGG_RRRRR`
Of note is the fact that the 256 palette slots can be viewed in two different
ways. There's two different formats for images in video memory: "8 bit per
pixel" (8bpp) and "4 bit per pixel mode" (4bpp).
* **8bpp:** Each pixel in the image is 8 bits and indexes directly into the full
256 entry palette array. An index of 0 means that pixel should be transparent,
so there's 255 possible colors.
* **4bpp:** Each pixel in the image is 4 bits and indexes into a "palbank" of 16
colors within the palette data. Some exterior control selects the palbank to
be used. An index of 0 still means that the pixel should be transparent, so
there's 15 possible colors.
Different images can use different modes all at once, as long as you can fit all
the colors you want to use into your palette layout.
PALRAM can't be written to in individual bytes. This isn't normally a problem at
all, because you wouldn't really want to write half of a color entry anyway. If
you do try to write a single byte then it gets "mirrored" into both halves of
the `u16` that would be associated with that address. For example, if you tried
to write `0x01u8` to either `0x500_0000` or `0x500_0001` then you'd actually
_effectively_ be writing `0x0101u16` to `0x500_0000`.
PALRAM follows what we'll call the "Video Memory Wait" rule: If you to access
the memory during a vertical blank or horizontal blank period there's 0 wait
cycles, and if you try to access the memory while the display controller is
drawing there is a 1 cycle wait inserted _if_ the display controller was using
that memory at that moment.
### Video RAM (VRAM)
* **Location:** `0x600_0000` to `0x601_7FFF` (96k or 64k+32k depending on mode)
* **Bus:** 16-bit
* **Access:** Read any, single bytes _sometimes_ mirrored (see text).
* **Wait Cycles:** Video Memory Wait (see text)
Video RAM is the memory for what you want the display controller to be
displaying. The GBA actually has 6 different display modes (numbered 0 through
5), and depending on the mode you're using the layout that you should imagine
VRAM having changes. Because there's so much involved here, I'll leave more
precise details to the following sections which talk about how to use VRAM in
each mode.
VRAM can't be written to in individual bytes. If you try to write a single byte
to background VRAM the byte gets mirrored like with PALRAM, and if you try with
object VRAM the write gets ignored entirely. Exactly what address ranges those
memory types are depends on video mode, but just don't bother with individual
byte writes to VRAM. If you want to change a single byte of data (and you might)
then the correct style is to read the full `u16`, mask out the old data, mask in
your new value, and then write the whole `u16`.
VRAM follows the same "Video Memory Wait" rule that PALRAM has.
### Object Attribute Memory (OAM)
* **Location:** `0x700_0000` to `0x700_03FF` (1k)
* **Bus:** 32-bit
* **Access:** Read any, single bytes no effect (see text).
* **Wait Cycles:** Video Memory Wait (see text)
This part of memory controls the "Objects" (OBJ) on the screen. An object is
_similar to_ the concept of a "sprite". However, because of an object's size
limitations, a single sprite might require more than one object to be drawn
properly. In general, if you want to think in terms of sprites at all, you
should think of sprites as being a logical / programming concept, and objects as
being a hardware concept.
While VRAM has the _image_ data for each object, this part of memory has the
_control_ data for each object. An objects "attributes" describe what part of
the VRAM to use, where to place is on the screen, any special graphical effects
to use, all that stuff. Each object has 6 bytes of attribute data (arranged as
three `u16` values), and there's a total of 128 objects (indexed 0 through 127).
But 6 bytes each times 128 entries out of 1024 bytes leaves us with 256 bytes
left over. What's the other space used for? Well, it's a little weird, but after
every three `u16` object attribute fields there's one `i16` "affine parameter"
field mixed in. It takes four such fields to make a complete set of affine
parameters (a 2x2 matrix), so we get a total of 32 affine parameter entries
across all of OAM. "Affine" might sound fancy but it just means a transformation
where anything that started parallel stays parallel after the transform. The
affine parameters can be used to scale, rotate, and/or skew a background or
object as it's being displayed on the screen. It takes more computing power than
the non-affine display, so you can't display as many different things at once
when using the affine modes.
OAM can't ever be written to with individual bytes. The write just has no effect
at all.
OAM follows the same "Video Memory Wait" rule that PALRAM has, **and** you can
also only freely access OAM during a horizontal blank if you set a special
"HBlank Interval Free" bit in one of the IO registers (the "Display Control"
register, which we'll talk about next lesson). The reason that you might _not_
want to set that bit is because when it's enabled you can't draw as many objects
at once. You don't lose the use of an exact number of objects, you actually lose
the use of a number of display adapter drawing cycles. Since not all objects
take the same number of cycles to render, it depends on what you're drawing.
GBATEK [has the details](https://problemkaputt.de/gbatek.htm#lcdobjoverview) if
you want to know precisely.
### Game Pak ROM (ROM)
* **Location:** Special (max of 32MB)
* **Bus:** 16-bit
* **Access:** Special
* **Wait Cycles:** Special
This is where your actual game is located! As you might guess, since each
cartridge is different, the details here depend quite a bit on the cartridge
that you use for your game. Even a simple statement like "you can't write to the
ROM region" isn't true for some carts if they have FlashROM.
The _most important_ thing to concern yourself with when considering the ROM
portion of memory is the 32MB limit. That's compiled code, images, sound,
everything put together. The total has to stay under 32MB.
The next most important thing to consider is that 16-bit bus. It means that we
compile our programs using "Thumb state" code instead of "ARM state" code.
Details about this can be found in the GBA Assembly section of the book, but
just be aware that there's two different types of assembly on the GBA. You can
switch between them, but the default for us is always Thumb state.
Another detail which you actually _don't_ have to think about much, but that you
might care if you're doing precise optimization, is that the ROM address space
is actually mirrored across three different locations:
* `0x800_0000` to `0x9FF_FFFF`: Wait State 0
* `0xA00_0000` to `0xBFF_FFFF`: Wait State 1
* `0xC00_0000` to `0xDFF_FFFF`: Wait State 2
These _don't_ mean 0, 1, and 2 wait cycles, they mean the wait cycles associated
with ROM mirrors 0, 1, and 2. On some carts the game will store different parts
of the data into different chips that are wired to be accessible through
different parts of the mirroring. The actual wait cycles used are even
configurable via an IO register called the
[WAITCNT](https://problemkaputt.de/gbatek.htm#gbasystemcontrol) ("Wait Control",
I don't know why C programmers have to give everything the worst names it's not
1980 any more).
### Save RAM (SRAM)
* **Location:** Special (max of 64k)
* **Bus:** 8-bit
* **Access:** Special
* **Wait Cycles:** Special
The Save RAM is also part of the cart that you've got your game on, so it also
depends on your hardware.
SRAM _starts_ at `0xE00_0000` and you can save up to however much the hardware
supports, to a maximum of 64k. However, you can only read and write SRAM one
_byte_ at a time. What's worse, while you can _write_ to SRAM using code
executing anywhere, you can only _read_ with code that's executing out of either
Internal or External Work RAM, not from with code that's executing out of ROM.
This means that you need to copy the code for doing the read into some scratch
space (either at startup or on the fly, doesn't matter) and call that function
you've carefully placed. It's a bit annoying, but soon enough a routine for it
all will be provided in the `gba` crate and we won't have to worry too much
about it.
(TODO: Provide the routine that I just claimed we would provide.)

48
book/src/volatile.md Normal file
View file

@ -0,0 +1,48 @@
# Volatile
I know that you just got your first program running and you're probably excited
to learn more about GBA stuff, but first we have to cover a subject that's not
quite GBA specific.
In the `hello_magic.rs` file we had these lines
```rust
(0x600_0000 as *mut u16).offset(120 + 80 * 240).write_volatile(0x001F);
(0x600_0000 as *mut u16).offset(136 + 80 * 240).write_volatile(0x03E0);
(0x600_0000 as *mut u16).offset(120 + 96 * 240).write_volatile(0x7C00);
```
You've probably seen or heard of the
[write](https://doc.rust-lang.org/core/ptr/fn.write.html) function before, but
you'd be excused if you've never heard of its cousin function,
[write_volatile](https://doc.rust-lang.org/core/ptr/fn.write_volatile.html).
What's the difference? Well, when the compiler sees normal reads and writes, it
assumes that those go into plain old memory locations. CPU registers, RAM,
wherever it is that the value's being placed. The compiler assumes that it's
safe to optimize away some of the reads and writes, or maybe issue the reads and
writes in a different order from what you wrote. Normally this is okay, and it's
exactly what we want the compiler to be doing, quietly making things faster for us.
However, some of the time we access values from parts of memory where it's
important that each access happen, and in the exact order that we say. In our
`hello_magic.rs` example, we're writing directly into the video memory of the
display. The compiler sees that the rest of the Rust program never read out of
those locations, so it might think "oh, we can skip those writes, they're
pointless". It doesn't know that we're having a side effect besides just storing
some value at an address.
By declaring a particular read or write to be `volatile` then we can force the
compiler to issue that access. Further, we're guaranteed that all `volatile`
access will happen in exactly the order it appears in the program relative to
other `volatile` access. However, non-volatile access can still be re-ordered
relative to a volatile access. In other words, for parts of the memory that are
volatile, we must _always_ use a volatile read or write for our program to
perform properly.
For exactly this reason, we've got the [voladdress](https://docs.rs/voladdress/)
crate. It used to be part of the GBA crate, but it became big enough to break
out into a stand alone crate. It doesn't even do too much, it just makes it a
lot less error prone to accidentally forget to use volatile with our memory
mapped addresses. We just call `read` and `write` on any `VolAddress` that we
happen to see and the right thing will happen.

View file

@ -133,7 +133,10 @@ extern "C" {
/// Memory in IWRAM _before_ this location is not free to use, you'll trash
/// your globals and stuff. Memory here or after is freely available for use
/// (careful that you don't run into your own stack of course).
static __bss_end: u8;
///
/// The actual value is unimportant, you just want to use the _address of_
/// this location as the start of your IWRAM usage.
pub static __bss_end: u8;
}
newtype! {