moving old text to new locations, notes of where to go

This commit is contained in:
Lokathor 2018-12-16 22:01:23 -07:00
parent 779770a187
commit 046e80851f
10 changed files with 210 additions and 264 deletions

View file

@ -1,256 +0,0 @@
# GBA Memory Mapping
The [GBA Memory Map](http://problemkaputt.de/gbatek.htm#gbamemorymap) has
several memory portions to it, each with their own little differences. Most of
the memory has pre-determined use according to the hardware, but there is also
space for games to use as a scratch pad in whatever way the game sees fit.
The memory ranges listed here are _inclusive_, so they end with a lot of F's
and E's.
We've talked about volatile memory before, but just as a reminder I'll say that
all of the memory we'll talk about here should be accessed using volatile with
two exceptions:
1) Work RAM (both internal and external) can be used normally, and if the
compiler is able to totally elide some reads and writes that's okay.
2) However, if you set aside any space in Work RAM where an interrupt will
communicate with the main program then that specific location will have to
keep using volatile access, since the compiler never knows when an interrupt
will actually happen.
## BIOS / System ROM
* `0x0` to `0x3FFF` (16k)
This is special memory for the BIOS. It is "read-only", but even then it's only
accessible when the program counter is pointing into the BIOS region. At all
other times you get a [garbage
value](http://problemkaputt.de/gbatek.htm#gbaunpredictablethings) back when you
try to read out of the BIOS.
## External Work RAM / EWRAM
* `0x2000000` to `0x203FFFF` (256k)
This is a big pile of space, the use of which is up to each game. However, the
external work ram has only a 16-bit bus (if you read/write a 32-bit value it
silently breaks it up into two 16-bit operations) and also 2 wait cycles (extra
CPU cycles that you have to expend _per 16-bit bus use_).
It's most helpful to think of EWRAM as slower, distant memory, similar to the
"heap" in a normal application. You can take the time to go store something
within EWRAM, or to load it out of EWRAM, but if you've got several operations
to do in a row and you're worried about time you should pull that value into
local memory, work on your local copy, and then push it back out to EWRAM.
## Internal Work RAM / IWRAM
* `0x3000000` to `0x3007FFF` (32k)
This is a smaller pile of space, but it has a 32-bit bus and no wait.
By default, `0x3007F00` to `0x3007FFF` is reserved for interrupt and BIOS use.
The rest of it is totally up to you. The user's stack space starts at
`0x3007F00` and proceeds _down_ from there. For best results you should probably
start at `0x3000000` and then go upwards. Under normal use it's unlikely that
the two memory regions will crash into each other.
## IO Registers
* `0x4000000` to `0x40003FE`
We've touched upon a few of these so far, and we'll get to more later. At the
moment it is enough to say that, as you might have guessed, all of them live in
this region. Each individual register is a `u16` or `u32` and they control all
sorts of things. We'll actually be talking about some more of them in this very
chapter, because that's how we'll control some of the background and object
stuff.
## Palette RAM / PALRAM
* `0x5000000` to `0x50003FF` (1k)
Palette RAM has a 16-bit bus, which isn't really a problem because it
conceptually just holds `u16` values. There's no automatic wait state, but if
you try to access the same location that the display controller is accessing you
get bumped by 1 cycle. Since the display controller can use the palette ram any
number of times per scanline it's basically impossible to predict if you'll have
to do a wait or not during VDraw. During VBlank you won't have any wait of
course.
PALRAM is among the memory where there's weirdness if you try to write just one
byte: if you try to write just 1 byte, it writes that byte into _both_ parts of
the larger 16-bit location. This doesn't really affect us much with PALRAM,
because palette values are all supposed to be `u16` anyway.
The palette memory actually contains not one, but _two_ sets of palettes. First
there's 256 entries for the background palette data (starting at `0x5000000`),
and then there's 256 entries for object palette data (starting at `0x5000200`).
The GBA also has two modes for palette access: 8-bits-per-pixel (8bpp) and
4-bits-per-pixel (4bpp).
* In 8bpp mode an 8-bit palette index value within a background or sprite
simply indexes directly into the 256 slots for that type of thing.
* In 4bpp mode a 4-bit palette index value within a background or sprite
specifies an index within a particular "palbank" (16 palette entries each),
and then a _separate_ setting outside of the graphical data determines which
palbank is to be used for that background or object (the screen entry data for
backgrounds, and the object attributes for objects).
### Transparency
When a pixel within a background or object specifies index 0 as its palette
entry it is treated as a transparent pixel. This means that in 8bpp mode there's
only 255 actual color options (0 being transparent), and in 4bpp mode there's
only 15 actual color options available within each palbank (the 0th entry of
_each_ palbank is transparent).
Individual backgrounds, and individual objects, each determine if they're 4bpp
or 8bpp separately, so a given overall palette slot might map to a used color in
8bpp and an unused/transparent color in 4bpp. If you're a palette wizard.
Palette slot 0 of the overall background palette is used to determine the
"backdrop" color. That's the color you see if no background or object ends up
being rendered within a given pixel.
Since display mode 3 and display mode 5 don't use the palette, they cannot
benefit from transparency.
## Video RAM / VRAM
* `0x6000000` to `0x6017FFF` (96k)
We've used this before! VRAM has a 16-bit bus and no wait. However, the same as
with PALRAM, the "you might have to wait if the display controller is looking at
it" rule applies here.
Unfortunately there's not much more exact detail that can be given about VRAM.
The use of the memory depends on the video mode that you're using.
One general detail of note is that you can't write individual bytes to any part
of VRAM. Depending on mode and location, you'll either get your bytes doubled
into both the upper and lower parts of the 16-bit location targeted, or you
won't even affect the memory. This usually isn't a big deal, except in two
situations:
* In Mode 4, if you want to change just 1 pixel, you'll have to be very careful
to read the old `u16`, overwrite just the byte you wanted to change, and then
write that back.
* In any display mode, avoid using `memcopy` to place things into VRAM.
It's written to be byte oriented, and only does 32-bit transfers under select
conditions. The rest of the time it'll copy one byte at a time and you'll get
either garbage or nothing at all.
## Object Attribute Memory / OAM
* `0x7000000` to `0x70003FF` (1k)
The Object Attribute Memory has a 32-bit bus and no default wait, but suffers
from the "you might have to wait if the display controller is looking at it"
rule. You cannot write individual bytes to OAM at all, but that's not really a
problem because all the fields of the data types within OAM are either `i16` or
`u16` anyway.
Object attribute memory is the wildest yet: it conceptually contains two types
of things, but they're _interlaced_ with each other all the way through.
Now, [GBATEK](http://problemkaputt.de/gbatek.htm#lcdobjoamattributes) and
[CowByte](https://www.cs.rit.edu/~tjh8300/CowBite/CowBiteSpec.htm#OAM%20(sprites))
doesn't quite give names to the two data types here.
[TONC](https://www.coranac.com/tonc/text/regobj.htm#sec-oam) calls them
`OBJ_ATTR` and `OBJ_AFFINE`, but we'll be giving them names fitting with the
Rust naming convention. Just know that if you try to talk about it with others
they might not be using the same names. In Rust terms their layout would look
like this:
```rust
#[repr(C)]
pub struct ObjectAttributes {
attr0: u16,
attr1: u16,
attr2: u16,
filler: i16,
}
#[repr(C)]
pub struct AffineMatrix {
filler0: [u16; 3],
pa: i16,
filler1: [u16; 3],
pb: i16,
filler2: [u16; 3],
pc: i16,
filler3: [u16; 3],
pd: i16,
}
```
(Note: the `#[repr(C)]` part just means that Rust must lay out the data exactly
in the order we specify, which otherwise it is not required to do).
So, we've got 1024 bytes in OAM and each `ObjectAttributes` value is 8 bytes, so
naturally we can support up to 128 objects.
_At the same time_, we've got 1024 bytes in OAM and each `AffineMatrix` is 32
bytes, so we can have 32 of them.
But, as I said, these things are all _interlaced_ with each other. See how
there's "filler" fields in each struct? If we imagine the OAM as being just an
array of one type or the other, indexes 0/1/2/3 of the `ObjectAttributes` array
would line up with index 0 of the `AffineMatrix` array. It's kinda weird, but
that's just how it works. When we setup functions to read and write these values
we'll have to be careful with how we do it. We probably _won't_ want to use
those representations above, at least not with the `AffineMatrix` type, because
they're quite wasteful if you want to store just object attributes or just
affine matrices.
## Game Pak ROM / Flash ROM
* `0x8000000` to `0x9FFFFFF` (wait 0)
* `0xA000000` to `0xBFFFFFF` (wait 1)
* `0xC000000` to `0xDFFFFFF` (wait 2)
* Max of 32Mb
These portions of the memory are less fixed, because they depend on the precise
details of the game pak you've inserted into the GBA. In general, they connect
to the game pak ROM and/or Flash memory, using a 16-bit bus. The ROM is
read-only, but the Flash memory (if any) allows writes.
The game pak ROM is listed as being in three sections, but it's actually the
same memory being effectively mirrored into three different locations. The
mirror that you choose to access the game pak through affects which wait state
setting it uses (configured via IO register of course). Unfortunately, the
details come down more to the game pak hardware that you load your game onto
than anything else, so there's not much I can say right here. We'll eventually
talk about it more later when I'm forced to do the boring thing and just cover
all the IO registers that aren't covered anywhere else.
One thing of note is the way that the 16-bit bus affects us: the instructions to
execute are coming through the same bus as the rest of the game data, so we want
them to be as compact as possible. The ARM chip in the GBA supports two
different instruction sets, "thumb" and "non-thumb". The thumb mode instructions
are 16-bit, so they can each be loaded one at a time, and the non-thumb
instructions are 32-bit, so we're at a penalty if we execute them directly out
of the game pak. However, some things will demand that we use non-thumb code, so
we'll have to deal with that eventually. It's possible to switch between modes,
but it's a pain to keep track of what mode you're in because there's not
currently support for it in Rust itself (perhaps some day). So we'll stick with
thumb code as much as we possibly can, that's why our target profile for our
builds starts with `thumbv4`.
## Game Pak SRAM
* `0xE000000` to `0xE00FFFF` (64k)
The game pak SRAM has an 8-bit bus. Why did Pokémon always take so long to save?
Saving the whole game one byte at a time is why. The SRAM also has some amount
of wait, but as with the ROM, the details depend on your game pak hardware (and
also as with ROM, you can adjust the settings with an IO register, should you
need to).
One thing to note about the SRAM is that the GBA has a Direct Memory Access
(DMA) feature that can be used for bulk memory movements in some cases, but the
DMA _cannot_ access the SRAM region. You really are stuck reading and writing
one byte at a time when you're using the SRAM.

View file

@ -24,3 +24,15 @@ everything is obviously too much for just one section of the book. Instead you
get an overview of general IO register rules and advice. Each particular
register is described in the appropriate sections of either the Video or
Non-Video chapters.
## Bus Size
TODO: describe this
## Minimum Write Size
TODO: talk about parts where you can't write one byte at a time
## Volatile or Not?
TODO: discuss what memory should be used volatile style and what can be used normal style.

View file

@ -228,12 +228,12 @@ pub fn div_rem(numerator: i32, denominator: i32) -> (i32, i32) {
I _hope_ this all makes sense by now.
## All The BIOS Functions
## Specific BIOS Functions
As for a full list of all the specific BIOS functions and their use, you should
For a full list of all the specific BIOS functions and their use you should
check the `gba::bios` module within the `gba` crate. There's just so many of
them that enumerating them all here wouldn't serve much purpose.
Which is not to say that we'll never cover any BIOS functions in this book!
Instead, we'll simply mention them when whenever they're relevent to the task at
hand (such as sound or waiting for vblank).
hand (such as controlling sound or waiting for vblank).

View file

@ -1 +1,28 @@
# Work RAM
## External Work RAM (EWRAM)
* **Address Span:** `0x2000000` to `0x203FFFF` (256k)
This is a big pile of space, the use of which is up to each game. However, the
external work ram has only a 16-bit bus (if you read/write a 32-bit value it
silently breaks it up into two 16-bit operations) and also 2 wait cycles (extra
CPU cycles that you have to expend _per 16-bit bus use_).
It's most helpful to think of EWRAM as slower, distant memory, similar to the
"heap" in a normal application. You can take the time to go store something
within EWRAM, or to load it out of EWRAM, but if you've got several operations
to do in a row and you're worried about time you should pull that value into
local memory, work on your local copy, and then push it back out to EWRAM.
## Internal Work RAM (IWRAM)
* **Address Span:** `0x3000000` to `0x3007FFF` (32k)
This is a smaller pile of space, but it has a 32-bit bus and no wait.
By default, `0x3007F00` to `0x3007FFF` is reserved for interrupt and BIOS use.
The rest of it is mostly up to you. The user's stack space starts at `0x3007F00`
and proceeds _down_ from there. For best results you should probably start at
`0x3000000` and then go upwards. Under normal use it's unlikely that the two
memory regions will crash into each other.

View file

@ -1 +1,3 @@
# IO Registers
* **Address Span:** `0x400_0000` to `0x400_03FE`

View file

@ -1 +1,50 @@
# Palette RAM
# Palette RAM (PALRAM)
* **Address Span:** `0x500_0000` to `0x500_03FF` (1k)
Palette RAM has a 16-bit bus, which isn't really a problem because it
conceptually just holds `u16` values. There's no automatic wait state, but if
you try to access the same location that the display controller is accessing you
get bumped by 1 cycle. Since the display controller can use the palette ram any
number of times per scanline it's basically impossible to predict if you'll have
to do a wait or not during VDraw. During VBlank you won't have any wait of
course.
PALRAM is among the memory where there's weirdness if you try to write just one
byte: if you try to write just 1 byte, it writes that byte into _both_ parts of
the larger 16-bit location. This doesn't really affect us much with PALRAM,
because palette values are all supposed to be `u16` anyway.
The palette memory actually contains not one, but _two_ sets of palettes. First
there's 256 entries for the background palette data (starting at `0x5000000`),
and then there's 256 entries for object palette data (starting at `0x5000200`).
The GBA also has two modes for palette access: 8-bits-per-pixel (8bpp) and
4-bits-per-pixel (4bpp).
* In 8bpp mode an 8-bit palette index value within a background or sprite
simply indexes directly into the 256 slots for that type of thing.
* In 4bpp mode a 4-bit palette index value within a background or sprite
specifies an index within a particular "palbank" (16 palette entries each),
and then a _separate_ setting outside of the graphical data determines which
palbank is to be used for that background or object (the screen entry data for
backgrounds, and the object attributes for objects).
### Transparency
When a pixel within a background or object specifies index 0 as its palette
entry it is treated as a transparent pixel. This means that in 8bpp mode there's
only 255 actual color options (0 being transparent), and in 4bpp mode there's
only 15 actual color options available within each palbank (the 0th entry of
_each_ palbank is transparent).
Individual backgrounds, and individual objects, each determine if they're 4bpp
or 8bpp separately, so a given overall palette slot might map to a used color in
8bpp and an unused/transparent color in 4bpp. If you're a palette wizard.
Palette slot 0 of the overall background palette is used to determine the
"backdrop" color. That's the color you see if no background or object ends up
being rendered within a given pixel.
Since display mode 3 and display mode 5 don't use the palette, they cannot
benefit from transparency.

View file

@ -1 +1,24 @@
# Video RAM
# Video RAM (VRAM)
* **Address Span:** `0x600_0000` to `0x601_7FFF` (96k)
We've used this before! VRAM has a 16-bit bus and no wait. However, the same as
with PALRAM, the "you might have to wait if the display controller is looking at
it" rule applies here.
Unfortunately there's not much more exact detail that can be given about VRAM.
The use of the memory depends on the video mode that you're using.
One general detail of note is that you can't write individual bytes to any part
of VRAM. Depending on mode and location, you'll either get your bytes doubled
into both the upper and lower parts of the 16-bit location targeted, or you
won't even affect the memory. This usually isn't a big deal, except in two
situations:
* In Mode 4, if you want to change just 1 pixel, you'll have to be very careful
to read the old `u16`, overwrite just the byte you wanted to change, and then
write that back.
* In any display mode, avoid using `memcopy` to place things into VRAM.
It's written to be byte oriented, and only does 32-bit transfers under select
conditions. The rest of the time it'll copy one byte at a time and you'll get
either garbage or nothing at all.

View file

@ -1 +1,62 @@
# Object Attribute Memory
# Object Attribute Memory (OAM)
* **Address Span:** `0x700_0000` to `0x700_03FF` (1k)
The Object Attribute Memory has a 32-bit bus and no default wait, but suffers
from the "you might have to wait if the display controller is looking at it"
rule. You cannot write individual bytes to OAM at all, but that's not really a
problem because all the fields of the data types within OAM are either `i16` or
`u16` anyway.
Object attribute memory is the wildest yet: it conceptually contains two types
of things, but they're _interlaced_ with each other all the way through.
Now, [GBATEK](http://problemkaputt.de/gbatek.htm#lcdobjoamattributes) and
[CowByte](https://www.cs.rit.edu/~tjh8300/CowBite/CowBiteSpec.htm#OAM%20(sprites))
doesn't quite give names to the two data types here.
[TONC](https://www.coranac.com/tonc/text/regobj.htm#sec-oam) calls them
`OBJ_ATTR` and `OBJ_AFFINE`, but we'll be giving them names fitting with the
Rust naming convention. Just know that if you try to talk about it with others
they might not be using the same names. In Rust terms their layout would look
like this:
```rust
#[repr(C)]
pub struct ObjectAttributes {
attr0: u16,
attr1: u16,
attr2: u16,
filler: i16,
}
#[repr(C)]
pub struct AffineMatrix {
filler0: [u16; 3],
pa: i16,
filler1: [u16; 3],
pb: i16,
filler2: [u16; 3],
pc: i16,
filler3: [u16; 3],
pd: i16,
}
```
(Note: the `#[repr(C)]` part just means that Rust must lay out the data exactly
in the order we specify, which otherwise it is not required to do).
So, we've got 1024 bytes in OAM and each `ObjectAttributes` value is 8 bytes, so
naturally we can support up to 128 objects.
_At the same time_, we've got 1024 bytes in OAM and each `AffineMatrix` is 32
bytes, so we can have 32 of them.
But, as I said, these things are all _interlaced_ with each other. See how
there's "filler" fields in each struct? If we imagine the OAM as being just an
array of one type or the other, indexes 0/1/2/3 of the `ObjectAttributes` array
would line up with index 0 of the `AffineMatrix` array. It's kinda weird, but
that's just how it works. When we setup functions to read and write these values
we'll have to be careful with how we do it. We probably _won't_ want to use
those representations above, at least not with the `AffineMatrix` type, because
they're quite wasteful if you want to store just object attributes or just
affine matrices.

View file

@ -1 +1,14 @@
# Game Pak ROM / Flash ROM
# Game Pak ROM / Flash ROM (ROM)
* **Address Span (Wait State 0):** `0x800_0000` to `0x9FF_FFFF`
* **Address Span (Wait State 1):** `0xA00_0000` to `0xBFF_FFFF`
* **Address Span (Wait State 2):** `0xC00_0000` to `0xDFF_FFFF`
The game's ROM data is a single set of data that's up to 32 megabytes in size.
However, that data is mirrored to three different locations in the address
space. Depending on which part of the address space you use, it can affect the
memory timings involved.
TODO: describe `WAITCNT` here, we won't get a better chance at it.
TODO: discuss THUMB vs ARM code and why THUMB is so much faster (because ROM is a 16-bit bus)

View file

@ -1 +1,16 @@
# Save RAM
# Save RAM (SRAM)
* **Address Span:** `0xE00_0000` to `0xE00FFFF` (64k)
The actual amount of SRAM available depends on your game pak, and the 64k figure
is simply the maximum possible. A particular game pak might have less, and an
emulator will likely let you have all 64k if you want.
As with other portions of the address space, SRAM has some number of wait cycles
per use. As with ROM, you can change the wait cycle settings via the `WAITCNT`
register if the defaults don't work well for your game pak. See the ROM section
for full details of how the `WAITCNT` register works.
The game pak SRAM also has only an 8-bit bus, so have fun with that.
The GBA Direct Memory Access (DMA) unit cannot access SRAM.