Deploy rust-console/gba to github.com/rust-console/gba.git:master

This commit is contained in:
DocsBot (from Travis CI) 2018-11-21 07:18:34 +00:00
parent 29cc0015d5
commit e86be64fb7
7 changed files with 296 additions and 66 deletions

BIN
devkitpro-pacman.deb Normal file

Binary file not shown.

View file

@ -165,10 +165,11 @@ reason, you'll still want devkitpro for the <code>gbafix</code> utility.</p>
<code>C:\devkitpro\tools\bin</code> to be <a href="https://stackoverflow.com/q/44272416/455232">added to your <code>C:\devkitpro\tools\bin</code> to be <a href="https://stackoverflow.com/q/44272416/455232">added to your
PATH</a>, depending on where you PATH</a>, depending on where you
installed it to and such.</li> installed it to and such.</li>
<li>On Linux you'll also want it to be added to your path, but if you're using <li>On Linux you can use pacman to get it, and the default install puts the stuff
Linux I'll just assume you know how to do all that. I'm told that the default in <code>/opt/devkitpro/devkitARM/bin</code> and <code>/opt/devkitpro/tools/bin</code>. If you need
installation path is <code>/opt/devkitpro/devkitARM/bin</code>, so look there first if help you can look in our repository's
you didn't select some other place.</li> <a href="https://github.com/rust-console/gba/blob/master/.travis.yml">.travis.yml</a>
file to see exactly what our CI does.</li>
</ul> </ul>
<p>Finally, you'll need <code>cargo-xbuild</code>. Just run <code>cargo install cargo-xbuild</code> and <p>Finally, you'll need <code>cargo-xbuild</code>. Just run <code>cargo install cargo-xbuild</code> and
cargo will figure it all out for you.</p> cargo will figure it all out for you.</p>

View file

@ -141,14 +141,14 @@
several memory portions to it, each with their own little differences. Most of several memory portions to it, each with their own little differences. Most of
the memory has pre-determined use according to the hardware, but there is also the memory has pre-determined use according to the hardware, but there is also
space for games to use as a scratch pad in whatever way the game sees fit.</p> space for games to use as a scratch pad in whatever way the game sees fit.</p>
<p>The memory ranges listed here are <em>inclusive</em>, so they end with a lot of <code>F</code>s <p>The memory ranges listed here are <em>inclusive</em>, so they end with a lot of F's
and <code>E</code>s.</p> and E's.</p>
<p>We've talked about volatile memory before, but just as a reminder I'll say that <p>We've talked about volatile memory before, but just as a reminder I'll say that
all of the memory we'll talk about here should be accessed with volatile with all of the memory we'll talk about here should be accessed using volatile with
two exceptions:</p> two exceptions:</p>
<ol> <ol>
<li>Work RAM (both internal and external) can be used normally, and if the <li>Work RAM (both internal and external) can be used normally, and if the
compiler is able to totally elide any reads and writes that's okay.</li> compiler is able to totally elide some reads and writes that's okay.</li>
<li>However, if you set aside any space in Work RAM where an interrupt will <li>However, if you set aside any space in Work RAM where an interrupt will
communicate with the main program then that specific location will have to communicate with the main program then that specific location will have to
keep using volatile access, since the compiler never knows when an interrupt keep using volatile access, since the compiler never knows when an interrupt
@ -171,13 +171,11 @@ try to read out of the BIOS.</p>
external work ram has only a 16-bit bus (if you read/write a 32-bit value it external work ram has only a 16-bit bus (if you read/write a 32-bit value it
silently breaks it up into two 16-bit operations) and also 2 wait cycles (extra silently breaks it up into two 16-bit operations) and also 2 wait cycles (extra
CPU cycles that you have to expend <em>per 16-bit bus use</em>).</p> CPU cycles that you have to expend <em>per 16-bit bus use</em>).</p>
<p>In other words, we should think of EWRAM as if it was &quot;heap space&quot; in a normal <p>It's most helpful to think of EWRAM as slower, distant memory, similar to the
application. You can take the time to go store something within EWRAM, or to &quot;heap&quot; in a normal application. You can take the time to go store something
load it out of EWRAM, but you should always avoid doing a critical computation within EWRAM, or to load it out of EWRAM, but if you've got several operations
on values in EWRAM. It's a bit of a pain, but if you wanna be speedy and you to do in a row and you're worried about time you should pull that value into
have more than just one manipulation that you want to do, you should pull the local memory, work on your local copy, and then push it back out to EWRAM.</p>
value into a local variable, do all of your manipulations, and then push it back
out at the end.</p>
<a class="header" href="#internal-work-ram--iwram" id="internal-work-ram--iwram"><h2>Internal Work RAM / IWRAM</h2></a> <a class="header" href="#internal-work-ram--iwram" id="internal-work-ram--iwram"><h2>Internal Work RAM / IWRAM</h2></a>
<ul> <ul>
<li><code>0x3000000</code> to <code>0x3007FFF</code> (32k)</li> <li><code>0x3000000</code> to <code>0x3007FFF</code> (32k)</li>
@ -185,11 +183,9 @@ out at the end.</p>
<p>This is a smaller pile of space, but it has a 32-bit bus and no wait.</p> <p>This is a smaller pile of space, but it has a 32-bit bus and no wait.</p>
<p>By default, <code>0x3007F00</code> to <code>0x3007FFF</code> is reserved for interrupt and BIOS use. <p>By default, <code>0x3007F00</code> to <code>0x3007FFF</code> is reserved for interrupt and BIOS use.
The rest of it is totally up to you. The user's stack space starts at The rest of it is totally up to you. The user's stack space starts at
<code>0x3007F00</code> and proceeds <em>down</em> from there. In other words, if you start your <code>0x3007F00</code> and proceeds <em>down</em> from there. For best results you should probably
own customized IWRAM use at <code>0x3000000</code> and go up, eventually you might hit your start at <code>0x3000000</code> and then go upwards. Under normal use it's unlikely that
stack. However, most reasonable uses won't actually cause a memory collision. the two memory regions will crash into each other.</p>
It's just something you should know about if you're using a ton of stack or
IWRAM and then get problems.</p>
<a class="header" href="#io-registers" id="io-registers"><h2>IO Registers</h2></a> <a class="header" href="#io-registers" id="io-registers"><h2>IO Registers</h2></a>
<ul> <ul>
<li><code>0x4000000</code> to <code>0x40003FE</code></li> <li><code>0x4000000</code> to <code>0x40003FE</code></li>
@ -221,9 +217,9 @@ and then there's 256 entries for object palette data (starting at <code>0x500020
<p>The GBA also has two modes for palette access: 8-bits-per-pixel (8bpp) and <p>The GBA also has two modes for palette access: 8-bits-per-pixel (8bpp) and
4-bits-per-pixel (4bpp).</p> 4-bits-per-pixel (4bpp).</p>
<ul> <ul>
<li>In 8bpp mode an (8-bit) palette index value within a background or sprite <li>In 8bpp mode an 8-bit palette index value within a background or sprite
simply indexes directly into the 256 slots for that type of thing.</li> simply indexes directly into the 256 slots for that type of thing.</li>
<li>In 4bpp mode a (4-bit) palette index value within a background or sprite <li>In 4bpp mode a 4-bit palette index value within a background or sprite
specifies an index within a particular &quot;palbank&quot; (16 palette entries each), specifies an index within a particular &quot;palbank&quot; (16 palette entries each),
and then a <em>separate</em> setting outside of the graphical data determines which and then a <em>separate</em> setting outside of the graphical data determines which
palbank is to be used for that background or object (the screen entry data for palbank is to be used for that background or object (the screen entry data for
@ -265,15 +261,17 @@ problem because all the fields of the data types within OAM are either <code>i16
of things, but they're <em>interlaced</em> with each other all the way through.</p> of things, but they're <em>interlaced</em> with each other all the way through.</p>
<p>Now, <a href="http://problemkaputt.de/gbatek.htm#lcdobjoamattributes">GBATEK</a> and <p>Now, <a href="http://problemkaputt.de/gbatek.htm#lcdobjoamattributes">GBATEK</a> and
<a href="https://www.cs.rit.edu/%7Etjh8300/CowBite/CowBiteSpec.htm#OAM%20(sprites)">CowByte</a> <a href="https://www.cs.rit.edu/%7Etjh8300/CowBite/CowBiteSpec.htm#OAM%20(sprites)">CowByte</a>
doesn't quite give names to the two data types, though doesn't quite give names to the two data types here.
<a href="https://www.coranac.com/tonc/text/regobj.htm#sec-oam">TONC</a> calls them <a href="https://www.coranac.com/tonc/text/regobj.htm#sec-oam">TONC</a> calls them
<code>OBJ_ATTR</code> and <code>OBJ_AFFINE</code>. We'll give them Rust names of course. In Rust terms <code>OBJ_ATTR</code> and <code>OBJ_AFFINE</code>, but we'll be giving them names fitting with the
their layout would look like this:</p> Rust naming convention. Just know that if you try to talk about it with others
they might not be using the same names. In Rust terms their layout would look
like this:</p>
<pre><pre class="playpen"><code class="language-rust"> <pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)] # #![allow(unused_variables)]
#fn main() { #fn main() {
#[repr(C)] #[repr(C)]
pub struct ObjectAttribute { pub struct ObjectAttributes {
attr0: u16, attr0: u16,
attr1: u16, attr1: u16,
attr2: u16, attr2: u16,
@ -294,13 +292,13 @@ pub struct AffineMatrix {
#}</code></pre></pre> #}</code></pre></pre>
<p>(Note: the <code>#[repr(C)]</code> part just means that Rust must lay out the data exactly <p>(Note: the <code>#[repr(C)]</code> part just means that Rust must lay out the data exactly
in the order we specify, which otherwise it is not required to do).</p> in the order we specify, which otherwise it is not required to do).</p>
<p>So, we've got 1024 bytes in OAM and each <code>ObjectAttribute</code> value is 8 bytes, so <p>So, we've got 1024 bytes in OAM and each <code>ObjectAttributes</code> value is 8 bytes, so
naturally we can support up to 128 objects.</p> naturally we can support up to 128 objects.</p>
<p><em>At the same time</em>, we've got 1024 bytes in OAM and each <code>AffineMatrix</code> is 32 <p><em>At the same time</em>, we've got 1024 bytes in OAM and each <code>AffineMatrix</code> is 32
bytes, so we can have 32 of them.</p> bytes, so we can have 32 of them.</p>
<p>But, as I said, these things are all <em>interlaced</em> with each other. See how <p>But, as I said, these things are all <em>interlaced</em> with each other. See how
there's &quot;filler&quot; fields in each struct? If we imagine the OAM as being just an there's &quot;filler&quot; fields in each struct? If we imagine the OAM as being just an
array of one type or the other, indexes 0/1/2/3 of the <code>ObjectAttribute</code> array array of one type or the other, indexes 0/1/2/3 of the <code>ObjectAttributes</code> array
would line up with index 0 of the <code>AffineMatrix</code> array. It's kinda weird, but would line up with index 0 of the <code>AffineMatrix</code> array. It's kinda weird, but
that's just how it works. When we setup functions to read and write these values that's just how it works. When we setup functions to read and write these values
we'll have to be careful with how we do it. We probably <em>won't</em> want to use we'll have to be careful with how we do it. We probably <em>won't</em> want to use
@ -324,7 +322,8 @@ mirror that you choose to access the game pak through affects which wait state
setting it uses (configured via IO register of course). Unfortunately, the setting it uses (configured via IO register of course). Unfortunately, the
details come down more to the game pak hardware that you load your game onto details come down more to the game pak hardware that you load your game onto
than anything else, so there's not much I can say right here. We'll eventually than anything else, so there's not much I can say right here. We'll eventually
talk about it more later,</p> talk about it more later when I'm forced to do the boring thing and just cover
all the IO registers that aren't covered anywhere else.</p>
<p>One thing of note is the way that the 16-bit bus affects us: the instructions to <p>One thing of note is the way that the 16-bit bus affects us: the instructions to
execute are coming through the same bus as the rest of the game data, so we want execute are coming through the same bus as the rest of the game data, so we want
them to be as compact as possible. The ARM chip in the GBA supports two them to be as compact as possible. The ARM chip in the GBA supports two
@ -342,9 +341,10 @@ builds starts with <code>thumbv4</code>.</p>
<li><code>0xE000000</code> to <code>0xE00FFFF</code> (64k)</li> <li><code>0xE000000</code> to <code>0xE00FFFF</code> (64k)</li>
</ul> </ul>
<p>The game pak SRAM has an 8-bit bus. Why did Pokémon always take so long to save? <p>The game pak SRAM has an 8-bit bus. Why did Pokémon always take so long to save?
This is why. It also has some amount of wait, but as with the ROM, the details Saving the whole game one byte at a time is why. The SRAM also has some amount
depend on your game pak hardware (and also as with ROM, you can adjust the of wait, but as with the ROM, the details depend on your game pak hardware (and
settings with an IO register, should you need to).</p> also as with ROM, you can adjust the settings with an IO register, should you
need to).</p>
<p>One thing to note about the SRAM is that the GBA has a Direct Memory Access <p>One thing to note about the SRAM is that the GBA has a Direct Memory Access
(DMA) feature that can be used for bulk memory movements in some cases, but the (DMA) feature that can be used for bulk memory movements in some cases, but the
DMA <em>cannot</em> access the SRAM region. You really are stuck reading and writing DMA <em>cannot</em> access the SRAM region. You really are stuck reading and writing

View file

@ -137,6 +137,120 @@
<div id="content" class="content"> <div id="content" class="content">
<main> <main>
<a class="header" href="#tile-data" id="tile-data"><h1>Tile Data</h1></a> <a class="header" href="#tile-data" id="tile-data"><h1>Tile Data</h1></a>
<p>When using the GBA's hardware graphics, if you want to let the hardware do most
of the work you have to use Modes 0, 1 or 2. However, to do that we first have
to learn about how tile data works inside of the GBA.</p>
<a class="header" href="#tiles" id="tiles"><h2>Tiles</h2></a>
<p>Fundamentally, a tile is an 8x8 image. If you want anything bigger than 8x8 you
need to arrange several tiles so that it looks like whatever you're trying to
draw.</p>
<p>As was already mentioned, the GBA supports two different color modes: 4 bits per
pixel and 8 bits per pixel. This means that we have two types of tile that we
need to model. The pixel bits always represent an index into the PALRAM.</p>
<ul>
<li>With 4 bits per pixel, the PALRAM is imagined to be 16 <strong>palbank</strong> sections of
16 palette entries each. The image data selects the index within the palbank,
and an external configuration selects which palbank is used.</li>
<li>With 8 bits per pixel, the PALRAM is imagined to be a single 256 entry array
and the index just directly picks which of the 256 colors is used.</li>
</ul>
<p>Knowing this, we can write the following definitions:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
#[derive(Debug, Clone, Copy, Default)]
#[repr(transparent)]
pub struct Tile4bpp {
data: [u32; 8]
}
#[derive(Debug, Clone, Copy, Default)]
#[repr(transparent)]
pub struct Tile8bpp {
data: [u32; 16]
}
#}</code></pre></pre>
<p>I hope this makes sense so far. At 4bpp, we have 4 bits per pixel, times 8
pixels per line, times 8 lines: 256 bits required. Similarly, at 8 bits per
pixel we'll need 512 bits. Why are we defining them as arrays of <code>u32</code> values?
Because when it comes time to do bulk copies the fastest way to it will be to go
one whole machine word at a time. If we make the data inside the type be an
array of <code>u32</code> then it'll already be aligned for fast <code>u32</code> bulk copies.</p>
<p>Keeping track of the current color depth is naturally the <em>programmer's</em>
problem. If you get it wrong you'll see a whole ton of garbage pixels all over
the screen, and you'll probably be able to guess why. You know, unless you did
one of the other things that can make a bunch of garbage pixels show up all over
the screen. Graphics programming is fun like that.</p>
<a class="header" href="#charblocks" id="charblocks"><h2>Charblocks</h2></a>
<p>Tiles don't just sit on their own, they get grouped into <strong>charblocks</strong>. Long
ago in the distant past, video games were built with hardware that was also used
to make text terminals. So tile image data was called &quot;character data&quot;. In fact
some guides will even call the regular mode for the background layers &quot;text
mode&quot;, despite the fact that you obviously don't have to show text at all.</p>
<p>A charblock is 16kb long (<code>0x4000</code> bytes), which means that the number of tiles
that fit into a charblock depends on your color depth. With 4bpp you get 512
tiles, and with 8bpp there's 256 tiles. So they'd be something like this:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
#[derive(Clone, Copy)]
#[repr(transparent)]
pub struct Charblock4bpp {
data: [Tile4bpp; 512],
}
#[derive(Clone, Copy)]
#[repr(transparent)]
pub struct Charblock8bpp {
data: [Tile8bpp; 256],
}
#}</code></pre></pre>
<p>You'll note that we can't even derive <code>Debug</code> or <code>Default</code> any more because the
arrays are so big. Rust supports Clone and Copy for arrays of any size, but the
rest is still size 32 or less. We won't generally be making up an entire
Charblock on the fly though, so it's not a big deal. If we <em>absolutely</em> had to,
we could call <code>core::mem::zeroed()</code>, but we really don't want to be trying to
build a whole charblock at runtime. We'll usually want to define our tile data
as <code>const</code> charblock values (or even parts of charblock values) that we then
load out of the game pak ROM at runtime.</p>
<p>Anyway, with 16k per charblock and only 96k total in VRAM, it's easy math to see
that there's 6 different charblocks in VRAM when in a tiled mode. The first four
of these are for backgrounds, and the other two are for objects. There's rules
for how a tile ID on a background or object selects a tile within a charblock,
but since they're different between backgrounds and objects we'll cover that on
their own pages.</p>
<a class="header" href="#image-editing" id="image-editing"><h2>Image Editing</h2></a>
<p>It's very important to note that if you use a normal image editor you'll get
very bad results if you translate that directly into GBA memory.</p>
<p>Imagine you have part of an image that's 16 by 16 pixels, aka 2 tiles by 2
tiles. The data for that bitmap is the 1st row of the 1st tile, then the 1st row
of the 2nd tile. However, when we translate that into the GBA, the first 8
pixels will indeed be the first 8 tile pixels, but then the next 8 pixels in
memory will be used as the <em>2nd row of the first tile</em>, not the 1st row of the
2nd tile.</p>
<p>So, how do we fix this?</p>
<p>Well, the simple but annoying way is to edit your tile image as being an 8 pixel
wide image and then have the image get super tall as you add more and more
tiles. It can work, but it's really impractical if you have any multi-tile
things that you're trying to do.</p>
<p>Instead, there are some image conversion tools that devkitpro provides in their
gba-dev section. They let you take normal images and then repackage them and
export it in various formats that you can then compile into your project.</p>
<p>Ketsuban uses the <a href="http://www.coranac.com/projects/grit/">grit</a> tool, with the
following suggestions:</p>
<ol>
<li>Include an actual resource file and a file describing it somewhere in your
project (see <a href="http://www.coranac.com/man/grit/html/index.htm">the grit
manual</a> for all details
involved here).</li>
<li>In a <code>build.rs</code> you run <code>grit</code> on each resource+description pair, such as in
this <a href="https://gist.github.com/ketsuban/526fa55fbef0a3ccd4c7cd6204f29f94">old gist
example</a></li>
<li>Then within your rust code you use the
<a href="https://doc.rust-lang.org/core/macro.include_bytes.html">include_bytes!</a>
macro to have the formatted resource be available as a const value you can
load at runtime.</li>
</ol>
</main> </main>

View file

@ -214,10 +214,11 @@ reason, you'll still want devkitpro for the <code>gbafix</code> utility.</p>
<code>C:\devkitpro\tools\bin</code> to be <a href="https://stackoverflow.com/q/44272416/455232">added to your <code>C:\devkitpro\tools\bin</code> to be <a href="https://stackoverflow.com/q/44272416/455232">added to your
PATH</a>, depending on where you PATH</a>, depending on where you
installed it to and such.</li> installed it to and such.</li>
<li>On Linux you'll also want it to be added to your path, but if you're using <li>On Linux you can use pacman to get it, and the default install puts the stuff
Linux I'll just assume you know how to do all that. I'm told that the default in <code>/opt/devkitpro/devkitARM/bin</code> and <code>/opt/devkitpro/tools/bin</code>. If you need
installation path is <code>/opt/devkitpro/devkitARM/bin</code>, so look there first if help you can look in our repository's
you didn't select some other place.</li> <a href="https://github.com/rust-console/gba/blob/master/.travis.yml">.travis.yml</a>
file to see exactly what our CI does.</li>
</ul> </ul>
<p>Finally, you'll need <code>cargo-xbuild</code>. Just run <code>cargo install cargo-xbuild</code> and <p>Finally, you'll need <code>cargo-xbuild</code>. Just run <code>cargo install cargo-xbuild</code> and
cargo will figure it all out for you.</p> cargo will figure it all out for you.</p>
@ -1306,14 +1307,14 @@ check, and then if they match the pair disappears.</p>
several memory portions to it, each with their own little differences. Most of several memory portions to it, each with their own little differences. Most of
the memory has pre-determined use according to the hardware, but there is also the memory has pre-determined use according to the hardware, but there is also
space for games to use as a scratch pad in whatever way the game sees fit.</p> space for games to use as a scratch pad in whatever way the game sees fit.</p>
<p>The memory ranges listed here are <em>inclusive</em>, so they end with a lot of <code>F</code>s <p>The memory ranges listed here are <em>inclusive</em>, so they end with a lot of F's
and <code>E</code>s.</p> and E's.</p>
<p>We've talked about volatile memory before, but just as a reminder I'll say that <p>We've talked about volatile memory before, but just as a reminder I'll say that
all of the memory we'll talk about here should be accessed with volatile with all of the memory we'll talk about here should be accessed using volatile with
two exceptions:</p> two exceptions:</p>
<ol> <ol>
<li>Work RAM (both internal and external) can be used normally, and if the <li>Work RAM (both internal and external) can be used normally, and if the
compiler is able to totally elide any reads and writes that's okay.</li> compiler is able to totally elide some reads and writes that's okay.</li>
<li>However, if you set aside any space in Work RAM where an interrupt will <li>However, if you set aside any space in Work RAM where an interrupt will
communicate with the main program then that specific location will have to communicate with the main program then that specific location will have to
keep using volatile access, since the compiler never knows when an interrupt keep using volatile access, since the compiler never knows when an interrupt
@ -1336,13 +1337,11 @@ try to read out of the BIOS.</p>
external work ram has only a 16-bit bus (if you read/write a 32-bit value it external work ram has only a 16-bit bus (if you read/write a 32-bit value it
silently breaks it up into two 16-bit operations) and also 2 wait cycles (extra silently breaks it up into two 16-bit operations) and also 2 wait cycles (extra
CPU cycles that you have to expend <em>per 16-bit bus use</em>).</p> CPU cycles that you have to expend <em>per 16-bit bus use</em>).</p>
<p>In other words, we should think of EWRAM as if it was &quot;heap space&quot; in a normal <p>It's most helpful to think of EWRAM as slower, distant memory, similar to the
application. You can take the time to go store something within EWRAM, or to &quot;heap&quot; in a normal application. You can take the time to go store something
load it out of EWRAM, but you should always avoid doing a critical computation within EWRAM, or to load it out of EWRAM, but if you've got several operations
on values in EWRAM. It's a bit of a pain, but if you wanna be speedy and you to do in a row and you're worried about time you should pull that value into
have more than just one manipulation that you want to do, you should pull the local memory, work on your local copy, and then push it back out to EWRAM.</p>
value into a local variable, do all of your manipulations, and then push it back
out at the end.</p>
<a class="header" href="#internal-work-ram--iwram" id="internal-work-ram--iwram"><h2>Internal Work RAM / IWRAM</h2></a> <a class="header" href="#internal-work-ram--iwram" id="internal-work-ram--iwram"><h2>Internal Work RAM / IWRAM</h2></a>
<ul> <ul>
<li><code>0x3000000</code> to <code>0x3007FFF</code> (32k)</li> <li><code>0x3000000</code> to <code>0x3007FFF</code> (32k)</li>
@ -1350,11 +1349,9 @@ out at the end.</p>
<p>This is a smaller pile of space, but it has a 32-bit bus and no wait.</p> <p>This is a smaller pile of space, but it has a 32-bit bus and no wait.</p>
<p>By default, <code>0x3007F00</code> to <code>0x3007FFF</code> is reserved for interrupt and BIOS use. <p>By default, <code>0x3007F00</code> to <code>0x3007FFF</code> is reserved for interrupt and BIOS use.
The rest of it is totally up to you. The user's stack space starts at The rest of it is totally up to you. The user's stack space starts at
<code>0x3007F00</code> and proceeds <em>down</em> from there. In other words, if you start your <code>0x3007F00</code> and proceeds <em>down</em> from there. For best results you should probably
own customized IWRAM use at <code>0x3000000</code> and go up, eventually you might hit your start at <code>0x3000000</code> and then go upwards. Under normal use it's unlikely that
stack. However, most reasonable uses won't actually cause a memory collision. the two memory regions will crash into each other.</p>
It's just something you should know about if you're using a ton of stack or
IWRAM and then get problems.</p>
<a class="header" href="#io-registers-1" id="io-registers-1"><h2>IO Registers</h2></a> <a class="header" href="#io-registers-1" id="io-registers-1"><h2>IO Registers</h2></a>
<ul> <ul>
<li><code>0x4000000</code> to <code>0x40003FE</code></li> <li><code>0x4000000</code> to <code>0x40003FE</code></li>
@ -1386,9 +1383,9 @@ and then there's 256 entries for object palette data (starting at <code>0x500020
<p>The GBA also has two modes for palette access: 8-bits-per-pixel (8bpp) and <p>The GBA also has two modes for palette access: 8-bits-per-pixel (8bpp) and
4-bits-per-pixel (4bpp).</p> 4-bits-per-pixel (4bpp).</p>
<ul> <ul>
<li>In 8bpp mode an (8-bit) palette index value within a background or sprite <li>In 8bpp mode an 8-bit palette index value within a background or sprite
simply indexes directly into the 256 slots for that type of thing.</li> simply indexes directly into the 256 slots for that type of thing.</li>
<li>In 4bpp mode a (4-bit) palette index value within a background or sprite <li>In 4bpp mode a 4-bit palette index value within a background or sprite
specifies an index within a particular &quot;palbank&quot; (16 palette entries each), specifies an index within a particular &quot;palbank&quot; (16 palette entries each),
and then a <em>separate</em> setting outside of the graphical data determines which and then a <em>separate</em> setting outside of the graphical data determines which
palbank is to be used for that background or object (the screen entry data for palbank is to be used for that background or object (the screen entry data for
@ -1430,15 +1427,17 @@ problem because all the fields of the data types within OAM are either <code>i16
of things, but they're <em>interlaced</em> with each other all the way through.</p> of things, but they're <em>interlaced</em> with each other all the way through.</p>
<p>Now, <a href="http://problemkaputt.de/gbatek.htm#lcdobjoamattributes">GBATEK</a> and <p>Now, <a href="http://problemkaputt.de/gbatek.htm#lcdobjoamattributes">GBATEK</a> and
<a href="https://www.cs.rit.edu/%7Etjh8300/CowBite/CowBiteSpec.htm#OAM%20(sprites)">CowByte</a> <a href="https://www.cs.rit.edu/%7Etjh8300/CowBite/CowBiteSpec.htm#OAM%20(sprites)">CowByte</a>
doesn't quite give names to the two data types, though doesn't quite give names to the two data types here.
<a href="https://www.coranac.com/tonc/text/regobj.htm#sec-oam">TONC</a> calls them <a href="https://www.coranac.com/tonc/text/regobj.htm#sec-oam">TONC</a> calls them
<code>OBJ_ATTR</code> and <code>OBJ_AFFINE</code>. We'll give them Rust names of course. In Rust terms <code>OBJ_ATTR</code> and <code>OBJ_AFFINE</code>, but we'll be giving them names fitting with the
their layout would look like this:</p> Rust naming convention. Just know that if you try to talk about it with others
they might not be using the same names. In Rust terms their layout would look
like this:</p>
<pre><pre class="playpen"><code class="language-rust"> <pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)] # #![allow(unused_variables)]
#fn main() { #fn main() {
#[repr(C)] #[repr(C)]
pub struct ObjectAttribute { pub struct ObjectAttributes {
attr0: u16, attr0: u16,
attr1: u16, attr1: u16,
attr2: u16, attr2: u16,
@ -1459,13 +1458,13 @@ pub struct AffineMatrix {
#}</code></pre></pre> #}</code></pre></pre>
<p>(Note: the <code>#[repr(C)]</code> part just means that Rust must lay out the data exactly <p>(Note: the <code>#[repr(C)]</code> part just means that Rust must lay out the data exactly
in the order we specify, which otherwise it is not required to do).</p> in the order we specify, which otherwise it is not required to do).</p>
<p>So, we've got 1024 bytes in OAM and each <code>ObjectAttribute</code> value is 8 bytes, so <p>So, we've got 1024 bytes in OAM and each <code>ObjectAttributes</code> value is 8 bytes, so
naturally we can support up to 128 objects.</p> naturally we can support up to 128 objects.</p>
<p><em>At the same time</em>, we've got 1024 bytes in OAM and each <code>AffineMatrix</code> is 32 <p><em>At the same time</em>, we've got 1024 bytes in OAM and each <code>AffineMatrix</code> is 32
bytes, so we can have 32 of them.</p> bytes, so we can have 32 of them.</p>
<p>But, as I said, these things are all <em>interlaced</em> with each other. See how <p>But, as I said, these things are all <em>interlaced</em> with each other. See how
there's &quot;filler&quot; fields in each struct? If we imagine the OAM as being just an there's &quot;filler&quot; fields in each struct? If we imagine the OAM as being just an
array of one type or the other, indexes 0/1/2/3 of the <code>ObjectAttribute</code> array array of one type or the other, indexes 0/1/2/3 of the <code>ObjectAttributes</code> array
would line up with index 0 of the <code>AffineMatrix</code> array. It's kinda weird, but would line up with index 0 of the <code>AffineMatrix</code> array. It's kinda weird, but
that's just how it works. When we setup functions to read and write these values that's just how it works. When we setup functions to read and write these values
we'll have to be careful with how we do it. We probably <em>won't</em> want to use we'll have to be careful with how we do it. We probably <em>won't</em> want to use
@ -1489,7 +1488,8 @@ mirror that you choose to access the game pak through affects which wait state
setting it uses (configured via IO register of course). Unfortunately, the setting it uses (configured via IO register of course). Unfortunately, the
details come down more to the game pak hardware that you load your game onto details come down more to the game pak hardware that you load your game onto
than anything else, so there's not much I can say right here. We'll eventually than anything else, so there's not much I can say right here. We'll eventually
talk about it more later,</p> talk about it more later when I'm forced to do the boring thing and just cover
all the IO registers that aren't covered anywhere else.</p>
<p>One thing of note is the way that the 16-bit bus affects us: the instructions to <p>One thing of note is the way that the 16-bit bus affects us: the instructions to
execute are coming through the same bus as the rest of the game data, so we want execute are coming through the same bus as the rest of the game data, so we want
them to be as compact as possible. The ARM chip in the GBA supports two them to be as compact as possible. The ARM chip in the GBA supports two
@ -1507,14 +1507,129 @@ builds starts with <code>thumbv4</code>.</p>
<li><code>0xE000000</code> to <code>0xE00FFFF</code> (64k)</li> <li><code>0xE000000</code> to <code>0xE00FFFF</code> (64k)</li>
</ul> </ul>
<p>The game pak SRAM has an 8-bit bus. Why did Pokémon always take so long to save? <p>The game pak SRAM has an 8-bit bus. Why did Pokémon always take so long to save?
This is why. It also has some amount of wait, but as with the ROM, the details Saving the whole game one byte at a time is why. The SRAM also has some amount
depend on your game pak hardware (and also as with ROM, you can adjust the of wait, but as with the ROM, the details depend on your game pak hardware (and
settings with an IO register, should you need to).</p> also as with ROM, you can adjust the settings with an IO register, should you
need to).</p>
<p>One thing to note about the SRAM is that the GBA has a Direct Memory Access <p>One thing to note about the SRAM is that the GBA has a Direct Memory Access
(DMA) feature that can be used for bulk memory movements in some cases, but the (DMA) feature that can be used for bulk memory movements in some cases, but the
DMA <em>cannot</em> access the SRAM region. You really are stuck reading and writing DMA <em>cannot</em> access the SRAM region. You really are stuck reading and writing
one byte at a time when you're using the SRAM.</p> one byte at a time when you're using the SRAM.</p>
<a class="header" href="#tile-data" id="tile-data"><h1>Tile Data</h1></a> <a class="header" href="#tile-data" id="tile-data"><h1>Tile Data</h1></a>
<p>When using the GBA's hardware graphics, if you want to let the hardware do most
of the work you have to use Modes 0, 1 or 2. However, to do that we first have
to learn about how tile data works inside of the GBA.</p>
<a class="header" href="#tiles" id="tiles"><h2>Tiles</h2></a>
<p>Fundamentally, a tile is an 8x8 image. If you want anything bigger than 8x8 you
need to arrange several tiles so that it looks like whatever you're trying to
draw.</p>
<p>As was already mentioned, the GBA supports two different color modes: 4 bits per
pixel and 8 bits per pixel. This means that we have two types of tile that we
need to model. The pixel bits always represent an index into the PALRAM.</p>
<ul>
<li>With 4 bits per pixel, the PALRAM is imagined to be 16 <strong>palbank</strong> sections of
16 palette entries each. The image data selects the index within the palbank,
and an external configuration selects which palbank is used.</li>
<li>With 8 bits per pixel, the PALRAM is imagined to be a single 256 entry array
and the index just directly picks which of the 256 colors is used.</li>
</ul>
<p>Knowing this, we can write the following definitions:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
#[derive(Debug, Clone, Copy, Default)]
#[repr(transparent)]
pub struct Tile4bpp {
data: [u32; 8]
}
#[derive(Debug, Clone, Copy, Default)]
#[repr(transparent)]
pub struct Tile8bpp {
data: [u32; 16]
}
#}</code></pre></pre>
<p>I hope this makes sense so far. At 4bpp, we have 4 bits per pixel, times 8
pixels per line, times 8 lines: 256 bits required. Similarly, at 8 bits per
pixel we'll need 512 bits. Why are we defining them as arrays of <code>u32</code> values?
Because when it comes time to do bulk copies the fastest way to it will be to go
one whole machine word at a time. If we make the data inside the type be an
array of <code>u32</code> then it'll already be aligned for fast <code>u32</code> bulk copies.</p>
<p>Keeping track of the current color depth is naturally the <em>programmer's</em>
problem. If you get it wrong you'll see a whole ton of garbage pixels all over
the screen, and you'll probably be able to guess why. You know, unless you did
one of the other things that can make a bunch of garbage pixels show up all over
the screen. Graphics programming is fun like that.</p>
<a class="header" href="#charblocks" id="charblocks"><h2>Charblocks</h2></a>
<p>Tiles don't just sit on their own, they get grouped into <strong>charblocks</strong>. Long
ago in the distant past, video games were built with hardware that was also used
to make text terminals. So tile image data was called &quot;character data&quot;. In fact
some guides will even call the regular mode for the background layers &quot;text
mode&quot;, despite the fact that you obviously don't have to show text at all.</p>
<p>A charblock is 16kb long (<code>0x4000</code> bytes), which means that the number of tiles
that fit into a charblock depends on your color depth. With 4bpp you get 512
tiles, and with 8bpp there's 256 tiles. So they'd be something like this:</p>
<pre><pre class="playpen"><code class="language-rust">
# #![allow(unused_variables)]
#fn main() {
#[derive(Clone, Copy)]
#[repr(transparent)]
pub struct Charblock4bpp {
data: [Tile4bpp; 512],
}
#[derive(Clone, Copy)]
#[repr(transparent)]
pub struct Charblock8bpp {
data: [Tile8bpp; 256],
}
#}</code></pre></pre>
<p>You'll note that we can't even derive <code>Debug</code> or <code>Default</code> any more because the
arrays are so big. Rust supports Clone and Copy for arrays of any size, but the
rest is still size 32 or less. We won't generally be making up an entire
Charblock on the fly though, so it's not a big deal. If we <em>absolutely</em> had to,
we could call <code>core::mem::zeroed()</code>, but we really don't want to be trying to
build a whole charblock at runtime. We'll usually want to define our tile data
as <code>const</code> charblock values (or even parts of charblock values) that we then
load out of the game pak ROM at runtime.</p>
<p>Anyway, with 16k per charblock and only 96k total in VRAM, it's easy math to see
that there's 6 different charblocks in VRAM when in a tiled mode. The first four
of these are for backgrounds, and the other two are for objects. There's rules
for how a tile ID on a background or object selects a tile within a charblock,
but since they're different between backgrounds and objects we'll cover that on
their own pages.</p>
<a class="header" href="#image-editing" id="image-editing"><h2>Image Editing</h2></a>
<p>It's very important to note that if you use a normal image editor you'll get
very bad results if you translate that directly into GBA memory.</p>
<p>Imagine you have part of an image that's 16 by 16 pixels, aka 2 tiles by 2
tiles. The data for that bitmap is the 1st row of the 1st tile, then the 1st row
of the 2nd tile. However, when we translate that into the GBA, the first 8
pixels will indeed be the first 8 tile pixels, but then the next 8 pixels in
memory will be used as the <em>2nd row of the first tile</em>, not the 1st row of the
2nd tile.</p>
<p>So, how do we fix this?</p>
<p>Well, the simple but annoying way is to edit your tile image as being an 8 pixel
wide image and then have the image get super tall as you add more and more
tiles. It can work, but it's really impractical if you have any multi-tile
things that you're trying to do.</p>
<p>Instead, there are some image conversion tools that devkitpro provides in their
gba-dev section. They let you take normal images and then repackage them and
export it in various formats that you can then compile into your project.</p>
<p>Ketsuban uses the <a href="http://www.coranac.com/projects/grit/">grit</a> tool, with the
following suggestions:</p>
<ol>
<li>Include an actual resource file and a file describing it somewhere in your
project (see <a href="http://www.coranac.com/man/grit/html/index.htm">the grit
manual</a> for all details
involved here).</li>
<li>In a <code>build.rs</code> you run <code>grit</code> on each resource+description pair, such as in
this <a href="https://gist.github.com/ketsuban/526fa55fbef0a3ccd4c7cd6204f29f94">old gist
example</a></li>
<li>Then within your rust code you use the
<a href="https://doc.rust-lang.org/core/macro.include_bytes.html">include_bytes!</a>
macro to have the formatted resource be available as a const value you can
load at runtime.</li>
</ol>
<a class="header" href="#tiled-backgrounds" id="tiled-backgrounds"><h1>Tiled Backgrounds</h1></a> <a class="header" href="#tiled-backgrounds" id="tiled-backgrounds"><h1>Tiled Backgrounds</h1></a>
<p>TODO</p> <p>TODO</p>
<a class="header" href="#object-basics" id="object-basics"><h1>Object Basics</h1></a> <a class="header" href="#object-basics" id="object-basics"><h1>Object Basics</h1></a>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long