mirror of
https://github.com/italicsjenga/gba.git
synced 2025-01-22 23:56:32 +11:00
newtype_enum, and other updates
This commit is contained in:
parent
09b4c8804c
commit
dfca52a079
6 changed files with 96 additions and 77 deletions
|
@ -9,96 +9,102 @@ sometimes. Accordingly, you should know how assembly works on the GBA.
|
|||
`ARMv4` ISA, the `ARMv4T` variant, and specifically the `ARM7TDMI`
|
||||
microarchitecture. Someone at ARM decided that having both `ARM#` and `ARMv#`
|
||||
was a good way to [version things](https://en.wikichip.org/wiki/arm/versions),
|
||||
even when the numbers don't match, and the rest of us have been sad ever
|
||||
since. The link there will take you to the correct book within the big pile of
|
||||
ARM books available within the ARM Infocenter. Note that there is also a [PDF
|
||||
even when the numbers don't match. The rest of us have been sad ever since.
|
||||
The link there will take you to the correct book specific to the GBA's
|
||||
microarchitecture. There's a whole big pile of ARM books available within the
|
||||
ARM Infocenter, so if you just google it or whatever make sure you end up
|
||||
looking at the correct one. Note that there is also a [PDF
|
||||
Version](http://infocenter.arm.com/help/topic/com.arm.doc.ddi0210c/DDI0210B.pdf)
|
||||
of the documentation available, if you'd like that.
|
||||
|
||||
* The [GBATek: ARM CPU
|
||||
Overview](https://problemkaputt.de/gbatek.htm#armcpuoverview) also has quite a
|
||||
bit of info. Most of it is somewhat a duplication of what you'd find in the
|
||||
ARM Infocenter reference manual, but it's also somewhat specialized towards
|
||||
the GBA's specifics. It's in the usual, uh, "sparse" style that GBATEK is
|
||||
written in, so I wouldn't suggest that read it first.
|
||||
bit of info. Some of it is a duplication of what you'd find in the ARM
|
||||
Infocenter reference manual. Some of it is specific to the GBA's chip. Some of
|
||||
it is specific to the ARM chips within the DS and DSi. It's a bit of a jumbled
|
||||
mess, and as with the rest of GBATEK, the explanations are in a "sparse" style
|
||||
(to put it nicely), so I wouldn't take it as your only source.
|
||||
|
||||
* The [Compiler Explorer](https://rust.godbolt.org/z/ndCnk3) can be used to
|
||||
quickly look at assembly output of your Rust code. That link there will load
|
||||
quickly look at assembly versions of your Rust code. That link there will load
|
||||
up an essentially blank `no_std` file with `opt-level=3` set and targeting
|
||||
`thumbv6m-none-eabi`. That's _not_ the same as the GBA (it's two ISA revisions
|
||||
later, ARMv6 instead of ARMv4), but it's the closest CPU target that ships
|
||||
with rustc, so it's the closest you can get with the compiler explorer
|
||||
website. If you're very dedicated I suppose you could setup a [local
|
||||
`thumbv6m-none-eabi`. That's _not_ the same target as the GBA (it's two ISA
|
||||
revisions later, ARMv6 instead of ARMv4), but it's the closest CPU target that
|
||||
is bundled with rustc, so it's the closest you can get with the compiler
|
||||
explorer website. If you're very dedicated I suppose you could setup a [local
|
||||
instance](https://github.com/mattgodbolt/compiler-explorer#running-a-local-instance)
|
||||
of compiler explorer and then add the extra target definition and so on, but
|
||||
that's _probably_ overkill.
|
||||
|
||||
## ARM and THUMB
|
||||
## ARM and Thumb
|
||||
|
||||
The "T" part in `ARMv4T` and `ARM7TDMI` means "Thumb". An ARM chip that supports
|
||||
Thumb mode has two different instruction sets instead of just one. The chip can
|
||||
run in ARM mode with 32-bit instructions, or it can run in THUMB mode with
|
||||
16-bit instructions. Apparently these modes are sometimes called `a32` and `t32`
|
||||
in a more modern context, but I will stick with ARM and THUMB because that's
|
||||
what other GBA references use (particularly GBATEK), and it's probably best to
|
||||
be more in agreement with them than with stuff for Raspberry Pi programming or
|
||||
whatever other modern ARM thing.
|
||||
Thumb has two different instruction sets instead of just one. The chip can run
|
||||
in ARM state with 32-bit instructions, or it can run in Thumb state with 16-bit
|
||||
instructions. Note that the CPU _state_ (ARM or Thumb) is distinct from the
|
||||
_mode_ (User, FIQ, IRQ, etc). Apparently these states are sometimes called
|
||||
`a32` and `t32` in a more modern context, but I will stick with ARM and Thumb
|
||||
because that's what the official ARM7TDMI manual and GBATEK both use.
|
||||
|
||||
On the GBA, the memory bus that physically transfers data from the game pak into
|
||||
On the GBA, the memory bus that physically transfers data from the cartridge into
|
||||
the device is a 16-bit memory bus. This means that if you need to transfer more
|
||||
than 16 bits at a time you have to do more than one transfer. Since we'd like
|
||||
our instructions to get to the CPU as fast as possible, we compile the majority
|
||||
of our program with the THUMB instruction set. The ARM reference says that with
|
||||
THUMB instructions on a 16-bit memory bus system you get about 160% performance
|
||||
of our program with the Thumb instruction set. The ARM reference says that with
|
||||
Thumb instructions on a 16-bit memory bus system you get about 160% performance
|
||||
compared to using ARM instructions. That's absolutely something we want to take
|
||||
advantage of. Also, your THUMB compiled code is about 65% of the same code
|
||||
advantage of. Also, your Thumb compiled code is about 65% of the same code
|
||||
compiled with ARM. Since a game ROM can only be 32MB total, and we're trying to
|
||||
fit in images and sound too, we want to get space savings where we can.
|
||||
|
||||
You may wonder, why is the THUMB code 65% as large if the instructions
|
||||
themselves are 50% as large, and why have ARM mode at all if there's such a
|
||||
benefit to be had with THUMB? Well, THUMB mode doesn't support as many different
|
||||
instructions as ARM mode does. Some lines of source code that can compile to a
|
||||
single ARM instruction might need to compile into more than one THUMB
|
||||
instruction. THUMB still has most of the really good instructions available, so
|
||||
You may wonder, why is the Thumb code 65% as large if the instructions
|
||||
themselves are 50% as large, and why have ARM state at all if there's such a
|
||||
benefit to be had with Thumb? Well, Thumb state doesn't support as many different
|
||||
instructions as ARM state does. Some lines of source code that can compile to a
|
||||
single ARM instruction might need to compile into more than one Thumb
|
||||
instruction. Thumb still has most of the really good instructions available, so
|
||||
it all averages out to about 65%.
|
||||
|
||||
That said, some parts of a GBA program _must_ be written in ARM mode. Also, ARM
|
||||
mode does allow that increased instruction flexibility. So we _need_ to use ARM
|
||||
some of the time, and we might just _want_ to use ARM even when we don't need
|
||||
to. It is possible to switch modes on the fly, there's extremely minimal
|
||||
overhead, even less than doing some function calls. The only problem is the
|
||||
16-bit memory bus of the game pak giving us a needless speed penalty with our
|
||||
ARM code. The CPU _executes_ the ARM instructions at full speed, but then it has
|
||||
to wait while more instructions get sent in. What do we do? Well, code is
|
||||
ultimately just a different kind of data. We can copy parts of our code off the
|
||||
game pak ROM and place it into a part of the RAM that has a 32-bit memory bus.
|
||||
Then the CPU can execute the code from there, going at full speed. Of course,
|
||||
there's only a very small amount of RAM compared to the size of a game pak, so
|
||||
we'll only do this with a few select functions. Exactly which functions will
|
||||
probably depend on your game.
|
||||
That said, some parts of a GBA program _must_ be written for ARM state. Also,
|
||||
ARM state does allow that increased instruction flexibility. So we _need_ to use
|
||||
ARM some of the time, and we might just _want_ to use ARM even when we don't
|
||||
need to at other times. It is possible to switch states on the fly, there's
|
||||
extremely minimal overhead, even less than doing some function calls. The only
|
||||
problem is the 16-bit memory bus of the cartridge giving us a needless speed
|
||||
penalty with our ARM code. The CPU _executes_ the ARM instructions at full
|
||||
speed, but then it has to wait while more instructions get sent in. What do we
|
||||
do? Well, code is ultimately just a different kind of data. We can copy parts of
|
||||
our code off the cartridge ROM and place it into a part of the RAM that has a
|
||||
32-bit memory bus. Then the CPU can execute the code from there, going at full
|
||||
speed. Of course, there's only a very small amount of RAM compared to the size
|
||||
of a cartridge, so we'll only do this with a few select functions. Exactly which
|
||||
functions will probably depend on your game.
|
||||
|
||||
One problem with this process is that Rust doesn't currently offer a way to mark
|
||||
individual functions for being ARM or THUMB. The whole program is compiled in a
|
||||
single mode. That's not an automatic killer, since we can use the `asm!` macro
|
||||
to write some inline assembly, then within our inline assembly we switch from
|
||||
THUMB to ARM, do some ARM stuff, and switch back to THUMB mode before the inline
|
||||
assembly is over. Rust is none the wiser to what happened. Yeah, it's clunky,
|
||||
that's why [it's on the 2019
|
||||
wishlist](https://github.com/rust-embedded/wg/issues/256#issuecomment-439677804)
|
||||
to fix it (then LLVM can manage it automatically for you).
|
||||
There's two problems that we face as Rust programmers:
|
||||
|
||||
The bigger problem is that when we do that all of our functions still start off
|
||||
in THUMB mode, even if they temporarily use ARM mode. For the few bits of code
|
||||
that must start _already in_ ARM mode, we're stuck. Those parts have to be
|
||||
written in external assembly files and then included with the linker. We were
|
||||
already going to write some assembly, and we already use more than one file in
|
||||
our project all the time, those parts aren't a big problem. The big problem is
|
||||
that using custom linker scripts isn't transitive between crates.
|
||||
1) Rust offers no way to specify individual functions as being ARM or Thumb. The
|
||||
whole program is compiled for one state or the other. Obviously this is no
|
||||
good, so it's on the [2019 embedded
|
||||
wishlist](https://github.com/rust-embedded/wg/issues/256#issuecomment-439677804),
|
||||
and perhaps a fix will come.
|
||||
|
||||
2) Rust offers no way to get a pointer to a function as well as the length of
|
||||
the compiled function, so we can't copy a function from the ROM to some other
|
||||
location because we can't even express statements about the function's data.
|
||||
I also put this [on the
|
||||
wishlist](https://github.com/rust-embedded/wg/issues/256#issuecomment-450539836),
|
||||
but honestly I have much less hope that this becomes a part of rust.
|
||||
|
||||
What this ultimately means is that some parts of our program have to be written
|
||||
in external assembly files and then added to the program with the linker. We
|
||||
were already going to write some assembly, and we already use more than one file
|
||||
in our project all the time, those parts aren't a big problem. The big problem
|
||||
is that using custom linker scripts to get assembly code into our final program
|
||||
isn't transitive between crates.
|
||||
|
||||
What I mean is that once we have a file full of custom assembly that we're
|
||||
linking in by hand, that's not "part of" the crate any more. At least not as
|
||||
`cargo` see it. So we can't just upload it to `crates.io` and then depend on it
|
||||
`cargo` sees it. So we can't just upload it to `crates.io` and then depend on it
|
||||
in other projects and have `cargo` download the right version and and include it
|
||||
all automatically. We're back to fully manually copying files from the old
|
||||
project into the new one, adding more lines to the linker script each time we
|
||||
|
|
1
src/ewram.rs
Normal file
1
src/ewram.rs
Normal file
|
@ -0,0 +1 @@
|
|||
//! Module for External Work RAM (`EWRAM`).
|
|
@ -49,10 +49,9 @@ impl DisplayControlSetting {
|
|||
}
|
||||
}
|
||||
|
||||
/// The six display modes available on the GBA.
|
||||
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
|
||||
#[repr(u16)]
|
||||
pub enum DisplayMode {
|
||||
newtype_enum! {
|
||||
/// The six display modes available on the GBA.
|
||||
DisplayMode = u16,
|
||||
/// * Affine: No
|
||||
/// * Layers: 0/1/2/3
|
||||
/// * Size(px): 256x256 to 512x512
|
||||
|
|
1
src/iwram.rs
Normal file
1
src/iwram.rs
Normal file
|
@ -0,0 +1 @@
|
|||
//! Module for Internal Work RAM (`IWRAM`).
|
35
src/lib.rs
35
src/lib.rs
|
@ -42,7 +42,7 @@ pub(crate) use gba_proc_macro::phantom_fields;
|
|||
/// }
|
||||
/// newtype! {
|
||||
/// /// You can't derive most stuff above array size 32, so we add
|
||||
/// /// the `, no frills` modifier.
|
||||
/// /// the `, no frills` modifier to this one.
|
||||
/// BigArray, [u8; 200], no frills
|
||||
/// }
|
||||
/// ```
|
||||
|
@ -67,6 +67,26 @@ macro_rules! newtype {
|
|||
};
|
||||
}
|
||||
|
||||
/// Assists in defining a newtype that's an enum.
|
||||
///
|
||||
/// First give `NewType = OldType,`, then define the tags and their explicit
|
||||
/// values with zero or more entries of `TagName = base_value,`. In both cases
|
||||
/// you can place doc comments or other attributes directly on to the type
|
||||
/// declaration or the tag declaration.
|
||||
///
|
||||
/// The generated enum will get an appropriate `repr` attribute as well as Debug, Clone, Copy,
|
||||
///
|
||||
/// Example:
|
||||
/// ```
|
||||
/// newtype_enum! {
|
||||
/// /// The Foo
|
||||
/// Foo = u16,
|
||||
/// /// The Bar
|
||||
/// Bar = 0,
|
||||
/// /// The Zap
|
||||
/// Zap = 1,
|
||||
/// }
|
||||
/// ```
|
||||
#[macro_export]
|
||||
macro_rules! newtype_enum {
|
||||
(
|
||||
|
@ -86,21 +106,14 @@ macro_rules! newtype_enum {
|
|||
};
|
||||
}
|
||||
|
||||
newtype_enum! {
|
||||
/// the Foo
|
||||
Foo = u16,
|
||||
/// the Bar
|
||||
Bar = 0,
|
||||
/// The Zap
|
||||
Zap = 1,
|
||||
}
|
||||
|
||||
pub mod base;
|
||||
pub(crate) use self::base::*;
|
||||
|
||||
pub mod bios;
|
||||
|
||||
pub mod wram;
|
||||
pub mod iwram;
|
||||
|
||||
pub mod ewram;
|
||||
|
||||
pub mod io;
|
||||
|
||||
|
|
|
@ -1 +0,0 @@
|
|||
//! Module for things related to WRAM.
|
Loading…
Add table
Reference in a new issue