gba/book/src/02-concepts/02-bios.md

508 lines
17 KiB
Markdown
Raw Normal View History

2018-12-09 04:57:38 +11:00
# BIOS
2018-12-16 19:21:43 +11:00
* **Address Span:** `0x0` to `0x3FFF` (16k)
The [BIOS](https://en.wikipedia.org/wiki/BIOS) of the GBA is a small read-only
portion of memory at the very base of the address space. However, it is also
hardware protected against reading, so if you try to read from BIOS memory when
the program counter isn't pointed into the BIOS (eg: any time code _you_ write
is executing) then you get [basically garbage
data](https://problemkaputt.de/gbatek.htm#gbaunpredictablethings) back.
So we're not going to spend time here talking about what bits to read or write
within BIOS memory like we do with the other sections. Instead we're going to
spend time talking about [inline
assembly](https://doc.rust-lang.org/unstable-book/language-features/asm.html)
([tracking issue](https://github.com/rust-lang/rust/issues/29722)) and then use
it to call the [GBA BIOS
Functions](https://problemkaputt.de/gbatek.htm#biosfunctions).
Note that BIOS calls have _more overhead than normal function calls_, so don't
2018-12-16 20:12:34 +11:00
go using them all over the place if you don't have to. They're also usually
written more to be compact in terms of code than for raw speed, so you actually
can out speed them in some cases. Between the increased overhead and not being
as speed optimized, you can sometimes do a faster job without calling the BIOS
at all. (TODO: investigate more about what parts of the BIOS we could
potentially offer faster alternatives for.)
2018-12-16 19:21:43 +11:00
I'd like to take a moment to thank [Marc Brinkmann](https://github.com/mbr)
(with contributions from [Oliver Schneider](https://github.com/oli-obk) and
[Philipp Oppermann](https://github.com/phil-opp)) for writing [this blog
post](http://embed.rs/articles/2016/arm-inline-assembly-rust/). It's at least
ten times the tutorial quality as the `asm` entry in the Unstable Book has. In
2018-12-16 20:02:16 +11:00
fairness to the Unstable Book, the actual spec of how inline ASM works in rust
is "basically what clang does", and that's specified as "basically what GCC
does", and that's basically/shockingly not specified much at all despite GCC
being like 30 years old.
2018-12-16 19:21:43 +11:00
2018-12-16 20:02:16 +11:00
So let's be slow and pedantic about this process.
2018-12-16 19:21:43 +11:00
## Inline ASM
2018-12-16 20:02:16 +11:00
**Fair Warning:** Inline asm is one of the least stable parts of Rust overall,
and if you write bad things you can trigger internal compiler errors and panics
and crashes and make LLVM choke and die without explanation. If you write some
inline asm and then suddenly your program suddenly stops compiling without
explanation, try commenting out that whole inline asm use and see if it's
causing the problem. Double check that you've written every single part of the
asm call absolutely correctly, etc, etc.
2018-12-17 05:04:56 +11:00
**Bonus Warning:** The general information that follows regarding the asm macro
is consistent from system to system, but specific information about register
names, register quantities, asm instruction argument ordering, and so on is
specific to ARM on the GBA. If you're programming for any other device you'll
need to carefully investigate that before you begin.
2018-12-16 20:02:16 +11:00
Now then, with those out of the way, the inline asm docs describe an asm call as
looking like this:
2018-12-16 19:21:43 +11:00
```rust
asm!(assembly template
: output operands
: input operands
: clobbers
: options
);
```
And once you stick a lot of stuff in there it can _absolutely_ be hard to
remember the ordering of the elements. So we'll start with a code block that
2018-12-16 20:02:16 +11:00
has some comments thrown in on each line:
2018-12-16 19:21:43 +11:00
```rust
asm!(/* ASM */ TODO
:/* OUT */ TODO
:/* INP */ TODO
:/* CLO */ TODO
:/* OPT */
);
```
2018-12-16 20:02:16 +11:00
Now we have to decide what we're gonna write. Obviously we're going to do some
instructions, but those instructions use registers, and how are we gonna talk
about them? We've got two choices.
1) We can pick each and every register used by specifying exact register names.
In THUMB mode we have 8 registers available, named `r0` through `r7`. If you
switch into 32-bit mode there's additional registers that are also available.
2) We can specify slots for registers we need and let LLVM decide. In this style
you name your slots `$0`, `$1` and so on. Slot numbers are assigned first to
all specified outputs, then to all specified inputs, in the order that you
list them.
In the case of the GBA BIOS, each BIOS function has pre-designated input and
output registers, so we will use the first style. If you use inline ASM in other
parts of your code you're free to use the second style.
### Assembly
This is just one big string literal. You write out one instruction per line, and
excess whitespace is ignored. You can also do comments within your assembly
using `;` to start a comment that goes until the end of the line.
Assembly convention doesn't consider it unreasonable to comment potentially as
much as _every single line_ of asm that you write when you're getting used to
things. Or even if you are used to things. This is cryptic stuff, there's a
reason we avoid writing in it as much as possible.
Remember that our Rust code is in 16-bit mode. You _can_ switch to 32-bit mode
within your asm as long as you switch back by the time the block ends. Otherwise
you'll have a bad time.
### Outputs
A comma separated list. Each entry looks like
* `"constraint" (binding)`
An output constraint starts with a symbol:
* `=` for write only
* `+` for reads and writes
* `&` for for "early clobber", meaning that you'll write to this at some point
before all input values have been read. It prevents this register from being
assigned to an input register.
Followed by _either_ the letter `r` (if you want LLVM to pick the register to
use) or curly braces around a specific register (if you want to pick).
2018-12-17 09:17:30 +11:00
* The binding can be any single 32-bit or smaller value.
2018-12-16 20:02:16 +11:00
* If your binding has bit pattern requirements ("must be non-zero", etc) you are
responsible for upholding that.
* If your binding type will try to `Drop` later then you are responsible for it
being in a fit state to do that.
* The binding must be either a mutable binding or a binding that was
pre-declared but not yet assigned.
Anything else is UB.
### Inputs
This is a similar comma separated list.
* `"constraint" (binding)`
An input constraint doesn't have the symbol prefix, you just pick either `r` or
a named register with curly braces around it.
2018-12-17 09:17:30 +11:00
* An input binding must be a single 32-bit or smaller value.
2018-12-16 20:02:16 +11:00
* An input binding _should_ be a type that is `Copy` but this is not an absolute
requirement. Having the input be read is semantically similar to using
`core::ptr::read(&binding)` and forgetting the value when you're done.
### Clobbers
Sometimes your asm will touch registers other than the ones declared for input
and output.
Clobbers are declared as a comma separated list of string literals naming
specific registers. You don't use curly braces with clobbers.
LLVM _needs_ to know this information. It can move things around to keep your
data safe, but only if you tell it what's about to happen.
Failure to define all of your clobbers can cause UB.
### Options
2018-12-17 09:17:30 +11:00
There's only one option we'd care to specify. That option is "volatile".
2018-12-16 20:02:16 +11:00
Just like with a function call, LLVM will skip a block of asm if it doesn't see
2018-12-17 09:17:30 +11:00
that any outputs from the asm were used later on. Nearly every single BIOS call
(other than the math operations) will need to be marked as "volatile".
2018-12-16 20:02:16 +11:00
### BIOS ASM
2018-12-16 19:21:43 +11:00
* Inputs are always `r0`, `r1`, `r2`, and/or `r3`, depending on function.
* Outputs are always zero or more of `r0`, `r1`, and `r3`.
* Any of the output registers that aren't actually used should be marked as
clobbered.
* All other registers are unaffected.
All of the GBA BIOS calls are performed using the
[swi](http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0068b/BABFCEEG.html)
instruction, combined with a value depending on what BIOS function you're trying
to invoke. If you're in 16-bit code you use the value directly, and if you're in
32-bit mode you shift the value up by 16 bits first.
### Example BIOS Function: Division
2018-12-17 09:17:30 +11:00
For our example we'll use the division function, because GBATEK gives very clear
instructions on how each register is used with that one:
2018-12-16 19:21:43 +11:00
```txt
Signed Division, r0/r1.
r0 signed 32bit Number
r1 signed 32bit Denom
Return:
r0 Number DIV Denom ;signed
r1 Number MOD Denom ;signed
r3 ABS (Number DIV Denom) ;unsigned
For example, incoming -1234, 10 should return -123, -4, +123.
The function usually gets caught in an endless loop upon division by zero.
```
2018-12-16 20:22:07 +11:00
The math folks tell me that the `r1` value should be properly called the
"remainder" not the "modulus". We'll go with that for our function, doesn't hurt
2018-12-17 09:17:30 +11:00
to use the correct names. Our Rust function has an assert against dividing by
2018-12-16 20:22:07 +11:00
`0`, then we name some bindings _without_ giving them a value, we make the asm
call, and then return what we got.
2018-12-16 19:21:43 +11:00
```rust
pub fn div_rem(numerator: i32, denominator: i32) -> (i32, i32) {
assert!(denominator != 0);
let div_out: i32;
let rem_out: i32;
unsafe {
asm!(/* ASM */ "swi 0x06"
:/* OUT */ "={r0}"(div_out), "={r1}"(rem_out)
:/* INP */ "{r0}"(numerator), "{r1}"(denominator)
:/* CLO */ "r3"
:/* OPT */
);
}
(div_out, rem_out)
}
```
2018-12-17 09:17:30 +11:00
I _hope_ this all makes sense by now.
# BIOS Function Definitions
What follows is one entry for every BIOS call function, sorted by `swi` value
(which also _kinda_ sorts them into themed groups too).
All functions here are marked with `#[inline(always)]`, which I wouldn't
normally bother with, but the compiler can't see that the ASM we use is
immediately a second function call, so we want to be very sure that it gets
inlined as much as possible. You should probably be using Link Time Optimization
in your release mode GBA games just to get that extra boost, but
`#[inline(always)]` will help keep debug builds going at a good speed too.
The entries here in the book are basically just copy pasting the source for each
function from the `gba::bios` module of the crate. The actual asm invocation
itself is uninteresting, but I've attempted to make the documentation for each
function clear and complete.
## CPU Control / Reset
### Soft Reset (0x00)
```rust
/// (`swi 0x00`) SoftReset the device.
///
/// This function does not ever return.
///
/// Instead, it clears the top `0x200` bytes of IWRAM (containing stacks, and
/// BIOS IRQ vector/flags), re-initializes the system, supervisor, and irq stack
/// pointers (new values listed below), sets `r0` through `r12`, `LR_svc`,
/// `SPSR_svc`, `LR_irq`, and `SPSR_irq` to zero, and enters system mode. The
/// return address is loaded into `r14` and then the function jumps there with
/// `bx r14`.
///
/// * sp_svc: `0x300_7FE0`
/// * sp_irq: `0x300_7FA0`
/// * sp_sys: `0x300_7F00`
/// * Zero-filled Area: `0x300_7E00` to `0x300_7FFF`
/// * Return Address: Depends on the 8-bit flag value at `0x300_7FFA`. In either
/// case execution proceeds in ARM mode.
/// * zero flag: `0x800_0000` (ROM), which for our builds means that the
/// `crt0` program to execute (just like with a fresh boot), and then
/// control passes into `main` and so on.
/// * non-zero flag: `0x200_0000` (RAM), This is where a multiboot image would
/// go if you were doing a multiboot thing. However, this project doesn't
/// support multiboot at the moment. You'd need an entirely different build
/// pipeline because there's differences in header format and things like
/// that. Perhaps someday, but probably not even then. Submit the PR for it
/// if you like!
///
/// ## Safety
///
/// This functions isn't ever unsafe to the current iteration of the program.
/// However, because not all memory is fully cleared you theoretically could
/// threaten the _next_ iteration of the program that runs. I'm _fairly_
/// convinced that you can't actually use this to force purely safe code to
/// perform UB, but such a scenario might exist.
#[inline(always)]
pub unsafe fn soft_reset() -> ! {
asm!(/* ASM */ "swi 0x00"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
core::hint::unreachable_unchecked()
}
```
### Register / RAM Reset (0x01)
```rust
/// (`swi 0x01`) RegisterRamReset.
///
/// Clears the portions of memory given by the `flags` value, sets the Display
/// Control Register to `0x80` (forced blank and nothing else), then returns.
///
/// * Flag bits:
/// 0) Clears the 256k of EWRAM (don't use if this is where your function call
/// will return to!)
/// 1) Clears the 32k of IWRAM _excluding_ the last `0x200` bytes (see also:
/// the `soft_reset` function).
/// 2) Clears all Palette data.
/// 3) Clears all VRAM.
/// 4) Clears all OAM (reminder: a zeroed obj isn't disabled!)
/// 5) Reset SIO registers (resets them to general purpose mode)
/// 6) Reset Sound registers
/// 7) Reset all IO registers _other than_ SIO and Sound
///
/// **Bug:** The least significant byte of `SIODATA32` is always zeroed, even if
/// bit 5 was not enabled. This is sadly a bug in the design of the GBA itself.
///
/// ## Safety
///
/// It is generally a safe operation to suddenly clear any part of the GBA's
/// memory, except in the case that you were executing out of IWRAM and clear
/// that. If you do that you return to nothing and have a bad time.
#[inline(always)]
pub unsafe fn register_ram_reset(flags: u8) {
asm!(/* ASM */ "swi 0x01"
:/* OUT */ // none
:/* INP */ "{r0}"(flags)
:/* CLO */ // none
:/* OPT */ "volatile"
);
}
//TODO(lokathor): newtype this flag business.
```
### Halt (0x02)
### Stop / Sleep (0x03)
### Interrupt Wait (0x04)
### VBlank Interrupt Wait (0x05)
## Math
For the math functions to make sense you'll want to be familiar with the fixed
point math concepts from the [Fixed Only](../01-quirks/02-fixed_only.md) section
of the Quirks chapter.
### Div (0x06)
```rust
/// (`swi 0x06`) Software Division and Remainder.
///
/// ## Panics
///
/// If the denominator is 0.
#[inline(always)]
pub fn div_rem(numerator: i32, denominator: i32) -> (i32, i32) {
assert!(denominator != 0);
let div_out: i32;
let rem_out: i32;
unsafe {
asm!(/* ASM */ "swi 0x06"
:/* OUT */ "={r0}"(div_out), "={r1}"(rem_out)
:/* INP */ "{r0}"(numerator), "{r1}"(denominator)
:/* CLO */ "r3"
:/* OPT */
);
}
(div_out, rem_out)
}
/// As `div_rem`, but keeping only the `div` part.
#[inline(always)]
pub fn div(numerator: i32, denominator: i32) -> i32 {
div_rem(numerator, denominator).0
}
/// As `div_rem`, but keeping only the `rem` part.
#[inline(always)]
pub fn rem(numerator: i32, denominator: i32) -> i32 {
div_rem(numerator, denominator).1
}
```
### DivArm (0x07)
This is exactly like Div, but with the input arguments swapped. It ends up being
exactly 3 cycles slower than normal Div because it swaps the input arguments to
the positions that Div is expecting ("move r0 -> r3, mov r1 -> r0, mov r3 ->
r1") and then goes to the normal Div function.
You can basically forget about this function. It's for compatibility with other
ARM software conventions, which we don't need. Just use normal Div.
### Sqrt (0x08)
```rust
/// (`swi 0x08`) Integer square root.
///
/// If you want more fractional precision, you can shift your input to the left
/// by `2n` bits to get `n` more bits of fractional precision in your output.
#[inline(always)]
pub fn sqrt(val: u32) -> u16 {
let out: u16;
unsafe {
asm!(/* ASM */ "swi 0x08"
:/* OUT */ "={r0}"(out)
:/* INP */ "{r0}"(val)
:/* CLO */ "r1", "r3"
:/* OPT */
);
}
out
}
```
### ArcTan (0x09)
```rust
/// (`swi 0x09`) Gives the arctangent of `theta`.
///
/// The input format is 1 bit for sign, 1 bit for integral part, 14 bits for
/// fractional part.
///
/// Accuracy suffers if `theta` is less than `-pi/4` or greater than `pi/4`.
#[inline(always)]
pub fn atan(theta: i16) -> i16 {
let out: i16;
unsafe {
asm!(/* ASM */ "swi 0x09"
:/* OUT */ "={r0}"(out)
:/* INP */ "{r0}"(theta)
:/* CLO */ "r1", "r3"
:/* OPT */
);
}
out
}
```
### ArcTan2 (0x0A)
```rust
/// (`swi 0x0A`) Gives the atan2 of `y` over `x`.
///
/// The output `theta` value maps into the range `[0, 2pi)`, or `0 .. 2pi` if
/// you prefer Rust's range notation.
///
/// `y` and `x` use the same format as with `atan`: 1 bit for sign, 1 bit for
/// integral, 14 bits for fractional.
#[inline(always)]
pub fn atan2(y: i16, x: i16) -> u16 {
let out: u16;
unsafe {
asm!(/* ASM */ "swi 0x0A"
:/* OUT */ "={r0}"(out)
:/* INP */ "{r0}"(x), "{r1}"(y)
:/* CLO */ "r3"
:/* OPT */
);
}
out
}
```
## Memory Modification
### CPU Set (0x08)
### CPU Fast Set (0x0C)
### Get BIOS Checksum (0x0D)
### BG Affine Set (0x0E)
### Obj Affine Set (0x0F)
## Decompression
### BitUnPack (0x10)
### LZ77UnCompReadNormalWrite8bit (0x11)
### LZ77UnCompReadNormalWrite16bit (0x12)
### HuffUnCompReadNormal (0x13)
### RLUnCompReadNormalWrite8bit (0x14)
### RLUnCompReadNormalWrite16bit (0x15)
### Diff8bitUnFilterWrite8bit (0x16)
### Diff8bitUnFilterWrite16bit (0x17)
### Diff16bitUnFilter (0x18)
## Sound
### SoundBias (0x19)
### SoundDriverInit (0x1A)
### SoundDriverMode (0x1B)
### SoundDriverMain (0x1C)
### SoundDriverVSync (0x1D)
### SoundChannelClear (0x1E)
### MidiKey2Freq (0x1F)
### SoundWhatever0 (0x20)
### SoundWhatever1 (0x21)
### SoundWhatever2 (0x22)
### SoundWhatever3 (0x23)
### SoundWhatever4 (0x24)
### MultiBoot (0x25)
### HardReset (0x26)
### CustomHalt (0x27)
### SoundDriverVSyncOff (0x28)
### SoundDriverVSyncOn (0x29)
### SoundGetJumpList (0x2A)