Use RFC 2873 asm syntax (#93)

The new syntax is way safer and more ergonomic. In fact, it renders
obsolete some of the warnings in the docs related to the use of `asm!`.
This commit is contained in:
Nicola Papale 2020-06-14 09:22:59 +02:00 committed by GitHub
parent af98b63aa1
commit 1696c66b1b
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 154 additions and 236 deletions

View file

@ -1,7 +1,7 @@
[package]
name = "gba"
description = "A crate (and book) for making GBA games with Rust."
version = "0.4.0-pre"
version = "0.4.0-pre1"
authors = ["Lokathor <zefria@gmail.com>", "Thomas Winwood <twwinwood@gmail.com>"]
repository = "https://github.com/rust-console/gba"
readme = "README.md"

View file

@ -26,7 +26,7 @@ at all. (TODO: investigate more about what parts of the BIOS we could
potentially offer faster alternatives for.)
I'd like to take a moment to thank [Marc Brinkmann](https://github.com/mbr)
(with contributions from [Oliver Schneider](https://github.com/oli-obk) and
(with contributions from [Oliver Scherer](https://github.com/oli-obk) and
[Philipp Oppermann](https://github.com/phil-opp)) for writing [this blog
post](http://embed.rs/articles/2016/arm-inline-assembly-rust/). It's at least
ten times the tutorial quality as the `asm` entry in the Unstable Book has. In
@ -39,15 +39,7 @@ So let's be slow and pedantic about this process.
## Inline ASM
**Fair Warning:** Inline asm is one of the least stable parts of Rust overall,
and if you write bad things you can trigger internal compiler errors and panics
and crashes and make LLVM choke and die without explanation. If you write some
inline asm and then suddenly your program suddenly stops compiling without
explanation, try commenting out that whole inline asm use and see if it's
causing the problem. Double check that you've written every single part of the
asm call absolutely correctly, etc, etc.
**Bonus Warning:** The general information that follows regarding the asm macro
**Fair Warning:** The general information that follows regarding the asm macro
is consistent from system to system, but specific information about register
names, register quantities, asm instruction argument ordering, and so on is
specific to ARM on the GBA. If you're programming for any other device you'll
@ -57,39 +49,44 @@ Now then, with those out of the way, the inline asm docs describe an asm call as
looking like this:
```rust
asm!(assembly template
: output operands
: input operands
: clobbers
: options
);
```
And once you stick a lot of stuff in there it can _absolutely_ be hard to
remember the ordering of the elements. So we'll start with a code block that
has some comments thrown in on each line:
```rust
asm!(/* ASM */ TODO
:/* OUT */ TODO
:/* INP */ TODO
:/* CLO */ TODO
:/* OPT */
let x = 10u32;
let y = 34u32;
let result: u32;
asm!(
// assembly template
"add {lhs}, {rhs}",
lhs = inout(reg_thumb) x => result,
rhs = in(reg_thumb) y,
options(nostack, nomem),
);
// result == 44
```
The `asm` macro follows the [RFC
2873](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md)
syntax. The following is just a summary of the RFC.
Now we have to decide what we're gonna write. Obviously we're going to do some
instructions, but those instructions use registers, and how are we gonna talk
about them? We've got two choices.
1) We can pick each and every register used by specifying exact register names.
In THUMB mode we have 8 registers available, named `r0` through `r7`. If you
switch into 32-bit mode there's additional registers that are also available.
In THUMB mode we have 8 registers available, named `r0` through `r7`. To use
those registers you would write `in("r0") x` instead of
`rhs = in(reg_thumb) x`, and directly refer to `r0` in the assembly template.
2) We can specify slots for registers we need and let LLVM decide. In this style
you name your slots `$0`, `$1` and so on. Slot numbers are assigned first to
all specified outputs, then to all specified inputs, in the order that you
list them.
2) We can specify slots for registers we need and let LLVM decide. This is what
we do when we write `rhs = in(reg_thumb) y` and use `{rhs}` in the assembly
template.
The `reg_thumb` stands for the register class we are using. Since we are
in THUMB mode, the set of registers we can use is limited. `reg_thumb` tells
LLVM: "use only registers available in THUMB mode". In 32-bit mode, you have
access to more register and you should use a different register class.
The register classes [are described in the
RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#register-operands).
Look for "ARM" register classes.
In the case of the GBA BIOS, each BIOS function has pre-designated input and
output registers, so we will use the first style. If you use inline ASM in other
@ -110,22 +107,22 @@ Remember that our Rust code is in 16-bit mode. You _can_ switch to 32-bit mode
within your asm as long as you switch back by the time the block ends. Otherwise
you'll have a bad time.
### Outputs
### Register bindings
A comma separated list. Each entry looks like
After the assembly string literal, you need to define your binding (which
rust variables are getting into your registers and which ones are going to refer
to their value afterward).
* `"constraint" (binding)`
There are many operand types [as per the
RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#operand-type),
but you will most often use:
An output constraint starts with a symbol:
* `=` for write only
* `+` for reads and writes
* `&` for for "early clobber", meaning that you'll write to this at some point
before all input values have been read. It prevents this register from being
assigned to an input register.
Followed by _either_ the letter `r` (if you want LLVM to pick the register to
use) or curly braces around a specific register (if you want to pick).
```
[alias =] in(<reg>) <binding> // input
[alias =] out(<reg>) <binding> // output
[alias =] inout(<reg>) <in binding> => <out binding> // both
out(<reg>) _ // Clobber
```
* The binding can be any single 32-bit or smaller value.
* If your binding has bit pattern requirements ("must be non-zero", etc) you are
@ -134,23 +131,13 @@ use) or curly braces around a specific register (if you want to pick).
being in a fit state to do that.
* The binding must be either a mutable binding or a binding that was
pre-declared but not yet assigned.
Anything else is UB.
### Inputs
This is a similar comma separated list.
* `"constraint" (binding)`
An input constraint doesn't have the symbol prefix, you just pick either `r` or
a named register with curly braces around it.
* An input binding must be a single 32-bit or smaller value.
* An input binding _should_ be a type that is `Copy` but this is not an absolute
requirement. Having the input be read is semantically similar to using
`core::ptr::read(&binding)` and forgetting the value when you're done.
Anything else is UB.
### Clobbers
Sometimes your asm will touch registers other than the ones declared for input
@ -166,11 +153,21 @@ Failure to define all of your clobbers can cause UB.
### Options
There's only one option we'd care to specify. That option is "volatile".
By default the compiler won't optimize the code you wrote in an `asm` block. You
will need to specify with the `options(..)` parameter that your code can be
optimized. The available options [are specified in the
RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#options-1).
Just like with a function call, LLVM will skip a block of asm if it doesn't see
that any outputs from the asm were used later on. Nearly every single BIOS call
(other than the math operations) will need to be marked as "volatile".
An optimization might duplicate or remove your instructions from the final
code.
Typically when executing a BIOS call (such as `swi 0x01`, which resets the
console), it's important that the instruction is executed, and not optimized
away, even though it has no observable input and output to the compiler.
However some BIOS calls, such as _some_ math functions, have no observable
effects outside of the registers we specified, in this case, we instruct the
compiler to optimize them.
### BIOS ASM
@ -215,11 +212,12 @@ pub fn div_rem(numerator: i32, denominator: i32) -> (i32, i32) {
let div_out: i32;
let rem_out: i32;
unsafe {
asm!(/* ASM */ "swi 0x06"
:/* OUT */ "={r0}"(div_out), "={r1}"(rem_out)
:/* INP */ "{r0}"(numerator), "{r1}"(denominator)
:/* CLO */ "r3"
:/* OPT */
asm!(
"swi 0x06",
inout("r0") numerator => div_out,
inout("r1") denominator => rem_out,
out("r3") _,
options(nostack, nomem),
);
}
(div_out, rem_out)

View file

@ -315,29 +315,3 @@ OTHER_MAGIC.index(120 + 96 * 240).write_volatile(0x7C00);
If you wanna see these types and methods with a full docs write up you should
check the GBA crate's source.
## Volatile ASM
In addition to some memory locations being volatile, it's also possible for
inline assembly to be declared volatile. This is basically the same idea, "hey
just do what I'm telling you, don't get smart about it".
Normally when you have some `asm!` it's basically treated like a function,
there's inputs and outputs and the compiler will try to optimize it so that if
you don't actually use the outputs it won't bother with doing those
instructions. However, `asm!` is basically a pure black box, so the compiler
doesn't know what's happening inside at all, and it can't see if there's any
important side effects going on.
An example of an important side effect that doesn't have output values would be
putting the CPU into a low power state while we want for the next VBlank. This
lets us save quite a bit of battery power. It requires some setup to be done
safely (otherwise the GBA won't ever actually wake back up from the low power
state), but the `asm!` you use once you're ready is just a single instruction
with no return value. The compiler can't tell what's going on, so you just have
to say "do it anyway".
Note that if you use a linker script to include any ASM with your Rust program
(eg: the `crt0.s` file that we setup in the "Development Setup" section), all of
that ASM is "volatile" for these purposes. Volatile isn't actually a _hardware_
concept, it's just an LLVM concept, and the linker script runs after LLVM has
done its work.

View file

@ -786,15 +786,12 @@ overhead I mentioned), the BIOS does its thing, and then eventually control
returns to us.
The precise details of what the BIOS call does depends on the function number
that we call. We'd even have to potentially mark it as volatile asm if there's
no clear outputs, otherwise the compiler would "helpfully" eliminate it for us
during optimization. In our case there _are_ clear outputs. The numerator goes
into register 0, and the denominator goes into register 1, the divmod happens,
and then the division output is left in register 0 and the modulus output is
left in register 1. I keep calling it "divmod" because div and modulus are two
sides of the same coin. There's no way to do one of them faster by not doing the
other or anything like that, so we'll first define it as a unified function that
returns a tuple:
that we call. The numerator goes into register 0, and the denominator goes into
register 1, the divmod happens, and then the division output is left in register
0 and the modulus output is left in register 1. I keep calling it "divmod"
because div and modulus are two sides of the same coin. There's no way to do one
of them faster by not doing the other or anything like that, so we'll first
define it as a unified function that returns a tuple:
```rust
#![feature(asm)]
@ -806,12 +803,18 @@ pub fn div_modulus(numerator: i32, denominator: i32) -> (i32, i32) {
let div_out: i32;
let mod_out: i32;
unsafe {
asm!(/* assembly template */ "swi 0x06"
:/* output operands */ "={r0}"(div_out), "={r1}"(mod_out)
:/* input operands */ "{r0}"(numerator), "{r1}"(denominator)
:/* clobbers */ "r3"
:/* options */
);
asm!(
// Assembly template
"swi 0x06",
// in+output registers
inout("r0") numerator => div_out,
inout("r0") denominator => mod_out,
// Clobber (not part of in/output but used by the operation)
out("r3") _,
// Additional compiler optimization options. See for details:
// https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#options-1
options(nostack, nomem),
);
}
(div_out, mod_out)
}

View file

@ -10,6 +10,7 @@
#![cfg_attr(not(all(target_vendor = "nintendo", target_env = "agb")), allow(unused_variables))]
use core::mem;
use super::*;
use io::irq::IrqFlags;
@ -60,13 +61,7 @@ pub unsafe fn soft_reset() -> ! {
}
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
asm!(/* ASM */ "swi 0x00"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
core::hint::unreachable_unchecked()
asm!("swi 0x00", options(noreturn))
}
}
@ -103,12 +98,7 @@ pub unsafe fn register_ram_reset(flags: RegisterRAMResetFlags) {
}
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
asm!(/* ASM */ "swi 0x01"
:/* OUT */ // none
:/* INP */ "{r0}"(flags.0)
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x01", in("r0") flags.0);
}
}
newtype! {
@ -143,12 +133,7 @@ pub fn halt() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x02"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x02");
}
}
}
@ -170,12 +155,7 @@ pub fn stop() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x03"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x03");
}
}
}
@ -202,11 +182,10 @@ pub fn interrupt_wait(ignore_current_flags: bool, target_flags: IrqFlags) {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x04"
:/* OUT */ // none
:/* INP */ "{r0}"(ignore_current_flags), "{r1}"(target_flags)
:/* CLO */ // none
:/* OPT */ "volatile"
asm!(
"swi 0x04",
in("r0") mem::transmute::<bool, u8>(ignore_current_flags),
in("r1") mem::transmute::<IrqFlags, u16>(target_flags),
);
}
}
@ -226,11 +205,10 @@ pub fn vblank_interrupt_wait() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x05"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ "r0", "r1" // both set to 1 by the routine
:/* OPT */ "volatile"
asm!(
"swi 0x05",
out("r0") _,
out("r1") _,
);
}
}
@ -253,11 +231,12 @@ pub fn div_rem(numerator: i32, denominator: i32) -> (i32, i32) {
let div_out: i32;
let rem_out: i32;
unsafe {
asm!(/* ASM */ "swi 0x06"
:/* OUT */ "={r0}"(div_out), "={r1}"(rem_out)
:/* INP */ "{r0}"(numerator), "{r1}"(denominator)
:/* CLO */ "r3"
:/* OPT */
asm!(
"swi 0x06",
inout("r0") numerator => div_out,
inout("r1") denominator => rem_out,
out("r3") _,
options(nostack, nomem),
);
}
(div_out, rem_out)
@ -292,16 +271,17 @@ pub fn sqrt(val: u32) -> u16 {
}
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
let out: u16;
let out: u32;
unsafe {
asm!(/* ASM */ "swi 0x08"
:/* OUT */ "={r0}"(out)
:/* INP */ "{r0}"(val)
:/* CLO */ "r1", "r3"
:/* OPT */
asm!(
"swi 0x08",
inout("r0") val => out,
out("r1") _,
out("r3") _,
options(pure, nomem),
);
}
out
out as u16
}
}
@ -321,11 +301,12 @@ pub fn atan(theta: i16) -> i16 {
{
let out: i16;
unsafe {
asm!(/* ASM */ "swi 0x09"
:/* OUT */ "={r0}"(out)
:/* INP */ "{r0}"(theta)
:/* CLO */ "r1", "r3"
:/* OPT */
asm!(
"swi 0x09",
inout("r0") theta => out,
out("r1") _,
out("r3") _,
options(pure, nomem),
);
}
out
@ -349,11 +330,12 @@ pub fn atan2(y: i16, x: i16) -> u16 {
{
let out: u16;
unsafe {
asm!(/* ASM */ "swi 0x0A"
:/* OUT */ "={r0}"(out)
:/* INP */ "{r0}"(x), "{r1}"(y)
:/* CLO */ "r3"
:/* OPT */
asm!(
"swi 0x0A",
inout("r0") x => out,
in("r1") y,
out("r3") _,
options(pure, nomem),
);
}
out
@ -378,11 +360,11 @@ pub unsafe fn cpu_set16(src: *const u16, dest: *mut u16, count: u32, fixed_sourc
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
let control = count + ((fixed_source as u32) << 24);
asm!(/* ASM */ "swi 0x0B"
:/* OUT */ // none
:/* INP */ "{r0}"(src), "{r1}"(dest), "{r2}"(control)
:/* CLO */ // none
:/* OPT */ "volatile"
asm!(
"swi 0x0B",
in("r0") src,
in("r1") dest,
in("r2") control,
);
}
}
@ -405,11 +387,11 @@ pub unsafe fn cpu_set32(src: *const u32, dest: *mut u32, count: u32, fixed_sourc
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
let control = count + ((fixed_source as u32) << 24) + (1 << 26);
asm!(/* ASM */ "swi 0x0B"
:/* OUT */ // none
:/* INP */ "{r0}"(src), "{r1}"(dest), "{r2}"(control)
:/* CLO */ // none
:/* OPT */ "volatile"
asm!(
"swi 0x0B",
in("r0") src,
in("r1") dest,
in("r2") control,
);
}
}
@ -433,11 +415,11 @@ pub unsafe fn cpu_fast_set(src: *const u32, dest: *mut u32, count: u32, fixed_so
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
let control = count + ((fixed_source as u32) << 24);
asm!(/* ASM */ "swi 0x0C"
:/* OUT */ // none
:/* INP */ "{r0}"(src), "{r1}"(dest), "{r2}"(control)
:/* CLO */ // none
:/* OPT */ "volatile"
asm!(
"swi 0x0C",
in("r0") src,
in("r1") dest,
in("r2") control,
);
}
}
@ -460,11 +442,10 @@ pub fn get_bios_checksum() -> u32 {
{
let out: u32;
unsafe {
asm!(/* ASM */ "swi 0x0D"
:/* OUT */ "={r0}"(out)
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ // none
asm!(
"swi 0x0D",
out("r0") out,
options(pure, readonly),
);
}
out
@ -499,12 +480,7 @@ pub fn sound_bias(level: u32) {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x19"
:/* OUT */ // none
:/* INP */ "{r0}"(level)
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x19", in("r0") level);
}
}
}
@ -544,12 +520,7 @@ pub fn sound_driver_mode(mode: u32) {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x1B"
:/* OUT */ // none
:/* INP */ "{r0}"(mode)
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x1B", in("r0") mode);
}
}
}
@ -571,12 +542,7 @@ pub fn sound_driver_main() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x1C"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x1C");
}
}
}
@ -594,12 +560,7 @@ pub fn sound_driver_vsync() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x1D"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x1D");
}
}
}
@ -619,12 +580,7 @@ pub fn sound_channel_clear() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x1E"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x1E");
}
}
}
@ -647,12 +603,7 @@ pub fn sound_driver_vsync_off() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x28"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x28");
}
}
}
@ -671,12 +622,7 @@ pub fn sound_driver_vsync_on() {
#[cfg(all(target_vendor = "nintendo", target_env = "agb"))]
{
unsafe {
asm!(/* ASM */ "swi 0x29"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("swi 0x29");
}
}
}

View file

@ -389,12 +389,9 @@ impl DMA3 {
// it's only two cycles we just insert two NOP instructions to ensure that
// successive calls to `fill32` or other DMA methods don't interfere with
// each other.
asm!(/* ASM */ "NOP
NOP"
:/* OUT */ // none
:/* INP */ // none
:/* CLO */ // none
:/* OPT */ "volatile"
);
asm!("
NOP
NOP
", options(nomem, nostack));
}
}