diff --git a/Cargo.toml b/Cargo.toml index 0483081..56ca58c 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -1,7 +1,7 @@ [package] name = "gba" description = "A crate (and book) for making GBA games with Rust." -version = "0.4.0-pre" +version = "0.4.0-pre1" authors = ["Lokathor ", "Thomas Winwood "] repository = "https://github.com/rust-console/gba" readme = "README.md" diff --git a/book/src-bak/02-bios.md b/book/src-bak/02-bios.md index 4ab245d..d76af96 100644 --- a/book/src-bak/02-bios.md +++ b/book/src-bak/02-bios.md @@ -26,7 +26,7 @@ at all. (TODO: investigate more about what parts of the BIOS we could potentially offer faster alternatives for.) I'd like to take a moment to thank [Marc Brinkmann](https://github.com/mbr) -(with contributions from [Oliver Schneider](https://github.com/oli-obk) and +(with contributions from [Oliver Scherer](https://github.com/oli-obk) and [Philipp Oppermann](https://github.com/phil-opp)) for writing [this blog post](http://embed.rs/articles/2016/arm-inline-assembly-rust/). It's at least ten times the tutorial quality as the `asm` entry in the Unstable Book has. In @@ -39,15 +39,7 @@ So let's be slow and pedantic about this process. ## Inline ASM -**Fair Warning:** Inline asm is one of the least stable parts of Rust overall, -and if you write bad things you can trigger internal compiler errors and panics -and crashes and make LLVM choke and die without explanation. If you write some -inline asm and then suddenly your program suddenly stops compiling without -explanation, try commenting out that whole inline asm use and see if it's -causing the problem. Double check that you've written every single part of the -asm call absolutely correctly, etc, etc. - -**Bonus Warning:** The general information that follows regarding the asm macro +**Fair Warning:** The general information that follows regarding the asm macro is consistent from system to system, but specific information about register names, register quantities, asm instruction argument ordering, and so on is specific to ARM on the GBA. If you're programming for any other device you'll @@ -57,39 +49,44 @@ Now then, with those out of the way, the inline asm docs describe an asm call as looking like this: ```rust -asm!(assembly template - : output operands - : input operands - : clobbers - : options - ); -``` - -And once you stick a lot of stuff in there it can _absolutely_ be hard to -remember the ordering of the elements. So we'll start with a code block that -has some comments thrown in on each line: - -```rust -asm!(/* ASM */ TODO - :/* OUT */ TODO - :/* INP */ TODO - :/* CLO */ TODO - :/* OPT */ +let x = 10u32; +let y = 34u32; +let result: u32; +asm!( + // assembly template + "add {lhs}, {rhs}", + lhs = inout(reg_thumb) x => result, + rhs = in(reg_thumb) y, + options(nostack, nomem), ); +// result == 44 ``` +The `asm` macro follows the [RFC +2873](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md) +syntax. The following is just a summary of the RFC. + Now we have to decide what we're gonna write. Obviously we're going to do some instructions, but those instructions use registers, and how are we gonna talk about them? We've got two choices. 1) We can pick each and every register used by specifying exact register names. - In THUMB mode we have 8 registers available, named `r0` through `r7`. If you - switch into 32-bit mode there's additional registers that are also available. + In THUMB mode we have 8 registers available, named `r0` through `r7`. To use + those registers you would write `in("r0") x` instead of + `rhs = in(reg_thumb) x`, and directly refer to `r0` in the assembly template. -2) We can specify slots for registers we need and let LLVM decide. In this style - you name your slots `$0`, `$1` and so on. Slot numbers are assigned first to - all specified outputs, then to all specified inputs, in the order that you - list them. +2) We can specify slots for registers we need and let LLVM decide. This is what + we do when we write `rhs = in(reg_thumb) y` and use `{rhs}` in the assembly + template. + + The `reg_thumb` stands for the register class we are using. Since we are + in THUMB mode, the set of registers we can use is limited. `reg_thumb` tells + LLVM: "use only registers available in THUMB mode". In 32-bit mode, you have + access to more register and you should use a different register class. + + The register classes [are described in the + RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#register-operands). + Look for "ARM" register classes. In the case of the GBA BIOS, each BIOS function has pre-designated input and output registers, so we will use the first style. If you use inline ASM in other @@ -110,22 +107,22 @@ Remember that our Rust code is in 16-bit mode. You _can_ switch to 32-bit mode within your asm as long as you switch back by the time the block ends. Otherwise you'll have a bad time. -### Outputs +### Register bindings -A comma separated list. Each entry looks like +After the assembly string literal, you need to define your binding (which +rust variables are getting into your registers and which ones are going to refer +to their value afterward). -* `"constraint" (binding)` +There are many operand types [as per the +RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#operand-type), +but you will most often use: -An output constraint starts with a symbol: - -* `=` for write only -* `+` for reads and writes -* `&` for for "early clobber", meaning that you'll write to this at some point - before all input values have been read. It prevents this register from being - assigned to an input register. - -Followed by _either_ the letter `r` (if you want LLVM to pick the register to -use) or curly braces around a specific register (if you want to pick). +``` +[alias =] in() // input +[alias =] out() // output +[alias =] inout() => // both +out() _ // Clobber +``` * The binding can be any single 32-bit or smaller value. * If your binding has bit pattern requirements ("must be non-zero", etc) you are @@ -134,23 +131,13 @@ use) or curly braces around a specific register (if you want to pick). being in a fit state to do that. * The binding must be either a mutable binding or a binding that was pre-declared but not yet assigned. - -Anything else is UB. - -### Inputs - -This is a similar comma separated list. - -* `"constraint" (binding)` - -An input constraint doesn't have the symbol prefix, you just pick either `r` or -a named register with curly braces around it. - * An input binding must be a single 32-bit or smaller value. * An input binding _should_ be a type that is `Copy` but this is not an absolute requirement. Having the input be read is semantically similar to using `core::ptr::read(&binding)` and forgetting the value when you're done. +Anything else is UB. + ### Clobbers Sometimes your asm will touch registers other than the ones declared for input @@ -166,11 +153,21 @@ Failure to define all of your clobbers can cause UB. ### Options -There's only one option we'd care to specify. That option is "volatile". +By default the compiler won't optimize the code you wrote in an `asm` block. You +will need to specify with the `options(..)` parameter that your code can be +optimized. The available options [are specified in the +RFC](https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#options-1). -Just like with a function call, LLVM will skip a block of asm if it doesn't see -that any outputs from the asm were used later on. Nearly every single BIOS call -(other than the math operations) will need to be marked as "volatile". +An optimization might duplicate or remove your instructions from the final +code. + +Typically when executing a BIOS call (such as `swi 0x01`, which resets the +console), it's important that the instruction is executed, and not optimized +away, even though it has no observable input and output to the compiler. + +However some BIOS calls, such as _some_ math functions, have no observable +effects outside of the registers we specified, in this case, we instruct the +compiler to optimize them. ### BIOS ASM @@ -215,11 +212,12 @@ pub fn div_rem(numerator: i32, denominator: i32) -> (i32, i32) { let div_out: i32; let rem_out: i32; unsafe { - asm!(/* ASM */ "swi 0x06" - :/* OUT */ "={r0}"(div_out), "={r1}"(rem_out) - :/* INP */ "{r0}"(numerator), "{r1}"(denominator) - :/* CLO */ "r3" - :/* OPT */ + asm!( + "swi 0x06", + inout("r0") numerator => div_out, + inout("r1") denominator => rem_out, + out("r3") _, + options(nostack, nomem), ); } (div_out, rem_out) diff --git a/book/src-bak/03-volatile_destination.md b/book/src-bak/03-volatile_destination.md index 18f8fdb..dcc1978 100644 --- a/book/src-bak/03-volatile_destination.md +++ b/book/src-bak/03-volatile_destination.md @@ -315,29 +315,3 @@ OTHER_MAGIC.index(120 + 96 * 240).write_volatile(0x7C00); If you wanna see these types and methods with a full docs write up you should check the GBA crate's source. -## Volatile ASM - -In addition to some memory locations being volatile, it's also possible for -inline assembly to be declared volatile. This is basically the same idea, "hey -just do what I'm telling you, don't get smart about it". - -Normally when you have some `asm!` it's basically treated like a function, -there's inputs and outputs and the compiler will try to optimize it so that if -you don't actually use the outputs it won't bother with doing those -instructions. However, `asm!` is basically a pure black box, so the compiler -doesn't know what's happening inside at all, and it can't see if there's any -important side effects going on. - -An example of an important side effect that doesn't have output values would be -putting the CPU into a low power state while we want for the next VBlank. This -lets us save quite a bit of battery power. It requires some setup to be done -safely (otherwise the GBA won't ever actually wake back up from the low power -state), but the `asm!` you use once you're ready is just a single instruction -with no return value. The compiler can't tell what's going on, so you just have -to say "do it anyway". - -Note that if you use a linker script to include any ASM with your Rust program -(eg: the `crt0.s` file that we setup in the "Development Setup" section), all of -that ASM is "volatile" for these purposes. Volatile isn't actually a _hardware_ -concept, it's just an LLVM concept, and the linker script runs after LLVM has -done its work. diff --git a/book/src-bak/gba_prng.md b/book/src-bak/gba_prng.md index 6745517..1ce9581 100644 --- a/book/src-bak/gba_prng.md +++ b/book/src-bak/gba_prng.md @@ -786,15 +786,12 @@ overhead I mentioned), the BIOS does its thing, and then eventually control returns to us. The precise details of what the BIOS call does depends on the function number -that we call. We'd even have to potentially mark it as volatile asm if there's -no clear outputs, otherwise the compiler would "helpfully" eliminate it for us -during optimization. In our case there _are_ clear outputs. The numerator goes -into register 0, and the denominator goes into register 1, the divmod happens, -and then the division output is left in register 0 and the modulus output is -left in register 1. I keep calling it "divmod" because div and modulus are two -sides of the same coin. There's no way to do one of them faster by not doing the -other or anything like that, so we'll first define it as a unified function that -returns a tuple: +that we call. The numerator goes into register 0, and the denominator goes into +register 1, the divmod happens, and then the division output is left in register +0 and the modulus output is left in register 1. I keep calling it "divmod" +because div and modulus are two sides of the same coin. There's no way to do one +of them faster by not doing the other or anything like that, so we'll first +define it as a unified function that returns a tuple: ```rust #![feature(asm)] @@ -806,12 +803,18 @@ pub fn div_modulus(numerator: i32, denominator: i32) -> (i32, i32) { let div_out: i32; let mod_out: i32; unsafe { - asm!(/* assembly template */ "swi 0x06" - :/* output operands */ "={r0}"(div_out), "={r1}"(mod_out) - :/* input operands */ "{r0}"(numerator), "{r1}"(denominator) - :/* clobbers */ "r3" - :/* options */ - ); + asm!( + // Assembly template + "swi 0x06", + // in+output registers + inout("r0") numerator => div_out, + inout("r0") denominator => mod_out, + // Clobber (not part of in/output but used by the operation) + out("r3") _, + // Additional compiler optimization options. See for details: + // https://github.com/Amanieu/rfcs/blob/inline-asm/text/0000-inline-asm.md#options-1 + options(nostack, nomem), + ); } (div_out, mod_out) } diff --git a/src/bios.rs b/src/bios.rs index a5f3b3b..fb5ffc6 100644 --- a/src/bios.rs +++ b/src/bios.rs @@ -10,6 +10,7 @@ #![cfg_attr(not(all(target_vendor = "nintendo", target_env = "agb")), allow(unused_variables))] +use core::mem; use super::*; use io::irq::IrqFlags; @@ -60,13 +61,7 @@ pub unsafe fn soft_reset() -> ! { } #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { - asm!(/* ASM */ "swi 0x00" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); - core::hint::unreachable_unchecked() + asm!("swi 0x00", options(noreturn)) } } @@ -103,12 +98,7 @@ pub unsafe fn register_ram_reset(flags: RegisterRAMResetFlags) { } #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { - asm!(/* ASM */ "swi 0x01" - :/* OUT */ // none - :/* INP */ "{r0}"(flags.0) - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x01", in("r0") flags.0); } } newtype! { @@ -143,12 +133,7 @@ pub fn halt() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x02" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x02"); } } } @@ -170,12 +155,7 @@ pub fn stop() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x03" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x03"); } } } @@ -202,11 +182,10 @@ pub fn interrupt_wait(ignore_current_flags: bool, target_flags: IrqFlags) { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x04" - :/* OUT */ // none - :/* INP */ "{r0}"(ignore_current_flags), "{r1}"(target_flags) - :/* CLO */ // none - :/* OPT */ "volatile" + asm!( + "swi 0x04", + in("r0") mem::transmute::(ignore_current_flags), + in("r1") mem::transmute::(target_flags), ); } } @@ -226,11 +205,10 @@ pub fn vblank_interrupt_wait() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x05" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ "r0", "r1" // both set to 1 by the routine - :/* OPT */ "volatile" + asm!( + "swi 0x05", + out("r0") _, + out("r1") _, ); } } @@ -253,11 +231,12 @@ pub fn div_rem(numerator: i32, denominator: i32) -> (i32, i32) { let div_out: i32; let rem_out: i32; unsafe { - asm!(/* ASM */ "swi 0x06" - :/* OUT */ "={r0}"(div_out), "={r1}"(rem_out) - :/* INP */ "{r0}"(numerator), "{r1}"(denominator) - :/* CLO */ "r3" - :/* OPT */ + asm!( + "swi 0x06", + inout("r0") numerator => div_out, + inout("r1") denominator => rem_out, + out("r3") _, + options(nostack, nomem), ); } (div_out, rem_out) @@ -292,16 +271,17 @@ pub fn sqrt(val: u32) -> u16 { } #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { - let out: u16; + let out: u32; unsafe { - asm!(/* ASM */ "swi 0x08" - :/* OUT */ "={r0}"(out) - :/* INP */ "{r0}"(val) - :/* CLO */ "r1", "r3" - :/* OPT */ + asm!( + "swi 0x08", + inout("r0") val => out, + out("r1") _, + out("r3") _, + options(pure, nomem), ); } - out + out as u16 } } @@ -321,11 +301,12 @@ pub fn atan(theta: i16) -> i16 { { let out: i16; unsafe { - asm!(/* ASM */ "swi 0x09" - :/* OUT */ "={r0}"(out) - :/* INP */ "{r0}"(theta) - :/* CLO */ "r1", "r3" - :/* OPT */ + asm!( + "swi 0x09", + inout("r0") theta => out, + out("r1") _, + out("r3") _, + options(pure, nomem), ); } out @@ -349,11 +330,12 @@ pub fn atan2(y: i16, x: i16) -> u16 { { let out: u16; unsafe { - asm!(/* ASM */ "swi 0x0A" - :/* OUT */ "={r0}"(out) - :/* INP */ "{r0}"(x), "{r1}"(y) - :/* CLO */ "r3" - :/* OPT */ + asm!( + "swi 0x0A", + inout("r0") x => out, + in("r1") y, + out("r3") _, + options(pure, nomem), ); } out @@ -378,11 +360,11 @@ pub unsafe fn cpu_set16(src: *const u16, dest: *mut u16, count: u32, fixed_sourc #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { let control = count + ((fixed_source as u32) << 24); - asm!(/* ASM */ "swi 0x0B" - :/* OUT */ // none - :/* INP */ "{r0}"(src), "{r1}"(dest), "{r2}"(control) - :/* CLO */ // none - :/* OPT */ "volatile" + asm!( + "swi 0x0B", + in("r0") src, + in("r1") dest, + in("r2") control, ); } } @@ -405,11 +387,11 @@ pub unsafe fn cpu_set32(src: *const u32, dest: *mut u32, count: u32, fixed_sourc #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { let control = count + ((fixed_source as u32) << 24) + (1 << 26); - asm!(/* ASM */ "swi 0x0B" - :/* OUT */ // none - :/* INP */ "{r0}"(src), "{r1}"(dest), "{r2}"(control) - :/* CLO */ // none - :/* OPT */ "volatile" + asm!( + "swi 0x0B", + in("r0") src, + in("r1") dest, + in("r2") control, ); } } @@ -433,11 +415,11 @@ pub unsafe fn cpu_fast_set(src: *const u32, dest: *mut u32, count: u32, fixed_so #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { let control = count + ((fixed_source as u32) << 24); - asm!(/* ASM */ "swi 0x0C" - :/* OUT */ // none - :/* INP */ "{r0}"(src), "{r1}"(dest), "{r2}"(control) - :/* CLO */ // none - :/* OPT */ "volatile" + asm!( + "swi 0x0C", + in("r0") src, + in("r1") dest, + in("r2") control, ); } } @@ -460,11 +442,10 @@ pub fn get_bios_checksum() -> u32 { { let out: u32; unsafe { - asm!(/* ASM */ "swi 0x0D" - :/* OUT */ "={r0}"(out) - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ // none + asm!( + "swi 0x0D", + out("r0") out, + options(pure, readonly), ); } out @@ -499,12 +480,7 @@ pub fn sound_bias(level: u32) { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x19" - :/* OUT */ // none - :/* INP */ "{r0}"(level) - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x19", in("r0") level); } } } @@ -544,12 +520,7 @@ pub fn sound_driver_mode(mode: u32) { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x1B" - :/* OUT */ // none - :/* INP */ "{r0}"(mode) - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x1B", in("r0") mode); } } } @@ -571,12 +542,7 @@ pub fn sound_driver_main() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x1C" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x1C"); } } } @@ -594,12 +560,7 @@ pub fn sound_driver_vsync() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x1D" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x1D"); } } } @@ -619,12 +580,7 @@ pub fn sound_channel_clear() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x1E" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x1E"); } } } @@ -647,12 +603,7 @@ pub fn sound_driver_vsync_off() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x28" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x28"); } } } @@ -671,12 +622,7 @@ pub fn sound_driver_vsync_on() { #[cfg(all(target_vendor = "nintendo", target_env = "agb"))] { unsafe { - asm!(/* ASM */ "swi 0x29" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!("swi 0x29"); } } } diff --git a/src/io/dma.rs b/src/io/dma.rs index 96d9d14..0da6c8e 100644 --- a/src/io/dma.rs +++ b/src/io/dma.rs @@ -389,12 +389,9 @@ impl DMA3 { // it's only two cycles we just insert two NOP instructions to ensure that // successive calls to `fill32` or other DMA methods don't interfere with // each other. - asm!(/* ASM */ "NOP - NOP" - :/* OUT */ // none - :/* INP */ // none - :/* CLO */ // none - :/* OPT */ "volatile" - ); + asm!(" + NOP + NOP + ", options(nomem, nostack)); } }