mirror of https://github.com/rust-lang/book
2297 lines
88 KiB
Markdown
2297 lines
88 KiB
Markdown
<!-- DO NOT EDIT THIS FILE.
|
||
|
||
This file is periodically generated from the content in the `/src/`
|
||
directory, so all fixes need to be made in `/src/`.
|
||
-->
|
||
|
||
[TOC]
|
||
|
||
# Advanced Features
|
||
|
||
By now, you’ve learned the most commonly used parts of the Rust programming
|
||
language. Before we do one more project, in Chapter 20, we’ll look at a few
|
||
aspects of the language you might run into every once in a while, but may not
|
||
use every day. You can use this chapter as a reference for when you encounter
|
||
any unknowns. The features covered here are useful in very specific situations.
|
||
Although you might not reach for them often, we want to make sure you have a
|
||
grasp of all the features Rust has to offer.
|
||
|
||
In this chapter, we’ll cover:
|
||
|
||
* Unsafe Rust: how to opt out of some of Rust’s guarantees and take
|
||
responsibility for manually upholding those guarantees
|
||
* Advanced traits: associated types, default type parameters, fully qualified
|
||
syntax, supertraits, and the newtype pattern in relation to traits
|
||
* Advanced types: more about the newtype pattern, type aliases, the never type,
|
||
and dynamically sized types
|
||
* Advanced functions and closures: function pointers and returning closures
|
||
* Macros: ways to define code that defines more code at compile time
|
||
|
||
It’s a panoply of Rust features with something for everyone! Let’s dive in!
|
||
|
||
## Unsafe Rust
|
||
|
||
All the code we’ve discussed so far has had Rust’s memory safety guarantees
|
||
enforced at compile time. However, Rust has a second language hidden inside it
|
||
that doesn’t enforce these memory safety guarantees: it’s called *unsafe Rust*
|
||
and works just like regular Rust, but gives us extra superpowers.
|
||
|
||
Unsafe Rust exists because, by nature, static analysis is conservative. When
|
||
the compiler tries to determine whether or not code upholds the guarantees,
|
||
it’s better for it to reject some valid programs than to accept some invalid
|
||
programs. Although the code *might* be okay, if the Rust compiler doesn’t have
|
||
enough information to be confident, it will reject the code. In these cases,
|
||
you can use unsafe code to tell the compiler, “Trust me, I know what I’m
|
||
doing.” Be warned, however, that you use unsafe Rust at your own risk: if you
|
||
use unsafe code incorrectly, problems can occur due to memory unsafety, such as
|
||
null pointer dereferencing.
|
||
|
||
Another reason Rust has an unsafe alter ego is that the underlying computer
|
||
hardware is inherently unsafe. If Rust didn’t let you do unsafe operations, you
|
||
couldn’t do certain tasks. Rust needs to allow you to do low-level systems
|
||
programming, such as directly interacting with the operating system or even
|
||
writing your own operating system. Working with low-level systems programming
|
||
is one of the goals of the language. Let’s explore what we can do with unsafe
|
||
Rust and how to do it.
|
||
|
||
### Unsafe Superpowers
|
||
|
||
To switch to unsafe Rust, use the `unsafe` keyword and then start a new block
|
||
that holds the unsafe code. You can take five actions in unsafe Rust that you
|
||
can’t in safe Rust, which we call *unsafe superpowers*. Those superpowers
|
||
include the ability to:
|
||
|
||
1. Dereference a raw pointer
|
||
1. Call an unsafe function or method
|
||
1. Access or modify a mutable static variable
|
||
1. Implement an unsafe trait
|
||
1. Access fields of `union`s
|
||
|
||
It’s important to understand that `unsafe` doesn’t turn off the borrow checker
|
||
or disable any of Rust’s other safety checks: if you use a reference in unsafe
|
||
code, it will still be checked. The `unsafe` keyword only gives you access to
|
||
these five features that are then not checked by the compiler for memory
|
||
safety. You’ll still get some degree of safety inside an unsafe block.
|
||
|
||
In addition, `unsafe` does not mean the code inside the block is necessarily
|
||
dangerous or that it will definitely have memory safety problems: the intent is
|
||
that as the programmer, you’ll ensure the code inside an `unsafe` block will
|
||
access memory in a valid way.
|
||
|
||
People are fallible and mistakes will happen, but by requiring these five
|
||
unsafe operations to be inside blocks annotated with `unsafe`, you’ll know that
|
||
any errors related to memory safety must be within an `unsafe` block. Keep
|
||
`unsafe` blocks small; you’ll be thankful later when you investigate memory
|
||
bugs.
|
||
|
||
To isolate unsafe code as much as possible, it’s best to enclose such code
|
||
within a safe abstraction and provide a safe API, which we’ll discuss later in
|
||
the chapter when we examine unsafe functions and methods. Parts of the standard
|
||
library are implemented as safe abstractions over unsafe code that has been
|
||
audited. Wrapping unsafe code in a safe abstraction prevents uses of `unsafe`
|
||
from leaking out into all the places that you or your users might want to use
|
||
the functionality implemented with `unsafe` code, because using a safe
|
||
abstraction is safe.
|
||
|
||
Let’s look at each of the five unsafe superpowers in turn. We’ll also look at
|
||
some abstractions that provide a safe interface to unsafe code.
|
||
|
||
### Dereferencing a Raw Pointer
|
||
|
||
In “Dangling References” on page XX, we mentioned that the compiler ensures
|
||
references are always valid. Unsafe Rust has two new types called *raw
|
||
pointers* that are similar to references. As with references, raw pointers can
|
||
be immutable or mutable and are written as `*const T` and `*mut T`,
|
||
respectively. The asterisk isn’t the dereference operator; it’s part of the
|
||
type name. In the context of raw pointers, *immutable* means that the pointer
|
||
can’t be directly assigned to after being dereferenced.
|
||
|
||
Different from references and smart pointers, raw pointers:
|
||
|
||
* Are allowed to ignore the borrowing rules by having both immutable and
|
||
mutable pointers or multiple mutable pointers to the same location
|
||
* Aren’t guaranteed to point to valid memory
|
||
* Are allowed to be null
|
||
* Don’t implement any automatic cleanup
|
||
|
||
By opting out of having Rust enforce these guarantees, you can give up
|
||
guaranteed safety in exchange for greater performance or the ability to
|
||
interface with another language or hardware where Rust’s guarantees don’t apply.
|
||
|
||
Listing 19-1 shows how to create an immutable and a mutable raw pointer from
|
||
references.
|
||
|
||
```
|
||
let mut num = 5;
|
||
|
||
let r1 = &num as *const i32;
|
||
let r2 = &mut num as *mut i32;
|
||
```
|
||
|
||
Listing 19-1: Creating raw pointers from references
|
||
|
||
Notice that we don’t include the `unsafe` keyword in this code. We can create
|
||
raw pointers in safe code; we just can’t dereference raw pointers outside an
|
||
unsafe block, as you’ll see in a bit.
|
||
|
||
We’ve created raw pointers by using `as` to cast an immutable and a mutable
|
||
reference into their corresponding raw pointer types. Because we created them
|
||
directly from references guaranteed to be valid, we know these particular raw
|
||
pointers are valid, but we can’t make that assumption about just any raw
|
||
pointer.
|
||
|
||
To demonstrate this, next we’ll create a raw pointer whose validity we can’t be
|
||
so certain of. Listing 19-2 shows how to create a raw pointer to an arbitrary
|
||
location in memory. Trying to use arbitrary memory is undefined: there might be
|
||
data at that address or there might not, the compiler might optimize the code
|
||
so there is no memory access, or the program might terminate with a
|
||
segmentation fault. Usually, there is no good reason to write code like this,
|
||
but it is possible.
|
||
|
||
```
|
||
let address = 0x012345usize;
|
||
let r = address as *const i32;
|
||
```
|
||
|
||
Listing 19-2: Creating a raw pointer to an arbitrary memory address
|
||
|
||
Recall that we can create raw pointers in safe code, but we can’t *dereference*
|
||
raw pointers and read the data being pointed to. In Listing 19-3, we use the
|
||
dereference operator `*` on a raw pointer that requires an `unsafe` block.
|
||
|
||
```
|
||
let mut num = 5;
|
||
|
||
let r1 = &num as *const i32;
|
||
let r2 = &mut num as *mut i32;
|
||
|
||
unsafe {
|
||
println!("r1 is: {}", *r1);
|
||
println!("r2 is: {}", *r2);
|
||
}
|
||
```
|
||
|
||
Listing 19-3: Dereferencing raw pointers within an `unsafe` block
|
||
|
||
Creating a pointer does no harm; it’s only when we try to access the value that
|
||
it points at that we might end up dealing with an invalid value.
|
||
|
||
Note also that in Listings 19-1 and 19-3, we created `*const i32` and `*mut
|
||
i32` raw pointers that both pointed to the same memory location, where `num` is
|
||
stored. If we instead tried to create an immutable and a mutable reference to
|
||
`num`, the code would not have compiled because Rust’s ownership rules don’t
|
||
allow a mutable reference at the same time as any immutable references. With
|
||
raw pointers, we can create a mutable pointer and an immutable pointer to the
|
||
same location and change data through the mutable pointer, potentially creating
|
||
a data race. Be careful!
|
||
|
||
With all of these dangers, why would you ever use raw pointers? One major use
|
||
case is when interfacing with C code, as you’ll see in “Calling an Unsafe
|
||
Function or Method” on page XX. Another case is when building up safe
|
||
abstractions that the borrow checker doesn’t understand. We’ll introduce unsafe
|
||
functions and then look at an example of a safe abstraction that uses unsafe
|
||
code.
|
||
|
||
### Calling an Unsafe Function or Method
|
||
|
||
The second type of operation you can perform in an unsafe block is calling
|
||
unsafe functions. Unsafe functions and methods look exactly like regular
|
||
functions and methods, but they have an extra `unsafe` before the rest of the
|
||
definition. The `unsafe` keyword in this context indicates the function has
|
||
requirements we need to uphold when we call this function, because Rust can’t
|
||
guarantee we’ve met these requirements. By calling an unsafe function within an
|
||
`unsafe` block, we’re saying that we’ve read this function’s documentation and
|
||
we take responsibility for upholding the function’s contracts.
|
||
|
||
Here is an unsafe function named `dangerous` that doesn’t do anything in its
|
||
body:
|
||
|
||
```
|
||
unsafe fn dangerous() {}
|
||
|
||
unsafe {
|
||
dangerous();
|
||
}
|
||
```
|
||
|
||
We must call the `dangerous` function within a separate `unsafe` block. If we
|
||
try to call `dangerous` without the `unsafe` block, we’ll get an error:
|
||
|
||
```
|
||
error[E0133]: call to unsafe function is unsafe and requires
|
||
unsafe function or block
|
||
--> src/main.rs:4:5
|
||
|
|
||
4 | dangerous();
|
||
| ^^^^^^^^^^^ call to unsafe function
|
||
|
|
||
= note: consult the function's documentation for information on
|
||
how to avoid undefined behavior
|
||
```
|
||
|
||
With the `unsafe` block, we’re asserting to Rust that we’ve read the function’s
|
||
documentation, we understand how to use it properly, and we’ve verified that
|
||
we’re fulfilling the contract of the function.
|
||
|
||
Bodies of unsafe functions are effectively `unsafe` blocks, so to perform other
|
||
unsafe operations within an unsafe function, we don’t need to add another
|
||
`unsafe` block.
|
||
|
||
#### Creating a Safe Abstraction over Unsafe Code
|
||
|
||
Just because a function contains unsafe code doesn’t mean we need to mark the
|
||
entire function as unsafe. In fact, wrapping unsafe code in a safe function is
|
||
a common abstraction. As an example, let’s study the `split_at_mut` function
|
||
from the standard library, which requires some unsafe code. We’ll explore how
|
||
we might implement it. This safe method is defined on mutable slices: it takes
|
||
one slice and makes it two by splitting the slice at the index given as an
|
||
argument. Listing 19-4 shows how to use `split_at_mut`.
|
||
|
||
```
|
||
let mut v = vec![1, 2, 3, 4, 5, 6];
|
||
|
||
let r = &mut v[..];
|
||
|
||
let (a, b) = r.split_at_mut(3);
|
||
|
||
assert_eq!(a, &mut [1, 2, 3]);
|
||
assert_eq!(b, &mut [4, 5, 6]);
|
||
```
|
||
|
||
Listing 19-4: Using the safe `split_at_mut` function
|
||
|
||
We can’t implement this function using only safe Rust. An attempt might look
|
||
something like Listing 19-5, which won’t compile. For simplicity, we’ll
|
||
implement `split_at_mut` as a function rather than a method and only for slices
|
||
of `i32` values rather than for a generic type `T`.
|
||
|
||
```
|
||
fn split_at_mut(
|
||
values: &mut [i32],
|
||
mid: usize,
|
||
) -> (&mut [i32], &mut [i32]) {
|
||
let len = values.len();
|
||
|
||
assert!(mid <= len);
|
||
|
||
(&mut values[..mid], &mut values[mid..])
|
||
}
|
||
```
|
||
|
||
Listing 19-5: An attempted implementation of `split_at_mut` using only safe Rust
|
||
|
||
This function first gets the total length of the slice. Then it asserts that
|
||
the index given as a parameter is within the slice by checking whether it’s
|
||
less than or equal to the length. The assertion means that if we pass an index
|
||
that is greater than the length to split the slice at, the function will panic
|
||
before it attempts to use that index.
|
||
|
||
Then we return two mutable slices in a tuple: one from the start of the
|
||
original slice to the `mid` index and another from `mid` to the end of the
|
||
slice.
|
||
|
||
When we try to compile the code in Listing 19-5, we’ll get an error:
|
||
|
||
```
|
||
error[E0499]: cannot borrow `*values` as mutable more than once at a time
|
||
--> src/main.rs:9:31
|
||
|
|
||
2 | values: &mut [i32],
|
||
| - let's call the lifetime of this reference `'1`
|
||
...
|
||
9 | (&mut values[..mid], &mut values[mid..])
|
||
| --------------------------^^^^^^--------
|
||
| | | |
|
||
| | | second mutable borrow occurs here
|
||
| | first mutable borrow occurs here
|
||
| returning this value requires that `*values` is borrowed for `'1`
|
||
```
|
||
|
||
Rust’s borrow checker can’t understand that we’re borrowing different parts of
|
||
the slice; it only knows that we’re borrowing from the same slice twice.
|
||
Borrowing different parts of a slice is fundamentally okay because the two
|
||
slices aren’t overlapping, but Rust isn’t smart enough to know this. When we
|
||
know code is okay, but Rust doesn’t, it’s time to reach for unsafe code.
|
||
|
||
Listing 19-6 shows how to use an `unsafe` block, a raw pointer, and some calls
|
||
to unsafe functions to make the implementation of `split_at_mut` work.
|
||
|
||
```
|
||
use std::slice;
|
||
|
||
fn split_at_mut(
|
||
values: &mut [i32],
|
||
mid: usize,
|
||
) -> (&mut [i32], &mut [i32]) {
|
||
1 let len = values.len();
|
||
2 let ptr = values.as_mut_ptr();
|
||
|
||
3 assert!(mid <= len);
|
||
|
||
4 unsafe {
|
||
(
|
||
5 slice::from_raw_parts_mut(ptr, mid),
|
||
6 slice::from_raw_parts_mut(ptr.add(mid), len - mid),
|
||
)
|
||
}
|
||
}
|
||
```
|
||
|
||
Listing 19-6: Using unsafe code in the implementation of the `split_at_mut`
|
||
function
|
||
|
||
Recall from “The Slice Type” on page XX that a slice is a pointer to some data
|
||
and the length of the slice. We use the `len` method to get the length of a
|
||
slice [1] and the `as_mut_ptr` method to access the raw pointer of a slice [2].
|
||
In this case, because we have a mutable slice to `i32` values, `as_mut_ptr`
|
||
returns a raw pointer with the type `*mut i32`, which we’ve stored in the
|
||
variable `ptr`.
|
||
|
||
We keep the assertion that the `mid` index is within the slice [3]. Then we get
|
||
to the unsafe code [4]: the `slice::from_raw_parts_mut` function takes a raw
|
||
pointer and a length, and it creates a slice. We use it to create a slice that
|
||
starts from `ptr` and is `mid` items long [5]. Then we call the `add` method on
|
||
`ptr` with `mid` as an argument to get a raw pointer that starts at `mid`, and
|
||
we create a slice using that pointer and the remaining number of items after
|
||
`mid` as the length [6].
|
||
|
||
The function `slice::from_raw_parts_mut` is unsafe because it takes a raw
|
||
pointer and must trust that this pointer is valid. The `add` method on raw
|
||
pointers is also unsafe because it must trust that the offset location is also
|
||
a valid pointer. Therefore, we had to put an `unsafe` block around our calls to
|
||
`slice::from_raw_parts_mut` and `add` so we could call them. By looking at the
|
||
code and by adding the assertion that `mid` must be less than or equal to
|
||
`len`, we can tell that all the raw pointers used within the `unsafe` block
|
||
will be valid pointers to data within the slice. This is an acceptable and
|
||
appropriate use of `unsafe`.
|
||
|
||
Note that we don’t need to mark the resultant `split_at_mut` function as
|
||
`unsafe`, and we can call this function from safe Rust. We’ve created a safe
|
||
abstraction to the unsafe code with an implementation of the function that uses
|
||
`unsafe` code in a safe way, because it creates only valid pointers from the
|
||
data this function has access to.
|
||
|
||
In contrast, the use of `slice::from_raw_parts_mut` in Listing 19-7 would
|
||
likely crash when the slice is used. This code takes an arbitrary memory
|
||
location and creates a slice 10,000 items long.
|
||
|
||
```
|
||
use std::slice;
|
||
|
||
let address = 0x01234usize;
|
||
let r = address as *mut i32;
|
||
|
||
let values: &[i32] = unsafe {
|
||
slice::from_raw_parts_mut(r, 10000)
|
||
};
|
||
```
|
||
|
||
Listing 19-7: Creating a slice from an arbitrary memory location
|
||
|
||
We don’t own the memory at this arbitrary location, and there is no guarantee
|
||
that the slice this code creates contains valid `i32` values. Attempting to use
|
||
`values` as though it’s a valid slice results in undefined behavior.
|
||
|
||
#### Using extern Functions to Call External Code
|
||
|
||
Sometimes your Rust code might need to interact with code written in another
|
||
language. For this, Rust has the keyword `extern` that facilitates the creation
|
||
and use of a *Foreign Function Interface* *(FFI)*, which is a way for a
|
||
programming language to define functions and enable a different (foreign)
|
||
programming language to call those functions.
|
||
|
||
Listing 19-8 demonstrates how to set up an integration with the `abs` function
|
||
from the C standard library. Functions declared within `extern` blocks are
|
||
always unsafe to call from Rust code. The reason is that other languages don’t
|
||
enforce Rust’s rules and guarantees, and Rust can’t check them, so
|
||
responsibility falls on the programmer to ensure safety.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
extern "C" {
|
||
fn abs(input: i32) -> i32;
|
||
}
|
||
|
||
fn main() {
|
||
unsafe {
|
||
println!(
|
||
"Absolute value of -3 according to C: {}",
|
||
abs(-3)
|
||
);
|
||
}
|
||
}
|
||
```
|
||
|
||
Listing 19-8: Declaring and calling an `extern` function defined in another
|
||
language
|
||
|
||
Within the `extern "C"` block, we list the names and signatures of external
|
||
functions from another language we want to call. The `"C"` part defines which
|
||
*application binary interface* *(ABI)* the external function uses: the ABI
|
||
defines how to call the function at the assembly level. The `"C"` ABI is the
|
||
most common and follows the C programming language’s ABI.
|
||
|
||
> ### Calling Rust Functions from Other Languages
|
||
>
|
||
> We can also use `extern` to create an interface that allows other languages
|
||
to call Rust functions. Instead of creating a whole `extern` block, we add the
|
||
`extern` keyword and specify the ABI to use just before the `fn` keyword for
|
||
the relevant function. We also need to add a `#[no_mangle]` annotation to tell
|
||
the Rust compiler not to mangle the name of this function. *Mangling* is when a
|
||
compiler changes the name we’ve given a function to a different name that
|
||
contains more information for other parts of the compilation process to consume
|
||
but is less human readable. Every programming language compiler mangles names
|
||
slightly differently, so for a Rust function to be nameable by other languages,
|
||
we must disable the Rust compiler’s name mangling.
|
||
>
|
||
> In the following example, we make the `call_from_c` function accessible from
|
||
C code, after it’s compiled to a shared library and linked from C:
|
||
>
|
||
> ```
|
||
> #[no_mangle]
|
||
> pub extern "C" fn call_from_c() {
|
||
> println!("Just called a Rust function from C!");
|
||
> }
|
||
> ```
|
||
>
|
||
> This usage of `extern` does not require `unsafe`.
|
||
|
||
### Accessing or Modifying a Mutable Static Variable
|
||
|
||
In this book, we’ve not yet talked about global variables, which Rust does
|
||
support but can be problematic with Rust’s ownership rules. If two threads are
|
||
accessing the same mutable global variable, it can cause a data race.
|
||
|
||
In Rust, global variables are called *static* variables. Listing 19-9 shows an
|
||
example declaration and use of a static variable with a string slice as a value.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
static HELLO_WORLD: &str = "Hello, world!";
|
||
|
||
fn main() {
|
||
println!("value is: {HELLO_WORLD}");
|
||
}
|
||
```
|
||
|
||
Listing 19-9: Defining and using an immutable static variable
|
||
|
||
Static variables are similar to constants, which we discussed in “Constants” on
|
||
page XX. The names of static variables are in `SCREAMING_SNAKE_CASE` by
|
||
convention. Static variables can only store references with the `'static`
|
||
lifetime, which means the Rust compiler can figure out the lifetime and we
|
||
aren’t required to annotate it explicitly. Accessing an immutable static
|
||
variable is safe.
|
||
|
||
A subtle difference between constants and immutable static variables is that
|
||
values in a static variable have a fixed address in memory. Using the value
|
||
will always access the same data. Constants, on the other hand, are allowed to
|
||
duplicate their data whenever they’re used. Another difference is that static
|
||
variables can be mutable. Accessing and modifying mutable static variables is
|
||
*unsafe*. Listing 19-10 shows how to declare, access, and modify a mutable
|
||
static variable named `COUNTER`.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
static mut COUNTER: u32 = 0;
|
||
|
||
fn add_to_count(inc: u32) {
|
||
unsafe {
|
||
COUNTER += inc;
|
||
}
|
||
}
|
||
|
||
fn main() {
|
||
add_to_count(3);
|
||
|
||
unsafe {
|
||
println!("COUNTER: {COUNTER}");
|
||
}
|
||
}
|
||
```
|
||
|
||
Listing 19-10: Reading from or writing to a mutable static variable is unsafe.
|
||
|
||
As with regular variables, we specify mutability using the `mut` keyword. Any
|
||
code that reads or writes from `COUNTER` must be within an `unsafe` block. This
|
||
code compiles and prints `COUNTER: 3` as we would expect because it’s single
|
||
threaded. Having multiple threads access `COUNTER` would likely result in data
|
||
races.
|
||
|
||
With mutable data that is globally accessible, it’s difficult to ensure there
|
||
are no data races, which is why Rust considers mutable static variables to be
|
||
unsafe. Where possible, it’s preferable to use the concurrency techniques and
|
||
thread-safe smart pointers we discussed in Chapter 16 so the compiler checks
|
||
that data access from different threads is done safely.
|
||
|
||
### Implementing an Unsafe Trait
|
||
|
||
We can use `unsafe` to implement an unsafe trait. A trait is unsafe when at
|
||
least one of its methods has some invariant that the compiler can’t verify. We
|
||
declare that a trait is `unsafe` by adding the `unsafe` keyword before `trait`
|
||
and marking the implementation of the trait as `unsafe` too, as shown in
|
||
Listing 19-11.
|
||
|
||
```
|
||
unsafe trait Foo {
|
||
// methods go here
|
||
}
|
||
|
||
unsafe impl Foo for i32 {
|
||
// method implementations go here
|
||
}
|
||
```
|
||
|
||
Listing 19-11: Defining and implementing an unsafe trait
|
||
|
||
By using `unsafe impl`, we’re promising that we’ll uphold the invariants that
|
||
the compiler can’t verify.
|
||
|
||
As an example, recall the `Send` and `Sync` marker traits we discussed in
|
||
“Extensible Concurrency with the Send and Sync Traits” on page XX: the compiler
|
||
implements these traits automatically if our types are composed entirely of
|
||
`Send` and `Sync` types. If we implement a type that contains a type that is
|
||
not `Send` or `Sync`, such as raw pointers, and we want to mark that type as
|
||
`Send` or `Sync`, we must use `unsafe`. Rust can’t verify that our type upholds
|
||
the guarantees that it can be safely sent across threads or accessed from
|
||
multiple threads; therefore, we need to do those checks manually and indicate
|
||
as such with `unsafe`.
|
||
|
||
### Accessing Fields of a Union
|
||
|
||
The final action that works only with `unsafe` is accessing fields of a union.
|
||
A `union` is similar to a `struct`, but only one declared field is used in a
|
||
particular instance at one time. Unions are primarily used to interface with
|
||
unions in C code. Accessing union fields is unsafe because Rust can’t guarantee
|
||
the type of the data currently being stored in the union instance. You can
|
||
learn more about unions in the Rust Reference at
|
||
*https://doc.rust-lang.org/reference/items/unions.html**.*
|
||
|
||
### When to Use Unsafe Code
|
||
|
||
Using `unsafe` to use one of the five superpowers just discussed isn’t wrong or
|
||
even frowned upon, but it is trickier to get `unsafe` code correct because the
|
||
compiler can’t help uphold memory safety. When you have a reason to use
|
||
`unsafe` code, you can do so, and having the explicit `unsafe` annotation makes
|
||
it easier to track down the source of problems when they occur.
|
||
|
||
## Advanced Traits
|
||
|
||
We first covered traits in “Traits: Defining Shared Behavior” on page XX, but
|
||
we didn’t discuss the more advanced details. Now that you know more about Rust,
|
||
we can get into the nitty-gritty.
|
||
|
||
### Associated Types
|
||
|
||
*Associated types* connect a type placeholder with a trait such that the trait
|
||
method definitions can use these placeholder types in their signatures. The
|
||
implementor of a trait will specify the concrete type to be used instead of the
|
||
placeholder type for the particular implementation. That way, we can define a
|
||
trait that uses some types without needing to know exactly what those types are
|
||
until the trait is implemented.
|
||
|
||
We’ve described most of the advanced features in this chapter as being rarely
|
||
needed. Associated types are somewhere in the middle: they’re used more rarely
|
||
than features explained in the rest of the book but more commonly than many of
|
||
the other features discussed in this chapter.
|
||
|
||
One example of a trait with an associated type is the `Iterator` trait that the
|
||
standard library provides. The associated type is named `Item` and stands in
|
||
for the type of the values the type implementing the `Iterator` trait is
|
||
iterating over. The definition of the `Iterator` trait is as shown in Listing
|
||
19-12.
|
||
|
||
```
|
||
pub trait Iterator {
|
||
type Item;
|
||
|
||
fn next(&mut self) -> Option<Self::Item>;
|
||
}
|
||
```
|
||
|
||
Listing 19-12: The definition of the `Iterator` trait that has an associated
|
||
type `Item`
|
||
|
||
The type `Item` is a placeholder, and the `next` method’s definition shows that
|
||
it will return values of type `Option<Self::Item>`. Implementors of the
|
||
`Iterator` trait will specify the concrete type for `Item`, and the `next`
|
||
method will return an `Option` containing a value of that concrete type.
|
||
|
||
Associated types might seem like a similar concept to generics, in that the
|
||
latter allow us to define a function without specifying what types it can
|
||
handle. To examine the difference between the two concepts, we’ll look at an
|
||
implementation of the `Iterator` trait on a type named `Counter` that specifies
|
||
the `Item` type is `u32`:
|
||
|
||
Filename: src/lib.rs
|
||
|
||
```
|
||
impl Iterator for Counter {
|
||
type Item = u32;
|
||
|
||
fn next(&mut self) -> Option<Self::Item> {
|
||
--snip--
|
||
```
|
||
|
||
This syntax seems comparable to that of generics. So why not just define the
|
||
`Iterator` trait with generics, as shown in Listing 19-13?
|
||
|
||
```
|
||
pub trait Iterator<T> {
|
||
fn next(&mut self) -> Option<T>;
|
||
}
|
||
```
|
||
|
||
Listing 19-13: A hypothetical definition of the `Iterator` trait using generics
|
||
|
||
The difference is that when using generics, as in Listing 19-13, we must
|
||
annotate the types in each implementation; because we can also implement
|
||
`Iterator<``String``> for Counter` or any other type, we could have multiple
|
||
implementations of `Iterator` for `Counter`. In other words, when a trait has a
|
||
generic parameter, it can be implemented for a type multiple times, changing
|
||
the concrete types of the generic type parameters each time. When we use the
|
||
`next` method on `Counter`, we would have to provide type annotations to
|
||
indicate which implementation of `Iterator` we want to use.
|
||
|
||
With associated types, we don’t need to annotate types because we can’t
|
||
implement a trait on a type multiple times. In Listing 19-12 with the
|
||
definition that uses associated types, we can choose what the type of `Item`
|
||
will be only once because there can be only one `impl Iterator for Counter`. We
|
||
don’t have to specify that we want an iterator of `u32` values everywhere we
|
||
call `next` on `Counter`.
|
||
|
||
Associated types also become part of the trait’s contract: implementors of the
|
||
trait must provide a type to stand in for the associated type placeholder.
|
||
Associated types often have a name that describes how the type will be used,
|
||
and documenting the associated type in the API documentation is a good practice.
|
||
|
||
### Default Generic Type Parameters and Operator Overloading
|
||
|
||
When we use generic type parameters, we can specify a default concrete type for
|
||
the generic type. This eliminates the need for implementors of the trait to
|
||
specify a concrete type if the default type works. You specify a default type
|
||
when declaring a generic type with the `<`PlaceholderType`=`ConcreteType`>`
|
||
syntax.
|
||
|
||
A great example of a situation where this technique is useful is with *operator
|
||
overloading*, in which you customize the behavior of an operator (such as `+`)
|
||
in particular situations.
|
||
|
||
Rust doesn’t allow you to create your own operators or overload arbitrary
|
||
operators. But you can overload the operations and corresponding traits listed
|
||
in `std::ops` by implementing the traits associated with the operator. For
|
||
example, in Listing 19-14 we overload the `+` operator to add two `Point`
|
||
instances together. We do this by implementing the `Add` trait on a `Point`
|
||
struct.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
use std::ops::Add;
|
||
|
||
#[derive(Debug, Copy, Clone, PartialEq)]
|
||
struct Point {
|
||
x: i32,
|
||
y: i32,
|
||
}
|
||
|
||
impl Add for Point {
|
||
type Output = Point;
|
||
|
||
fn add(self, other: Point) -> Point {
|
||
Point {
|
||
x: self.x + other.x,
|
||
y: self.y + other.y,
|
||
}
|
||
}
|
||
}
|
||
|
||
fn main() {
|
||
assert_eq!(
|
||
Point { x: 1, y: 0 } + Point { x: 2, y: 3 },
|
||
Point { x: 3, y: 3 }
|
||
);
|
||
}
|
||
```
|
||
|
||
Listing 19-14: Implementing the `Add` trait to overload the `+` operator for
|
||
`Point` instances
|
||
|
||
The `add` method adds the `x` values of two `Point` instances and the `y`
|
||
values of two `Point` instances to create a new `Point`. The `Add` trait has an
|
||
associated type named `Output` that determines the type returned from the `add`
|
||
method.
|
||
|
||
The default generic type in this code is within the `Add` trait. Here is its
|
||
definition:
|
||
|
||
```
|
||
trait Add<Rhs=Self> {
|
||
type Output;
|
||
|
||
fn add(self, rhs: Rhs) -> Self::Output;
|
||
}
|
||
```
|
||
|
||
This code should look generally familiar: a trait with one method and an
|
||
associated type. The new part is `Rhs=Self`: this syntax is called *default
|
||
type parameters*. The `Rhs` generic type parameter (short for “right-hand
|
||
side”) defines the type of the `rhs` parameter in the `add` method. If we don’t
|
||
specify a concrete type for `Rhs` when we implement the `Add` trait, the type
|
||
of `Rhs` will default to `Self`, which will be the type we’re implementing
|
||
`Add` on.
|
||
|
||
When we implemented `Add` for `Point`, we used the default for `Rhs` because we
|
||
wanted to add two `Point` instances. Let’s look at an example of implementing
|
||
the `Add` trait where we want to customize the `Rhs` type rather than using the
|
||
default.
|
||
|
||
We have two structs, `Millimeters` and `Meters`, holding values in different
|
||
units. This thin wrapping of an existing type in another struct is known as the
|
||
*newtype pattern*, which we describe in more detail in “Using the Newtype
|
||
Pattern to Implement External Traits on External Types” on page XX. We want to
|
||
add values in millimeters to values in meters and have the implementation of
|
||
`Add` do the conversion correctly. We can implement `Add` for `Millimeters`
|
||
with `Meters` as the `Rhs`, as shown in Listing 19-15.
|
||
|
||
Filename: src/lib.rs
|
||
|
||
```
|
||
use std::ops::Add;
|
||
|
||
struct Millimeters(u32);
|
||
struct Meters(u32);
|
||
|
||
impl Add<Meters> for Millimeters {
|
||
type Output = Millimeters;
|
||
|
||
fn add(self, other: Meters) -> Millimeters {
|
||
Millimeters(self.0 + (other.0 * 1000))
|
||
}
|
||
}
|
||
```
|
||
|
||
Listing 19-15: Implementing the `Add` trait on `Millimeters` to add
|
||
`Millimeters` and `Meters`
|
||
|
||
To add `Millimeters` and `Meters`, we specify `impl Add<Meters>` to set the
|
||
value of the `Rhs` type parameter instead of using the default of `Self`.
|
||
|
||
You’ll use default type parameters in two main ways:
|
||
|
||
1. To extend a type without breaking existing code
|
||
1. To allow customization in specific cases most users won’t need
|
||
|
||
The standard library’s `Add` trait is an example of the second purpose:
|
||
usually, you’ll add two like types, but the `Add` trait provides the ability to
|
||
customize beyond that. Using a default type parameter in the `Add` trait
|
||
definition means you don’t have to specify the extra parameter most of the
|
||
time. In other words, a bit of implementation boilerplate isn’t needed, making
|
||
it easier to use the trait.
|
||
|
||
The first purpose is similar to the second but in reverse: if you want to add a
|
||
type parameter to an existing trait, you can give it a default to allow
|
||
extension of the functionality of the trait without breaking the existing
|
||
implementation code.
|
||
|
||
### Disambiguating Between Methods with the Same Name
|
||
|
||
Nothing in Rust prevents a trait from having a method with the same name as
|
||
another trait’s method, nor does Rust prevent you from implementing both traits
|
||
on one type. It’s also possible to implement a method directly on the type with
|
||
the same name as methods from traits.
|
||
|
||
When calling methods with the same name, you’ll need to tell Rust which one you
|
||
want to use. Consider the code in Listing 19-16 where we’ve defined two traits,
|
||
`Pilot` and `Wizard`, that both have a method called `fly`. We then implement
|
||
both traits on a type `Human` that already has a method named `fly` implemented
|
||
on it. Each `fly` method does something different.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
trait Pilot {
|
||
fn fly(&self);
|
||
}
|
||
|
||
trait Wizard {
|
||
fn fly(&self);
|
||
}
|
||
|
||
struct Human;
|
||
|
||
impl Pilot for Human {
|
||
fn fly(&self) {
|
||
println!("This is your captain speaking.");
|
||
}
|
||
}
|
||
|
||
impl Wizard for Human {
|
||
fn fly(&self) {
|
||
println!("Up!");
|
||
}
|
||
}
|
||
|
||
impl Human {
|
||
fn fly(&self) {
|
||
println!("*waving arms furiously*");
|
||
}
|
||
}
|
||
```
|
||
|
||
Listing 19-16: Two traits are defined to have a `fly` method and are
|
||
implemented on the `Human` type, and a `fly` method is implemented on `Human`
|
||
directly.
|
||
|
||
When we call `fly` on an instance of `Human`, the compiler defaults to calling
|
||
the method that is directly implemented on the type, as shown in Listing 19-17.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
fn main() {
|
||
let person = Human;
|
||
person.fly();
|
||
}
|
||
```
|
||
|
||
Listing 19-17: Calling `fly` on an instance of `Human`
|
||
|
||
Running this code will print `*waving arms furiously*`, showing that Rust
|
||
called the `fly` method implemented on `Human` directly.
|
||
|
||
To call the `fly` methods from either the `Pilot` trait or the `Wizard` trait,
|
||
we need to use more explicit syntax to specify which `fly` method we mean.
|
||
Listing 19-18 demonstrates this syntax.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
fn main() {
|
||
let person = Human;
|
||
Pilot::fly(&person);
|
||
Wizard::fly(&person);
|
||
person.fly();
|
||
}
|
||
```
|
||
|
||
Listing 19-18: Specifying which trait’s `fly` method we want to call
|
||
|
||
Specifying the trait name before the method name clarifies to Rust which
|
||
implementation of `fly` we want to call. We could also write
|
||
`Human::fly(&person)`, which is equivalent to the `person.fly()` that we used
|
||
in Listing 19-18, but this is a bit longer to write if we don’t need to
|
||
disambiguate.
|
||
|
||
Running this code prints the following:
|
||
|
||
```
|
||
This is your captain speaking.
|
||
Up!
|
||
*waving arms furiously*
|
||
```
|
||
|
||
Because the `fly` method takes a `self` parameter, if we had two *types* that
|
||
both implement one *trait*, Rust could figure out which implementation of a
|
||
trait to use based on the type of `self`.
|
||
|
||
However, associated functions that are not methods don’t have a `self`
|
||
parameter. When there are multiple types or traits that define non-method
|
||
functions with the same function name, Rust doesn’t always know which type you
|
||
mean unless you use fully qualified syntax. For example, in Listing 19-19 we
|
||
create a trait for an animal shelter that wants to name all baby dogs Spot. We
|
||
make an `Animal` trait with an associated non-method function `baby_name`. The
|
||
`Animal` trait is implemented for the struct `Dog`, on which we also provide an
|
||
associated non-method function `baby_name` directly.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
trait Animal {
|
||
fn baby_name() -> String;
|
||
}
|
||
|
||
struct Dog;
|
||
|
||
impl Dog {
|
||
fn baby_name() -> String {
|
||
String::from("Spot")
|
||
}
|
||
}
|
||
|
||
impl Animal for Dog {
|
||
fn baby_name() -> String {
|
||
String::from("puppy")
|
||
}
|
||
}
|
||
|
||
fn main() {
|
||
println!("A baby dog is called a {}", Dog::baby_name());
|
||
}
|
||
```
|
||
|
||
Listing 19-19: A trait with an associated function and a type with an
|
||
associated function of the same name that also implements the trait
|
||
|
||
We implement the code for naming all puppies Spot in the `baby_name` associated
|
||
function that is defined on `Dog`. The `Dog` type also implements the trait
|
||
`Animal`, which describes characteristics that all animals have. Baby dogs are
|
||
called puppies, and that is expressed in the implementation of the `Animal`
|
||
trait on `Dog` in the `baby_name` function associated with the `Animal` trait.
|
||
|
||
In `main`, we call the `Dog::baby_name` function, which calls the associated
|
||
function defined on `Dog` directly. This code prints the following:
|
||
|
||
```
|
||
A baby dog is called a Spot
|
||
```
|
||
|
||
This output isn’t what we wanted. We want to call the `baby_name` function that
|
||
is part of the `Animal` trait that we implemented on `Dog` so the code prints
|
||
`A baby dog is called a puppy`. The technique of specifying the trait name that
|
||
we used in Listing 19-18 doesn’t help here; if we change `main` to the code in
|
||
Listing 19-20, we’ll get a compilation error.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
fn main() {
|
||
println!("A baby dog is called a {}", Animal::baby_name());
|
||
}
|
||
```
|
||
|
||
Listing 19-20: Attempting to call the `baby_name` function from the `Animal`
|
||
trait, but Rust doesn’t know which implementation to use
|
||
|
||
Because `Animal::baby_name` doesn’t have a `self` parameter, and there could be
|
||
other types that implement the `Animal` trait, Rust can’t figure out which
|
||
implementation of `Animal::baby_name` we want. We’ll get this compiler error:
|
||
|
||
```
|
||
error[E0283]: type annotations needed
|
||
--> src/main.rs:20:43
|
||
|
|
||
20 | println!("A baby dog is called a {}", Animal::baby_name());
|
||
| ^^^^^^^^^^^^^^^^^ cannot infer
|
||
type
|
||
|
|
||
= note: cannot satisfy `_: Animal`
|
||
```
|
||
|
||
To disambiguate and tell Rust that we want to use the implementation of
|
||
`Animal` for `Dog` as opposed to the implementation of `Animal` for some other
|
||
type, we need to use fully qualified syntax. Listing 19-21 demonstrates how to
|
||
use fully qualified syntax.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
fn main() {
|
||
println!(
|
||
"A baby dog is called a {}",
|
||
<Dog as Animal>::baby_name()
|
||
);
|
||
}
|
||
```
|
||
|
||
Listing 19-21: Using fully qualified syntax to specify that we want to call the
|
||
`baby_name` function from the `Animal` trait as implemented on `Dog`
|
||
|
||
We’re providing Rust with a type annotation within the angle brackets, which
|
||
indicates we want to call the `baby_name` method from the `Animal` trait as
|
||
implemented on `Dog` by saying that we want to treat the `Dog` type as an
|
||
`Animal` for this function call. This code will now print what we want:
|
||
|
||
```
|
||
A baby dog is called a puppy
|
||
```
|
||
|
||
In general, fully qualified syntax is defined as follows:
|
||
|
||
```
|
||
<Type as Trait>::function(receiver_if_method, next_arg, ...);
|
||
```
|
||
|
||
For associated functions that aren’t methods, there would not be a `receiver`:
|
||
there would only be the list of other arguments. You could use fully qualified
|
||
syntax everywhere that you call functions or methods. However, you’re allowed
|
||
to omit any part of this syntax that Rust can figure out from other information
|
||
in the program. You only need to use this more verbose syntax in cases where
|
||
there are multiple implementations that use the same name and Rust needs help
|
||
to identify which implementation you want to call.
|
||
|
||
### Using Supertraits
|
||
|
||
Sometimes you might write a trait definition that depends on another trait: for
|
||
a type to implement the first trait, you want to require that type to also
|
||
implement the second trait. You would do this so that your trait definition can
|
||
make use of the associated items of the second trait. The trait your trait
|
||
definition is relying on is called a *supertrait* of your trait.
|
||
|
||
For example, let’s say we want to make an `OutlinePrint` trait with an
|
||
`outline_print` method that will print a given value formatted so that it’s
|
||
framed in asterisks. That is, given a `Point` struct that implements the
|
||
standard library trait `Display` to result in `(x, y)`, when we call
|
||
`outline_print` on a `Point` instance that has `1` for `x` and `3` for `y`, it
|
||
should print the following:
|
||
|
||
```
|
||
**********
|
||
* *
|
||
* (1, 3) *
|
||
* *
|
||
**********
|
||
```
|
||
|
||
In the implementation of the `outline_print` method, we want to use the
|
||
`Display` trait’s functionality. Therefore, we need to specify that the
|
||
`OutlinePrint` trait will work only for types that also implement `Display` and
|
||
provide the functionality that `OutlinePrint` needs. We can do that in the
|
||
trait definition by specifying `OutlinePrint: Display`. This technique is
|
||
similar to adding a trait bound to the trait. Listing 19-22 shows an
|
||
implementation of the `OutlinePrint` trait.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
use std::fmt;
|
||
|
||
trait OutlinePrint: fmt::Display {
|
||
fn outline_print(&self) {
|
||
let output = self.to_string();
|
||
let len = output.len();
|
||
println!("{}", "*".repeat(len + 4));
|
||
println!("*{}*", " ".repeat(len + 2));
|
||
println!("* {} *", output);
|
||
println!("*{}*", " ".repeat(len + 2));
|
||
println!("{}", "*".repeat(len + 4));
|
||
}
|
||
}
|
||
```
|
||
|
||
Listing 19-22: Implementing the `OutlinePrint` trait that requires the
|
||
functionality from `Display`
|
||
|
||
Because we’ve specified that `OutlinePrint` requires the `Display` trait, we
|
||
can use the `to_string` function that is automatically implemented for any type
|
||
that implements `Display`. If we tried to use `to_string` without adding a
|
||
colon and specifying the `Display` trait after the trait name, we’d get an
|
||
error saying that no method named `to_string` was found for the type `&Self` in
|
||
the current scope.
|
||
|
||
Let’s see what happens when we try to implement `OutlinePrint` on a type that
|
||
doesn’t implement `Display`, such as the `Point` struct:
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
struct Point {
|
||
x: i32,
|
||
y: i32,
|
||
}
|
||
|
||
impl OutlinePrint for Point {}
|
||
```
|
||
|
||
We get an error saying that `Display` is required but not implemented:
|
||
|
||
```
|
||
error[E0277]: `Point` doesn't implement `std::fmt::Display`
|
||
--> src/main.rs:20:6
|
||
|
|
||
20 | impl OutlinePrint for Point {}
|
||
| ^^^^^^^^^^^^ `Point` cannot be formatted with the default formatter
|
||
|
|
||
= help: the trait `std::fmt::Display` is not implemented for `Point`
|
||
= note: in format strings you may be able to use `{:?}` (or {:#?} for
|
||
pretty-print) instead
|
||
note: required by a bound in `OutlinePrint`
|
||
--> src/main.rs:3:21
|
||
|
|
||
3 | trait OutlinePrint: fmt::Display {
|
||
| ^^^^^^^^^^^^ required by this bound in `OutlinePrint`
|
||
```
|
||
|
||
To fix this, we implement `Display` on `Point` and satisfy the constraint that
|
||
`OutlinePrint` requires, like so:
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
use std::fmt;
|
||
|
||
impl fmt::Display for Point {
|
||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
|
||
write!(f, "({}, {})", self.x, self.y)
|
||
}
|
||
}
|
||
```
|
||
|
||
Then, implementing the `OutlinePrint` trait on `Point` will compile
|
||
successfully, and we can call `outline_print` on a `Point` instance to display
|
||
it within an outline of asterisks.
|
||
|
||
### Using the Newtype Pattern to Implement External Traits
|
||
|
||
In “Implementing a Trait on a Type” on page XX, we mentioned the orphan rule
|
||
that states we’re only allowed to implement a trait on a type if either the
|
||
trait or the type, or both, are local to our crate. It’s possible to get around
|
||
this restriction using the *newtype pattern*, which involves creating a new
|
||
type in a tuple struct. (We covered tuple structs in “Using Tuple Structs
|
||
Without Named Fields to Create Different Types” on page XX.) The tuple struct
|
||
will have one field and be a thin wrapper around the type for which we want to
|
||
implement a trait. Then the wrapper type is local to our crate, and we can
|
||
implement the trait on the wrapper. *Newtype* is a term that originates from
|
||
the Haskell programming language. There is no runtime performance penalty for
|
||
using this pattern, and the wrapper type is elided at compile time.
|
||
|
||
As an example, let’s say we want to implement `Display` on `Vec<T>`, which the
|
||
orphan rule prevents us from doing directly because the `Display` trait and the
|
||
`Vec<T>` type are defined outside our crate. We can make a `Wrapper` struct
|
||
that holds an instance of `Vec<T>`; then we can implement `Display` on
|
||
`Wrapper` and use the `Vec<T>` value, as shown in Listing 19-23.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
use std::fmt;
|
||
|
||
struct Wrapper(Vec<String>);
|
||
|
||
impl fmt::Display for Wrapper {
|
||
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
|
||
write!(f, "[{}]", self.0.join(", "))
|
||
}
|
||
}
|
||
|
||
fn main() {
|
||
let w = Wrapper(vec![
|
||
String::from("hello"),
|
||
String::from("world"),
|
||
]);
|
||
println!("w = {w}");
|
||
}
|
||
```
|
||
|
||
Listing 19-23: Creating a `Wrapper` type around `Vec<String>` to implement
|
||
`Display`
|
||
|
||
The implementation of `Display` uses `self.0` to access the inner `Vec<T>`
|
||
because `Wrapper` is a tuple struct and `Vec<T>` is the item at index 0 in the
|
||
tuple. Then we can use the functionality of the `Display` type on `Wrapper`.
|
||
|
||
The downside of using this technique is that `Wrapper` is a new type, so it
|
||
doesn’t have the methods of the value it’s holding. We would have to implement
|
||
all the methods of `Vec<T>` directly on `Wrapper` such that the methods
|
||
delegate to `self.0`, which would allow us to treat `Wrapper` exactly like a
|
||
`Vec<T>`. If we wanted the new type to have every method the inner type has,
|
||
implementing the `Deref` trait on the `Wrapper` to return the inner type would
|
||
be a solution (we discussed implementing the `Deref` trait in “Treating Smart
|
||
Pointers Like Regular References with Deref” on page XX). If we didn’t want the
|
||
`Wrapper` type to have all the methods of the inner type—for example, to
|
||
restrict the `Wrapper` type’s behavior—we would have to implement just the
|
||
methods we do want manually.
|
||
|
||
This newtype pattern is also useful even when traits are not involved. Let’s
|
||
switch focus and look at some advanced ways to interact with Rust’s type system.
|
||
|
||
## Advanced Types
|
||
|
||
The Rust type system has some features that we’ve so far mentioned but haven’t
|
||
yet discussed. We’ll start by discussing newtypes in general as we examine why
|
||
newtypes are useful as types. Then we’ll move on to type aliases, a feature
|
||
similar to newtypes but with slightly different semantics. We’ll also discuss
|
||
the `!` type and dynamically sized types.
|
||
|
||
### Using the Newtype Pattern for Type Safety and Abstraction
|
||
|
||
> Note: This section assumes you’ve read the earlier section “Using the Newtype
|
||
Pattern to Implement External Traits” on page XX.
|
||
|
||
The newtype pattern is also useful for tasks beyond those we’ve discussed so
|
||
far, including statically enforcing that values are never confused and
|
||
indicating the units of a value. You saw an example of using newtypes to
|
||
indicate units in Listing 19-15: recall that the `Millimeters` and `Meters`
|
||
structs wrapped `u32` values in a newtype. If we wrote a function with a
|
||
parameter of type `Millimeters`, we wouldn’t be able to compile a program that
|
||
accidentally tried to call that function with a value of type `Meters` or a
|
||
plain `u32`.
|
||
|
||
We can also use the newtype pattern to abstract away some implementation
|
||
details of a type: the new type can expose a public API that is different from
|
||
the API of the private inner type.
|
||
|
||
Newtypes can also hide internal implementation. For example, we could provide a
|
||
`People` type to wrap a `HashMap<i32, String>` that stores a person’s ID
|
||
associated with their name. Code using `People` would only interact with the
|
||
public API we provide, such as a method to add a name string to the `People`
|
||
collection; that code wouldn’t need to know that we assign an `i32` ID to names
|
||
internally. The newtype pattern is a lightweight way to achieve encapsulation
|
||
to hide implementation details, which we discussed in “Encapsulation That Hides
|
||
Implementation Details” on page XX.
|
||
|
||
### Creating Type Synonyms with Type Aliases
|
||
|
||
Rust provides the ability to declare a *type alias* to give an existing type
|
||
another name. For this we use the `type` keyword. For example, we can create
|
||
the alias `Kilometers` to `i32` like so:
|
||
|
||
```
|
||
type Kilometers = i32;
|
||
```
|
||
|
||
Now the alias `Kilometers` is a *synonym* for `i32`; unlike the `Millimeters`
|
||
and `Meters` types we created in Listing 19-15, `Kilometers` is not a separate,
|
||
new type. Values that have the type `Kilometers` will be treated the same as
|
||
values of type `i32`:
|
||
|
||
```
|
||
type Kilometers = i32;
|
||
|
||
let x: i32 = 5;
|
||
let y: Kilometers = 5;
|
||
|
||
println!("x + y = {}", x + y);
|
||
```
|
||
|
||
Because `Kilometers` and `i32` are the same type, we can add values of both
|
||
types and we can pass `Kilometers` values to functions that take `i32`
|
||
parameters. However, using this method, we don’t get the type-checking benefits
|
||
that we get from the newtype pattern discussed earlier. In other words, if we
|
||
mix up `Kilometers` and `i32` values somewhere, the compiler will not give us
|
||
an error.
|
||
|
||
The main use case for type synonyms is to reduce repetition. For example, we
|
||
might have a lengthy type like this:
|
||
|
||
```
|
||
Box<dyn Fn() + Send + 'static>
|
||
```
|
||
|
||
Writing this lengthy type in function signatures and as type annotations all
|
||
over the code can be tiresome and error prone. Imagine having a project full of
|
||
code like that in Listing 19-24.
|
||
|
||
```
|
||
let f: Box<dyn Fn() + Send + 'static> = Box::new(|| {
|
||
println!("hi");
|
||
});
|
||
|
||
fn takes_long_type(f: Box<dyn Fn() + Send + 'static>) {
|
||
--snip--
|
||
}
|
||
|
||
fn returns_long_type() -> Box<dyn Fn() + Send + 'static> {
|
||
--snip--
|
||
}
|
||
```
|
||
|
||
Listing 19-24: Using a long type in many places
|
||
|
||
A type alias makes this code more manageable by reducing the repetition. In
|
||
Listing 19-25, we’ve introduced an alias named `Thunk` for the verbose type and
|
||
can replace all uses of the type with the shorter alias `Thunk`.
|
||
|
||
```
|
||
type Thunk = Box<dyn Fn() + Send + 'static>;
|
||
|
||
let f: Thunk = Box::new(|| println!("hi"));
|
||
|
||
fn takes_long_type(f: Thunk) {
|
||
--snip--
|
||
}
|
||
|
||
fn returns_long_type() -> Thunk {
|
||
--snip--
|
||
}
|
||
```
|
||
|
||
Listing 19-25: Introducing a type alias `Thunk` to reduce repetition
|
||
|
||
This code is much easier to read and write! Choosing a meaningful name for a
|
||
type alias can help communicate your intent as well (*thunk* is a word for code
|
||
to be evaluated at a later time, so it’s an appropriate name for a closure that
|
||
gets stored).
|
||
|
||
Type aliases are also commonly used with the `Result<T, E>` type for reducing
|
||
repetition. Consider the `std::io` module in the standard library. I/O
|
||
operations often return a `Result<T, E>` to handle situations when operations
|
||
fail to work. This library has a `std::io::Error` struct that represents all
|
||
possible I/O errors. Many of the functions in `std::io` will be returning
|
||
`Result<T, E>` where the `E` is `std::io::Error`, such as these functions in
|
||
the `Write` trait:
|
||
|
||
```
|
||
use std::fmt;
|
||
use std::io::Error;
|
||
|
||
pub trait Write {
|
||
fn write(&mut self, buf: &[u8]) -> Result<usize, Error>;
|
||
fn flush(&mut self) -> Result<(), Error>;
|
||
|
||
fn write_all(&mut self, buf: &[u8]) -> Result<(), Error>;
|
||
fn write_fmt(
|
||
&mut self,
|
||
fmt: fmt::Arguments,
|
||
) -> Result<(), Error>;
|
||
}
|
||
```
|
||
|
||
The `Result<..., Error>` is repeated a lot. As such, `std::io` has this type
|
||
alias declaration:
|
||
|
||
```
|
||
type Result<T> = std::result::Result<T, std::io::Error>;
|
||
```
|
||
|
||
Because this declaration is in the `std::io` module, we can use the fully
|
||
qualified alias `std::io::Result<T>`; that is, a `Result<T, E>` with the `E`
|
||
filled in as `std::io::Error`. The `Write` trait function signatures end up
|
||
looking like this:
|
||
|
||
```
|
||
pub trait Write {
|
||
fn write(&mut self, buf: &[u8]) -> Result<usize>;
|
||
fn flush(&mut self) -> Result<()>;
|
||
|
||
fn write_all(&mut self, buf: &[u8]) -> Result<()>;
|
||
fn write_fmt(&mut self, fmt: fmt::Arguments) -> Result<()>;
|
||
}
|
||
```
|
||
|
||
The type alias helps in two ways: it makes code easier to write *and* it gives
|
||
us a consistent interface across all of `std::io`. Because it’s an alias, it’s
|
||
just another `Result<T, E>`, which means we can use any methods that work on
|
||
`Result<T, E>` with it, as well as special syntax like the `?` operator.
|
||
|
||
### The Never Type That Never Returns
|
||
|
||
Rust has a special type named `!` that’s known in type theory lingo as the
|
||
*empty type* because it has no values. We prefer to call it the *never type*
|
||
because it stands in the place of the return type when a function will never
|
||
return. Here is an example:
|
||
|
||
```
|
||
fn bar() -> ! {
|
||
--snip--
|
||
}
|
||
```
|
||
|
||
This code is read as “the function `bar` returns never.” Functions that return
|
||
never are called *diverging functions*. We can’t create values of the type `!`,
|
||
so `bar` can never possibly return.
|
||
|
||
But what use is a type you can never create values for? Recall the code from
|
||
Listing 2-5, part of the number-guessing game; we’ve reproduced a bit of it
|
||
here in Listing 19-26.
|
||
|
||
```
|
||
let guess: u32 = match guess.trim().parse() {
|
||
Ok(num) => num,
|
||
Err(_) => continue,
|
||
};
|
||
```
|
||
|
||
Listing 19-26: A `match` with an arm that ends in `continue`
|
||
|
||
At the time, we skipped over some details in this code. In “The match Control
|
||
Flow Construct” on page XX, we discussed that `match` arms must all return the
|
||
same type. So, for example, the following code doesn’t work:
|
||
|
||
```
|
||
let guess = match guess.trim().parse() {
|
||
Ok(_) => 5,
|
||
Err(_) => "hello",
|
||
};
|
||
```
|
||
|
||
The type of `guess` in this code would have to be an integer *and* a string,
|
||
and Rust requires that `guess` have only one type. So what does `continue`
|
||
return? How were we allowed to return a `u32` from one arm and have another arm
|
||
that ends with `continue` in Listing 19-26?
|
||
|
||
As you might have guessed, `continue` has a `!` value. That is, when Rust
|
||
computes the type of `guess`, it looks at both match arms, the former with a
|
||
value of `u32` and the latter with a `!` value. Because `!` can never have a
|
||
value, Rust decides that the type of `guess` is `u32`.
|
||
|
||
The formal way of describing this behavior is that expressions of type `!` can
|
||
be coerced into any other type. We’re allowed to end this `match` arm with
|
||
`continue` because `continue` doesn’t return a value; instead, it moves control
|
||
back to the top of the loop, so in the `Err` case, we never assign a value to
|
||
`guess`.
|
||
|
||
The never type is useful with the `panic!` macro as well. Recall the `unwrap`
|
||
function that we call on `Option<T>` values to produce a value or panic with
|
||
this definition:
|
||
|
||
```
|
||
impl<T> Option<T> {
|
||
pub fn unwrap(self) -> T {
|
||
match self {
|
||
Some(val) => val,
|
||
None => panic!(
|
||
"called `Option::unwrap()` on a `None` value"
|
||
),
|
||
}
|
||
}
|
||
}
|
||
```
|
||
|
||
In this code, the same thing happens as in the `match` in Listing 19-26: Rust
|
||
sees that `val` has the type `T` and `panic!` has the type `!`, so the result
|
||
of the overall `match` expression is `T`. This code works because `panic!`
|
||
doesn’t produce a value; it ends the program. In the `None` case, we won’t be
|
||
returning a value from `unwrap`, so this code is valid.
|
||
|
||
One final expression that has the type `!` is a `loop`:
|
||
|
||
```
|
||
print!("forever ");
|
||
|
||
loop {
|
||
print!("and ever ");
|
||
}
|
||
```
|
||
|
||
Here, the loop never ends, so `!` is the value of the expression. However, this
|
||
wouldn’t be true if we included a `break`, because the loop would terminate
|
||
when it got to the `break`.
|
||
|
||
### Dynamically Sized Types and the Sized Trait
|
||
|
||
Rust needs to know certain details about its types, such as how much space to
|
||
allocate for a value of a particular type. This leaves one corner of its type
|
||
system a little confusing at first: the concept of *dynamically sized types*.
|
||
Sometimes referred to as *DSTs* or *unsized types*, these types let us write
|
||
code using values whose size we can know only at runtime.
|
||
|
||
Let’s dig into the details of a dynamically sized type called `str`, which
|
||
we’ve been using throughout the book. That’s right, not `&str`, but `str` on
|
||
its own, is a DST. We can’t know how long the string is until runtime, meaning
|
||
we can’t create a variable of type `str`, nor can we take an argument of type
|
||
`str`. Consider the following code, which does not work:
|
||
|
||
```
|
||
let s1: str = "Hello there!";
|
||
let s2: str = "How's it going?";
|
||
```
|
||
|
||
Rust needs to know how much memory to allocate for any value of a particular
|
||
type, and all values of a type must use the same amount of memory. If Rust
|
||
allowed us to write this code, these two `str` values would need to take up the
|
||
same amount of space. But they have different lengths: `s1` needs 12 bytes of
|
||
storage and `s2` needs 15. This is why it’s not possible to create a variable
|
||
holding a dynamically sized type.
|
||
|
||
So what do we do? In this case, you already know the answer: we make the types
|
||
of `s1` and `s2` a `&str` rather than a `str`. Recall from “String Slices” on
|
||
page XX that the slice data structure just stores the starting position and the
|
||
length of the slice. So, although a `&T` is a single value that stores the
|
||
memory address of where the `T` is located, a `&str` is *two* values: the
|
||
address of the `str` and its length. As such, we can know the size of a `&str`
|
||
value at compile time: it’s twice the length of a `usize`. That is, we always
|
||
know the size of a `&str`, no matter how long the string it refers to is. In
|
||
general, this is the way in which dynamically sized types are used in Rust:
|
||
they have an extra bit of metadata that stores the size of the dynamic
|
||
information. The golden rule of dynamically sized types is that we must always
|
||
put values of dynamically sized types behind a pointer of some kind.
|
||
|
||
We can combine `str` with all kinds of pointers: for example, `Box<str>` or
|
||
`Rc<str>`. In fact, you’ve seen this before but with a different dynamically
|
||
sized type: traits. Every trait is a dynamically sized type we can refer to by
|
||
using the name of the trait. In “Using Trait Objects That Allow for Values of
|
||
Different Types” on page XX, we mentioned that to use traits as trait objects,
|
||
we must put them behind a pointer, such as `&dyn Trait` or `Box<dyn Trait>`
|
||
(`Rc<dyn Trait>` would work too).
|
||
|
||
To work with DSTs, Rust provides the `Sized` trait to determine whether or not
|
||
a type’s size is known at compile time. This trait is automatically implemented
|
||
for everything whose size is known at compile time. In addition, Rust
|
||
implicitly adds a bound on `Sized` to every generic function. That is, a
|
||
generic function definition like this:
|
||
|
||
```
|
||
fn generic<T>(t: T) {
|
||
--snip--
|
||
}
|
||
```
|
||
|
||
is actually treated as though we had written this:
|
||
|
||
```
|
||
fn generic<T: Sized>(t: T) {
|
||
--snip--
|
||
}
|
||
```
|
||
|
||
By default, generic functions will work only on types that have a known size at
|
||
compile time. However, you can use the following special syntax to relax this
|
||
restriction:
|
||
|
||
```
|
||
fn generic<T: ?Sized>(t: &T) {
|
||
--snip--
|
||
}
|
||
```
|
||
|
||
A trait bound on `?Sized` means “`T` may or may not be `Sized`” and this
|
||
notation overrides the default that generic types must have a known size at
|
||
compile time. The `?Trait` syntax with this meaning is only available for
|
||
`Sized`, not any other traits.
|
||
|
||
Also note that we switched the type of the `t` parameter from `T` to `&T`.
|
||
Because the type might not be `Sized`, we need to use it behind some kind of
|
||
pointer. In this case, we’ve chosen a reference.
|
||
|
||
Next, we’ll talk about functions and closures!
|
||
|
||
## Advanced Functions and Closures
|
||
|
||
This section explores some advanced features related to functions and closures,
|
||
including function pointers and returning closures.
|
||
|
||
### Function Pointers
|
||
|
||
We’ve talked about how to pass closures to functions; you can also pass regular
|
||
functions to functions! This technique is useful when you want to pass a
|
||
function you’ve already defined rather than defining a new closure. Functions
|
||
coerce to the type `fn` (with a lowercase *f*), not to be confused with the
|
||
`Fn` closure trait. The `fn` type is called a *function pointer*. Passing
|
||
functions with function pointers will allow you to use functions as arguments
|
||
to other functions.
|
||
|
||
The syntax for specifying that a parameter is a function pointer is similar to
|
||
that of closures, as shown in Listing 19-27, where we’ve defined a function
|
||
`add_one` that adds 1 to its parameter. The function `do_twice` takes two
|
||
parameters: a function pointer to any function that takes an `i32` parameter
|
||
and returns an `i32`, and one `i32 value`. The `do_twice` function calls the
|
||
function `f` twice, passing it the `arg` value, then adds the two function call
|
||
results together. The `main` function calls `do_twice` with the arguments
|
||
`add_one` and `5`.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
fn add_one(x: i32) -> i32 {
|
||
x + 1
|
||
}
|
||
|
||
fn do_twice(f: fn(i32) -> i32, arg: i32) -> i32 {
|
||
f(arg) + f(arg)
|
||
}
|
||
|
||
fn main() {
|
||
let answer = do_twice(add_one, 5);
|
||
|
||
println!("The answer is: {answer}");
|
||
}
|
||
```
|
||
|
||
Listing 19-27: Using the `fn` type to accept a function pointer as an argument
|
||
|
||
This code prints `The answer is: 12`. We specify that the parameter `f` in
|
||
`do_twice` is an `fn` that takes one parameter of type `i32` and returns an
|
||
`i32`. We can then call `f` in the body of `do_twice`. In `main`, we can pass
|
||
the function name `add_one` as the first argument to `do_twice`.
|
||
|
||
Unlike closures, `fn` is a type rather than a trait, so we specify `fn` as the
|
||
parameter type directly rather than declaring a generic type parameter with one
|
||
of the `Fn` traits as a trait bound.
|
||
|
||
Function pointers implement all three of the closure traits (`Fn`, `FnMut`, and
|
||
`FnOnce`), meaning you can always pass a function pointer as an argument for a
|
||
function that expects a closure. It’s best to write functions using a generic
|
||
type and one of the closure traits so your functions can accept either
|
||
functions or closures.
|
||
|
||
That said, one example of where you would want to only accept `fn` and not
|
||
closures is when interfacing with external code that doesn’t have closures: C
|
||
functions can accept functions as arguments, but C doesn’t have closures.
|
||
|
||
As an example of where you could use either a closure defined inline or a named
|
||
function, let’s look at a use of the `map` method provided by the `Iterator`
|
||
trait in the standard library. To use the `map` function to turn a vector of
|
||
numbers into a vector of strings, we could use a closure, like this:
|
||
|
||
```
|
||
let list_of_numbers = vec![1, 2, 3];
|
||
let list_of_strings: Vec<String> = list_of_numbers
|
||
.iter()
|
||
.map(|i| i.to_string())
|
||
.collect();
|
||
```
|
||
|
||
Or we could name a function as the argument to `map` instead of the closure,
|
||
like this:
|
||
|
||
```
|
||
let list_of_numbers = vec![1, 2, 3];
|
||
let list_of_strings: Vec<String> = list_of_numbers
|
||
.iter()
|
||
.map(ToString::to_string)
|
||
.collect();
|
||
```
|
||
|
||
Note that we must use the fully qualified syntax that we talked about in
|
||
“Advanced Traits” on page XX because there are multiple functions available
|
||
named `to_string`.
|
||
|
||
Here, we’re using the `to_string` function defined in the `ToString` trait,
|
||
which the standard library has implemented for any type that implements
|
||
`Display`.
|
||
|
||
Recall from “Enum Values” on page XX that the name of each enum variant that we
|
||
define also becomes an initializer function. We can use these initializer
|
||
functions as function pointers that implement the closure traits, which means
|
||
we can specify the initializer functions as arguments for methods that take
|
||
closures, like so:
|
||
|
||
```
|
||
enum Status {
|
||
Value(u32),
|
||
Stop,
|
||
}
|
||
|
||
let list_of_statuses: Vec<Status> = (0u32..20)
|
||
.map(Status::Value)
|
||
.collect();
|
||
```
|
||
|
||
Here, we create `Status::Value` instances using each `u32` value in the range
|
||
that `map` is called on by using the initializer function of `Status::Value`.
|
||
Some people prefer this style and some people prefer to use closures. They
|
||
compile to the same code, so use whichever style is clearer to you.
|
||
|
||
### Returning Closures
|
||
|
||
Closures are represented by traits, which means you can’t return closures
|
||
directly. In most cases where you might want to return a trait, you can instead
|
||
use the concrete type that implements the trait as the return value of the
|
||
function. However, you can’t do that with closures because they don’t have a
|
||
concrete type that is returnable; you’re not allowed to use the function
|
||
pointer `fn` as a return type, for example.
|
||
|
||
The following code tries to return a closure directly, but it won’t compile:
|
||
|
||
```
|
||
fn returns_closure() -> dyn Fn(i32) -> i32 {
|
||
|x| x + 1
|
||
}
|
||
```
|
||
|
||
The compiler error is as follows:
|
||
|
||
```
|
||
error[E0746]: return type cannot have an unboxed trait object
|
||
--> src/lib.rs:1:25
|
||
|
|
||
1 | fn returns_closure() -> dyn Fn(i32) -> i32 {
|
||
| ^^^^^^^^^^^^^^^^^^ doesn't have a size known at
|
||
compile-time
|
||
|
|
||
= note: for information on `impl Trait`, see
|
||
<https://doc.rust-lang.org/book/ch10-02-traits.html#returning-types-that-
|
||
implement-traits>
|
||
help: use `impl Fn(i32) -> i32` as the return type, as all return paths are of
|
||
type `[closure@src/lib.rs:2:5: 2:14]`, which implements `Fn(i32) -> i32`
|
||
|
|
||
1 | fn returns_closure() -> impl Fn(i32) -> i32 {
|
||
| ~~~~~~~~~~~~~~~~~~~
|
||
```
|
||
|
||
The error references the `Sized` trait again! Rust doesn’t know how much space
|
||
it will need to store the closure. We saw a solution to this problem earlier.
|
||
We can use a trait object:
|
||
|
||
```
|
||
fn returns_closure() -> Box<dyn Fn(i32) -> i32> {
|
||
Box::new(|x| x + 1)
|
||
}
|
||
```
|
||
|
||
This code will compile just fine. For more about trait objects, refer to “Using
|
||
Trait Objects That Allow for Values of Different Types” on page XX.
|
||
|
||
Next, let’s look at macros!
|
||
|
||
## Macros
|
||
|
||
We’ve used macros like `println!` throughout this book, but we haven’t fully
|
||
explored what a macro is and how it works. The term *macro* refers to a family
|
||
of features in Rust: *declarative* macros with `macro_rules!` and three kinds
|
||
of *procedural* macros:
|
||
|
||
* Custom `#[derive]` macros that specify code added with the `derive` attribute
|
||
used on structs and enums
|
||
* Attribute-like macros that define custom attributes usable on any item
|
||
* Function-like macros that look like function calls but operate on the tokens
|
||
specified as their argument
|
||
|
||
We’ll talk about each of these in turn, but first, let’s look at why we even
|
||
need macros when we already have functions.
|
||
|
||
### The Difference Between Macros and Functions
|
||
|
||
Fundamentally, macros are a way of writing code that writes other code, which
|
||
is known as *metaprogramming*. In Appendix C, we discuss the `derive`
|
||
attribute, which generates an implementation of various traits for you. We’ve
|
||
also used the `println!` and `vec!` macros throughout the book. All of these
|
||
macros *expand* to produce more code than the code you’ve written manually.
|
||
|
||
Metaprogramming is useful for reducing the amount of code you have to write and
|
||
maintain, which is also one of the roles of functions. However, macros have
|
||
some additional powers that functions don’t have.
|
||
|
||
A function signature must declare the number and type of parameters the
|
||
function has. Macros, on the other hand, can take a variable number of
|
||
parameters: we can call `println!("hello")` with one argument or
|
||
`println!("hello {}", name)` with two arguments. Also, macros are expanded
|
||
before the compiler interprets the meaning of the code, so a macro can, for
|
||
example, implement a trait on a given type. A function can’t, because it gets
|
||
called at runtime and a trait needs to be implemented at compile time.
|
||
|
||
The downside to implementing a macro instead of a function is that macro
|
||
definitions are more complex than function definitions because you’re writing
|
||
Rust code that writes Rust code. Due to this indirection, macro definitions are
|
||
generally more difficult to read, understand, and maintain than function
|
||
definitions.
|
||
|
||
Another important difference between macros and functions is that you must
|
||
define macros or bring them into scope *before* you call them in a file, as
|
||
opposed to functions you can define anywhere and call anywhere.
|
||
|
||
### Declarative Macros with macro_rules! for General Metaprogramming
|
||
|
||
The most widely used form of macros in Rust is the *declarative macro*. These
|
||
are also sometimes referred to as “macros by example,” “`macro_rules!` macros,”
|
||
or just plain “macros.” At their core, declarative macros allow you to write
|
||
something similar to a Rust `match` expression. As discussed in Chapter 6,
|
||
`match` expressions are control structures that take an expression, compare the
|
||
resultant value of the expression to patterns, and then run the code associated
|
||
with the matching pattern. Macros also compare a value to patterns that are
|
||
associated with particular code: in this situation, the value is the literal
|
||
Rust source code passed to the macro; the patterns are compared with the
|
||
structure of that source code; and the code associated with each pattern, when
|
||
matched, replaces the code passed to the macro. This all happens during
|
||
compilation.
|
||
|
||
To define a macro, you use the `macro_rules!` construct. Let’s explore how to
|
||
use `macro_rules!` by looking at how the `vec!` macro is defined. Chapter 8
|
||
covered how we can use the `vec!` macro to create a new vector with particular
|
||
values. For example, the following macro creates a new vector containing three
|
||
integers:
|
||
|
||
```
|
||
let v: Vec<u32> = vec![1, 2, 3];
|
||
```
|
||
|
||
We could also use the `vec!` macro to make a vector of two integers or a vector
|
||
of five string slices. We wouldn’t be able to use a function to do the same
|
||
because we wouldn’t know the number or type of values up front.
|
||
|
||
Listing 19-28 shows a slightly simplified definition of the `vec!` macro.
|
||
|
||
Filename: src/lib.rs
|
||
|
||
```
|
||
1 #[macro_export]
|
||
2 macro_rules! vec {
|
||
3 ( $( $x:expr ),* ) => {
|
||
{
|
||
let mut temp_vec = Vec::new();
|
||
4 $(
|
||
5 temp_vec.push(6 $x);
|
||
)*
|
||
7 temp_vec
|
||
}
|
||
};
|
||
}
|
||
```
|
||
|
||
Listing 19-28: A simplified version of the `vec!` macro definition
|
||
|
||
> Note: The actual definition of the `vec!` macro in the standard library
|
||
includes code to pre-allocate the correct amount of memory up front. That code
|
||
is an optimization that we don’t include here, to make the example simpler.
|
||
|
||
The `#[macro_export]` annotation [1] indicates that this macro should be made
|
||
available whenever the crate in which the macro is defined is brought into
|
||
scope. Without this annotation, the macro can’t be brought into scope.
|
||
|
||
We then start the macro definition with `macro_rules!` and the name of the
|
||
macro we’re defining *without* the exclamation mark [2]. The name, in this case
|
||
`vec`, is followed by curly brackets denoting the body of the macro definition.
|
||
|
||
The structure in the `vec!` body is similar to the structure of a `match`
|
||
expression. Here we have one arm with the pattern `( $( $x:expr ),* )`,
|
||
followed by `=>` and the block of code associated with this pattern [3]. If the
|
||
pattern matches, the associated block of code will be emitted. Given that this
|
||
is the only pattern in this macro, there is only one valid way to match; any
|
||
other pattern will result in an error. More complex macros will have more than
|
||
one arm.
|
||
|
||
Valid pattern syntax in macro definitions is different from the pattern syntax
|
||
covered in Chapter 18 because macro patterns are matched against Rust code
|
||
structure rather than values. Let’s walk through what the pattern pieces in
|
||
Listing 19-28 mean; for the full macro pattern syntax, see the Rust Reference
|
||
at *https://doc.rust-lang.org/reference/macros-by-example.html*.
|
||
|
||
First we use a set of parentheses to encompass the whole pattern. We use a
|
||
dollar sign (`$`) to declare a variable in the macro system that will contain
|
||
the Rust code matching the pattern. The dollar sign makes it clear this is a
|
||
macro variable as opposed to a regular Rust variable. Next comes a set of
|
||
parentheses that captures values that match the pattern within the parentheses
|
||
for use in the replacement code. Within `$()` is `$x:expr`, which matches any
|
||
Rust expression and gives the expression the name `$x`.
|
||
|
||
The comma following `$()` indicates that a literal comma separator character
|
||
could optionally appear after the code that matches the code in `$()`. The `*`
|
||
specifies that the pattern matches zero or more of whatever precedes the `*`.
|
||
|
||
When we call this macro with `vec![1, 2, 3];`, the `$x` pattern matches three
|
||
times with the three expressions `1`, `2`, and `3`.
|
||
|
||
Now let’s look at the pattern in the body of the code associated with this arm:
|
||
`temp_vec.push()` [5] within `$()* at [4] and [7] is generated for each part
|
||
that matches `$()` in the pattern zero or more times depending on how many
|
||
times the pattern matches. The `$x` [6] is replaced with each expression
|
||
matched. When we call this macro with `vec![1, 2, 3];`, the code generated that
|
||
replaces this macro call will be the following:
|
||
|
||
```
|
||
{
|
||
let mut temp_vec = Vec::new();
|
||
temp_vec.push(1);
|
||
temp_vec.push(2);
|
||
temp_vec.push(3);
|
||
temp_vec
|
||
}
|
||
```
|
||
|
||
We’ve defined a macro that can take any number of arguments of any type and can
|
||
generate code to create a vector containing the specified elements.
|
||
|
||
To learn more about how to write macros, consult the online documentation or
|
||
other resources, such as “The Little Book of Rust Macros” at
|
||
*https://veykril.github.io/tlborm* started by Daniel Keep and continued by
|
||
Lukas Wirth.
|
||
|
||
### Procedural Macros for Generating Code from Attributes
|
||
|
||
The second form of macros is the procedural macro, which acts more like a
|
||
function (and is a type of procedure). *Procedural macros* accept some code as
|
||
an input, operate on that code, and produce some code as an output rather than
|
||
matching against patterns and replacing the code with other code as declarative
|
||
macros do. The three kinds of procedural macros are custom `derive`,
|
||
attribute-like, and function-like, and all work in a similar fashion.
|
||
|
||
When creating procedural macros, the definitions must reside in their own crate
|
||
with a special crate type. This is for complex technical reasons that we hope
|
||
to eliminate in the future. In Listing 19-29, we show how to define a
|
||
procedural macro, where `some_attribute` is a placeholder for using a specific
|
||
macro variety.
|
||
|
||
Filename: src/lib.rs
|
||
|
||
```
|
||
use proc_macro::TokenStream;
|
||
|
||
#[some_attribute]
|
||
pub fn some_name(input: TokenStream) -> TokenStream {
|
||
}
|
||
```
|
||
|
||
Listing 19-29: An example of defining a procedural macro
|
||
|
||
The function that defines a procedural macro takes a `TokenStream` as an input
|
||
and produces a `TokenStream` as an output. The `TokenStream` type is defined by
|
||
the `proc_macro` crate that is included with Rust and represents a sequence of
|
||
tokens. This is the core of the macro: the source code that the macro is
|
||
operating on makes up the input `TokenStream`, and the code the macro produces
|
||
is the output `TokenStream`. The function also has an attribute attached to it
|
||
that specifies which kind of procedural macro we’re creating. We can have
|
||
multiple kinds of procedural macros in the same crate.
|
||
|
||
Let’s look at the different kinds of procedural macros. We’ll start with a
|
||
custom `derive` macro and then explain the small dissimilarities that make the
|
||
other forms different.
|
||
|
||
### How to Write a Custom derive Macro
|
||
|
||
Let’s create a crate named `hello_macro` that defines a trait named
|
||
`HelloMacro` with one associated function named `hello_macro`. Rather than
|
||
making our users implement the `HelloMacro` trait for each of their types,
|
||
we’ll provide a procedural macro so users can annotate their type with
|
||
`#[derive(HelloMacro)]` to get a default implementation of the `hello_macro`
|
||
function. The default implementation will print `Hello, Macro! My name is`
|
||
TypeName`!` where TypeName is the name of the type on which this trait has been
|
||
defined. In other words, we’ll write a crate that enables another programmer to
|
||
write code like Listing 19-30 using our crate.
|
||
|
||
Filename: src/main.rs
|
||
|
||
```
|
||
use hello_macro::HelloMacro;
|
||
use hello_macro_derive::HelloMacro;
|
||
|
||
#[derive(HelloMacro)]
|
||
struct Pancakes;
|
||
|
||
fn main() {
|
||
Pancakes::hello_macro();
|
||
}
|
||
```
|
||
|
||
Listing 19-30: The code a user of our crate will be able to write when using
|
||
our procedural macro
|
||
|
||
This code will print `Hello, Macro! My name is Pancakes!` when we’re done. The
|
||
first step is to make a new library crate, like this:
|
||
|
||
```
|
||
$ cargo new hello_macro --lib
|
||
```
|
||
|
||
Next, we’ll define the `HelloMacro` trait and its associated function:
|
||
|
||
Filename: src/lib.rs
|
||
|
||
```
|
||
pub trait HelloMacro {
|
||
fn hello_macro();
|
||
}
|
||
```
|
||
|
||
We have a trait and its function. At this point, our crate user could implement
|
||
the trait to achieve the desired functionality, like so:
|
||
|
||
```
|
||
use hello_macro::HelloMacro;
|
||
|
||
struct Pancakes;
|
||
|
||
impl HelloMacro for Pancakes {
|
||
fn hello_macro() {
|
||
println!("Hello, Macro! My name is Pancakes!");
|
||
}
|
||
}
|
||
|
||
fn main() {
|
||
Pancakes::hello_macro();
|
||
}
|
||
```
|
||
|
||
However, they would need to write the implementation block for each type they
|
||
wanted to use with `hello_macro`; we want to spare them from having to do this
|
||
work.
|
||
|
||
Additionally, we can’t yet provide the `hello_macro` function with default
|
||
implementation that will print the name of the type the trait is implemented
|
||
on: Rust doesn’t have reflection capabilities, so it can’t look up the type’s
|
||
name at runtime. We need a macro to generate code at compile time.
|
||
|
||
The next step is to define the procedural macro. At the time of this writing,
|
||
procedural macros need to be in their own crate. Eventually, this restriction
|
||
might be lifted. The convention for structuring crates and macro crates is as
|
||
follows: for a crate named foo, a custom `derive` procedural macro crate is
|
||
called foo`_derive`. Let’s start a new crate called `hello_macro_derive` inside
|
||
our `hello_macro` project:
|
||
|
||
```
|
||
$ cargo new hello_macro_derive --lib
|
||
```
|
||
|
||
Our two crates are tightly related, so we create the procedural macro crate
|
||
within the directory of our `hello_macro` crate. If we change the trait
|
||
definition in `hello_macro`, we’ll have to change the implementation of the
|
||
procedural macro in `hello_macro_derive` as well. The two crates will need to
|
||
be published separately, and programmers using these crates will need to add
|
||
both as dependencies and bring them both into scope. We could instead have the
|
||
`hello_macro` crate use `hello_macro_derive` as a dependency and re-export the
|
||
procedural macro code. However, the way we’ve structured the project makes it
|
||
possible for programmers to use `hello_macro` even if they don’t want the
|
||
`derive` functionality.
|
||
|
||
We need to declare the `hello_macro_derive` crate as a procedural macro crate.
|
||
We’ll also need functionality from the `syn` and `quote` crates, as you’ll see
|
||
in a moment, so we need to add them as dependencies. Add the following to the
|
||
*Cargo.toml* file for `hello_macro_derive`:
|
||
|
||
Filename: hello_macro_derive/Cargo.toml
|
||
|
||
```
|
||
[lib]
|
||
proc-macro = true
|
||
|
||
[dependencies]
|
||
syn = "1.0"
|
||
quote = "1.0"
|
||
```
|
||
|
||
To start defining the procedural macro, place the code in Listing 19-31 into
|
||
your *src/lib.rs* file for the `hello_macro_derive` crate. Note that this code
|
||
won’t compile until we add a definition for the `impl_hello_macro` function.
|
||
|
||
Filename: hello_macro_derive/src/lib.rs
|
||
|
||
```
|
||
use proc_macro::TokenStream;
|
||
use quote::quote;
|
||
use syn;
|
||
|
||
#[proc_macro_derive(HelloMacro)]
|
||
pub fn hello_macro_derive(input: TokenStream) -> TokenStream {
|
||
// Construct a representation of Rust code as a syntax tree
|
||
// that we can manipulate
|
||
let ast = syn::parse(input).unwrap();
|
||
|
||
// Build the trait implementation
|
||
impl_hello_macro(&ast)
|
||
}
|
||
```
|
||
|
||
Listing 19-31: Code that most procedural macro crates will require in order to
|
||
process Rust code
|
||
|
||
Notice that we’ve split the code into the `hello_macro_derive` function, which
|
||
is responsible for parsing the `TokenStream`, and the `impl_hello_macro`
|
||
function, which is responsible for transforming the syntax tree: this makes
|
||
writing a procedural macro more convenient. The code in the outer function
|
||
(`hello_macro_derive` in this case) will be the same for almost every
|
||
procedural macro crate you see or create. The code you specify in the body of
|
||
the inner function (`impl_hello_macro` in this case) will be different
|
||
depending on your procedural macro’s purpose.
|
||
|
||
We’ve introduced three new crates: `proc_macro`, `syn` (available from
|
||
*https://crates.io/crates/syn*), and `quote` (available from
|
||
*https://crates.io/crates/quote*). The `proc_macro` crate comes with Rust, so
|
||
we didn’t need to add that to the dependencies in *Cargo.toml*. The
|
||
`proc_macro` crate is the compiler’s API that allows us to read and manipulate
|
||
Rust code from our code.
|
||
|
||
The `syn` crate parses Rust code from a string into a data structure that we
|
||
can perform operations on. The `quote` crate turns `syn` data structures back
|
||
into Rust code. These crates make it much simpler to parse any sort of Rust
|
||
code we might want to handle: writing a full parser for Rust code is no simple
|
||
task.
|
||
|
||
The `hello_macro_derive` function will be called when a user of our library
|
||
specifies `#[derive(HelloMacro)]` on a type. This is possible because we’ve
|
||
annotated the `hello_macro_derive` function here with `proc_macro_derive` and
|
||
specified the name `HelloMacro`, which matches our trait name; this is the
|
||
convention most procedural macros follow.
|
||
|
||
The `hello_macro_derive` function first converts the `input` from a
|
||
`TokenStream` to a data structure that we can then interpret and perform
|
||
operations on. This is where `syn` comes into play. The `parse` function in
|
||
`syn` takes a `TokenStream` and returns a `DeriveInput` struct representing the
|
||
parsed Rust code. Listing 19-32 shows the relevant parts of the `DeriveInput`
|
||
struct we get from parsing the `struct Pancakes;` string.
|
||
|
||
```
|
||
DeriveInput {
|
||
--snip--
|
||
|
||
ident: Ident {
|
||
ident: "Pancakes",
|
||
span: #0 bytes(95..103)
|
||
},
|
||
data: Struct(
|
||
DataStruct {
|
||
struct_token: Struct,
|
||
fields: Unit,
|
||
semi_token: Some(
|
||
Semi
|
||
)
|
||
}
|
||
)
|
||
}
|
||
```
|
||
|
||
Listing 19-32: The `DeriveInput` instance we get when parsing the code that has
|
||
the macro’s attribute in Listing 19-30
|
||
|
||
The fields of this struct show that the Rust code we’ve parsed is a unit struct
|
||
with the `ident` (*identifier*, meaning the name) of `Pancakes`. There are more
|
||
fields on this struct for describing all sorts of Rust code; check the `syn`
|
||
documentation for `DeriveInput` at
|
||
*https://docs.rs/syn/1.0/syn/struct.DeriveInput.html* for more information.
|
||
|
||
Soon we’ll define the `impl_hello_macro` function, which is where we’ll build
|
||
the new Rust code we want to include. But before we do, note that the output
|
||
for our `derive` macro is also a `TokenStream`. The returned `TokenStream` is
|
||
added to the code that our crate users write, so when they compile their crate,
|
||
they’ll get the extra functionality that we provide in the modified
|
||
`TokenStream`.
|
||
|
||
You might have noticed that we’re calling `unwrap` to cause the
|
||
`hello_macro_derive` function to panic if the call to the `syn::parse` function
|
||
fails here. It’s necessary for our procedural macro to panic on errors because
|
||
`proc_macro_derive` functions must return `TokenStream` rather than `Result` to
|
||
conform to the procedural macro API. We’ve simplified this example by using
|
||
`unwrap`; in production code, you should provide more specific error messages
|
||
about what went wrong by using `panic!` or `expect`.
|
||
|
||
Now that we have the code to turn the annotated Rust code from a `TokenStream`
|
||
into a `DeriveInput` instance, let’s generate the code that implements the
|
||
`HelloMacro` trait on the annotated type, as shown in Listing 19-33.
|
||
|
||
Filename: hello_macro_derive/src/lib.rs
|
||
|
||
```
|
||
fn impl_hello_macro(ast: &syn::DeriveInput) -> TokenStream {
|
||
let name = &ast.ident;
|
||
let gen = quote! {
|
||
impl HelloMacro for #name {
|
||
fn hello_macro() {
|
||
println!(
|
||
"Hello, Macro! My name is {}!",
|
||
stringify!(#name)
|
||
);
|
||
}
|
||
}
|
||
};
|
||
gen.into()
|
||
}
|
||
```
|
||
|
||
Listing 19-33: Implementing the `HelloMacro` trait using the parsed Rust code
|
||
|
||
We get an `Ident` struct instance containing the name (identifier) of the
|
||
annotated type using `ast.ident`. The struct in Listing 19-32 shows that when
|
||
we run the `impl_hello_macro` function on the code in Listing 19-30, the
|
||
`ident` we get will have the `ident` field with a value of `"Pancakes"`. Thus
|
||
the `name` variable in Listing 19-33 will contain an `Ident` struct instance
|
||
that, when printed, will be the string `"Pancakes"`, the name of the struct in
|
||
Listing 19-30.
|
||
|
||
The `quote!` macro lets us define the Rust code that we want to return. The
|
||
compiler expects something different to the direct result of the `quote!`
|
||
macro’s execution, so we need to convert it to a `TokenStream`. We do this by
|
||
calling the `into` method, which consumes this intermediate representation and
|
||
returns a value of the required `TokenStream` type.
|
||
|
||
The `quote!` macro also provides some very cool templating mechanics: we can
|
||
enter `#name`, and `quote!` will replace it with the value in the variable
|
||
`name`. You can even do some repetition similar to the way regular macros work.
|
||
Check out the `quote` crate’s docs at *https://docs.rs/quote* for a thorough
|
||
introduction.
|
||
|
||
We want our procedural macro to generate an implementation of our `HelloMacro`
|
||
trait for the type the user annotated, which we can get by using `#name`. The
|
||
trait implementation has the one function `hello_macro`, whose body contains
|
||
the functionality we want to provide: printing `Hello, Macro! My name is` and
|
||
then the name of the annotated type.
|
||
|
||
The `stringify!` macro used here is built into Rust. It takes a Rust
|
||
expression, such as `1 + 2`, and at compile time turns the expression into a
|
||
string literal, such as `"1 + 2"`. This is different from `format!` or
|
||
`println!`, macros which evaluate the expression and then turn the result into
|
||
a `String`. There is a possibility that the `#name` input might be an
|
||
expression to print literally, so we use `stringify!`. Using `stringify!` also
|
||
saves an allocation by converting `#name` to a string literal at compile time.
|
||
|
||
At this point, `cargo build` should complete successfully in both `hello_macro`
|
||
and `hello_macro_derive`. Let’s hook up these crates to the code in Listing
|
||
19-30 to see the procedural macro in action! Create a new binary project in
|
||
your *projects* directory using `cargo new pancakes`. We need to add
|
||
`hello_macro` and `hello_macro_derive` as dependencies in the `pancakes`
|
||
crate’s *Cargo.toml*. If you’re publishing your versions of `hello_macro` and
|
||
`hello_macro_derive` to *https://crates.io*, they would be regular
|
||
dependencies; if not, you can specify them as `path` dependencies as follows:
|
||
|
||
```
|
||
[dependencies]
|
||
hello_macro = { path = "../hello_macro" }
|
||
hello_macro_derive = { path = "../hello_macro/hello_macro_derive" }
|
||
```
|
||
|
||
Put the code in Listing 19-30 into *src/main.rs*, and run `cargo run`: it
|
||
should print `Hello, Macro! My name is Pancakes!` The implementation of the
|
||
`HelloMacro` trait from the procedural macro was included without the
|
||
`pancakes` crate needing to implement it; the `#[derive(HelloMacro)]` added the
|
||
trait implementation.
|
||
|
||
Next, let’s explore how the other kinds of procedural macros differ from custom
|
||
`derive` macros.
|
||
|
||
### Attribute-like Macros
|
||
|
||
Attribute-like macros are similar to custom `derive` macros, but instead of
|
||
generating code for the `derive` attribute, they allow you to create new
|
||
attributes. They’re also more flexible: `derive` only works for structs and
|
||
enums; attributes can be applied to other items as well, such as functions.
|
||
Here’s an example of using an attribute-like macro. Say you have an attribute
|
||
named `route` that annotates functions when using a web application framework:
|
||
|
||
```
|
||
#[route(GET, "/")]
|
||
fn index() {
|
||
```
|
||
|
||
This `#[route]` attribute would be defined by the framework as a procedural
|
||
macro. The signature of the macro definition function would look like this:
|
||
|
||
```
|
||
#[proc_macro_attribute]
|
||
pub fn route(
|
||
attr: TokenStream,
|
||
item: TokenStream
|
||
) -> TokenStream {
|
||
```
|
||
|
||
Here, we have two parameters of type `TokenStream`. The first is for the
|
||
contents of the attribute: the `GET, "/"` part. The second is the body of the
|
||
item the attribute is attached to: in this case, `fn index() {}` and the rest
|
||
of the function’s body.
|
||
|
||
Other than that, attribute-like macros work the same way as custom `derive`
|
||
macros: you create a crate with the `proc-macro` crate type and implement a
|
||
function that generates the code you want!
|
||
|
||
### Function-like Macros
|
||
|
||
Function-like macros define macros that look like function calls. Similarly to
|
||
`macro_rules!` macros, they’re more flexible than functions; for example, they
|
||
can take an unknown number of arguments. However, `macro_rules!` macros can
|
||
only be defined using the match-like syntax we discussed in “Declarative Macros
|
||
with macro_rules! for General Metaprogramming” on page XX. Function-like macros
|
||
take a `TokenStream` parameter, and their definition manipulates that
|
||
`TokenStream` using Rust code as the other two types of procedural macros do.
|
||
An example of a function-like macro is an `sql!` macro that might be called
|
||
like so:
|
||
|
||
```
|
||
let sql = sql!(SELECT * FROM posts WHERE id=1);
|
||
```
|
||
|
||
This macro would parse the SQL statement inside it and check that it’s
|
||
syntactically correct, which is much more complex processing than a
|
||
`macro_rules!` macro can do. The `sql!` macro would be defined like this:
|
||
|
||
```
|
||
#[proc_macro]
|
||
pub fn sql(input: TokenStream) -> TokenStream {
|
||
```
|
||
|
||
This definition is similar to the custom `derive` macro’s signature: we receive
|
||
the tokens that are inside the parentheses and return the code we wanted to
|
||
generate.
|
||
|
||
## Summary
|
||
|
||
Whew! Now you have some Rust features in your toolbox that you likely won’t use
|
||
often, but you’ll know they’re available in very particular circumstances.
|
||
We’ve introduced several complex topics so that when you encounter them in
|
||
error message suggestions or in other people’s code, you’ll be able to
|
||
recognize these concepts and syntax. Use this chapter as a reference to guide
|
||
you to solutions.
|
||
|
||
Next, we’ll put everything we’ve discussed throughout the book into practice
|
||
and do one more project!
|
||
|