Learning Rust
Welcome to my collection of resources for those learning the Rust programming language. The advice in these pages is typically suitable for those with at least a beginning familiarity of Rust -- for example, those who have worked through The Book -- but are still experiencing the growing pains of learning Rust.
Practical suggestions for building intuition around borrow errors
In this section we outline the basics of understanding lifetimes and borrowing for those who are struggling with understanding borrow check errors.
Miscellanea
This section collects relatively small and self-contained tutorials and other tidbits.
A tour of dyn Trait
In this section we explore what dyn Trait
is and is not, go over its limitations
and strengths, deep dive on how lifetimes work with dyn Trait
, provide some
common dyn Trait
recipes, and more.
Practical suggestions for building intuition around borrow errors
Ownership, borrowing, and lifetimes covers a lot of ground in Rust, and thus this section is somewhat long.
It is also hard to create a succinct overview of the topic, as what aspects of the topic a newcomer
first encounters depends on what projects they take on in the process of learning Rust. The genesis of
this guide was advice for someone who picked a zero-copy regular expression crate as their learning project,
and as such they ran into a lot of lifetime issues off the bat. Someone who chose a web framework would be
more likely to run into issues around Arc
and Mutex
, those who start with async
projects will likely
run into many async
-specific errors, and so on.
Despite the length of this section, there are entire areas it doesn't yet touch on, such as destructor mechanics, shared ownership and shared mutability, and so on.
Even so, it's not expected that anyone will absorb everything in this guide all at once. Instead, it's intended to be a broad introduction to the topic. Skim it on your first pass and take time on the parts that seem relevant, or use them as pointers to look up more in-depth documentation on your own. Hopefully you will get a feel for what kind errors can arise even if you haven't ran into them yourself yet, so that you're not completely baffled when they do arise.
In general, your mental model of ownership and borrowing will go through a few stages of evolution. It will pay off to return to any given area of the topic with fresh eyes every once in awhile.
Keep at it and participate in the community
Experience plays a very big role in understanding borrow errors and being able to figure out why some code is problematic in a reasonable amount of time. Running into borrow check errors and solving them, either on your own or by asking for help, is a part of the process of building up that intuition.
I personally learned a great deal by participating on the Rust user's forum, URLO. Another way you can build experience and broaden your intuition is by reading other's lifetime questions on that forum, reading the replies to those questions, and trying to solve their problems -- or even just experiment with them! -- yourself.
By being an active participant, you can not only learn more, but will eventually be able to help out other users.
- When you read a question and are lost: read the replies and see if you can understand them and learn more
- When you have played with a question and got it to compile but aren't sure why: reply with something like "I don't know if this fixes your use case, but this compiles for me: Playground"
- After you've got the hang of certain common problems: "It's because of XYZ. You can do this instead: ..."
Even after I got my sea-legs, some of the more advanced lifetime and borrow-checker skills I've developed came from solving other people's problems on this forum. Once you can answer their questions, you may still learn things if someone else comes along and provides a different or better solution.
Prefer ownership over long-lived references
A common newcomer mistake is to overuse references or other lifetime-carrying
data structures by storing them in their own long-lived structures. This is
often a mistake as the primary use-case of non-'static
references is for
short-term borrows.
If you're trying to create a lifetime-carrying struct, stop and consider if
you could use a struct with no lifetime instead, for example by replacing
&str
with String
, or &[T]
with Vec<T>
.
Don't hide lifetimes
When you use lifetime-carrying structs (whether your own or someone else's), the Rust compiler currently let's you elide the lifetime parameter when mentioning the struct:
#![allow(unused)] fn main() { struct Foo<'a>(&'a str); impl<'a> Foo<'a> { // No lifetime syntax needed :-( // vvv fn new(s: &str) -> Foo { Foo(s) } } }
This can make it non-obvious that borrowing is going on, and harder to figure
out where errors are coming from. To save yourself some headaches, I recommend
using the #![deny(elided_lifetimes_in_paths)]
lint:
#![allow(unused)] #![deny(elided_lifetimes_in_paths)] fn main() { struct Foo<'a>(&'a str); impl<'a> Foo<'a> { // Now this is an error // vvv fn new(s: &str) -> Foo { Foo(s) } } }
The first thing I do when taking on a borrow check error in someone else's code is to turn on this lint. If you have not enabled the lint and are getting errors in your own code, try enabling the lint. For every place that errors, take a moment to pause and consider what is going on with the lifetimes. Sometimes there's only one possibility and you will just need to make a trivial change:
- fn new(s: &str) -> Foo {
+ fn new(s: &str) -> Foo<'_> {
But often, in my experience, one of the error sites will be part of the problem you're dealing with.
Understand elision and get a feel for when to name lifetimes
Read the function lifetime elision rules.
They're intuitive for the most part. The elision rules are built around being what you need in the common case,
but they're not always the solution. For example, they assume an input of &'s self
means you mean to return a
borrow of self
or its contents with lifetime 's
, but this is often not correct for lifetime-carrying
data structures.
If you get a lifetime error involving elided lifetimes, try giving all the lifetimes names. This can improve the compiler errors; if nothing else, you won't have to work so hard to mentally track the numbers used in errors for illustration, or what "anonymous lifetime" is being talked about.
Take care to refer to the elision rules when naming all the lifetimes, or you may inadvertently change the meaning of the signature. If the error changes drastically, you've probably changed the meaning of the signature.
Once you have a good feel for the function lifetime elision rules, you'll start developing intuition for when you need to name your lifetimes instead.
Get a feel for variance, references, and reborrows
Here's some official docs on the topic of variance, but reading it may make you go cross-eyed. As an alternative, in this section I attempt to introduce some basic rules about how references work with regard to lifetimes over the course of a few layers.
If it still makes you cross-eyed, just skim or skip ahead.
The seed of a mental model
Some find it helpful to think of shared (&T
) and exclusive (&mut T
) references like so:
&T
is a compiler-checkedRwLockReadGuard
- You can have as many of these at one time as you want
&mut T
is a compiler-checkedRwLockWriteGuard
- You can only have one of these at one time, and when you do, you can have no
RwLockReadGuard
- You can only have one of these at one time, and when you do, you can have no
The exclusivity is key.
&mut T
are often called "mutable references" for obvious reasons. And following from that, &T
are often called "immutable references". However, I find it more accurate and consistent to call
&mut T
an exclusive reference and to call &T
a shared reference.
This guide doesn't yet cover shared mutability, more commonly called interior mutability, but you'll
run into the concept sometime in your Rust journey. The one thing I will mention here is that it
enables mutation behind a &T
, and thus "immutable reference" is a misleading name.
There are other situations where the important quality of a &mut T
is the exclusivity that it
guarantees, and not the ability to mutate through it. If you find yourself annoyed at getting
borrow errors about &mut T
when you performed no actual mutation, it may help to instead
consider &mut T
as a directive to the compiler to ensure exclusive access instead of
mutable access.
Reference types
Let's open with a question: is &str
a type?
When not being pedantic or formal, pretty much everyone will say yes, &str
is a type.
However, it is technically a type constructor which is parameterized with a generic
lifetime parameter. So &str
isn't technically a type, &'a str
for some concrete
lifetime is a type. &'a str
and &'b str
are not the same type, unless 'a == 'b
.
Similarly, Vec<T>
for a generic T
is a type constructor, but Vec<i32>
is a type.
Vec<T>
and Vec<U>
are not the same type, unless T == U
.
By "concrete lifetime", I mean some compile-time determined lifetime. The exact
definition of "lifetime" is surprisingly complicated and beyond the scope of this
guide, but here are a few examples of &str
s and their concrete types.
#![allow(unused)] fn main() { // The exact lifetime of `'a` is determined at each call site. We'll explore // what this means in more depth later. // // The lifetime of `b` works the same, we just didn't give it a name. fn example<'a>(a: &'a str, b: &str) { // Literal strings are `&'static str` let s = "literal"; // The lifetime of local borrows are determined by compiler analysis // and have no names (but it's still a single lifetime). let local = String::new(); let borrow = local.as_str(); // These are the same and they just tell the compiler to infer the // lifetime. In this small example that means the same thing as not // having a type annotation at all. let borrow: &str = local.as_str(); let borrow: &'_ str = local.as_str(); } }
Lifetime bounds
Here's a brief introduction to the lifetime bounds you may see on fn
declarations and impl
blocks.
Bounds between lifetimes
A 'a: 'b
bound means, roughly speaking, 'long: 'short
.
It's often read as "'a
outlives 'b
" and it sometimes called an "outlives bound" or "outlives relation".
I personally also like to read it as "'a
is valid for (at least) 'b
".
Note that 'a
may be the same as 'b
, it does not have to be strictly longer despite the "outlives"
terminology. It is analogous to >=
in this respect. Therefore, in this example:
#![allow(unused)] fn main() { fn example<'a: 'b, 'b: 'a>(a: &'a str, b: &'b str) {} }
'a
and 'b
must actually be the same lifetime.
When you have a function argument with a nested reference such as &'b Foo<'a>
, a 'a: 'b
bound is inferred.
Bounds between (generic) types and lifetimes
A T: 'a
means that a &'a T
would not be instantly undefined behavior. In other words, it
means that if the type T
contains any references or other lifetimes, they must be at least
as long as 'a
.
You can also read these as "(the type) T
is valid for 'a
".
Note that this has nothing to do with the liveness scope or drop scope of a value of type T
!
In particular the most common bound of this form is T: 'static
.
This does not mean the value of type T
must last for your entire program! It just means that
the type T
has no non-'static
lifetimes. String: 'static
for example, but this doesn't
mean that you don't drop String
s.
Liveness scopes of values
For the above reasons, I prefer to never refer to the liveness or drop scope of a value as the value's "lifetime". Although there is a connection between the liveness scope of values and lifetimes of references you take to it, conflating the two concepts can lead to confusion.
That said, not everyone follows this convention, so you may see the liveness scope of a value referred to as the values "lifetime". So the distinction is something to just generally be aware of.
Reference lifetimes
Here's something you'll utilize in Rust all the time without thinking about it:
- A
&'long T
coerces to a&'short T
- A
&'long mut T
coerces to a&'short mut T
The technical term is "covariant (in the lifetime)" but a practical mental model is "the (outer) lifetime of references can shrink".
The property holds for values whose type is a reference, but it doesn't always hold for other types. For example, we'll soon see that this property doesn't always hold for the lifetime of a reference nested within another reference. When the property doesn't hold, it's usually due to invariance.
Even if you never really think about covariance, you'll grow an intuition for it -- so much for so that you'll eventually be surprised when you encounter invariance and the property doesn't hold. We'll look at some cases soon.
Copy and reborrows
Shared references (&T
) implement Copy
, which makes them very flexible. Once you have one,
you can have as many as you want; once you've exposed one, you can't keep track of how many there are.
Exclusive references (&mut T
) do not implement Copy
.
Instead, you can use them ergonomically through a mechanism called reborrowing. For example here:
#![allow(unused)] fn main() { fn foo<'v>(v: &'v mut Vec<i32>) { v.push(0); // line 1 println!("{v:?}"); // line 2 } }
You're not moving v: &mut Vec<i32>
when you pass it to push
on line 1, or you couldn't print it on line 2.
But you're not copying it either, because &mut _
does not implement Copy
.
Instead *v
is reborrowed for some shorter lifetime than 'v
, which ends on line 1.
An explicit reborrow would look like this:
#![allow(unused)] fn main() { Vec::push(&mut *v, 0); }
v
can't be used while the reborrow &mut *v
exists, but after it "expires", you can use v
again.
In this way, both &mut
are still exclusive borrows.
Though tragically underdocumented, reborrowing is what makes &mut
usable; there's a lot of implicit reborrowing in Rust.
Reborrowing makes &mut T
act like the Copy
-able &T
in some ways. But the necessity that &mut T
is exclusive while
it exists leads to it being much less flexible.
Reborrowing is a large topic on its own, but you should at least understand that it exists, and is what enables Rust to be usable and ergonomic while still enforcing memory safety.
Nested borrows and invariance
Now let's consider nested references:
- A
&'medium &'long U
coerces to a&'short &'short U
- A
&'medium mut &'long mut U
coerces to a&'short mut &'long mut U
...- ...but not to a
&'short mut &'short mut U
- ...but not to a
We say that &mut T
is invariant in T
, which means any lifetimes in T
cannot change
(grow or shrink) at all. In the example, T
is &'long mut U
, and the 'long
cannot be changed.
Why not? Consider this:
#![allow(unused)] fn main() { fn bar(v: &mut Vec<&'static str>) { let w: &mut Vec<&'_ str> = v; // call the lifetime 'w let local = "Gottem".to_string(); w.push(&*local); } // `local` drops }
If 'w
was allowed to be shorter than 'static
, we'd end up with a dangling reference in *v
after bar
returns.
You will inevitably end up with a feel for covariance from using references with their flexible outer lifetimes, but eventually hit a use case where invariance matters and causes some borrow check errors, because it's (necessarily) so much less flexible. It's just part of the Rust learning experience.
Let's look at one more property of nested references you may run into:
- You can get a
&'long U
from a&'short &'long U
:- Just copy it out!
- But you cannot get a
&'long mut U
through dereferencing a&'short mut &'long mut U
.- You can only reborrow a
&'short mut U
. - Obtaining a
&'long mut U
is sometimes possible by swapping or temporarily moving out of the inner reference, for example withmem::swap
,mem::replace
, orreplace_with
. See the mutable slice iterator example.
- You can only reborrow a
The reason is again to prevent memory unsafety.
Additionally,
- You cannot get a
&'long U
or any&mut U
from a&'short &'long mut U
- You can only reborrow a
&'short U
- You can only reborrow a
Recall that once a shared reference exist, any number of copies of it could
simultaneously exist. Therefore, so long as the outer shared reference exists
(and could be used to observe U
), the inner &mut
must not be usable in a
mutable or otherwise exclusive fashion.
And once the outer reference expires, the inner &mut
is active and must
again be exclusive, so it must not be possible to obtain a &'long U
either.
Invariance elsewhere
While behind a &mut
is the most common place to first encounter invariance,
it's present elsewhere as well.
Cell<T>
and RefCell<T>
are also invariant in T
.
Trait parameters are invariant too. As a result, lifetime-parameterized traits can be onerous to work with.
Additionally, if you have a bound like T: Trait<U>
, U
becomes invariant because it's a type parameter of the trait.
If your U
resolves to &'x V
, the lifetime 'x
will be invariant too.
Non-references
The variance of lifetime and type parameters of your own struct
s is automatically
inferred from how you use those parameters in your field. For example if you have a
#![allow(unused)] fn main() { struct AllMine<'a, T>(&'a mut T); }
Then AllMine
is covariant in 'a
and invariant in T
, just like &'a mut T
is.
Get a feel for borrow-returning methods
Here we look at how borrow-returning methods work. Our examples will consider a typical pattern:
#![allow(unused)] fn main() { fn method(&mut self) -> &SomeReturnType {} }
When not to name lifetimes
Sometimes newcomers try to solve borrow check errors by making things more generic, which often involves adding lifetimes and naming previously-elided lifetimes:
#![allow(unused)] fn main() { struct S; impl S { fn quz<'a: 'b, 'b>(&'a mut self) -> &'b str { todo!() } } }
But this doesn't actually permit more lifetimes than this:
#![allow(unused)] fn main() { struct S; impl S { fn quz<'b>(&'b mut self) -> &'b str { todo!() } } }
Because in the first example, &'a mut self
can coerce to &'b mut self
.
And, in fact, you want it to -- because you generally don't want to exclusively borrow self
any longer than necessary.
And at this point you can instead utilize lifetime elision and stick with:
#![allow(unused)] fn main() { struct S; impl S { fn quz(&mut self) -> &str { todo!() } } }
As covariance and function lifetime elision become more intuitive, you'll build a feel for when it's pointless to name lifetimes. Adding superfluous lifetimes like in the first example tends to make understanding borrow errors harder, not easier.
Bound-related lifetimes "infect" each other
Separating 'a
and 'b
in the last section didn't make things any more flexible in terms of self
being borrowed.
Once you declare a bound like 'a: 'b
, then the two lifetimes "infect" each other.
Even though the return type had a different lifetime than the input, it was still effectively a reborrow of the input.
This can actually happen between two input parameters too: if you've stated a lifetime relationship between two borrows,
the compiler assumes they can observe each other in some sense. It's probably not anything you'll run into soon,
but if you do, the compiler errors tend to be drop errors ("borrow might be used here, when x
is dropped"), or
sometimes read like "data flows from X into Y".
&mut inputs don't "downgrade" to &
Still talking about this signature:
#![allow(unused)] fn main() { fn quz(&mut self) -> &str { todo!() } }
Newcomers often expect self
to only be shared-borrowed after quz
returns, because the return is a shared reference.
But that's not how things work; self
remains exclusively borrowed for as long as the returned &str
is valid.
I find looking at the exact return type a trap when trying to build a mental model for this pattern.
The fact that the lifetimes are connected is crucial, but beyond that, instead focus on the input parameter:
You cannot call the method until you have created a &mut self
with a lifetime as long as the return type has.
Once that exclusive borrow (or reborrow) is created, the exclusiveness lasts for the entirety of the lifetime.
Moreover, you give the &mut self
away by calling the method (it is not Copy
), so you can't create any other
reborrows to self
other than through whatever the method returns to you (in this case, the &str
).
async
and returning impl Trait
The return type of an async
function captures all of its generic parameters,
including any lifetimes. So here:
#![allow(unused)] fn main() { async fn example(v: &mut Vec<String>) -> String { "Hi :-)".to_string() } }
The future returned by the async fn
implicitly reborrows the v
input, and
"carries" the same lifetime, just like the other examples we saw.
The same is true when you use return
-position impl Trait
(RPIT) in traits (RPITIT):
#![allow(unused)] fn main() { struct MyStruct {} trait StringIter { fn iter(&self) -> impl Iterator<Item = String>; } impl StringIter for MyStruct { fn iter(&self) -> impl Iterator<Item = String> { ["Hi :-)"].into_iter().map(ToString::to_string) } } // Fails to compile as `iter` borrows `my` fn example(my: MyStruct) { let iter = my.iter(); let _move_my = my; for _ in iter {} } }
self
will remained borrowed here for as long as the iterator is alive. Note that
this is true even if it's not required by the body! The implicit lifetime capture
is considered part of the API contract.
For both of these cases, all generic types are also captured.
Not that RPIT outside of traits does not implicitly capture lifetimes! At least, not as of this writing -- the plan is that RPIT outside of traits will act like RPITIT and implicitly capture all lifetimes in edition 2024 and beyond. But for now, they only implicitly capture type parameters.
#![allow(unused)] fn main() { use std::fmt::Display; // Required: edition 2021 or before fn no_capture(s: &str) -> impl Display { s.to_string() } // This wouldn't compile if `no_capture` reborrowed `*s` fn check() { let mut s = "before".to_string(); let d = no_capture(&s); s ="after".to_string(); println!("{d}"); } }
#![allow(unused)] fn main() { use std::fmt::Display; // This fails on edition 2021 or before, because it tries to // return a reborrow of `*s`, but that requires capturing the lifetime fn no_capture(s: &str) -> impl Display { s } }
#![allow(unused)] fn main() { use std::fmt::Display; // This allows it to work again (but `+ '_` is too restrictive // for every situation, which is part of why edition 2021 will // change the behavior of RPIT outside of traits) // // vvvv fn no_capture(s: &str) -> impl Display + '_ { s } }
A lot can be said about async
and RPITs; more than can be covered here. But their
implicit capturing nature is something to be aware of, given how invisible it is.
Understand function lifetime parameters
First, note that elided lifetimes in function signatures are invisible lifetime parameters on the function.
#![allow(unused)] fn main() { fn zed(s: &str) {} }
#![allow(unused)] fn main() { // same thing fn zed<'s>(s: &'s str) {} }
When you have a lifetime parameter like this, the caller chooses the lifetime. But the body of your function is opaque to the caller: they can only choose lifetimes just longer than your function body.
So when you have a lifetime parameter on your function (without any further bounds), the only things you know are
- It's longer than your function body
- You don't get to pick it, and it could be arbitrarily long (even
'static
) - But it could be just barely longer than your function body too; you have to support both cases
And the main corollaries are
- You can't borrow locals for a caller-chosen lifetime
- You can't extend a caller-chosen lifetime to some other named lifetime in scope
- Unless there's some other outlives bound that makes it possible
Here's a couple of error examples related to function lifetime parameters:
#![allow(unused)] fn main() { fn long_borrowing_local<'a>(name: &'a str) { let local = String::new(); let borrow: &'a str = &local; } fn borrowing_zed(name: &str) -> &str { match name.len() { 0 => "Hello, stranger!", _ => &format!("Hello, {name}!"), } } }
Understand borrows within a function
The analysis that the compiler does to determine lifetimes and borrow check within a function body is quite complicated. A full exploration is beyond the scope of this guide, but we'll give a brief introduction here.
Your best bet if you run into an error you can't understand is to ask for help on the forum or elsewhere.
Borrow errors within a function
Here are some simple causes of borrow check errors within a function.
Recalling the Basics
The most basic mechanism to keep in mind is that &mut
references are exclusive,
while &
references are shared and implement Copy
. You can't intermix using
a shared reference and an exclusive reference to the same value, or two exclusive
references to the same value.
fn main() { let mut local = "Hello".to_string(); // Creating and using a shared reference let x = &local; println!("{x}"); // Creating and using an exclusive reference let y = &mut local; y.push_str(", world!"); // Trying to use the shared reference again println!("{x}"); }
This doesn't compile because as soon as you created the exclusive reference, any other existing references must cease to be valid.
Borrows are often implicit
Here's the example again, only slightly rewritten.
fn main() { let mut local = "Hello".to_string(); // Creating and using a shared reference let x = &local; println!("{x}"); // Implicitly creating and using an exclusive reference local.push_str(", world!"); // Trying to use the shared reference again println!("{x}"); }
Here, push_str
takes &mut self
,
so an implicit &mut local
exists as part of the method call,
and thus the example can still not compile.
Creating a &mut
is not the only exclusive use
The borrow checker looks at every use of a value to see if it's compatible with the lifetimes of borrows to that value, not just uses that involve references or just uses that involve lifetimes.
For example, moving a value invalidates any references to the value, as otherwise those references would dangle.
fn main() { let local = "Hello".to_string(); // Creating and using a shared reference let x = &local; println!("{x}"); // Moving the value let _local = local; // Trying to use the shared reference again println!("{x}"); }
Referenced values must remain in scope
The effects of a value going out of scope are similar to moving the value: all references are invalidated.
fn main() { let x; { let local = "Hello".to_string(); x = &local; } // `local` goes out of scope here // Trying to use the shared reference after `local` goes out of scope println!("{x}"); }
Using &mut self
or &self
counts as a use of all fields
In the example below, left
becomes invalid when we create &self
to call bar
. Because you can get a &self.left
out of a &self
,
this is similar to trying to intermix &mut self.left
and &self.left
.
#![allow(unused)] fn main() { #[derive(Debug)] struct Pair { left: String, right: String, } impl Pair { fn foo(&mut self) { let left = &mut self.left; left.push_str("hi"); self.bar(); println!("{left}"); } fn bar(&self) { println!("{self:?}"); } } }
More generally, creating a &mut x
or &x
counts as a use of
everything reachable from x
.
Some things that compile successfully
Once you've started to get the hang of borrow errors, you might start to wonder why certain programs are allowed to compile. Here we introduce some of the ways that Rust allows non-trivial borrowing while still being sound.
Independently borrowing fields
Rust tracks borrows of struct fields individually, so the borrows of
left
and right
below do not conflict.
#![allow(unused)] fn main() { #[derive(Debug)] struct Pair { left: String, right: String, } impl Pair { fn foo(&mut self) { let left = &mut self.left; let right = &mut self.right; left.push_str("hi"); right.push_str("there"); println!("{left} {right}"); } } }
This capability is also called splitting borrows.
Note that data you access through indexing are not consider fields
per se; instead indexing is an operation that generally borrows
all of &self
or &mut self
.
fn main() { let mut v = vec![0, 1, 2]; // These two do not overlap, but... let left = &mut v[..1]; let right = &mut v[1..]; // ...the borrow checker cannot recognize that println!("{left:?} {right:?}"); }
Usually in this case, one uses methods like split_at_mut
in order to split the borrows instead.
Similarly to indexing, when you access something through "deref coercion", you're
exercising the Deref
trait
(or DerefMut
), which borrow all of self
.
There are also some niche cases where the borrow checker is smarter, however.
fn main() { // Pattern matching does understand non-overlapping slices (slices are special) let mut v = vec![String::new(), String::new()]; let slice = &mut v[..]; if let [_left, right] = slice { if let [left, ..] = slice { left.push_str("left"); } // Still usable! right.push_str("right"); } }
fn main() { // You can split borrows through a `Box` dereference (`Box` is special) let mut bx = Box::new((0, 1)); let left = &mut bx.0; let right = &mut bx.1; *left += 1; *right += 1; }
The examples are non-exhaustive 🙂.
Reborrowing
As mentioned before, reborrows are what make &mut
reasonable to use.
In fact, they have other special properties you can't emulate with a custom struct and
trait implementations. Consider this example:
#![allow(unused)] fn main() { fn foo(s: &mut String) -> &str { &**s } }
Actually, that's too fast. Let's change this a little bit and go step by step.
#![allow(unused)] fn main() { fn foo(s: &mut String) -> &str { let ms: &mut str = &mut **s; let rs: &str = &*s; rs } }
Here, both s
and ms
are going out of scope at the end of foo
, but this doesn't
invalidate rs
. That is, reborrowing through references can impose lifetime constraints
on the reborrow, but the reborrow is not dependent on references staying in scope! It is
only dependent on the borrowed data.
This demonstrates that reborrowing is more powerful than nesting references.
Shared reborrowing
When it comes to detecting conflicts, the borrow checker distinguishes between shared reborrows and exclusive ones. In particular, creating a shared reborrow will invalidate any exclusive reborrows of the same value (as they are no longer exclusive). But it will not invalidated shared reborrows:
#![allow(unused)] fn main() { struct Pair { left: String, right: String, } impl Pair { fn foo(&mut self) { // a split borrow: exclusive reborrow, shared reborrow let left = &mut self.left; let right = &self.right; left.push('x'); // Shared reborrow of all of `self`, which "covers" all fields let this = &*self; // It invalidates any exclusive reborrows, so this will fail... // println!("{left}"); // But it does not invalidate shared reborrows! println!("{right}"); } } }
Two-phase borrows
The following code compiles:
fn main() { let mut v = vec![0]; let shared = &v; v.push(shared.len()); }
However, if you're aware of the order of evaluation here, it probably seems like it shouldn't.
The implicit &mut v
should have invalidated shared
before shared.len()
was evaluated.
What gives?
This is the result of a feature called two-phase borrows, which is intended to make nested method calls more ergonomic:
fn main() { let mut v = vec![0]; v.push(v.len()); }
In the olden days, you would have had to write it like so:
fn main() { let mut v = vec![0]; let len = v.len(); v.push(len); }
The implementation slipped, which is why the first example compiles too. How far it slipped is hard to say, as not only is there no specification, the feature doesn't even seem to be documented 🤷.
Learn some pitfalls and anti-patterns
Here we cover some pitfalls to recognize and anti-patterns to avoid.
First, read this list of common lifetime misconceptions by @pretzelhammer. Skip or skim the parts that don't make sense to you yet, and return to it for a re-read occasionally.
dyn Trait
lifetimes and Box<dyn Trait>
Every trait object (dyn Trait
) has an elide-able lifetime with it's own defaults when completely elided.
This default is stronger than the normal function signature elision rules.
The most common way to run into a lifetime error about this is with Box<dyn Trait>
in your function signatures, structs, and type aliases, where it means Box<dyn Trait + 'static>
.
Often the error indicates that non-'static
references/types aren't allowed in that context,
but sometimes it means that you should add an explicit lifetime, like Box<dyn Trait + 'a>
or Box<dyn Trait + '_>
.
The latter will act like "normal" lifetime elision; for example, it will introduce a new anonymous lifetime
parameter as a function input parameter, or use the &self
lifetime in return position.
The reason the lifetime exists is that coercing values to dyn Trait
erases their base type, including any
lifetimes that it may contain. But those lifetimes have to be tracked by the compiler somehow to ensure
memory safety. The dyn Trait
lifetime represents the maximum lifetime the erased type is valid for.
Some short examples:
#![allow(unused)] fn main() { trait Trait {} // The return is `Box<dyn Trait + 'static>` and this errors as there // needs to be a bound requiring `T` to be `'static`, or the return // type needs to be more flexible fn one<T: Trait>(t: T) -> Box<dyn Trait> { Box::new(t) } // This works as we've added the bound fn two<T: Trait + 'static>(t: T) -> Box<dyn Trait> { Box::new(t) } // This works as we've made the return type more flexible. We still // have to add a lifetime bound. fn three<'a, T: Trait + 'a>(t: T) -> Box<dyn Trait + 'a> { Box::new(t) } }
For a more in-depth exploration,
see this section of the dyn Trait
tour.
Conditional return of a borrow
The compiler isn't perfect, and there are some things it doesn't yet accept which are in fact sound and could be accepted. Perhaps the most common one to trip on is conditional return of a borrow, aka NLL Problem Case #3. There are some examples and workarounds in the issue and related issues.
The plan is still to accept that pattern some day.
More generally, if you run into something and don't understand why it's an error or think it should be allowed, try asking in a forum post.
Borrowing something forever
An anti-pattern you may run into is to create a &'a mut Thing<'a>
. It's an anti-pattern because it translates
into "take an exclusive reference of Thing<'a>
for the entire rest of it's validity ('a
)". Once you create
the exclusive borrow, you cannot use the Thing<'a>
ever again, except via that borrow.
You can't call methods on it, you can't take another reference to it, you can't move it, you can't print it, you
can't use it at all. You can't even call a non-trivial destructor on it; if you have a non-trivial destructor,
your code won't compile in the presence of &'a mut Thing<'a>
.
So avoid &'a mut Thing<'a>
.
Examples:
#[derive(Debug)] struct Node<'a>(&'a str); fn example_1<'a>(node: &'a mut Node<'a>) {} struct DroppingNode<'a>(&'a str); impl Drop for DroppingNode<'_> { fn drop(&mut self) {} } fn example_2<'a>(node: &'a mut DroppingNode<'a>) {} fn main() { let local = String::new(); let mut node_a = Node(&local); // You can do this once and it's ok... example_1(&mut node_a); let mut node_b = Node(&local); // ...but then you can't use the node directly ever again example_1(&mut node_b); println!("{node_b:?}"); let mut node_c = DroppingNode(&local); // And this doesn't work at all example_2(&mut node_c); }
We look at the shared version of this pattern (&'a Thing<'a>
) a little later.
&'a mut self
and Self
aliasing more generally
fn foo(&'a mut self)
is a red flag -- because if that 'a
wasn't declared on the function,
it's probably part of the Self
struct. And if that's the case, this is a &'a mut Thing<'a>
in disguise. As discussed in the previous section, this will make
self
unusable afterwards, and thus is an anti-pattern.
More generally, self
types and the Self
alias include any parameters on the type constructor post-resolution.
Which means here:
#![allow(unused)] fn main() { struct Node<'a>(&'a str); impl<'a> Node<'a> { fn new(s: &str) -> Self { Node(s) } } }
Self
is an alias for Node<'a>
. It is not an alias for Node<'_>
. So it means:
#![allow(unused)] fn main() { fn new<'s>(s: &'s str) -> Node<'a> { }
And not:
#![allow(unused)] fn main() { fn new<'s>(s: &'s str) -> Node<'s> { }
And you really meant to code one of these:
#![allow(unused)] fn main() { fn new(s: &'a str) -> Self { fn new(s: &str) -> Node<'_> { }
Similarly, using Self
as a constructor will use the resolved type parameters. So this won't work:
#![allow(unused)] fn main() { fn new(s: &str) -> Node<'_> { Self(s) } }
You need
#![allow(unused)] fn main() { fn new(s: &str) -> Node<'_> { Node(s) } }
Avoid self-referential structs
By self-referential, I mean you have one field that is a reference, and that reference points to another field (or contents of a field) in the same struct.
#![allow(unused)] fn main() { struct Snek<'a> { owned: String, // Like if you want this to point to the `owned` field borrowed: &'a str, } }
The only safe way to construct this to be self-referential is to take a &'a mut Snek<'a>
, get a &'a str
to the owned
field, and assign it to the borrowed
field.
#![allow(unused)] fn main() { struct Snek<'a> { owned: String, // Like if you want this to point to the `owned` field borrowed: &'a str, } impl<'a> Snek<'a> { fn bite(&'a mut self) { self.borrowed = &self.owned; } } }
And as was covered before, that's an anti-pattern because you cannot use the self-referential struct directly ever again.
The only way to use it at all is via a (reborrowed) return value from the method call that required &'a mut self
.
So it's technically possible, but so restrictive it's pretty much always useless.
Trying to create self-referential structs is a common newcomer misstep, and you may see the response to questions about them in the approximated form of "you can't do that in safe Rust".
&'a Struct<'a>
and covariance
Here's a situation that looks similar to borrowing something forever, but is actually somewhat different.
#![allow(unused)] fn main() { struct Person<'a> { given: &'a str, sur: &'a str, } impl<'a> Person<'a> { // vvvvvvvv `&'a Person<'a>` fn given(&'a self) -> &'a str { self.given } } fn example(person: Person<'_>) { // Unlike when we borrowed something forever, this compiles let _one = person.given(); let _two = person.given(); } }
The difference is that &U
is covariant in U
, so
lifetimes can "shrink" behind the reference
(unlike &mut U
, which is invariant in U
). Person<'a>
is also
covariant in 'a
, because all of our uses of 'a
in the definition
are in covariant position.
What all this means is that &'long Person<'long>
can coerce to
a &'short Person<'short>
. As a result, calling Person::given
doesn't have to borrow the person
forever -- it only has to borrow
person
for as long as the return value is used.
Note that the covariance is required! A shared nested borrow where
the inner lifetime is invariant is still almost as bad as the
"borrowed forever" &mut
case. Most of this page talks about the
covariant case; we'll consider the invariant case at the end.
This is still a yellow flag
Even though it's not as problematic as the &mut
case, there is
still something non-ideal about that signature: it forces the
borrow of person
to be longer than it needs to be. For example,
this fails:
#![allow(unused)] fn main() { struct Person<'a> { given: &'a str } impl<'a> Person<'a> { fn given(&'a self) -> &'a str { self.given } } struct Stork(String); impl Stork { fn deliver(&self, _: usize) -> Person<'_> { Person { given: &self.0 } } } fn example(stork: Stork) { let mut last = ""; for i in 0..10 { let person = stork.deliver(i); last = person.given(); // ... } println!("Last: {last}"); } }
person
has to remain borrowed for as long the return value is around,
because we said &self
and the returned &str
have to have the same
lifetime.
If we instead allow the lifetimes to be different:
#![allow(unused)] fn main() { struct Person<'a> { given: &'a str } struct Stork(String); impl Stork { fn deliver(&self, _: usize) -> Person<'_> { Person { given: &self.0 } } } impl<'a> Person<'a> { // vvvvv We removed `'a` from `&self` fn given(&self) -> &'a str { self.given } } fn example(stork: Stork) { let mut last = ""; for i in 0..10 { let person = stork.deliver(i); last = person.given(); // ... } println!("Last: {last}"); } }
Then the borrow of person
can end immediately after the call, even
while the return value remains usable. This is possible because we're
just copying the reference out. Or if you prefer to think of it another
way, we're handing out a reborrow of an existing borrow we were holding
on to, and not borrowing something we owned ourselves.
So now the stork
still has to be around for as long as last
is used,
but the person
can go away at the end of the loop.
Allowing the lifetimes to be different is normally what you want to do
when you have a struct that's just managing a borrowed resource in some
way -- when you hand out pieces of the borrowed resource, you want them
to be tied to the lifetime of the original borrow and not the lifetime of
&self
or &mut self
on the method call. It's how borrowing iterators
work,
for example.
A variation on the theme
Consider this version of the method:
#![allow(unused)] fn main() { struct Person<'a> { given: &'a str } impl<'a> Person<'a> { fn given(&self) -> &str { self.given } } }
It has the same downside as given(&'a self) -> &'a str
: The return
value is tied to self
and not 'a
. It's easy to make this mistake
when developing borrowing structs, because the lifetime elision rules
nudge you in this direction. It's also harder to spot because there's
no &'a self
to clue you in.
But sometimes it's perfectly okay
On the flip side, because of the covariance we discussed at the top of this page, there's no practical difference between these two methods:
#![allow(unused)] fn main() { struct Person<'a> { given: &'a str } impl<'a> Person<'a> { fn foo(&self) {} fn bar(&'a self) {} }
There's no return value to force the lifetime to be longer, so these
methods are going to act the same. There's no reason for the 'a
on &'a self
, but it's not hurting anything either.
Similarly, within a struct there's rarely a benefit to keeping the nested lifetimes separated, so you might as well use this:
#![allow(unused)] fn main() { struct Cradle<'a> { person: &'a Person<'a> } }
Instead of something with two lifetimes.
(That said, an even better approach is to not have complicated nested-borrow-holding data structures at all.)
The invariant case
Finally, let's look at a case where it's generally not okay: A shared nested borrow where the inner borrow is invariant.
Perhaps the most likely reason this comes up is due to shared mutability: the ability
to mutate things that are behind a shared reference (&
). Some examples from the
standard library include Cell<T>
,
RefCell<T>
, and
Mutex<T>
. These
shared mutability types have to be invariant over their generic parameter T
,
just like how &mut T
is invariant over T
.
Let's see an example, similar to one we've seen before:
#![allow(unused)] fn main() { use std::cell::Cell; #[derive(Debug)] struct ShareableSnek<'a> { owned: String, borrowed: Cell<&'a str>, } impl<'a> ShareableSnek<'a> { fn bite(&'a self) { self.borrowed.set(&self.owned); } } let snek = ShareableSnek { owned: "🐍".to_string(), borrowed: Cell::new(""), }; snek.bite(); // Unlike the `&mut` case, we can still use `snek`! It's borrowed forever, // but it's "only" *shared*-borrowed forever. println!("{snek:?}"); }
That doesn't seem so bad though, right? Well, it's quite as bad as the &mut
case, but it's still usually too restrictive to be useful.
#![allow(unused)] fn main() { use std::cell::Cell; #[derive(Debug)] struct ShareableSnek<'a> { owned: String, borrowed: Cell<&'a str>, } impl<'a> ShareableSnek<'a> { fn bite(&'a self) { self.borrowed.set(&self.owned); } } let snek = ShareableSnek { owned: "🐍".to_string(), borrowed: Cell::new(""), }; snek.bite(); let _mutable_stuff = &mut snek; let _move = snek; // Having a non-trivial destructor would also cause a failure }
Once it's borrowed forever, the snek
can only be used in a "shared" way.
It can only be mutated using shared mutability, and it can't be moved --
it's pinned in place forever.
Scrutinize compiler advice
The compiler gives better errors than pretty much any other language I've used, but it still does give some poor suggestions in some cases. It's hard to turn a borrow check error into an accurate "what did the programmer mean" suggestion. So suggested bounds are an area where it can be better to take a moment to try and understand what's going on with the lifetimes, instead of just blindly applying compiler advice.
I'll cover a few scenarios here.
Advice to change function signature when aliases are involved
Here's a scenario from earlier in this guide. The compiler advice is:
#![allow(unused)] fn main() { error[E0621]: explicit lifetime required in the type of `s` --> src/lib.rs:5:9 | 4 | fn new(s: &str) -> Node<'_> { | ---- help: add explicit lifetime `'a` to the type of `s`: `&'a str` 5 | Self(s) | ^^^^^^^ lifetime `'a` required }
- Self(s)
+ Node(s)
And you may get this advice when implementing a trait, where you usually can't change the signature.
Advice to add bound which implies lifetime equality
The example for this one is very contrived, but consider the output here:
#![allow(unused)] fn main() { fn f<'a, 'b>(s: &'a mut &'b mut str) -> &'b str { *s } }
#![allow(unused)] fn main() { = help: consider adding the following bound: `'a: 'b` }
With the nested lifetime in the argument, there's already an implied 'b: 'a
bound.
If you follow the advice and add a 'a: 'b
bound, then the two bounds together imply that 'a
and 'b
are in fact the same lifetime.
More clear advice would be to use a single lifetime. Even better advice for this particular example would be to return &'a str
instead.
Another possible pitfall of blindly following this advice is ending up with something like this:
#![allow(unused)] fn main() { impl Node<'a> { fn g<'s: 'a>(&'s mut self) { /* ... */ } }
That's the &'a mut Node<'a>
anti-pattern in disguise! This will probably be unusable and hints at a deeper problem that needs solved.
Advice to add a static bound
The compiler is gradually getting better about this, but when it suggests to use a &'static
or that a lifetime needs to outlive 'static
, it usually actually means either
- You're in a context where non-
'static
references and other non-static
types aren't allowed - You should add a lifetime parameter somewhere
Rather than try to cook up my own example, I'll just link to this issue. Although it's closed, there's still room for improvement in some of the examples within.
Illustrative examples
Here we provide some examples and "recipes" to illustrate how lifetimes work (or don't work) in more practical settings.
Mutable slice iterator
The standard library has an iterator over &mut [T]
which is implemented (as of this writing) in terms of pointer arithmetic, presumably for the sake of optimization.
In this example, we'll show how one can implement their own mutable slice iterator with entirely safe code.
Here's the starting place for our implementation:
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], // ...maybe other fields for your needs... } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { todo!() } } }
Below are a few starting attempts at it. Spoilers, they don't compile.
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { // Eh, we'll worry about iterative logic later! self.slice.get_mut(0) } } }
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { // Actually, this method looks perfect for our iteration logic let (first, rest) = self.slice.split_first_mut()?; self.slice = rest; Some(first) } } }
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { // 🤔 Pattern matching?? match &mut self.slice { [] => None, [first, rest @ ..] => Some(first), } } } }
Yeah, the compiler really doesn't like any of that. Let's take a minute to write out all the elided lifetimes. Some of them are in aliases, which we're also going to expand:
Item
is&'a mut T
&mut self
is short forself: &mut Self
, andSelf
isMyIterMut<'a, T>
Here's what it looks like with everything being explicit:
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next<'s>(self: &'s mut MyIterMut<'a, T>) -> Option<&'a mut T> { todo!() } } }
And remember that in MyIterMut<'a, T>
, slice
is a &'a mut [T]
.
Ah, yes. We have a nested exclusive borrow here.
You cannot get a
&'long mut U
through dereferencing a&'short mut &'long mut U
.
- You can only reborrow a
&'short mut U
.
There is no safe way to go through the &'s mut self
and pull out a &'a mut T
.
Are we stuck then? No, there is actually a way forward! As it turns out, slices are special. In particular, the compiler understands that an empty slice covers no actual data, so there can't be any memory aliasing concerns or data races, et cetera. So the compiler understands it's perfectly sound to pull an empty slice reference out of no where, with any lifetime at all. Even if it's an exclusive slice reference!
#![allow(unused)] fn main() { fn magic<T>() -> &'static mut [T] { &mut [] } }
For our purposes, we don't even need the magic: the standard library
has a Default
implementation
for &mut [T]
.
Why does this unstick us? With that implementation, we can conjure an empty &mut [T]
out of nowhere and move our slice
field out from behind &mut self
:
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { let mut slice = std::mem::take(&mut self.slice); // Eh, we'll worry about iterative logic later! slice.get_mut(0) } } }
std::mem::take
and swap
and replace
are
very useful and safe functions; don't be thrown off by them being in std::mem
along side the
dangerous transmute
and other low-level functions. Note how we passed &mut self.slice
--
that's a &mut &mut [T]
. take
replaces everything inside of the outer &mut
, which can have
an arbitrarily short lifetime -- just long enough to move the memory around.
So we're done aside from iterative logic, right? This should just give us the first element forever?
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { let mut slice = std::mem::take(&mut self.slice); slice.get_mut(0) } } let mut arr = [0, 1, 2, 3]; let iter = MyIterMut { slice: &mut arr }; for x in iter.take(10) { println!("{x}"); } }
Uh, it only gave us one item. Oh right -- when we're done with the slice, we need to move it
back into our slice
field. We only want to temporarily replace that field with an empty slice.
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { let mut slice = std::mem::take(&mut self.slice); // Eh, we'll worry about iterative logic later! let first = slice.get_mut(0); self.slice = slice; first } } }
Uh oh, now what.
#![allow(unused)] fn main() { error[E0499]: cannot borrow `*slice` as mutable more than once at a time 9 | let first = slice.get_mut(0); | ----- first mutable borrow occurs here 10 | self.slice = slice; | ^^^^^ second mutable borrow occurs here 11 | first | ----- returning this value requires that `*slice` is borrowed for `'a` }
Oh, right! These are exclusive references. We can't return the same item multiple
times -- that would mean someone could get multiple &mut
to the same element if they
collected the
iterator, for example. Come to think of it, we can't punt on our iteration logic
either -- if we try to hold on to the entire &mut [T]
while handing out &mut T
to the elements, that's also multiple &mut
to the same memory!
This is what the error is telling us: We can't hold onto the entire slice
and
return first
.
(There's a pattern called "leanding iterators" where you can hand out borrows of
data you own in an iterator-like fashion, but it's not possible with the current
Iterator
trait; it is also a topic for another day.)
Alright, let's try split_first_mut
again instead, that really did seem like a perfect fit for our iteration logic.
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { let mut slice = std::mem::take(&mut self.slice); let (first, rest) = slice.split_first_mut()?; self.slice = rest; Some(first) } } // ... let mut arr = [0, 1, 2, 3]; let iter = MyIterMut { slice: &mut arr }; for x in iter { println!("{x}"); } }
Finally, a working version! split_first_mut
is a form of borrow splitting,
which we briefly mentioned before.
And for the sake of completion, here's the pattern based approach to borrow splitting:
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { // ....these are all that changed.... // vvvvvvvvvvvvvvvvvv v match std::mem::take(&mut self.slice) { [] => None, [first, rest @ ..] => Some(first), } } } // ... let mut arr = [0, 1, 2, 3]; let iter = MyIterMut { slice: &mut arr }; for x in iter { println!("{x}"); } }
Oh whoops, just one element again. Right. We need to put the rest
back in self.slice
:
#![allow(unused)] fn main() { struct MyIterMut<'a, T> { slice: &'a mut [T], } impl<'a, T> Iterator for MyIterMut<'a, T> { type Item = &'a mut T; fn next(&mut self) -> Option<Self::Item> { match std::mem::take(&mut self.slice) { [] => None, [first, rest @ ..] => { self.slice = rest; Some(first) } } } } // ... let mut arr = [0, 1, 2, 3]; let iter = MyIterMut { slice: &mut arr }; for x in iter { println!("{x}"); } }
👍
Circle back
Ownership, borrowing, and lifetimes is a huge topic. There's way too much in this "intro" guide alone for you to absorb everything in it at once. So occasionally circle back and revisit the common misconceptions, or the documentation on variance, or take another crack at some complicated problem you saw. Your mental model will expand over time; it's enough in the beginning to know some things exist and revisit them when you run into a wall.
Moreover, Rust is practicality oriented, and the abilities of the compiler have developed organically to allow common patterns soundly and ergonomically. Which is to say that the borrow checker has a fractal surface; there's an exception to any mental model of the borrow checker. So there's always something new to learn, forget, and relearn, if you're into that.
Miscellanea
This section collects relatively small and self-contained tutorials and other tidbits.
Slice layout
It's not uncommon for people on the forum
to ask why it's conventional to have &[T]
as an argument insteaed of
&Vec<T>
, or to ask about the layout of slices more generally. Or to
ask analogous questions about &str
and String
, et cetera.
This page exists to be a useful citation for such questions.
If you want, you can skip ahead to the graphical layout.
What is a slice anyway?
The terminology around slices tends to be pretty loose. I'll try to keep
it more formal on this page, but when you read something about a "slice"
elsewhere, keep in mind that it may be referring to any of [T]
, &[T]
,
&mut [T]
, or even other types of references to [T]
(Box<[T]>
, Arc<[T]>
, ...).
This is the case not just for casual material, but for official documentation and other technical material. You just have to figure out which one or ones they are specifically talking about from context.
With that out of the way, let me intoduce some terminology for this page:
-
A slice,
[T]
, is a series ofT
in contiguous memory (layed out one after another, with proper alignment). The length is only known at run time; we say it is a dynamically sized type (DST), or an unsized type, or a type that does not implementSized
. -
A shared slice,
&[T]
, is a shared reference to a slice. It's a wide reference consisting of a pointer to the memory of the slice, and the number of elements in the slice. -
An exclusive slice,
&mut [T]
, is like a shared slice, but the borrow is exclusive (so you can e.g. overwrite elements through it). -
There are other wide pointer variations like boxed slices (
Box<[T]>
) and so on; we'll mention a few more momentarily.
Note that while slices are unsized, the wide pointers to slices (like &[T]
)
are sized.
Where is a slice?
Slices can be on the heap, but also on the stack, or in static memory, or anywhere else. The type doesn't "care" where it is. Therefore, you can't be sure where a pointer to a slice points unless the pointer itself has further guarantees.
For example, if you have a Box<[T]>
, then any T
within are on the
heap, because that's a guarantee of
Box<_>
.
(N.b. if T
is zero sized, they are not actually stored anywhere.)
So in that particular case, we could say the slice [T]
is on the heap.
What is a Vec<_>
anyway?
A Vec<T>
is growable buffer
that owns and stores T
s in contiguous memory, on the heap. You can conceptually
think of this as something that owns a slice [T]
(or more accurately,
[MaybeUninit<T>]
). You can index into a Vec<T>
with a range and get back a
shared or exclusive slice.
A Vec<_>
consists of a pointer, capacity, and length.
Other types
A String
is, under the hood,
like a Vec<u8>
which has additional guarantees -- namely, that the bytes are valid UTF8.
A &str
is like a &[u8]
that has the
same guarantee. You can index into a String
with a range and get back a &str
(or &mut str
). Like [u8]
, a str
is unsized, which is why you're almost always working
with a &str
or other pointer instead.
So the relationship betweeen str
and String
is the same as between [T]
and Vec<T>
.
There are other pairs of types with the same relationship:
These std
types generally have a ToOwned
relationship and a Borrow
relationship.
Even more data structures that can be considered a form of owned slices include:
-
[T; N]
is an array with a compile-time known length (i.e. it's a fixed-size array). It is like a slice ([T]
), but it isSized
, as the the length is known at compile time. The length is also part of the type. It's not growable. -
Box<[T]>
, a "boxed slice"; this is similar to aVec<T>
in that it owns theT
and stores them contiguously on the heap. Unlike aVec<T>
, the buffer is not growable (or shrinkable) through a&mut Box<[T]>
; you would have to allocate new storage and move the elements over. The length of a boxed slice is stored at runtime, and isn't known at compile time. Therefore, like a shared slice, it consists of a pointer and a length. -
Arc<[T]>
andRc<[T]>
are shared owneship variations onBox<[T]>
. -
There are similar variations for string-like types (
Box<str>
,Arc<Path>
,Rc<OsStr>
, ...) -
And other combinations too (
Box<[T; N]>
, etc.)
You can create shared slices to these other types of owned slices as well.
Technically, just a single T
is like a [T; 1]
(it has the same layout in memory).
So if you squint just right, every owned T
is also a form of owned slice, but with
a compile-time known length of 1. And indeed, you can create
&[T]
and &mut [T]
(and array versions too)
from &T
and &mut T
.
Graphical layout
Here's a graphical representation of the layout of slices, shared slices,
Vec<T>
, and &Vec<T>
.
+---+---+---+---+---+---+---+---+
| Pointer | Length | &[T] (or &str, &Path, Box<[T]>, ...)
+---+---+---+---+---+---+---+---+
|
V
+---+---+---+---+---+---+---+---+
| D | A | T | A | . | . | . | ...... [T] (or str, Path, ...)
+---+---+---+---+---+---+---+---+
^
|
+---+---+---+---+---+---+---+---+---+---+---+---+
| Pointer | Length | Capacity | Vec<T> (or String, PathBuf, ...)
+---+---+---+---+---+---+---+---+---+---+---+---+
^
|
+---+---+---+---+
| Pointer | &Vec<T> (or &String, &PathBuf, ...)
+---+---+---+---+
One advantage of taking &[T]
instead of &Vec<T>
as an argument should be
immediately apparent from the diagram: a &[T]
has less indirection.
However, there are other reasons:
- Everything useful for
&Vec<T>
is actually a method on&[T]
- You can't check the capacity with a
&[T]
, but you can't change the capacity with a&Vec<T>
anyway
- You can't check the capacity with a
- If you take
&[T]
as an argument, you can take shared slices that point to data which isn't owned by aVec<T>
(such as static data, part of an array, into aBox<[T]>
, et cetera)- So it is strictly and significantly more general to take
&[T]
- So it is strictly and significantly more general to take
Similar advantages apply to taking a &str
instead of a &String
, et cetera.
In contrast, there are many things you can do with a &mut Vec<T>
that you can't
do with a &mut [T]
, so which you choose depends much more on what you need to
do with the borrowed data.
Graphical layout for arrays
The layout of an array is the same as a slice, except the length is known.
+---+---+---+---+---+---+---+---+
| Pointer | Length | &[T] (or &str, &Path, Box<[T]>, ...)
+---+---+---+---+---+---+---+---+
|
V
+---+---+---+---+---+---+---+
| D | A | T | A | . | . | . | [T; N] (or str, Path, ...)
+---+---+---+---+---+---+---+
^
|
+---+---+---+---+
| Pointer | &[T; N] (or `Box<[T; N]>`, ...)
+---+---+---+---+
^
|
+---+---+---+---+
| Pointer | &Box<[T; N]> (or &&[T; N], ...)
+---+---+---+---+
Because [T; N]
is sized, and because the length is part of the type,
pointers to it (like &[T; N]
) are normal "thin" pointers, not "wide"
pointers. But you can also create a &[T]
that points to the array
(or to part of the array), as in the diagram.
Should you take a &[T]
or a &[T; N]
as a function argument? If
you don't need a specific length, and aren't trying to generate code
that's optimized based on the specific length of the array, you probably
want &[T]
.
Default parameter mechanics
Default parameters in Rust are not as convenient as one might wish. The RFC for default type parameters was never fully completed; in particular, the "inference falls back to defaults" parts have been delayed indefinitely. As a result, there are times where default parameters don't kick in, and you have to either be explicit or use other workarounds. It can also be unclear why the workarounds act differently.
Default parameters are also not in the reference yet.
This page exists to explain the mechanics behind default parameters as they exist today, and to clear up exactly what the workarounds mean. For an exploration on how the interaction between inference and default parameters could be defined in the future, I recommend this wonderful blog post by Gankra.
Motivation
The most likely reason you'll run into default parameters not "working" is because
some expression desugars to replacing all type (and const
) parameters with inference
variables, in combination with the fact that inference variables do not fall back to the
defaults.
What are inference variables? For types, an inference variable is the same as the
"wildcard type" _
, which tells the compiler to infer the type for you.
_
cannot be used for const
parameters as of yet,
but they can still be inferred implicitly.
(For most of this guide, we'll be focused on types;
there's a subsection about const
parameters specifically later.)
Let's see some examples of compilation failures involving defaulted parameters:
#![allow(unused)] fn main() { use std::collections::HashSet; // `HashSet` will be our running example for a type with both required // (non-defaulted, non-lifetime) and defaulted parameters // struct HashSet<Key, S = RandomState> { .. } // The `insert` is enough for the compiler to infer the `Key` parameter, but // not the `S` parameter let mut hs = HashSet::default(); hs.insert(String::new()); // This means the same thing: *all* type (and const) parameters became // inference variables let mut hs = HashSet::<_, _>::default(); hs.insert(String::new()); }
This can be confusing because similar code just works:
#![allow(unused)] fn main() { use std::collections::HashSet; // This compiles, but the compiler can figure `Key` out on its own, so why? let mut hs = HashSet::<String>::default(); hs.insert(String::new()); // And in fact... this compiles too! let mut hs = HashSet::<_>::default(); hs.insert(String::new()); // `new` doesn't have this problem, which may also be confusing let mut hs = HashSet::new(); hs.insert(String::new()); }
The errors can also arise when the type has defaults for of all the type (and const
) parameters:
#![allow(unused)] fn main() { // This will be our running example for a type where all non-lifetime // parameters have defaults pub enum Foo<T = String> { Bar(T), Baz, } // This fails because the elided parameter desugars to an inference variable let foo = Foo::Baz; // So this means the exact same thing let foo = Foo::<_>::Baz; }
And some of the workarounds may be even more confusing:
#![allow(unused)] fn main() { pub enum Foo<T = String> { Bar(T), Baz } // This works! let foo = <Foo>::Baz; }
We want to explain exactly which expressions end up being problematic, and why the workarounds solve the problem.
The explanations in brief
First let's tackle why just wrapping the type in <>
worked for that last example.
#![allow(unused)] fn main() { pub enum Foo<T = String> { Bar(T), Baz } let foo = <Foo>::Baz; }
The leading <Foo>::
notation is called a
"qualified path type".
And the short answer to why it works is that, with respect to elided default
parameters, types in <>
s act the same as type ascription:
#![allow(unused)] fn main() { pub enum Foo<T = String> { Bar(T), Baz } // Also works let foo: Foo = Foo::Baz; }
Type ascription uses default parameters in a way that's probably closer to your intuition.
(We explore the details below.)
Note that types act like type ascription in <>
elsewhere too, such as
within a turbofish, not just as a qualified path type.
As for the difference here:
#![allow(unused)] fn main() { use std::collections::HashSet; // This fails if we change `HashSet::new()` to `HashSet::default()` let mut hs = HashSet::new(); hs.insert(String::new()); }
The example only works because HashSet::new
(and a number of other methods) is only defined for HashSet<_, RandomState>
.
In contrast, Default
is implemented for all possible HashSet<_, _>
. So
in a sense, this is a workaround on the side of the HashSet
implementation!
If inference and default parameters worked together,
new
would presumably be defined for all possible hashers, too.
Finally, let's look at this workaround:
#![allow(unused)] fn main() { use std::collections::HashSet; // Remember, `HashSet::default()` fails let mut hs = HashSet::<_>::default(); hs.insert(String::new()); }
The key difference here is that if no required parameters are specified,
then all the type (and const
) parameters -- including defaulted parameters
-- are filled in with inference variables. But if one or more non-lifetime
parameter is specified, it desugars to a qualified type path -- where default
parameters act the same as they do in type ascription.
#![allow(unused)] fn main() { use std::collections::HashSet; // These are all the same and fail // let mut hs = HashSet::default(); // let mut hs = HashSet::<>::default(); // let mut hs = HashSet::<_, _>::default(); let mut hs = <HashSet::<_, _>>::default(); hs.insert(String::new()); }
#![allow(unused)] fn main() { use std::collections::HashSet; // These are the same and succeed. // let mut hs = HashSet::<_>::default(); let mut hs = <HashSet<_>>::default(); hs.insert(String::new()); }
As is clear from the example, using _
explicitly counts as specifying a type
parameter. Also note that the desugaring to "all parameters are inference
variables" only happens when the type is not inside <>
s.
Type position mechanics in more detail
By "type position", we mean contexts where the language expects a type specifically. This includes variable type ascription, implementation headers, type parameter fields themselves, and qualified path types.
In type position, you can only elide default parameters. Elided default parameters are
replaced by their default types (or const
values) specifically (i.e. not inference variables).
Let's see some examples:
#![allow(unused)] fn main() { use std::collections::HashSet; use std::hash::RandomState; enum Foo<T = String> { Bar(T), Baz, } // These ascriptions mean the same thing // vvvvvvvvvvv let e: Foo = Foo::Baz; let e: Foo<> = Foo::Baz; let e: Foo<String> = Foo::Baz; // These ascriptions mean the same thing // vvvvvvvvvvvvvvvvvvvvvvvvvvvv let hs: HashSet<String> = Default::default(); let hs: HashSet<String, RandomState> = Default::default(); }
The following errors demonstrate that elided parameters aren't inference variables, and that inference variables don't fall back to the defaults.
#![allow(unused)] fn main() { enum Foo<T = String> { Bar(T), Baz, } // Fails due to ambiguity let e: Foo<_> = Foo::Baz; }
#![allow(unused)] fn main() { use std::collections::HashSet; use std::hash::RandomState; // Fails due to ambiguity let hs: HashSet<String, _> = Default::default(); }
#![allow(unused)] fn main() { enum Foo<T = String> { Bar(T), Baz, } // Fails because the elided type is exactly the default type (`String`) let e: Foo = Foo::Bar(0); }
The final example is the opposite situation from most of the examples we've seen:
it's a case where you want inference to override defaults. If you made the ascription
Foo<_>
it will compile (but a more trivial fix for this particular example is to
just remove the redundant ascription).
More about qualified path expressions
Types inside of <>
s are in type position, and that includes qualified path expressions.
A qualified path expressions
is when you have some path expression that starts with a segment contained in
<>
. They were defined in RFC 0132
and they come in two different forms:
#![allow(unused)] fn main() { // vvvvvvvv `<T>` where `T` is a type let s = <String>::default(); // vvvvvvvvvvvvvvvvvvv `<T as Tr>` where `T` is a type and `Tr` is a trait let s = <String as Default>::default(); }
The first form can resolve to inherent functions or trait methods, whereas
the second form can only resolve to the named trait's methods. Rust doesn't
have "trait inference variables", so the trait must be named; you can't use
_
in place of the trait, for example. (You can still use it in place of
the trait's type parameters.)
Traits
Default parameters for traits work the same as default parameters for types, both inside and outside of "type position". When thinking of traits in paths as sugar for qualified paths, the desugaring is like so:
#![allow(unused)] fn main() { trait Trait<One, Two = String>: Sized { fn foo(self) -> (Self, One, Two) where One: Default, Two: Default { (self, One::default(), Two::default()) } } impl<T, U> Trait<T, U> for i32 {} impl<T, U> Trait<T, U> for f64 {} // Failing versions //let _: (i32, (), _) = Trait::foo(0); //let _: (i32, (), _) = Trait::<_, _>::foo(0); let _: (i32, (), _) = <_ as Trait<_, _>>::foo(0); // ^^^^^^^^^^^^^^^^^^ }
#![allow(unused)] fn main() { trait Trait<One, Two = String>: Sized { fn foo(self) -> (Self, One, Two) where One: Default, Two: Default { (self, One::default(), Two::default()) } } impl<T, U> Trait<T, U> for i32 {} impl<T, U> Trait<T, U> for f64 {} // Working versions let _: (i32, (), _) = Trait::<_>::foo(0); let _: (i32, (), _) = <_ as Trait<_>>::foo(0); // ^^^^^^^^^^^^^^^ }
The only new thing of note is that the implementing type is an inference variable in this case.
Mostly historical side note
Before edition 2021, it's possible to leave the dyn
off of dyn Trait
types
(although it does fire a lint). This means that the same name can refer to either
a trait, or a type (the trait object type). Which one is used depends on the
context.
For example:
let _: i32 = Trait::name(0.0);
// If `Trait` has a method called `name`, that is is
let _: i32 = <_ as Trait>::name(0.0);
// But if it does not, and `dyn Trait` has a method called `name`, this is
let _: i32 = <dyn Trait>::name(0.0)
// And the following line is always referring to `dyn Trait`
let _: i32 = <Trait>::name(0.0);
More about types in expressions
In this section, "types in expressions" refers to types which are in expressions
but not within <>
(e.g. not a qualified path type or a type parameter). These
are the positions where it is required to use turbofish (e.g. Vec::<String>
)
instead of just appending the parameter list (e.g. Vec<String>
).
In these positions, it is always allowed to elide all the type and const
parameters, even if there are required (i.e. non-defaulted, non-lifetime)
parameters. When you do so -- even if all the type and const
parameters
have defaults -- the behavior is the same as using type inference variables
(_
) for all the parameters.
If you do not elide all non-lifetime parameters -- that is, if you specify one or
more type parameter or const
parameter -- then you must specify all required
parameters. Or in other words: if you specify at least one type or const
parameter, you can only elide defaulted parameters (and lifetimes).
The behavior of elided defaulted parameters is as follows:
- If you specify zero non-lifetime parameters
- Inference variables are used for all type and
const
parameters
- Inference variables are used for all type and
- If you specify one or more non-lifetime parameters
- Defaults are used for elided type and
const
parameters
- Defaults are used for elided type and
Above, we phrased the different default parameter behavior for types in expressions in
terms of desugaring to qualified type paths.
However, the behavior applies in other contexts too, such as struct
expression syntax:
#![allow(unused)] fn main() { struct Two<T, U = String> { t: T, u: U } // This is ambiguous let _ = Two { t: (), u: Default::default() }; }
#![allow(unused)] fn main() { struct Two<T, U = String> { t: T, u: U } // But this works let _ = Two::<_> { t: (), u: Default::default() }; }
Qualified path types are not allowed in this postion, so not all of the workarounds we discussed for paths are applicable.
#![allow(unused)] fn main() { struct One<T = String> { t: T } // Ambiguous let _ = One { t: Default::default() }; }
#![allow(unused)] fn main() { struct One<T = String> { t: T } // Not accepted grammatically let _ = <One> { t: Default::default() }; }
Finally, there is no way to syntactically represent inferred but
non-defaulted const
parameters in qualified path types (or any
other type-annotation-like position).
#![allow(unused)] fn main() { struct Pixel<const N: usize>([u8; N]); impl<const N: usize> Default for Pixel<N> { fn default() -> Self { Self([0; N]) } } // Works let pixel = Pixel::default(); // These fail because `_` cannot be used for const parameters yet // let pixel = <Pixel<_>>::default(); // let pixel: Pixel<_> = Default::default(); drive_inference(pixel); fn drive_inference(_: Pixel<3>) {} }
Non-type generic parameters
This guide has mostly concentrated on type parameters. We've tried to be careful with our wording throughout the guide, but let's take a moment to talk specifically at how non-type parameters work with regards to defaults.
Lifetime parameters
Lifetime paramters can not be given defaults, and do not change any of the default parameter behavior we've discussed.
I've tried to take care to use phrases like "specify one or mmore non-lifetime parameter" instead of phrase like "empty parameter list". But just to make things more explicit: the inclusion or elision of lifetime parameters doesn't change how parameter defaults work.
For example, the below are still cases of specifying no required parameters, and thus uses an inference variable (which then fails as ambiguous).
#![allow(unused)] fn main() { pub enum Foo2<'a, T = String> { Bar(&'a T), Baz, } let foo = Foo2::<'_>::Bar; let foo = Foo2::<'static>::Bar; }
const
parameters
const
parameters defaults were stabilized in 1.59,
along with the ability to intermix const
and type parameters
in the parameter list (as otherwise the presence of a defaulted
type parameter would force all const
parameters to also have
defaults, for example).
Generally speaking, defaulted const
parameters act just like defaulted
type parameters. However, one important difference is that
_
cannot be used for const
parameters as of yet.
This does mean that some of the workarounds we've seen cannot be applied:
#![allow(unused)] fn main() { // We'll use this analogously to our `HashSet` examples struct MyArray<const N: usize, T = String>([T; N]); impl<T: Default, const N: usize> Default for MyArray<N, T> { fn default() -> Self { Self(std::array::from_fn(|_| T::default())) } } // Ambiguous for the usual reasons // let arr = MyArray::default(); // Here's what we did when `HashSet` had this problem. // But it fails because we can't use `_` for the `const` parameter! let arr = MyArray::<_>::default(); // Explicitness it is then // let arr = MyArray::<16>::default(); // (These parts are just here to make everything above work like // our `HashSet` examples worked.) drive_inference_of_length(&arr); fn drive_inference_of_length<T>(_: &MyArray<16, T>) {} }
A warning about implementations and function arguments
Default type parameters work the same in implementation headers and function argument lists as they do in other "type positions". This may be surprising with compared to elided lifetime parameters.
In implementations and function argument lists, eliding a lifetime parameter introduces a new, independent generic lifetime parameter. But eliding a type parameter never means "introduce a new generic". Elided type parameters always resolve to a single type (or error), whether that type comes from inference or a default type.
#![allow(unused)] fn main() { pub enum Foo<T = String> { Bar(T), Baz } // This is an implementation for `Foo<String>` only impl Foo { fn papers_please(&self) {} } // This is an implemenetation for all (`Sized`) `T` impl<T> Foo<T> { fn welcome(&self) {} } let foo = Foo::Bar(0); // Works foo.welcome(); // Fails foo.papers_please(); }
(Eliding lifetimes in other positions sometimes means 'static
and
sometimes means "infer this for me", but that's a topic for another
day. Lifetime parameters cannot have defaults.)
Default type parameters elsewhere
Declaring default parameters that are not on types, traits, or trait aliases either results in an error, or fires a deny-by-default lint stating that support will be removed.
Despite the lint, default parameters on functions work the same as default parameters on type declarations. However, every function has a unique type (a "function item type") which cannot be named. Because the function item type cannot be named, most of the workarounds we've talked about cannot be applied.
That being said, the case where you use a turbofish with one or more non-elided type parameter still works:
#![allow(unused)] fn main() { #[allow(invalid_type_param_default)] fn example<X, Y: Default = String>() -> Y { Y::default() } let s = example::<()>(); println!("{}", std::any::type_name_of_val(&s)); }
Default parameters on impl
headers do not serve any purpose as far as I'm
aware. Implementations don't have names at all (which is why the parameters
are on the impl
keyword).
#![allow(unused)] fn main() { struct MyStruct; trait WhyThough<T, U> { } #[allow(invalid_type_param_default)] impl<T = String> WhyThough<i32, T> for MyStruct {} }
Default parameters on GATs are currently just denied, even if the lint is allowed.
#![allow(unused)] #![allow(invalid_type_param_default)] fn main() { trait MyTrait { type Gat<T = String>; } }
A tour of dyn Trait
Rust's type-erasing dyn Trait
offers a way to treat different implementors
of a trait in a homogenous fashion while remaining strictly and statically
(i.e. compile-time) typed. For example: if you want a Vec
of values which
implement your trait, but they might not all be the same base type, you need
type erasure so that you can create a Vec<Box<dyn Trait>>
or similar.
dyn Trait
is also useful in some situations where generics are undesirable,
or to type erase unnameable types such as closures into something you need to
name (such as a field type, an associated type, or a trait method return type).
There is a lot to know about when and how dyn Trait
works or does not, and
how this ties together with generics, lifetimes, and Rust's type system more
generally. It is therefore not uncommon to get somewhat confused about
dyn Trait
when learning Rust.
In this section we take a look at what dyn Trait
is and is not, the limitations
around using it, how it relates to generics and opaque types, and more.
dyn Trait
Overview
What is dyn Trait
?
dyn Trait
is a compiler-provided type which implements Trait
. Any Sized
implementor
of Trait
can be coerced to be a dyn Trait
, erasing the original base type in the process.
Different implementations of Trait
may have different sizes, and as a result, dyn Trait
has no statically known size. That means it does not implement Sized
, and we call such
types "unsized", or "dynamically sized types (DSTs)".
Every dyn Trait
value is the result of type erasing some other existing value.
You cannot create a dyn Trait
from a trait definition alone; there must be an
implementing base type that you can coerce.
Rust currently does not support passing unsized parameters, returning unsized values, or
having unsized locals. Therefore, when interacting with dyn Trait
, you will generally
be working with some sort of indirection: a Box<dyn Trait>
, &dyn Trait
, Arc<dyn Trait>
,
etc.
And in fact, the indirection is necessary for another reason. These indirections are or
contain wide pointers to the erased type, which consist of a pointer to the value, and
a second pointer to a static vtable. The vtable in turn contains data such as the size
of the value, a pointer to the value's destructor, pointers to methods of the Trait
,
and so on. The vtable enables dynamic dispatch, by which
different dyn Trait
values can dispatch method calls to the different erased base type
implementations of Trait
.
dyn Trait
is also called a "trait object".
You can also have objects such as dyn Trait + Send + Sync
. Send
and Sync
are
auto-traits,
and a trait object can include any number of these auto traits as additional bounds.
Every distinct set of Trait + AutoTraits
is a distinct type.
However, you can only have one non-auto trait in a trait object, so this will not work:
#![allow(unused)] fn main() { trait Trait1 {} trait Trait2 {}; struct S(Box<dyn Trait1 + Trait2>); }
That being noted, one can usually use a subtrait/supertrait pattern to work around this restriction.
The trait object lifetime
Confession: we were being imprecise when we said dyn Trait
is a type. dyn Trait
is a
type constructor: it is parameterized with a lifetime,
similar to how references are. So dyn Trait
on it's own isn't a type,
dyn Trait + 'a
for some concrete lifetime 'a
is a type.
The lifetime can usually be elided, which we will explore later. But it is always part of the type, just like a lifetime is part of every reference type, even when elided.
Associated Types
If a trait has non-generic associated types, those associated types usually become
named parameters of dyn Trait
:
#![allow(unused)] fn main() { let _: Box<dyn Iterator<Item = i32>> = Box::new([1, 2, 3].into_iter()); }
We explore associated types in dyn Trait
more
in a later section.
What dyn Trait
is not
dyn Trait
is not Sized
We mentioned the fact that dyn Trait
is not Sized
already.
However, let us take a moment to note that generic parameters
have an implicit Sized
bound.
Therefore you may need to remove the implicit bound by using
: ?Sized
in order to use dyn Trait
in generic contexts.
#![allow(unused)] fn main() { trait Trait {} // This function only takes `T: Sized`. It cannot accept a // `&dyn Trait`, for example, as `dyn Trait` is not `Sized`. fn foo<T: Trait>(_: &T) {} // This function takes any `T: Trait`, even if `T` is not // `Sized`. fn bar<T: Trait + ?Sized>(t: &T) { // Demonstration that `foo` cannot accept non-`Sized` // types: foo(t); } }
dyn Trait
is neither a generic nor dynamically typed
Given a concrete lifetime 'a
, dyn Trait + 'a
is a statically known type.
The erased base type is not statically known, but don't let this confuse
you: the dyn Trait
itself is its own distinct type and that type is known
at compile time.
For example, consider these two function signatures:
#![allow(unused)] fn main() { trait Trait {} fn generic<T: Trait>(_rt: &T) {} fn not_generic(_dt: &dyn Trait) {} }
In the generic case, a distinct version of the function will exist for every
type T
which is passed to the function. This compile-time generation of
new functions for every type is known as monomorphization. (Side note,
lifetimes are erased during compilation, and not monomorphized.)
You can even create function pointers to the different versions like so:
trait Trait {} impl Trait for String {} fn generic<T: Trait>(_rt: &T) {} fn main() { let fp = generic::<String>; }
That is, the function item type is parameterized by some T: Trait
.
In contrast, there will always only be only one non_generic
function in
the resulting library. The base implementors of Trait
must be typed-erased
into dyn Trait + '_
before being passed to the function. The function type
is not parameterized by a generic type.
Similarly, here:
#![allow(unused)] fn main() { trait Trait {} fn generic<T: Trait>(bx: Box<T>) {} }
bx: Box<T>
is not a Box<dyn Trait>
. It is a thin owning pointer to a
heap allocated T
specifically. Because T
has an implicit Sized
bound
here, we could coerce bx
to a Box<dyn Trait + '_>
. But that would be a
transformation to a different type of Box
: a wide owning pointer which has
erased T
and included the corresponding vtable pointer.
We'll explore more details on the interaction of generics and dyn Trait
in a later section.
You may wonder why you can use the methods of Trait
on a &dyn Trait
or
Box<dyn Trait>
, etc., despite not declaring any such bound. The reason is
analogous to why you can use Display
methods on a String
without declaring
that bound, say: the type is statically known, and the compiler recognizes that
dyn Trait
implements Trait
, just like it recognizes that String
implements Display
. Trait bounds are needed for generics, not concrete types.
(In fact, Box<dyn Trait>
doesn't implement Trait
automatically,
but deref coercion usually takes care of that case. For many std
traits,
the trait is explicitly implemented for Box<dyn Trait>
as well;
we'll also explore what that can look like.)
As a concrete type, you can also implement methods on dyn Trait
(provided Trait
is local to your crate), and even implement other
traits for dyn Trait
(as we will see in some of the examples).
dyn Trait
is not a supertype
Because you can coerce base types into a dyn Trait
, it is not uncommon for
people to think that dyn Trait
is some sort of supertype over all the
coercible implementors of Trait
. The confusion is likely exacerbated by
trait bounds and lifetime bounds sharing the same syntax.
But the coercion from a base type to a dyn Trait
is an unsizing coercion,
and not a sub-to-supertype conversion; the coercion happens at statically
known locations in your code, and may change the layout of the types
involved (e.g. changing a thin pointer into a wide pointer) as well.
Relatedly, trait Trait
is not a class. You cannot create a dyn Trait
without an implementing type (they do not have built-in constructors),
and a given type can implement a great many traits. Due to the confusion it
can cause, I recommend not referring to base types as "instances" of the trait.
It is just a type that implements Trait
, which exists independently of the
trait. When I create a String
, I'm creating a String
, not "an instance
of Display
(and Debug
and Write
and ToString
and ...)".
When I read "an instance of Trait
", I assume the variable in question is
some form of dyn Trait
, and not some unerased base type that implements Trait
.
Implementing something for dyn Trait
does not implement it for all other
T: Trait
. In fact it implements it for nothing but dyn Trait
itself.
Implementing something for dyn Trait + Send
doesn't implement anything
for dyn Trait
or vice-versa either; those are also separate, distinct types.
There are ways to emulate dynamic typing in Rust, which we will explore later. We'll also explore the role of supertraits (which, despite the name, still do not define a sub/supertype relationship).
The only subtypes in Rust involve lifetimes and types which are higher-ranked over lifetimes.
(Pedantic self-correction: trait objects have lifetimes and thus are supertypes in that sense. However that's not the same concept that most Rust learners get confused about; there is no supertype relationship with the implementing types.)
dyn Trait
is not universally applicable
We'll look at the details in their own sections, but in short, you cannot
always coerce an implementor of Trait
into dyn Trait
. Both
the trait and the implementor
must meet certain conditions.
In summary
dyn Trait + 'a
is
- a concrete, statically known type
- created by type erasing implementors of
Trait
- used behind wide pointers to the type-erased value and to a static vtable
- dynamically sized (unsized, does not implement
Sized
) - an implementor of
Trait
via dynamic dispatch - not a supertype of all implementors
- not dynamically typed
- not a generic
- not creatable from all values
- not available for all traits
dyn Trait
implementations
In order for dyn Trait
to be useful for abstracting over the base
types which implement Trait
, dyn Trait
itself needs to implement
Trait
. The compiler always supplies that implementation. Here we
look at how this notionally works, and also touch on how this leads
to some related limitations around dyn Trait
.
We also cover a few surprising corner-cases related to how the
implementation of Trait
for dyn Trait
works... or doesn't.
How dyn Trait
implements Trait
Let us note upfront: this is a rough sketch, and not normative. What the
compiler actually does is an implementation detail. But by providing a
sketch of how it could be implemented, we hope to provide some intuition
for dyn Trait
being a concrete type, and some explanation of the
limitations that dyn Trait
has.
With that disclaimer out of the way, let's look at what the compiler implementation might look like for this trait:
#![allow(unused)] fn main() { trait Trait { fn look(&self); fn add(&mut self, s: String) -> i32; } }
Recall that when dealing with dyn Trait
, you'll be dealing with
a pointer to the erased base type, and with a vtable. For example,
we could imagine a &dyn Trait
looks something like this:
#[repr(C)]
struct DynTraitRef<'a> {
_lifetime: PhantomData<&'a ()>,
base_type: *const (),
vtable: &'static DynTraitVtable,
}
// Pseudo-code
type &'a dyn Trait = DynTraitRef<'a>;
Here we're using a thin *const ()
to point to the erased base type.
Similarly, you can imagine a DynTraitMut<'a>
for &'a mut dyn Trait
that uses *mut ()
.
And the vtable might look something like this:
#![allow(unused)] fn main() { #[repr(C)] struct DynTraitVtable { fn_drop: fn(*mut ()), type_size: usize, type_alignment: usize, fn_look: fn(*const ()), fn_add: fn(*mut (), s: String) -> i32, } }
And the implementation itself could look something like this:
impl Trait for dyn Trait + '_ {
fn look(&self) {
(self.vtable.fn_look)(self.base_type)
}
fn add(&mut self, s: String) -> i32 {
(self.vtable.fn_add)(self.base_type, s)
}
}
In summary, we've erased the base type by replacing references to the
base type with the appropriate type of pointer to the same data, both
in the wide references (&dyn Trait
, &mut dyn Trait
), and also in
the vtable function pointers. The compiler guarantees there's no ABI
mismatch.
Reminder: This is just a rough sketch on how dyn Trait
can be
implemented to aid the high-level understanding and discussion, and
not necessary exactly how they are implemented.
Here's another blog post on the topic.
Note that it was written in 2015, and some things in Rust have changed
since that time. For example, trait objects used to be "spelled" just
Trait
instead of dyn Trait
.
You'll have to figure out if they're talking about the trait or the
dyn Trait
type from context.
Other receivers
Let's look at one other function signature:
#![allow(unused)] fn main() { trait Trait { fn eat_box(self: Box<Self>); } }
How does this work? Internally, a Box<BaseType /* : Sized */>
is
a thin pointer, while a Box<dyn Trait>
is wide pointer, very similar
to &mut dyn Trait
for example (although the Box
pointer implies ownership and
not just exclusivity). The implementation for this method would be
similar to that of &mut dyn Trait
as well:
// Still just for illustrative purpose
impl Trait for dyn Trait + '_ {
fn eat_box(self: Box<Self>) {
let BoxRepresentation { base_type, vtable } = self;
let boxed_type = Box::from_raw(base_type);
(vtable.fn_eat_box)(boxed_type);
}
}
In short, the compiler knows how to go from the type-erased form
(like Box<Self>
) into something ABI compatible for the base type
(Box<BaseType>
) for every supported receiver type.
It's an implementation detail, but currently the way the compiler
knows how to do the conversion is via the
DispatchFromDyn
trait. The documentation lists the current limitations of supported
types (some of which are only available under the unstable
arbitrary_self_types
feature).
Supertraits are also implemented
We'll look at supertraits in more detail later, but here we'll briefly note that when you have a supertrait:
trait SuperTrait { /* ... */ }
trait Trait: SuperTrait { /* ... */ }
The vtable for dyn Trait
includes the methods of SuperTrait
and the compiler
supplies an implementation of SuperTrait
for dyn Trait
, just as it supplies
an implementation of Trait
.
Box<dyn Trait>
and &dyn Trait
do not automatically implement Trait
It may come as a surprise that neither Box<dyn Trait>
nor
&dyn Trait
automatically implement Trait
. Why not?
In short, because it's not always possible.
As we'll cover later, a trait
may have methods which are not dispatchable by dyn Trait
, but must
be implemented for any Sized
type. One example is associated
functions that have no receiver:
#![allow(unused)] fn main() { trait Trait { fn no_receiver() -> String where Self: Sized; } }
There's no way for the compiler to generate the body of such an associated
function, and it can't provide a complete Trait
implementation without
one.
Additionally, the receivers of dispatchable methods don't always make sense:
#![allow(unused)] fn main() { trait Trait { fn takes_mut(&mut self); } }
A &dyn Trait
can produce a &BaseType
, but not a &mut BaseType
, so
there is no way to implement Trait::takes_mut
for &dyn Trait
when
the only pre-existing implementation is for BaseType
.
Similarly, an Arc<dyn Trait>
has no way to call a Box<dyn Trait>
or vice-versa, and so on.
Implementing these yourself
If Trait
is a local trait, you can implement it for Box<dyn Trait + '_>
and so on just like you would for any other type. Take care though, as it
can be easy to accidentally write a recursive definition!
We walk through an example of this later on.
Moreover, &T
, &mut T
, and Box<T>
are
fundamental,
which means that when it comes to the orphan rules (which gate which trait
implementations you can write), they act the same as T
. Additionally,
if Trait
is a local trait, then dyn Trait + '_
is a local type.
Together that means that you can even implement other traits for
Box<dyn Trait + '_>
(and other fundamental wrappers)!
We also have an example of this later on.
Unfortunately, Rc
, Arc
, and so on are not fundamental, so this doesn't
cover every possible use case.
The implementation cannot be directly overrode
The compiler provided implementation of Trait
for dyn Trait
cannot be
overrode by an implementation in your code. If you attempt to define your
own definition directly, you'll get a compiler error:
#![allow(unused)] fn main() { trait Trait {} impl Trait for dyn Trait + '_ {} }
And if you have a blanket implementation to implement Trait
and dyn Trait
happens to meet the bounds on the implementation, it will be ignored and the
compiler defined implementation will still be used:
#![allow(unused)] fn main() { use std::any::type_name; trait Trait { fn hi(&self) { println!("Hi from {}!", type_name::<Self>()); } } // The simplest example is an implementation for absolutely everything impl<T: ?Sized> Trait for T {} let dt: &dyn Trait = &(); // Prints "Hi from ()!" and not "Hi from dyn Trait!" dt.hi(); // Same thing <dyn Trait as Trait>::hi(dt); }
This even applies with more complicated implementations,
and applies to the supertrait implementations for dyn Trait
as well.
We'll see that this can be useful later. But
unfortunately, there are some compiler bugs around the compiler
implementation taking precedence over your blanket implementations.
How those bugs are dealt with is yet to be determined; it's possible
that certain blanket implementations will be disallowed, or that
some traits will no longer be dyn
-safe. (The general pattern,
such as the simple example above, is almost surely too widespread
to be deprecated.)
The implementation cannot be indirectly bypassed
You may be aware that when a concrete type has an inherent method with the same name and receiver as a trait method, the inherent method takes precedence when performing method lookup:
trait Trait { fn method(&self) { println!("In trait Trait"); } } struct S; impl Trait for S {} impl S { fn method(&self) { println!("In impl S"); } } fn main() { let s = S; s.method(); // If you wanted to use the trait, you can do this <S as Trait>::method(&s); }
Unfortunately, this functionality is not available for dyn Trait
.
You can write the implementation, but unlike the example above, they
will be considered ambiguous with the trait methods:
#![allow(unused)] fn main() { trait Trait { fn method(&self) {} fn non_dyn_dispatchable(&self) where Self: Sized {} } impl dyn Trait + '_ { fn method(&self) {} fn non_dyn_dispatchable(&self) {} } fn foo(d: &dyn Trait) { d.method(); d.non_dyn_dispatchable(); } }
Moreover, there is no syntax to call the inherent methods specifically
like there is for normal struct
s.
Even if you try to hide the trait,
the inherent methods are unreachable, dead code.
Apparently the idea is that the trait methods "are" the inherent methods of
dyn Trait
, but this is rather unfortunate as it prevents directly providing
something like the non_dyn_dispatchable
override attempted above.
See issue 51402 for more
information.
Implementing methods on dyn Trait
that don't attempt to shadow the
methods of Trait
does work, however.
#![allow(unused)] fn main() { trait Trait {} impl dyn Trait + '_ { fn some_other_method(&self) {} } fn bar(d: &dyn Trait) { d.some_other_method(); } }
A niche exception to dyn Trait: Trait
Some bounds on traits aren't checked until you try to utilize the trait,
even when the trait is considered object safe. As a result, it is
actually sometimes possible to create a dyn Trait
that does not implement
Trait
!
#![allow(unused)] fn main() { trait Iterable where for<'a> &'a Self: IntoIterator< Item = &'a <Self as Iterable>::Borrow, >, { type Borrow; fn iter(&self) -> Box<dyn Iterator<Item = &Self::Borrow> + '_> { Box::new(self.into_iter()) } } impl<I: ?Sized, Borrow> Iterable for I where for<'a> &'a Self: IntoIterator<Item = &'a Borrow>, { type Borrow = Borrow; } fn example(v: Vec<String>) { // This compiles, demonstrating that we can create `dyn Iterable` // (i.e. the trait is object safe and `v` can be coerced) let dt: &dyn Iterable<Borrow = String> = &v; // But this gives an error as `&dyn Iterable` doesn't meet the trait // bound, and thus `dyn Iterable` does not implement `Iterable`! for item in dt.iter() { println!("{item}"); } } }
With this particular example, it's possible to provide an implementation such that
dyn Iterable
meets the bounds.
If that's not possible, you probably need to drop the bound or give up
on the trait being dyn
-safe.
dyn Trait
coercions
Some dyn Trait
coercions which are typical (in terms of what is being coerced) look like so:
#![allow(unused)] fn main() { use std::sync::Arc; trait Trait {} fn coerce_ref<'a, T: Trait + Sized + 'a>(t: &T ) -> &( dyn Trait + 'a) { t } fn coerce_box<'a, T: Trait + Sized + 'a>(t: Box<T>) -> Box<dyn Trait + 'a> { t } fn coerce_arc<'a, T: Trait + Sized + 'a>(t: Arc<T>) -> Arc<dyn Trait + 'a> { t } // etc }
These are more syntactically noisy than you will typically see in practice, as
I have included some explicit lifetimes and bounds which are normally implied
or not used. For example the Sized
bound on generic type parameters
is usually implied,
but I've made it explicit to emphasize that we're talking about Sized
base types.
The key point is that given an object safe Trait
, and when
T: 'a + Trait + Sized
, you can coerce a Ptr<T>
to a Ptr<dyn Trait + 'a>
for the supported Ptr
pointer types such as &_
and Box<_>
.
If we had wanted a dyn Trait + Send + 'a
, naturally we would need T: Send
as well, and similarly for any other auto trait.
In the rest of this section, we look at cases beyond these typical examples, as well as some limitations of coercions.
Associated types
When a trait has one or more non-generic associated type, every concrete implementor of the trait chooses a single, statically-known type for each associated type. For base types, this means the associated types are "outputs" of the implementing type and the implemented trait: if you know the latter two, you can statically determine the associated types as well.
So what should the associated types be in the implementation of Trait
for
dyn Trait
?
There is no single answer; they would need to vary based on the erased base types.
However, dyn Trait
for traits with associated types is just too useful to
make traits with associated types ineligible for dyn Trait
. Instead, associated
types in the trait become, in essence, named type parameters of the dyn Trait
type constructor. (Recall it's already a type constructor due to the trait object lifetime.)
So given
#![allow(unused)] fn main() { trait Iterator { type Item; fn next(&mut self) -> Option<Self::Item>; } }
We have
dyn Iterator<Item = String> + '_
dyn Iterator<Item = i32> + '_
dyn Iterator<Item = f64> + '_
and so on. The associated types in dyn Trait<...>
must be resolved to
concrete types in order for the dyn Trait<...>
to be a concrete type.
Naturally, you can only coerce to dyn Iterator<Item = String>
if you
both implement Iterator
, and in your implementation, type Item = String
.
The syntax mirrors that of associated type trait bounds:
#![allow(unused)] fn main() { fn takes_string_iter<Iter>(i: Iter) where Iter: Iterator<Item = String>, { // ... } }
The parameters being named has a number of benefits. For one, it's
usually quite relevant, such as what Item
an Iterator
returns
(especially if the associated types are well named). It also removes
the need to order the associated types in a well-defined way, such as
lexicographically or especially declaration order (which would be too fragile).
The named parameters must be specified after all ordered parameters, however.
#![allow(unused)] fn main() { trait AssocAndParams<T, U> { type Assoc1; type Assoc2; } // The trait's ordered type parameters must be in declaration order // (here, `String` then `usize`). After that come the named associated // type parameters, which can be reordered arbitrary amongst themselves. fn foo(d: Box<dyn AssocAndParams<String, usize, Assoc1 = i32, Assoc2 = u32>>) -> Box<dyn AssocAndParams<String, usize, Assoc2 = u32, Assoc1 = i32>> { d } }
Opting out of dyn
-usability
As of Rust 1.72, if you add
a where Self: Sized
bound to an associated type, it is considered
non-dyn
-usable. The associated type
becomes unusable by dyn Trait
, and you no longer need to constrain the
associated type with the named parameter.
#![allow(unused)] fn main() { trait Trait { type Foo where Self: Sized; fn foo(&self) -> Self::Foo where Self: Sized; fn bar(&self) {} } impl Trait for i32 { type Foo = (); fn foo(&self) -> Self::Foo {} } impl Trait for u64 { type Foo = f32; fn foo(&self) -> Self::Foo { 0.0 } } // No need for `dyn Trait<Foo = ()>`! let mut a: &dyn Trait = &0_i32; // No need for associated type equality between base types! a = &0_u64; // This fails because the type is not defined (`dyn Trait` is not `Sized`) // let _: <dyn Trait as Trait>::Foo = todo!(); }
Although it produces is awarning, you can still optionally specify the
associated type, even though it's not usable by the dyn Trait
itself.
Note also that this does result in incompatible types and limits the
possible coercions:
#![allow(unused)] fn main() { trait Trait { type Foo where Self: Sized; fn foo(&self) -> Self::Foo where Self: Sized; fn bar(&self) {} } impl Trait for i32 { type Foo = (); fn foo(&self) -> Self::Foo {} } impl Trait for u64 { type Foo = f32; fn foo(&self) -> Self::Foo { 0.0 } } let mut a: &dyn Trait<Foo = ()> = &0_i32; // Fails! a = &0_u64; }
This introduces some interesting possibilities around
implementing trait
for Box<dyn Trait>
:
#![allow(unused)] fn main() { trait Trait { type Foo where Self: Sized; fn foo(&self) -> Self::Foo where Self: Sized; fn bar(&self) {} } impl Trait for i32 { type Foo = (); fn foo(&self) -> Self::Foo {} } impl Trait for u64 { type Foo = f32; fn foo(&self) -> Self::Foo { 0.0 } } impl<T: Default> Trait for Box<dyn Trait<Foo = T>> { type Foo = T; fn foo(&self) -> Self::Foo { T::default() } } }
The warning currently says "while the associated type can be specified, it cannot be used in any way," but this example shows that is not technically true. I think this sort of usage was just not anticipated.
The reason it's not an error to specify non-dyn
-usable associated types in this
manner is that there was a period where you could add Self: Sized
bounds to
associated types, but were still required to name the associated type in
dyn Trait<..>
. Thus it would be a breaking change to make the warning an error.
Given the potential utility, I would argue that the warning should at a minimum be reworded, and perhaps renamed.
No nested coercions
An unsizing coercion needs to happen behind a layer of indirection (such as a
reference or in a Box
) in order to accomodate the wide pointer to the erased
type's vtable (and because moving unsized types is not supported).
However, the unsizing coercion can only happen behind a single layer of
indirection. For example, you can't coerce a Vec<Box<T>>
to a Vec<Box<dyn Trait>>
.
Why not? Box<T>
and Box<dyn Trait>
have different layouts! The former
is the size of one pointer, while the second is the size of two pointers.
The entire Vec
would need to be reallocated to accomodate such a change:
#![allow(unused)] fn main() { trait Trait {} fn convert_vec<'a, T: Trait + 'a>(v: Vec<Box<T>>) -> Vec<Box<dyn Trait + 'a>> { v.into_iter().map(|bx| bx as _).collect() } }
In general, unsizing coercions consume the original pointer (reference, Box
,
etc) and produce a new one, and this cannot happen in a nested context.
Internally, which coercions are possible are determined by the
CoerceUnsized
trait, and the (compiler-implemented) Unsize
trait, as discussed in the
documentation.
Except when you can
There are some material and some apparent exceptions where unsizing coercion can occur in a nested context.
If you follow the link above, you'll see that some types such as Cell
implement CoerceUnsized
in a recursive manner.
The idea is that Cell
and the others have the same layout as their
generic type parameter. As a result, outer layers of Cell
don't count
as "nesting".
#![allow(unused)] fn main() { use std::cell::Cell; trait Trait {} // Fails :-( //fn coerce_vec<'a, T: Trait + 'a>(v: Vec<Box<T>>) -> Vec<Box<dyn Trait + 'a>> { // v //} // Works! :-) fn coerce_cell<'a, T: Trait + 'a>(c: Cell<Box<T>>) -> Cell<Box<dyn Trait + 'a>> { c } }
We'll cover the apparent exceptions (which are actually just supertype coercions) in an upcoming section.
The Sized
limitation
Base types must meet a Sized
bound in order to be able to be coerced to
dyn Trait
. For example, &str
cannot be coerced to &dyn Display
even though str
implements Display
, because str
is unsized.
Why is this limitation in place? &str
is also a wide pointer; it consists
of a pointer to the UTF8 bytes, and a usize
which is the number of bytes.
Similarly a slice reference &[T]
is a pointer to the contiguous data, and
a count of the number of items.
A &dyn Trait
created from a &str
or &[T]
would thus naively need to be
a "super-wide pointer", with a pointer to the data, the element count, and
the vtable pointer. But &dyn Trait
is a concrete type with a static layout
-- two pointers -- so this naive approach can't work. Moreover, what if I
wanted to coerce a super-wide pointer? Each recursive coercion requires
another pointer, making the size unbounded.
A non-naive approach would require special-casing how dynamic dispatch
works for erased non-Sized
base types. For example, once you've type
erased str
, you've lost the information that &str
is also a wide pointer,
and how to create that wide pointer. However, the code would need to recreate
a wide pointer in order to perform dynamic dispatch.
So for dyn Trait
to non-naively support unsized types, it would need
to examine at run-time how to construct a pointer to the erased base type:
one possibility for thin pointers, and an additional possibility for each type
of wide pointer supported. Not only that, but the metadata required (such as
the length of the str
) has to be stored somewhere, and that can't be in
static memory like the vtable is.
Instead, unsized base types are simply not supported.
Sometimes you can work around the limitation by, for example, implementing
the trait for &str
instead of str
, and then coercing a &'_ str
to
dyn Trait + '_
(since references are always Sized
).
#![allow(unused)] fn main() { use std::fmt::Display; // This fails as we cannot coerce `str` to `dyn Display`, so we cannot coerce // `&str` to `&dyn Display`. // let _: &dyn Display = "hi"; // However, `&str` also implements `Display`. (If `T: Display`, then `&T: Display`.) // Because `&str` is `Sized`, we can instead coerce `&&str` to `&dyn Display`: let _: &dyn Display = &"hi"; }
Sized
is also used as a sort of "not-dyn
" marker,
which we explore later.
There is one broad exception to the Sized
limitation: coercing between
forms of dyn Trait
itself, which we look at immediately below.
Discarding auto traits
You can coerce a dyn Trait + Send
to a dyn Trait
, and similarly discard
any other auto trait.
Although
dyn Trait
isn't a supertype of dyn Trait + Send
,
this is nonetheless referred to as upcasting dyn Trait + Send
to dyn Trait
.
Note that auto traits have no methods, and thus no change to the vtable is
required for these coercions. They allow one to call a less restricted
function (that takes dyn Trait
) from a more restrictive one (e.g. one that
requires dyn Trait + Send
). The coercion is necessary as, again, these are
(distinct) concrete types, and not generics nor subtypes nor dynamic types.
Although no change to the vtable is required, this coercion can still not happen in a nested context.
The reflexive case
You can cast dyn Trait
to dyn Trait
.
Sorry, we're being too imprecise again. You can cast a dyn Trait + 'a
to a dyn Trait + 'b
,
where 'a: 'b
. This is important for
how borrowing works with dyn Trait + '_
.
As lifetimes are erased during compilation, the vtable is the same regardless of the lifetime. Despite that, this unsizing coercion can still not happen in a nested context.
However, in a future section we'll see how variance can allow shortening the trait object lifetime even in nested context, provided that context is also covariant. The section after that about higher-ranked types explores another lifetime-related coercion which could also be considered reflexive.
Supertrait upcasting
Though not supported on stable yet,
the ability to upcast from dyn SubTrait
to dyn SuperTrait
is a feature expected to be available some day.
It is, once again, explicitly a coercion and not a sub/super type relationship (despite the terminology). Although this is an implementation detail, the conversion will probably involve replacing the vtable pointer (in contrast with the last couple of examples).
Until the feature is stable, you can write your own "manual" supertrait upcasts.
Object-safe traits only
There are other restrictions on the trait which we have not discussed here, such as not (yet) supporting traits with generic associated types (GATs). We cover those in the next section.
dyn
safety (object safety)
There exists traits for which you cannot create a dyn Trait
:
#![allow(unused)] fn main() { let s = String::new(); let d: &dyn Clone = &s; }
Instead of repeating all the rules here, I'll just link to the reference. You should go read that first.
Note that as of this writing, the reference hasn't been updated to document that you
can opt to make associated types and GATs
unavailable to trait objects by adding a where Self: Sized
bound. For now I'll
refer to this as opting the GAT (or associated type) out of being "dyn
-usable".
What may not be immediately apparent is why these limitations exists. The rest of this page explores some of the reasons.
The Sized
constraints
Before we get into the restrictions, let's have an aside about how the
Sized
constraints work with dyn Trait
and dyn
safety.
Rust uses Sized
to indicate that
- A trait is not
dyn
safe - An associated type or GAT is not
dyn
-usable - A method is not
dyn
-dispatchable - An associated function is not callable for
dyn Trait
- Even though it never can be (so far), you have to declare this for the sake of being explicit and for potential forwards compatibility
This makes some sense, as dyn Trait
is not Sized
. So a dyn Trait
cannot implement a trait with Sized
as a supertrait, and a dyn Trait
can't call methods (or associated functions) that require Sized
either.
However, it's still a hack as there are types which are not Sized
but also
not dyn Trait
, and we might want to implement our trait for those, including
some methods which are not dyn
-dispatchable (such as generic methods).
Currently that's just not possible in Rust (the non-dyn
-dispatchable methods
will also not be available for other unsized types).
The next few paragraphs demonstrate (or perhaps rant about) how this can be an annoying limitation. If you'd rather get on with learning practical Rust, you may want to skip ahead 🙂.
Consider this example, where we've added a Sized
bound in order to remain a dyn
-safe trait:
#![allow(unused)] fn main() { trait Bound<T: ?Sized> {} trait Trait { // Opt-out of `dyn`-dispatchability for this method because it's generic fn method<T: Bound<Self>>(&self) where Self: Sized; } }
If you try to implement this trait for str
, you won't have method
available, even if it would logically make sense to have it available.
Moreover, if you write the implementation like so:
#![allow(unused)] fn main() { trait Bound<T: ?Sized> {} trait Trait { fn method<T: Bound<Self>>(&self) where Self: Sized; } impl Trait for str { // `Self: Sized` isn't true, so don't bother with `method` } }
You get an error saying you must provide method
, even though the
bounds cannot be satisfied. So then you can provide a perfectly
functional implementation:
#![allow(unused)] fn main() { trait Bound<T: ?Sized> {} trait Trait { fn method<T: Bound<Self>>(&self) where Self: Sized; } impl Trait for str { fn method<T: Bound<Self>>(&self) where Self: Sized { // do logical `method` things } } }
Whoops, it doesn't accept that either! 😠 We have to implement it without the bound, like so:
#![allow(unused)] fn main() { trait Bound<T: ?Sized> {} trait Trait { fn method<T: Bound<Self>>(&self) where Self: Sized; } impl Trait for str { fn method<T: Bound<Self>>(&self) { // do logical `method` things } } }
And that compiles... but we can never actually call it.
trait Bound<T: ?Sized> {} trait Trait { fn method<T: Bound<Self>>(&self) where Self: Sized; } impl Trait for str { fn method<T: Bound<Self>>(&self) { } } fn main() { "".method(); }
Alternatively, we can exploit the fact that higher-ranked bounds
are checked at the call site and not the definition site to sneak
in the unsatisfiable Self: Sized
bound in a way that compiles:
#![allow(unused)] fn main() { trait Bound<T: ?Sized> {} trait Trait { fn method<T: Bound<Self>>(&self) where Self: Sized; } impl Trait for str { // Still not callable, but compiles: vvvvvvv due to this binder fn method<T: Bound<Self>>(&self) where for<'a> Self: Sized { unreachable!() } } }
But naturally the method still cannot be called, as the bound is not satisfiable.
This is a pretty sad state of affairs. Ideally, there would be a
distinct trait for opting out of dyn
safety and dispatchability
instead of using Sized
for this purpose; let's call it NotDyn
.
Then we could have Sized: NotDyn
for backwards compatibility,
change the bound above to be NotDyn
, and have our implementation
for str
be functional.
There also some other future possibilities that may improve the situation:
- Some resolution of RFC issue 2829 or the duplicates linked within would allow omitting the method altogether (but it would still not be callable)
- RFC 2056 will allow defining the method with the trivially unsatifiable bound without exploiting the higher-ranked trick (but it will still not be callable)
- RFC 3245 will allow
calling
<str as Trait>::method
and refined implementations more generally
But I feel removing the conflation between dyn
safety and Sized
would
be more clear and correct regardless of any future workarounds that may exist.
Receiver limitations
The requirement for some sort of Self
-based receiver on dyn
-dispatchable
methods is to ensure the vtable is available. Some wide pointer to Self
needs to be present in order to
find the vtable and perform dynamic dispatch.
Arguably this could be expanded to methods that take a single,
non-receiver &Self
and so on.
As for the other limitation on receiver types, the compiler has to know
how to go backwards from type erased version to original
version in order to
implement Trait
. This may be generalized some day, but for
now it's a restricted set.
Generic method limitations
In order to support type-generic methods, there would need to be a function pointer in the vtable for every possible type that the generic could take on. Not only would this create vtables of unwieldly size, it would also require some sort of global analysis. After all, every crate which uses your trait might define new types that meet the trait bounds in question, and they (or you) might also want to call the method using those types.
You can sometimes work around this limitation by type erasing the generic type parameter in question (in the main method, as an alternative method, or in a different "erased" trait). We'll see an example of this later.
Use of Self
limitations
Methods which take some form of Self
other than as a receiver
can depend on the parameter being exactly the same as the
implementing type. But this can't be relied upon once the base
types have been erased.
For example, consider PartialEq<Self>
:
#![allow(unused)] fn main() { // Simplified pub trait PartialEq { fn partial_eq(&self, rhs: &Self); } }
If this were implemented for dyn PartialEq
, the rhs
parameter
would be a &dyn PartialEq
like self
is. But there is no
guarantee that the base types are the same! Both u8
and String
implement PartialEq
for example, but there's no facility to
compare them for equality (and Rust has no interest in handling
this in an arbitrary way).
You can sometimes work around this by supplying your own implementations
for some other dyn Trait
, perhaps utilizing the Any
trait
to emulate dynamic typing and reflection.
We give an example of this approach later.
The impl Clone for Box<dyn Trait>
example
demonstrates handling a case where Self
is the return value.
GAT limitations
GATs are too new to support type erasing as-of-yet. We'll need
some way to embed the GAT into the dyn Trait
as a parameter,
similar to how is done for non-generic associated types.
As of Rust 1.72,
you can opt out of GATs being dyn
-usable, and thus out of the
necessity of naming the GAT as a parameter, by adding a
Self: Sized
bound.
This is similar to the same ability on non-generic associated types. Interestingly, it allows specifying not only specific GAT equalities...
#![allow(unused)] fn main() { trait Trait { type Gat<'a> where Self: Sized; } impl Trait for () { type Gat<'a> = &'a str; } let _: &dyn Trait<Gat<'static> = &'static str> = &(); }
...but also higher-ranked GAT equality:
#![allow(unused)] fn main() { trait Trait { type Gat<'a> where Self: Sized; } impl Trait for () { type Gat<'a> = &'a str; } // This syntax is still not supported // let _: &dyn Trait<for<'a> Gat<'a> = &'a str> = &(); // However, with `dyn Trait`, you can move the binder to outside the `Trait`: let _: &dyn for<'a> Trait<Gat<'a> = &'a str> = &(); }
However, as with the non-generic associated type case, making any use of the
equality would have to be done indirectly, as the dyn Trait
itself cannot
define a GAT in its own implementation.
Associated constant limitations
Similarly, supporting associated constants will require at least support for associated constant equality.
Return position impl Trait
limitations
Trait methods utilizing RPITs are, notionally at least, sugar for declaring an opaque associated type or generic associated type. Additionally, even if the RPIT captures no generic parameters and thus corresponds to returning an associated type, there is currently no way to name that associated type.
Similar to generic methods, you can sometimes work around this limitation by type erasing the return type. (Note that there are some trade-offs, but a discussion of such is more suited to a dedicated guide about RPITs.)
History
Object safety was introduced in RFC 0255,
and RFC 0546 removed the
implied Sized
bound on traits and added the rule that traits with (explicit) Sized
bounds
are not object safe.
Both RFCs were implemented before Rust 1.0.
dyn Trait
lifetimes
As mentioned before, every dyn Trait
has a "trait object lifetime". Even though
it is often elided, the lifetime is always present.
The lifetime is necessary as types which implement Trait
may not be valid everywhere.
For example, &'s String
implements Display
for any lifetime 's
. If you type
erase a &'s String
into a dyn Display
, Rust needs to keep track of that lifetime
so you don't try to print the value after the reference becomes invalid.
So you can coerce &'s String
to dyn Display + 's
, but not dyn Display + 'static
.
Let's look at a couple examples:
#![allow(unused)] fn main() { use core::fmt::Display; fn fails() -> Box<dyn Display + 'static> { let local = String::new(); // This reference cannot be longer than the function body let borrow = &local; // We can coerce it to `dyn Display`... let bx: Box<dyn Display + '_> = Box::new(borrow); // But the lifetime cannot be `'static`, so this is an error bx } }
#![allow(unused)] fn main() { use core::fmt::Display; // This is fine as per the function lifetime elision rules, the lifetime of the // `dyn Display + '_` is the same as the lifetime of the `&String`, and we know // the reference is valid for that long or it wouldn't be possible to call the // function. fn works(s: &String) -> Box<dyn Display + '_> { Box::new(s) } }
When multiple lifetimes are involved
Let's try another example, with a struct
that has more complicated lifetimes.
#![allow(unused)] fn main() { trait Trait {} // We're using `*mut` to make the lifetimes invariant struct MultiRef<'a, 'b>(*mut &'a str, *mut &'b str); impl Trait for MultiRef<'_, '_> {} fn foo<'a, 'b>(mr: MultiRef<'a, 'b>) { let _: Box<dyn Trait + '_> = Box::new(mr); } }
This compiles, but there's nothing preventing either 'a
from being longer than 'b
,
or 'b
from being longer than 'a
. So what's the lifetime of the dyn Trait
? It
can't be either 'a
or 'b
:
#![allow(unused)] fn main() { trait Trait {} #[derive(Copy, Clone)] struct MultiRef<'a, 'b>(*mut &'a str, *mut &'b str); impl Trait for MultiRef<'_, '_> {} // These both fail fn foo<'a, 'b>(mr: MultiRef<'a, 'b>) { let _: Box<dyn Trait + 'a> = Box::new(mr); let _: Box<dyn Trait + 'b> = Box::new(mr); } }
In this case, the compiler computes some lifetime, let's call it 'c
,
such that 'a
and 'b
are both valid for the entirety of 'c
.
That is, 'c
is contained in an intersection of 'a
and 'b
.
Any lifetime for which both 'a
and 'b
are valid over will do:
#![allow(unused)] fn main() { trait Trait {} struct MultiRef<'a, 'b>(*mut &'a str, *mut &'b str); impl Trait for MultiRef<'_, '_> {} // `'c` must be within the intersection of `'a` and `'b` fn foo<'a: 'c, 'b: 'c, 'c>(mr: MultiRef<'a, 'b>) { let _: Box<dyn Trait + 'c> = Box::new(mr); } }
Note that this is not the same as 'a + 'b
-- that is the union
of 'a
and 'b
. Unfortunately, there is no compact syntax
for the intersection of 'a
and 'b
.
Variance
The dyn Trait
lifetime is covariant, like the outer lifetime of a
reference. This means that whenever it is in a covariant type position,
longer lifetimes can be coerced into shorter lifetimes.
#![allow(unused)] fn main() { trait Trait {} fn why_be_static<'a>(bx: Box<dyn Trait + 'static>) -> Box<dyn Trait + 'a> { bx } }
The trait object with the longer lifetime is a subtype of the trait object with the shorter lifetime, so this is a form of supertype coercion. In the next section, we'll look at another form of trait object subtyping.
The idea behind why trait object lifetimes are covariant is that the lifetime represents the region where it is still valid to call methods on the trait object. Since it's valid to call methods anywhere in that region, it's also valid to restrict the region to some subset of itself -- i.e. to coerce the lifetime to be shorter.
However, it turns out that the dyn Trait
lifetime is even more flexible than
your typical covariant lifetime.
Unsizing coercions in invariant context
Earlier we noted that
you can cast a dyn Trait + 'a
to a dyn Trait + 'b
, where 'a: 'b
.
Well, isn't that just covariance? Not quite -- when we noted this before,
we were talking about an unsizing coercion between two dyn Trait + '_
.
And that coercion can take place even in invariant position. That means
that the dyn Trait
lifetime can act in a covariant-like fashion even in
invariant contexts!
For example, this compiles, even though the dyn Trait
is behind a &mut
:
#![allow(unused)] fn main() { trait Trait {} fn invariant_coercion<'m, 'long: 'short, 'short>( arg: &'m mut (dyn Trait + 'long) ) -> &'m mut (dyn Trait + 'short) { arg } }
But as there are no nested unsizing coercions, this version does not compile:
#![allow(unused)] fn main() { use std::cell::Cell; trait Trait {} // Fails: `Cell<T>` is invariant in `T` and the `dyn Trait` is nested fn foo<'l: 's, 's>(v: Cell<Box<Box<dyn Trait + 'l>>>) -> Cell<Box<Box<dyn Trait + 's>>> { v } }
Because this is an unsizing coercion and not a subtyping coercion, there may be situations where you must make the coercion explicitly, for example with a cast.
#![allow(unused)] fn main() { trait Trait {} // This fails without the `as _` cast. fn foo<'a>(arg: &'a mut Box<dyn Trait + 'static>) -> Option<&'a mut (dyn Trait + 'a)> { true.then(move || arg.as_mut() as _) } }
Why this is actually a critical feature
We'll examine elided lifetime in depth soon, but let us note here how this "ultra-covariance" is very important for making common patterns usably ergonomic.
The signatures of foo
and bar
are effectively the same in the following example:
#![allow(unused)] fn main() { trait Trait {} fn foo(d: &mut dyn Trait) {} fn bar<'a>(d: &'a mut (dyn Trait + 'a)) { foo(d); foo(d); } }
We can call foo
multiple times from bar
by reborrowing
the &'a mut dyn Trait
for shorter than 'a
. But because the trait object
lifetime must match the outer &mut
lifetime in this case, we also have
to coerce dyn Trait + 'a
to that shorter lifetime.
Similar considerations come into play when going between a &mut Box<dyn Trait>
and a &mut dyn Trait
:
#![allow(unused)] fn main() { trait Trait {} fn foo(d: &mut dyn Trait) {} fn bar<'a>(d: &'a mut (dyn Trait + 'a)) { foo(d); foo(d); } fn baz(bx: &mut Box<dyn Trait /* + 'static */>) { // If the trait object lifetime could not "shrink" inside the `&mut`, // we could not make these calls at all foo(&mut **bx); bar(&mut **bx); } }
Here we reborrow **bx
as &'a mut (dyn Trait + 'static)
for some
short-lived 'a
, and then coerce that to a &'a mut (dyn Trait + 'a)
.
Variance in nested context
The supertype coercion of going from dyn Trait + 'a
to dyn Trait + 'b
when 'a: 'b
can happen in deeply nested contexts, provided the trait
object is still in a covariant context. So unlike the Cell
version
above, this version compiles:
#![allow(unused)] fn main() { trait Trait {} fn foo<'l: 's, 's>(v: Vec<Box<Box<dyn Trait + 'l>>>) -> Vec<Box<Box<dyn Trait + 's>>> { v } }
Higher-ranked types
Another feature of trait objects is that they can be higher-ranked over lifetime parameters of the trait:
// A trait with a lifetime parameter trait Look<'s> { fn method(&self, s: &'s str); } // An implementation that works for any lifetime impl<'s> Look<'s> for () { fn method(&self, s: &'s str) { println!("Hi there, {s}!"); } } fn main() { // A higher-ranked trait object // vvvvvvvvvvvvvvvvvvvvvvvv let _bx: Box<dyn for<'any> Look<'any>> = Box::new(()); }
The for<'x>
part is a lifetime binder that introduces higher-ranked
lifetimes. There can be more than one lifetime, and you can give them
arbitrary names just like lifetime parameters on functions, structs,
and so on.
You can only coerce to a higher-ranked trait object if you implement the trait in question for all lifetimes. For example, this doesn't work:
trait Look<'s> { fn method(&self, s: &'s str); } impl<'s> Look<'s> for &'s i32 { fn method(&self, s: &'s str) { println!("Hi there, {s}!"); } } fn main() { let _bx: Box<dyn for<'any> Look<'any>> = Box::new(&0); }
&'s i32
only implements Look<'s>
, not Look<'a>
for all lifetimes 'a
.
Similarly, this won't work either:
trait Look<'s> { fn method(&self, s: &'s str); } impl Look<'static> for i32 { fn method(&self, s: &'static str) { println!("Hi there, {s}!"); } } fn main() { let _bx: Box<dyn for<'any> Look<'any>> = Box::new(0); }
Implementing the trait with 'static
as the lifetime parameter is not the
same thing as implementing the trait for any lifetime as the parameter.
Traits and trait implementations don't have something like variance; the
parameters of traits are always invariant and thus implementations are
always for the explicit lifetime(s) only.
Subtyping
There's a relationship between higher-ranked types like dyn for<'any> Look<'any>
and non-higher-ranked types like dyn Look<'x>
(for a single lifetime 'x
): the
higher-ranked type is a subtype of the non-higher-ranked types. Thus you can
coerce a higher-ranked type to a non-higher-ranked type with any concrete lifetime:
#![allow(unused)] fn main() { trait Look<'s> { fn method(&self, s: &'s str); } fn as_static(bx: Box<dyn for<'any> Look<'any>>) -> Box<dyn Look<'static>> { bx } fn as_whatever<'w>(bx: Box<dyn for<'any> Look<'any>>) -> Box<dyn Look<'w>> { bx } }
Note that this still isn't a form of variance for the lifetime parameter of the
trait. This fails for example, because you can't coerce from dyn Look<'static>
to dyn Look<'w>
:
#![allow(unused)] fn main() { trait Look<'s> { fn method(&self, s: &'s str); } fn as_static(bx: Box<dyn for<'any> Look<'any>>) -> Box<dyn Look<'static>> { bx } fn as_whatever<'w>(bx: Box<dyn for<'any> Look<'any>>) -> Box<dyn Look<'w>> { as_static(bx) } }
As a supertype coercion, going from higher-ranked to non-higher-ranked can apply even in a covariant nested context, just like non-higher-ranked supertype coercions:
#![allow(unused)] fn main() { trait Look<'s> {} fn foo<'l: 's, 's, 'p>( v: Vec<Box<dyn for<'any> Look<'any> + 'l>> ) -> Vec<Box<dyn Look<'p> + 's>> { v } }
Fn
traits and fn
pointers
The Fn
traits (FnOnce
,
FnMut
,
and Fn
)
have special-cased syntax. For one, you write them out to look more like
a function, using (TypeOne, TypeTwo)
to list the input parameters and
-> ResultType
to list the associated type. But for another, elided
input lifetimes are sugar that introduces higher-ranked bindings.
For example, these two trait object types are the same:
#![allow(unused)] fn main() { fn identity(bx: Box<dyn Fn(&str)>) -> Box<dyn for<'any> Fn(&'any str)> { bx } }
This is similar to how elided lifetimes work for function declarations as well, and indeed, the same output lifetime elision rules also apply:
#![allow(unused)] fn main() { // The elided input lifetime becomes a higher-ranked lifetime // The elided output lifetime is the same as the single input lifetime // (underneath the binder) fn identity(bx: Box<dyn Fn(&str) -> &str>) -> Box<dyn for<'any> Fn(&'any str) -> &'any str> { bx } }
#![allow(unused)] fn main() { // Doesn't compile as what the output lifetime should be is // considered ambiguous fn ambiguous(bx: Box<dyn Fn(&str, &str) -> &str>) {} // Here's a possible fix, which is also an example of // multiple lifetimes in the binder fn first(bx: Box<dyn for<'a, 'b> Fn(&'a str, &'b str) -> &'a str>) {} }
Function pointers are another example of types which can be higher-ranked
in Rust. They have analogous syntax and sugar to function declarations
and the Fn
traits.
#![allow(unused)] fn main() { fn identity(fp: fn(&str) -> &str) -> for<'any> fn(&'any str) -> &'any str { fp } }
Syntactic inconsistencies
There are some inconsistencies around the syntax for function declarations,
function pointer types, and the Fn
traits involving the "names" of the
input arguments.
First of all, only function (method) declarations can make use of the
shorthand self
syntaxes for receivers, like &self
:
#![allow(unused)] fn main() { struct S; impl S { fn foo(&self) {} // ^^^^^ } }
This exception is pretty unsurprising as the Self
alias only exists
within those implementation blocks.
Each non-self
argument in a function declaration is an
irrefutable pattern
followed by a type annotation. It is an error to leave out the pattern;
if you don't use the argument (and thus don't need to name it), you
still need to use at least the wildcard pattern.
#![allow(unused)] fn main() { fn this_works(_: i32) {} fn this_fails(i32) {} }
There is an accidental exception to this rule, but it was removed in Edition 2018 and thus is only available on Edition 2015.
In contrast, each argument in a function pointer can be
- An identifier followed by a type annotation (
i: i32
) _
followed by a type annotation (_: i32
)- Just a type name (
i32
)
So these all work:
#![allow(unused)] fn main() { let _: fn(i32) = |_| {}; let _: fn(i: i32) = |_| {}; let _: fn(_: i32) = |_| {}; }
But actual patterns are not allowed:
#![allow(unused)] fn main() { let _: fn(&i: &i32) = |_| {}; }
The idiomatic form is to just use the type name.
It's also allowed to have colliding names in function pointer arguments, but this is a property of having no function body -- so it's also possible in a trait method declaration, for example. It is also related to the Edition 2015 exception for anonymous function arguments mentioned above, and may be deprecated eventually.
#![allow(unused)] fn main() { trait Trait { fn silly(a: u32, a: i32); } let _: fn(a: u32, a: i32) = |_, _| {}; }
Finally, each argument in the Fn
traits can only be a type name:
no identifiers, _
, or patterns allowed.
#![allow(unused)] fn main() { // None of these compile let _: Box<dyn Fn(i: i32)> = Box::new(|_| {}); let _: Box<dyn Fn(_: i32)> = Box::new(|_| {}); let _: Box<dyn Fn(&_: &i32)> = Box::new(|_| {}); }
Why the differences? One reason is that
patterns are grammatically incompatible with anonymous arguments,
apparently.
I'm uncertain as to why identifiers are accepted on function pointers,
however, or more generally why the Fn
sugar is inconsistent with
function pointer types. But the simplest explanation is that function
pointers existed first with nameable parameters for whatever reason,
whereas the Fn
sugar is for trait input type parameters which also
do not have names.
Higher-ranked trait bounds
You can also apply higher-ranked trait bounds (HRTBs) to generic type parameters, using the same syntax:
#![allow(unused)] fn main() { trait Look<'s> { fn method(&self, s: &'s str); } fn box_it_up<'t, T>(t: T) -> Box<dyn for<'any> Look<'any> + 't> where T: for<'any> Look<'any> + 't, { Box::new(t) } }
The sugar for Fn
like traits applies here as well. You've probably
already seen bounds like this on methods that take closures:
#![allow(unused)] fn main() { struct S; impl S { fn map<'s, F, R>(&'s self, mut f: F) -> impl Iterator<Item = R> + 's where F: FnMut(&[i32]) -> R + 's { // This part isn't the point ;-) [].into_iter().map(f) } } }
That bound is actually F: for<'x> FnMut(&'x [i32]) -> R + 's
.
That's all about higher-ranked types for now
Hopefully this has given you a decent overview of higher-ranked
types, HRTBs, and how they relate to the Fn
traits. There
are a lot more details and nuances to those topics and related
concepts such as closures, as you might imagine. However, an
exploration of those topics deserves its own dedicated guide, so
we won't see too much more about higher-ranked types in this
tour of dyn Trait
.
Elision rules
The dyn Trait
lifetime elision rules are an instance of fractal complexity
in Rust. Some general guidelines will get you 95% of the way there, some
advanced guidelines will get you another 4% of the way there, but the deeper
you go the more niche circumstances you may run into. And unfortunately,
there is no proper specification to refer to.
The good news is that you can override the lifetime elision behavior by being explicit about the lifetime, which provides an escape hatch from most of the complexity. So when in doubt, be explicit!
In the following subsections, we present the current behavior of the compiler in layers, to the extent we have explored them.
We occasionally refer to the reference's documentation on trait object lifetime elision. However, our layered approach differs somewhat from the reference's approach, as the reference is not completely accurate.
Basic guidelines and subtleties
As a reminder, dyn Trait
is a type constructor which is parameterized with a
lifetime; a fully resolved type includes the lifetime, such as dyn Trait + 'static
.
The lifetime can be elided in many situations, in which case the actual lifetime
used may take on some default lifetime, or may be inferred.
When talking about default trait object (dyn Trait
) lifetimes, we're talking about
situations where the lifetime has been completely elided. If the wildcard lifetime
is used (dyn Trait + '_
), then the normal lifetime elision rules
usually apply instead. (The exceptions are rare, and you can usually be explicit
instead if you need to.)
For a completely elided dyn Trait
lifetime, you can start with these
general guidelines for traits with no lifetime bounds (which are the vast majority):
- In function bodies, the trait object lifetime is inferred (i.e. ignore the following bullets)
- For references like
&'a dyn Trait
, the default is the same as the reference lifetime ('a
) - For
dyn
-supportingstd
types with lifetime parameters such asRef<'a, T>
, it is also'a
- For non-lifetime-parameter types like
Box<dyn Trait>
, and for baredyn Trait
, it's'static
And for the (rare) trait with lifetime bounds:
- If the trait has a
'static
bound, the trait object lifetime is always'static
- If the trait has only non-
'static
lifetime bounds, you're better off being explicit
This is a close enough approximation to let you understand dyn Trait
lifetime elision most of the time, but there are exceptions to these
guidelines (which are explored on the next couple of pages).
There are also a few subtleties worth pointing out within these guidelines, which are covered immediately below.
Default 'static
bound gotchas
The most likely scenario to run into an error about dyn Trait
lifetime is
when Box
or similar is involved, resulting an implicit 'static
constraint.
Those errors can often be addressed by either adding an explicit 'static
bound, or by overriding the implicit 'static
lifetime. In particular, using
'_
will usually result in the "normal" (non-dyn Trait
) lifetime elision for
the given context.
#![allow(unused)] fn main() { trait Trait {} impl<T: Trait> Trait for &T {} // Remove `+ 'static` to see an error fn with_explicit_bound<'a, T: Trait + 'static> (t: T) -> Box<dyn Trait> { Box::new(t) } // Remove `+ 'a` (in either position) to see an error fn with_nonstatic_box<'a, T: Trait + 'a>(t: T) -> Box<dyn Trait + 'a> { Box::new(t) } // Remove `+ '_` to see an error fn with_fn_lifetime_elision(t: &impl Trait) -> Box<dyn Trait + '_> { Box::new(t) } }
This can be particularly confusing within a function body, where a
Box<dyn Trait>
variable annotation acts differently from a Box<dyn Trait>
function input parameter annotation:
#![allow(unused)] fn main() { trait Trait {} impl Trait for &i32 {} // In this context, the elided lifetime is `'static` fn requires_static(_: Box<dyn Trait>) {} fn example() { let local = 0; // In this context, the annotation means `Box<dyn Trait + '_>`! // That is why it can compile on it's own, with the local reference. let bx: Box<dyn Trait> = Box::new(&local); // So despite using the same syntax, this call cannot compile. // Uncomment it to see the compilation error. // requires_static(bx); } }
impl
headers
The dyn Trait
lifetime elision applies in impl
headers, which can lead to
implementations being less general than possible or desired:
#![allow(unused)] fn main() { trait Trait {} trait Two {} impl Two for Box<dyn Trait> {} impl Two for &dyn Trait {} }
Two
is implemented for
Box<dyn Trait + 'static>
&'a (dyn Trait + 'a)
for any'a
(the lifetimes must match)
Consider using implementations like the following if possible, as they are more general:
#![allow(unused)] fn main() { trait Trait {} trait Two {} // Implemented for all lifetimes impl Two for Box<dyn Trait + '_> {} // Implemented for all lifetimes such that the inner lifetime is // at least as long as the outer lifetime impl Two for &(dyn Trait + '_) {} }
Alias gotchas
Similar to impl
headers, elision will apply when defining a type alias:
#![allow(unused)] fn main() { trait MyTraitICouldNotThinkOfAShortNameFor {} // This is an alias to `dyn ... + 'static`! type MyDyn = dyn MyTraitICouldNotThinkOfAShortNameFor; // The default does not "override" the type alias and thus // requires the trait object lifetime to be `'static` fn foo(_: &MyDyn) {} // As per the `dyn` elision rules, this requires the trait // object lifetime to be the same as the reference... fn bar(d: &dyn MyTraitICouldNotThinkOfAShortNameFor) { // ...and thus this fails as the lifetime cannot be extended foo(d); } }
More generally, elision does not "penetrate" or alter type aliases.
This includes the Self
alias within implementation blocks.
#![allow(unused)] fn main() { trait Trait {} impl dyn Trait { // Error: requires `T: 'static` fn f<T: Trait>(t: &T) -> &Self { t } } impl<'a> dyn Trait + 'a { // Error: requires `T: 'a` fn g<T: Trait>(t: &T) -> &Self { t } } }
See also how type aliases with parameters behave.
'static
traits
When the trait itself is 'static
, the trait object lifetime has an implied
'static
bound. Therefore if you name the trait object lifetime explicitly,
the name you give it will also have an implied 'static
bound. So here:
use core::any::Any; // n.b. trait `Any` has a `'static` bound fn example<'a>(_: &'a (dyn Any + 'a)) {} fn main() { let local = (); example(&local); }
We get an error that the borrow of local
must be 'static
. The problem is
that 'a
in example
has inherited the 'static
bound ('a: 'static
), and
we also gave the outer reference the lifetime of 'a
. This is a case where
we don't actually want them to be the same.
The most ergonomic solution is to always completely elide the trait object
lifetime when the trait itself has a 'static
bound. Unlike other cases,
the trait object lifetime is independent of the outer reference lifetime when
the trait itself has a 'static
bound, so this compiles:
use core::any::Any; // This is `&'a (dyn Any + 'static)` and `'a` doesn't have to be `'static` fn example(_: &dyn Any) {} fn main() { let local = (); example(&local); }
Any
is the most common trait with a 'static
bound, i.e. the most likely
reason for you to encounter this scenario.
static
contexts
In some contexts like when declaring a static
, it's possible to elide the
lifetime of types like references; doing so will result in 'static
being
used for the elided lifetime:
#![allow(unused)] fn main() { // The elided lifetime is `'static` static S: &str = ""; const C: &str = ""; }
As a result, elided dyn Trait
lifetimes will by default also be 'static
,
matching the inferred lifetime of the reference. In contrast, this fails
due to the outer lifetime being 'static
:
#![allow(unused)] fn main() { trait Trait {} impl Trait for () {} struct S<'a: 'b, 'b>(&'b &'a str); impl<'a: 'b, 'b> S<'a, 'b> { const T: &(dyn Trait + 'a) = &(); } }
In this context, eliding all the lifetimes is again usually what you want.
Advanced guidelines
In this section, we cover how to guide elision behavior for your own generic data types, and point out some exceptions to the basic guidelines presented in the previous section.
Guiding behavior of your own types
When you declare a custom type with a lifetime parameter 'a
and a trait parameter T: ?Sized
,
including an explicit T: 'a
bound will result in elision behaving the same as
it does for references &'a T
and for std
types like Ref<'a, T>
:
#![allow(unused)] fn main() { // When `T` is replaced by `dyn Trait` with an elided lifetime, the elided lifetime // will default to `'a` outside of function bodies struct ExplicitOutlives<'a, T: 'a + ?Sized>(&'a T); }
If your type has no lifetime parameter, or if there is no bound between the type
parameter and the lifetime parameter, the default for elided dyn Trait
lifetimes
will be 'static
, like it is for Box<T>
. This is true even if there is an
implied T: 'a
bound. For example:
#![allow(unused)] fn main() { trait Trait {} impl Trait for () {} // There's an *implied* `T: 'a` bound due to the `&'a T` field (RFC 2093) struct InferredOutlivesOnly<'a, T: ?Sized>(&'a T); // Yet this function expects an `InferredOutlivesOnly<'a, dyn Trait + 'static>` fn example<'a>(ioo: InferredOutlivesOnly<'a, dyn Trait>) {} // Thus this fails to compile fn attempt<'a>(ioo: InferredOutlivesOnly<'a, dyn Trait + 'a>) { example(ioo); } }
If you make T: 'a
explicit in the definition of the struct
, the
example will compile.
If T: 'a
is an inferred bound of your type, and T: ?Sized
, I recommend
including the explicit T: 'a
bound.
Ambiguous bounds
If you have more than one lifetime bound in your type definition, the
bound is considered ambiguous, even if one of the lifetimes is 'static
(or more generally, even if one lifetime is known to outlive the other).
Such structs are rare, but if you have one, you usually must be explicit
about the dyn Trait
lifetime:
#![allow(unused)] fn main() { trait Trait {} impl Trait for () {} struct S<'a, 'b: 'a, T: 'a + 'b + ?Sized>(&'a &'b T); // error[E0228]: the lifetime bound for this object type cannot be deduced // from context; please supply an explicit bound const C: S<dyn Trait> = S(&&()); }
However, in function bodies, the lifetime is still inferred; moreover it is inferred independent of any annotation of the lifetime types:
#![allow(unused)] fn main() { trait Trait {} impl Trait for () {} struct Weird<'a, 'b, T: 'a + 'b + ?Sized>(&'a T, &'b T); fn example<'a, 'b>() { // Either of `dyn Trait + 'a` or `dyn Trait + 'b` is an error, // so the `dyn Trait` lifetime must be inferred independently // from `'a` and `'b` let _: Weird<'a, 'b, dyn Trait> = Weird(&(), &()); } }
(This is contrary to the documentation in the reference, and ironically more flexible than non-ambiguous types. In this particular example, the lifetime will be inferred analogously to the lifetime intersection mentioned previously.)
Interaction with type aliases
When you use a type alias, the bounds between lifetime parameters and type
parameters on the type
alias determine how dyn Trait
lifetime elision
behaves, overriding the bounds on the aliased type (be they stronger or weaker).
#![allow(unused)] fn main() { trait Trait {} // Without the `T: 'a` bound, the default trait object lifetime // for this alias is `'static` type MyRef<'a, T> = &'a T; // So this compiles fn foo(mr: MyRef<'_, dyn Trait>) -> &(dyn Trait + 'static) { mr } // With the `T: 'a` bound, the default trait object lifetime for // this alias is the lifetime parameter type MyOtherRef<'a, T: 'a> = MyRef<'a, T>; // So this does not compile fn bar(mr: MyOtherRef<'_, dyn Trait>) -> &(dyn Trait + 'static) { mr } }
See issue 100270. This is undocumented.
Associated types and GATs
dyn Trait
lifetime elision applies in this context. There are some
things of note, however:
- Bounds on associated types and GATs don't seem to have any effect
- Eliding non-
dyn Trait
lifetimes is not allowed
For example:
#![allow(unused)] fn main() { trait Trait {} trait Assoc { type T: ?Sized; } impl Assoc for () { // dyn Trait + 'static type T = dyn Trait; } impl<'a> Assoc for &'a str { // &'a (dyn Trait + 'a) type T = &'a dyn Trait; // This is a compilation error as the reference lifetime is elided // type T = &dyn Trait; } }
#![allow(unused)] fn main() { trait Trait {} impl Trait for () {} trait BoundedAssoc<'x> { type BA: 'x + ?Sized; } // Still `dyn Trait + 'static` impl<'x> BoundedAssoc<'x> for () { type BA = dyn Trait; } // Fails as `'a` might not be `'static` fn bib<'a>(obj: Box<dyn Trait + 'a>) { let obj: Box< <() as BoundedAssoc<'a>>::BA > = obj; } }
An exception to inference in function bodies
There is also an exception to the elided dyn Trait
lifetime being inferred
in function bodies. If you have a reference-like type, and you annotate the
lifetime of the non-dyn Trait
lifetime with a named lifetime, then the
elided dyn Trait
lifetime will be the same as the annotated lifetime
(similar to how things behave outside of a function body):
#![allow(unused)] fn main() { trait Trait {} impl Trait for () {} fn example<'a>(arg: &'a ()) { let dt: &'a dyn Trait = arg; // fails let _: &(dyn Trait + 'static) = dt; } }
According to the reference, &dyn Trait
should always behave like this.
However, if the outer lifetime is elided or if '_
is used for the outer lifetime,
the dyn Trait
lifetime is inferred independently of the reference lifetime:
#![allow(unused)] fn main() { trait Trait {} impl Trait for () {} fn example() { let local = (); // The outer reference lifetime cannot be `'static`... let obj: &dyn Trait = &local; // Yet the `dyn Trait` lifetime is! let _: &(dyn Trait + 'static) = obj; } }
This is not documented anywhere and is in conflict with the reference. It was implemented here, with no team input or FCP. 🤷
However, the chances that you will run into a problem due to this behavior is low, as it's rare to annotate lifetimes within a function body.
What we are not yet covering
To the best of our knowledge, this covers the behavior of dyn Trait
lifetime elision
when there are no lifetime bounds on the trait itself. Non-'static
lifetime bounds
on the trait itself lead to some more nuanced behavior; we'll cover some of them in the
next section.
Advanced guidelines summary
So all in all, we have three common categories of dyn Trait
lifetime elision
when ignoring lifetime bounds on traits:
'static
for type parameters with no (explicit) lifetime bound- E.g.
Box<dyn Trait>
(Box<dyn Trait + 'static>
) - E.g.
struct Unbounded<'a, T: ?Sized>(&'a T)
- E.g.
- Another lifetime parameter for type parameters with a single (explicit) lifetime bound
- E.g.
&'a dyn Trait
(&'a dyn Trait + 'a
) - E.g.
Ref<'a, dyn Trait>
(Ref<'a, dyn Trait + 'a>
) - E.g.
struct Bounded<'a, T: 'a + ?Sized>(&'a T)
- E.g.
- Ambiguous due to multiple bounds (rare)
- E.g.
struct Weird<'a, T: 'a + 'static>(&'a T);
- E.g.
And the behavior in various contexts is:
static | impl | [G]AT | fn in | fn out | fn body | |
---|---|---|---|---|---|---|
Box<dyn Trait> | 'static | 'static | 'static | 'static | 'static | Inferred |
&dyn Trait | Ref | Ref | E0637 | Ref | Ref | Inferred |
&'a dyn Trait | Ref | Ref | Ref | Ref | Ref | Ref |
Ambig. | E0228 | E0228 | E0228 | E0228 | E0228 | Inferred |
With the following notes:
type
alias bounds take precedence over the aliased type bounds- Associated type and GAT bounds do not effect the default
For contrast, the "normal" elision rules work like so:
static | impl | [G]AT | fn in | fn out | fn body | |
---|---|---|---|---|---|---|
Box<dyn Tr + '_> | 'static | Fresh | E0637 | Fresh | Elision | Inferred |
&(dyn Trait + '_ ) | 'static | Fresh | E0637 | Fresh | Elision | Inferred |
&'a (dyn Trait + '_ ) | 'static | Fresh | E0637 | Fresh | Elision | Inferred |
Ambig. with '_ | 'static | Fresh | E0637 | Fresh | Elision | Inferred |
Influences from trait lifetime bounds
When the trait itself has lifetime bounds, those bounds may influence the
behavior of dyn Trait
lifetime elision. Where and how the influence does
or does not take place is not properly documented, but we'll cover some
cases here.
The way trait object lifetime defaults behave in these scenarios is not intuitive, and perhaps even arbitrary. But to be clear, you will probably never need to actually know the exact rules. Traits with exotic lifetime bounds are rare, and should you actually encounter one, you can usually choose to be explicit instead of trying to figure out what lifetime is the default when elided.
Which is to say, this subsection is more of an exploration of the compiler's current behavior than something useful to learn. If you're trying to learn practical Rust, you should probably just skip it.
A very high level summary is:
- Trait bounds introduce implied bounds on the trait object lifetimes
- Elision in the presence of non-
'static
trait lifetime bounds is arbitrary, so prefer to be explicit - Prefer not to add non-
'static
lifetime bounds to your own object safe traits- Avoid multiple lifetime bounds in particular
This section is also non-exhaustive. Given how many exceptions I have ran across, take my assertive statements in this section with a grain of salt.
Trait lifetime bounds create an implied bound
The trait bound creates an implied bound on the dyn Trait
lifetime:
#![allow(unused)] fn main() { pub trait LifetimeTrait<'a, 'b>: 'a {} pub fn f<'b>(_: Box<dyn LifetimeTrait<'_, 'b> + 'b>) {} fn fp<'a, 'b, 'c>(t: Box<dyn LifetimeTrait<'a, 'b> + 'c>) { // This compiles which indicates an implied `'c: 'a` bound let c: &'c [()] = &[]; let _: &'a [()] = c; // This does not, demonstrating that `'c: 'b` is not implied // (i.e. the implied bound is on the trait object lifetime only, and // not on the other parameters.) //let _: &'b [()] = c; // This does not as it requires `'c: 'b` and `'b: 'a` //f(t); } }
This is similar to how &'b &'a _
creates an implied 'a: 'b
bound.
It only applies to the trait object lifetime, and not the entirety
of the dyn Trait
(e.g. it does not apply to trait parameters).
The 'static
case
We've already summarized the behavior of trait object lifetime elision when
the trait itself has a 'static
bound as part of our basic guidelines: the
lifetime in this case is always 'static
.
This applies even to
- types with ambiguous (more than one) lifetime bounds
- types with a single lifetime bound like
&_
- i.e. the trait object lifetime (which is
'static
) becomes independent of the outer lifetime
- i.e. the trait object lifetime (which is
- situations where a non-
'static
bound does not override the&_
trait object lifetime default, as in some of the examples further below
This case applies even if there are multiple bounds and only one of them is
'static
, in contrast with
bounds considered ambiguous from the struct definition.
A single trait lifetime bound does not always apply
According to the reference, the default trait object lifetime for a trait with a single lifetime bound in the context of a generic struct with no lifetime bounds is always the lifetime in the trait's bound.
That's a mouthful, but the implication is that here:
#![allow(unused)] fn main() { trait Single<'a>: 'a {} }
The elided lifetime of Box<dyn Single<'a>>
is always 'a
.
However, this is not actually the case:
#![allow(unused)] fn main() { trait Single<'a>: 'a {} // The elided lifetime was `'static`, not `'a`, so this compiles fn foo<'a>(s: Box<dyn Single<'a>>) { let s: Box<dyn Single<'a> + 'static> = s; } }
#![allow(unused)] fn main() { trait Single<'a>: 'a {} // In this case it *is* `'a`, so compilation fails fn bar<'a: 'a>(s: Box<dyn Single<'a>>) { let s: Box<dyn Single<'a> + 'static> = s; } }
When they apply, trait lifetime bounds override struct bounds
According to the reference, bounds on the trait never override bounds on the struct. But based on my testing, the opposite is true: when bounds on the trait apply, they always override the bounds on the struct.
The complicated part is figuring out when they apply.
For example, the following compiles, but according to the reference it should be ambiguous due to the multiple lifetime bounds on the struct. It does not compile without the lifetime bound on the trait; the bound on the trait is overriding the ambiguous bounds on the struct.
#![allow(unused)] fn main() { use core::marker::PhantomData; // Remove `: 'a` to see the compile error pub trait LifetimeTrait<'a>: 'a {} pub struct Over<'a, T: 'a + 'static + ?Sized>(&'a T); pub struct Invariant<T: ?Sized>(*mut PhantomData<T>); unsafe impl<T: ?Sized> Sync for Invariant<T> {} pub static OS: Invariant<Over<'_, dyn LifetimeTrait>> = Invariant(std::ptr::null_mut()); }
Further below are some examples where the trait bound overrides the
&_
bounds as well, so it is not just ambiguous struct bounds which can
be overridden by trait bounds.
Multiple trait bounds can be ambiguous or can apply
The following is considered ambiguous due to the multiple lifetime bounds on the trait.
#![allow(unused)] fn main() { trait Double<'a, 'b>: 'a + 'b {} fn f<'a, 'b, T: Double<'a, 'b> + 'static>(t: T) { let bx: Box<dyn Double<'a, 'b>> = Box::new(t); // This version works: let bx: Box<dyn Double<'a, 'b> + 'static> = Box::new(t); } }
The current documentation is silent on this point, but a multiple-bound trait can still apply in such a way that it provides the default trait object lifetime.
#![allow(unused)] fn main() { pub trait Double<'a, 'b>: 'a + 'b {} fn x1<'a: 'a, 'b>(bx: Box<dyn Double<'a, 'b>>) { // This fails (the lifetime is not `'static`) //let bx: Box<dyn Double<'a, 'b> + 'static> = bx; // This also fails (the lifetime is not `'b` nor `'a + 'b`) //let bx: Box<dyn Double<'a, 'b> + 'b> = bx; // But this succeeds and we can conclude the lifetime is `'a` let bx: Box<dyn Double<'a, 'b> + 'a> = bx; } }
There's a subtle point here: the elided trait object lifetime is 'a
,
but there's an implied : 'a + 'b
bound on the trait object lifetime
due to the trait bounds. Therefore the function signature has an
implied 'a: 'b
bound, similar to when you have a &'b &'a _
argument.
Trait bounds always apply in function bodies
Based on my testing, the default trait object lifetime for annotations
of dyn Trait
in function bodies is always the trait bound. And in
fact, this bound even overrides the wildcard '_
lifetime annotation.
This is a surprising exception to the '_
annotation restoring "normal"
lifetime elision behavior.
#![allow(unused)] fn main() { trait Single<'a>: 'a {} fn baz<'long: 'a, 'a, T: 'long + Single<'a>>(s: T) { // This compiles with the assignment at the end: //let s: Box<dyn Single<'a> + 'long> = Box::new(s); // But none of these compile because `'a: 'long` does not hold: //let s: Box<dyn Single<'a>> = Box::new(s); //let s: Box<dyn Single<'_>> = Box::new(s); //let s: Box<dyn Single<'a> + '_> = Box::new(s); //let s: Box<dyn Single<'_> + '_> = Box::new(s); //let s: Box<dyn Single + '_> = Box::new(s); let s: Box<dyn Single> = Box::new(s); let s: Box<dyn Single<'_> + 'long> = s; } }
When and how to trait lifetime bounds apply?
Now that we've seen a number of examples, we can theorize when and how trait lifetime bounds apply. As the examples have already illustrated, there are very different rules for different contexts.
In function signatures
This appears to be the most complex and arbitrary context for trait object lifetime elision.
If you were paying close attention, you may have noticed that we occasionally had
trivial bounds like 'a: 'a
in the examples above, and that affected whether the
trait bounds applied or not. A lifetime parameter of a function with no
explicit bounds is known as a late-bound parameter, and
whether or not a lifetime is late-bound influences when the trait bounds apply
in function signatures.
Parameters which are not late-bound are early-bound.
Let us call a lifetime parameter of a trait which is also a bound of the trait a "bounding parameter". My hypothesis on the behavior is as follows:
- if any trait bound is
'static
, the default lifetime is'static
- if any bounding parameter is explicitly
'static
, the default lifetime is'static
- if exactly one bounding parameter is early-bound, the default lifetime is that lifetime
- including if it is in multiple positions, such as
dyn Double<'a, 'a>
- including if it is in multiple positions, such as
- if more than one bounding parameter is early-bound, the default lifetime is ambiguous
- if no bounding parameters are early-bound, the default lifetime depends on the
struct
bounds (the same as they do for a trait without bounds)
Note that in any case, the implied bounds on the trait object lifetime that exist due to the trait bounds are still in effect.
The requirement that exactly one of the bounding parameters is early-bound
or that any of them are 'static
are syntactical requirements, rather than
semantic ones. For example:
#![allow(unused)] fn main() { pub trait Double<'a, 'b>: 'a + 'b {} // Semantically, `'a` and `'b` must be `'static`. However the // parameters were not explicitly `'static` and thus this // trait object lifetime is considered ambiguous (even though, // due to the implied bounds, it must be `'static` too). fn foo<'a: 'static, 'b: 'static>(d: Box<dyn Double<'a, 'b>>) {} // Semantically, `'a` and `'b` must be the same. They are also // early-bound parameters due to the bounds. However the parameters // are not syntatically the same lifetime and thus this trait // object lifetime is considered ambiguous. fn bar<'a: 'b, 'b: 'a>(d: &dyn Double<'a, 'b>) {} }
But if you change either example to Double<'a, 'a>
, then
exactly one of the bounding parameters is early-bound, and they
will compile:
#![allow(unused)] fn main() { pub trait Double<'a, 'b>: 'a + 'b {} fn foo<'a: 'static, 'b: 'static>(d: Box<dyn Double<'a, 'a>>) {} fn bar<'a: 'b, 'b: 'a>(d: &dyn Double<'a, 'a>) {} }
Implicit bounds do not negate being late-bound
Note that when considering &dyn Trait
there is always an implied bound between the
outer reference's lifetime and the dyn Trait
(in addition to the implied bound from
the trait itself). However, these implied bounds are not enough to make the trait
bound apply on their own. A lifetime can be late-bound even when there are implied bounds.
#![allow(unused)] fn main() { pub trait LifetimeTrait<'a>: 'a {} impl LifetimeTrait<'_> for () {} // All of these compile with the `fp` function below, indicating that // the trait bound does in fact apply and results in a trait object // lifetime independent of the reference lifetime pub fn f<'a: 'a>(_: &dyn LifetimeTrait<'a>) {} //pub fn f<'a: 'a>(_: &'_ dyn LifetimeTrait<'a>) {} //pub fn f<'r, 'a: 'a>(_: &'r dyn LifetimeTrait<'a>) {} //pub fn f<'r: 'r, 'a: 'a>(_: &'r dyn LifetimeTrait<'a>) {} //pub fn f<'r, 'a: 'r + 'a>(_: &'r dyn LifetimeTrait<'a>) {} //pub fn f<'r: 'r, 'a: 'r>(_: &'r dyn LifetimeTrait<'a>) {} // However none of these compile with `fp`, indicating that the elided trait // object lifetime is defaulting to the reference lifetime "per normal". //pub fn f(_: &dyn LifetimeTrait) {} //pub fn f(_: &'_ dyn LifetimeTrait) {} //pub fn f<'r>(_: &'r dyn LifetimeTrait) {} //pub fn f<'r: 'r>(_: &'r dyn LifetimeTrait) {} //pub fn f(_: &dyn LifetimeTrait<'_>) {} //pub fn f(_: &'_ dyn LifetimeTrait<'_>) {} //pub fn f<'r>(_: &'r dyn LifetimeTrait<'_>) {} //pub fn f<'r: 'r>(_: &'r dyn LifetimeTrait<'_>) {} //pub fn f<'a>(_: &dyn LifetimeTrait<'a>) {} //pub fn f<'a>(_: &'_ dyn LifetimeTrait<'a>) {} //pub fn f<'r, 'a>(_: &'r dyn LifetimeTrait<'a>) {} //pub fn f<'r: 'r, 'a>(_: &'r dyn LifetimeTrait<'a>) {} // n.b. `'a` is invariant due to being a trait parameter fn fp<'a>(t: &(dyn LifetimeTrait<'a> + 'a)) { f(t); } }
The above examples also demonstrate that when trait bounds apply,
they do override non-ambiguous struct bounds (such as those of &_
).
Implied bounds and default object bounds interact
The interaction between what the default object lifetime is for a given signature can interact in potentially surprising ways. Consider this example:
#![allow(unused)] fn main() { pub trait LifetimeTrait<'a>: 'a {} // The implied bounds in `&'outer (dyn Lifetime<'param> + 'trait)` are: // - `'param: 'outer` (validity of the reference) // - `'trait: 'outer` (validity of the reference) // - `'trait: 'param` (from the trait bound) // // And as the trait bound does not apply to the elided parameter in this // case, we also have `'outer = 'trait` due to the "normal" default // lifetime behavior of `&_`. Adding that equality to the above bounds // results in a requirement that *all three lifetimes are the same*. // // And thus this compiles: pub fn g<'r, 'a>(d: &'r dyn LifetimeTrait<'a>) { let r: [&'r (); 1] = [&()]; let a: [&'a (); 1] = [&()]; let _: [&'a (); 1] = r; let _: [&'r (); 1] = a; let _: &'r (dyn LifetimeTrait<'r> + 'r) = d; let _: &'a (dyn LifetimeTrait<'a> + 'a) = d; } }
The results can be even more surprising with more complex bounds:
#![allow(unused)] fn main() { trait Double<'a, 'b>: 'a + 'b {} fn h<'a, 'b, T>(bx: Box<dyn Double<'a, 'b>>, t: &'a T) where &'a T: Send, // this makes `'a` early-bound { // `bx` is `Box<dyn Double<'a, 'b> + 'a>` as per the rules above, // so this does not compile: //let _: Box<dyn Double<'a, 'b> + 'static> = bx; // However, the implied bounds still apply, which means: // - `'a: 'a + 'b` // - So `'a: 'b` // // Which is why this can compile even though that bound // is not declared anywhere! let t: &'b T = t; // The lifetimes are still not the same, so this fails let _: &'a T = t; } }
The only reason that 'a: 'b
is an implied bound in the above example
is the interaction between
- the implied
: 'a + 'b
bound on the trait object lifetime - the default trait object lifetime being
'a
- due to
'a
being early-bound and'b
being late-bound
- due to
If 'b
was also early-bound, the default trait object lifetime would
be ambiguous. If 'a
wasn't early-bound, the default trait object
lifetime would be 'static
and there would be no implied 'a: 'b
bound.
The wildcard lifetime still introduces a fresh inference lifetime
Based on my testing, using '_
will behave like typical lifetime elision,
introducing a fresh inference lifetime in input position, and following
the function signature elision rules in output position.
Higher-ranked lifetimes are late-bound
Based on my testing, for<'a> dyn Trait...
lifetimes act the same as
late-bound lifetimes.
Function bodies
As mentioned above, trait object bounds always apply in function bodies, similar to function signatures where every lifetime is early-bound. This is true regardless of whether the lifetimes are early or late bound in the function signature.
#![allow(unused)] fn main() { trait Single<'a>: 'a {} fn foo<'r, 'a>(bx: Box<dyn Single<'a> + 'static>, rf: &'r (dyn Single<'a> + 'static)) { // Here it is `'a`, and not `'static` nor inferred let bx: Box<dyn Single<'a>> = bx; // So this fails //let _: Box<dyn Single<'a> + 'static> = bx; // Here it is `'a`, and not the same as the reference lifetime nor inferred let a: &dyn Single<'a> = rf; // So this succeeds let _: &(dyn Single<'a> + 'a) = a; // And this fails //let _: &(dyn Single<'a> + 'static) = a; // Same behavior when the reference lifetime is explicit let a: &'r dyn Single<'a> = rf; let _: &'r (dyn Single<'a> + 'a) = a; //let _: &'r (dyn Single<'a> + 'static) = a; // This also fails, demonstrating that `'r` is not `'a` //let _: &'a &'r () = &&(); } }
And unlike elsewhere, using '_
in place of complete trait object
lifetime elision in the function body does not restore the normal
lifetime elision behavior (which would be inferring the lifetime).
All three of the examples above behave identically if '_
is used.
#![allow(unused)] fn main() { trait Single<'a>: 'a {} fn foo<'r, 'a>(bx: Box<dyn Single<'a> + 'static>, rf: &'r (dyn Single<'a> + 'static)) { let bx: Box<dyn Single<'a> + '_> = bx; // Fails //let _: Box<dyn Single<'a> + 'static> = bx; let a: &(dyn Single<'a> + '_) = rf; let _: &(dyn Single<'a> + 'a) = a; // Fails //let _: &(dyn Single<'a> + 'static) = a; let a: &'r (dyn Single<'a> + '_) = rf; let _: &'r (dyn Single<'a> + 'a) = a; // Fails //let _: &'r (dyn Single<'a> + 'static) = a; } }
In combination with the behavior of function signatures, this can lead to some awkward situations.
#![allow(unused)] fn main() { trait Double<'a, 'b>: 'a + 'b {} // Here in the signature, `'_` acts like "normal" and creates an // independent lifetime for the trait object lifetime; let us call // it `'c`. Though independent, it is related due to the implied // bounds: `'c: 'a + 'b` fn foo<'a, 'b>(bx: Box<dyn Double<'a, 'b> + '_>) { // Here in the body, the default trait object lifetime is // considered ambiguous, and `'_` does not override this. // // Moreover, there is no way to name `'c` since it was // elided in the signature. We could annotate this as // either `'a` or `'b`, but cannot "preserve" the full // lifetime unless we change the function signature to // give the lifetime a name. let bx: Box<dyn Double<'a, 'b> + '_> = bx; } }
Static contexts
In most static contexts, any elided lifetimes (not just trait object
lifetimes) default to the 'static
lifetime.
#![allow(unused)] fn main() { use core::marker::PhantomData; trait Single<'a>: 'a + Send + Sync {} trait Halfie<'a, 'b>: 'a + Send + Sync {} trait Double<'a, 'b>: 'a + 'b + Send + Sync {} static BS: PhantomData<Box<dyn Single<'_>>> = PhantomData; static BH: PhantomData<Box<dyn Halfie<'_, '_>>> = PhantomData; static BD: PhantomData<Box<dyn Double<'_, '_>>> = PhantomData; static S_BS: PhantomData<Box<dyn Single<'static> + 'static>> = BS; static S_BH: PhantomData<Box<dyn Halfie<'static, 'static> + 'static>> = BH; static S_BD: PhantomData<Box<dyn Double<'static, 'static> + 'static>> = BD; const CS: PhantomData<Box<dyn Single<'_>>> = PhantomData; const CH: PhantomData<Box<dyn Halfie<'_, '_>>> = PhantomData; const CD: PhantomData<Box<dyn Double<'_, '_>>> = PhantomData; const S_CS: PhantomData<Box<dyn Single<'static> + 'static>> = CS; const S_CH: PhantomData<Box<dyn Halfie<'static, 'static> + 'static>> = CH; const S_CD: PhantomData<Box<dyn Double<'static, 'static> + 'static>> = CD; }
However, from Rust 1.64 forward, associated const
s were allowed to use general
elided lifetimes and the wildcard lifetime (as opposed to only elided
trait object lifetimes). This was an accidental stabilization which
will probably be removed or modified.
In the meanwhile, elided lifetimes act like independent lifetime
variables on the impl
block. Those in turn act like early-bound
lifetimes in function signatures.
#![allow(unused)] fn main() { use core::marker::PhantomData; trait Single<'a>: 'a + Send + Sync {} struct L<'l, 'm>(&'l str, &'m str); impl<'a, 'b> L<'a, 'b> { const CS: PhantomData<Box<dyn Single<'a>>> = PhantomData; // Fails //const S_CS: PhantomData<Box<dyn Single<'a> + 'static>> = Self::CS; const S_CS: PhantomData<Box<dyn Single<'a> + 'a>> = Self::CS; } }
Elided lifetimes can be inferred to be 'static
elsewhere...
#![allow(unused)] fn main() { use core::marker::PhantomData; trait Single<'a>: 'a + Send + Sync {} struct L<'l, 'm>(&'l str, &'m str); impl<'a, 'b> L<'a, 'b> { const ECS: PhantomData<Box<dyn Single<'_>>> = PhantomData; const SCS: PhantomData<Box<dyn Single<'static>>> = PhantomData; const S_ECS: PhantomData<Box<dyn Single<'static> + 'static>> = Self::ECS; const S_SCS: PhantomData<Box<dyn Single<'static> + 'static>> = Self::SCS; } }
...however, it's really a free variable. Therefore, cases such as this are considered ambiguous:
#![allow(unused)] fn main() { use core::marker::PhantomData; trait Double<'a, 'b>: 'a + 'b + Send + Sync {} struct L<'l, 'm>(&'l str, &'m str); impl<'a, 'b> L<'a, 'b> { const EBCD: PhantomData<Box<dyn Double<'a, '_>>> = PhantomData; } }
...and cases such this are considered to be a borrow check violation, as there are no outlives relationships between the anonymously introduced lifetime parameters:
#![allow(unused)] fn main() { use core::marker::PhantomData; trait Single<'a>: 'a + Send + Sync {} struct R<'l, 'm, 'r>(&'l str, &'m str, &'r ()); impl<'a, 'b, 'r> R<'a, 'b, 'r> where 'a: 'r, 'b: 'r { const ECS: PhantomData<&dyn Single<'_>> = PhantomData; const RECS: PhantomData<&'r dyn Single<'_>> = PhantomData; } }
(There is no implicit bound due to nesting the lifetimes because
the nesting occurs in the body of the impl
block and not the header.)
impl
headers
Trait bounds always apply in impl
headers.
#![allow(unused)] fn main() { trait Single<'a>: 'a {} trait Halfie<'a, 'b>: 'a {} trait Double<'a, 'b>: 'a + 'b {} struct S<T>(T); // The trait bounds apply impl<'a> S<Box<dyn Single<'a>>> { fn f01() {} } // 'a (not 'static) impl<'a, 'r> S<&'r dyn Single<'a>> { fn f02() {} } // 'a (not 'r) impl<'a, 'b> S<Box<dyn Halfie<'a, 'b>>> { fn f03() {} } // 'a (not 'static) impl<'a, 'b, 'r> S<&'r dyn Halfie<'a, 'b>> { fn f04() {} } // 'a (not 'r) // Ambiguous (uncomment for error) // impl<'a, 'b> S<Box<dyn Double<'a, 'b>>> { fn f05() {} } // impl<'a, 'b, 'r> S<&'r dyn Double<'a, 'b>> { fn f05() {} } // Try `+ 'static` or `+ 'r` for errors fn f<'a, 'b, 'r>(_: &'r &'a str, _: &'r &'b str) { S::<Box<dyn Single<'a> + 'a>>::f01(); S::<&'r (dyn Single<'a> + 'a)>::f02(); S::<Box<dyn Halfie<'a, 'b> + 'a>>::f03(); S::<&'r (dyn Halfie<'a, 'b> + 'a)>::f04(); } }
As in function signatures, but unlike function bodies, the wildcard lifetime
'_
acts like normal elision (introducing a new anonymous lifetime variable).
#![allow(unused)] fn main() { trait Single<'a>: 'a {} trait Halfie<'a, 'b>: 'a {} trait Double<'a, 'b>: 'a + 'b {} struct S<T>(T); // The wildcard lifetime `'_` introduces an independent lifetime // (covering all cases including `'static`) as per normal impl<'a> S<Box<dyn Single<'a> + '_>> { fn f26() {} } impl<'a, 'r> S<&'r (dyn Single<'a> + '_)> { fn f27() {} } impl<'a, 'b> S<Box<dyn Halfie<'a, 'b> + '_>> { fn f28() {} } impl<'a, 'b, 'r> S<&'r (dyn Halfie<'a, 'b> + '_)> { fn f29() {} } impl<'a, 'b> S<Box<dyn Double<'a, 'b> + '_>> { fn f30() {} } impl<'a, 'b, 'r> S<&'r (dyn Double<'a, 'b> + '_)> { fn f31() {} } fn f<'a, 'b, 'r>(_: &'r &'a str, _: &'r &'b str) { S::<Box<dyn Single<'a> + 'static>>::f26(); S::<&'r (dyn Single<'a> + 'static)>::f27(); S::<Box<dyn Halfie<'a, 'b> + 'static>>::f28(); S::<&'r (dyn Halfie<'a, 'b> + 'static)>::f29(); S::<Box<dyn Double<'a, 'b> + 'static>>::f30(); S::<&'r (dyn Double<'a, 'b> + 'static)>::f31(); } }
Associated types
Similar to impl
headers, trait bounds always apply to associated types.
#![allow(unused)] fn main() { use core::marker::PhantomData; trait Single<'a>: 'a {} trait Halfie<'a, 'b>: 'a {} trait Double<'a, 'b>: 'a + 'b {} trait Assoc { type A01: ?Sized + Default; type A02: ?Sized + Default; //type A03: ?Sized + Default; type A04: ?Sized + Default; type A05: ?Sized + Default; //type A06: ?Sized + Default; } impl<'r, 'a, 'b> Assoc for (&'r &'a (), &'r &'b ()) { // '_ is not allowed here // & /* elided */ is not allowed here type A01 = PhantomData<Box<dyn Single<'a>>>; type A02 = PhantomData<Box<dyn Halfie<'a, 'b>>>; // ambiguous // type A03 = PhantomData<Box<dyn Double<'a, 'b>>>; type A04 = PhantomData<&'r dyn Single<'a>>; type A05 = PhantomData<&'r dyn Halfie<'a, 'b>>; // ambiguous // type A06 = PhantomData<&'r dyn Double<'a, 'b>>; } fn f<'r, 'a: 'r, 'b: 'r>() { // 'a (not `'static`, `'r`, `'b`) let _: PhantomData<Box<dyn Single<'a> + 'a>> = <(&'r &'a (), &'r &'b ()) as Assoc>::A01::default(); let _: PhantomData<Box<dyn Halfie<'a, 'b> + 'a>> = <(&'r &'a (), &'r &'b ()) as Assoc>::A02::default(); let _: PhantomData<&'r (dyn Single<'a> + 'a)> = <(&'r &'a (), &'r &'b ()) as Assoc>::A04::default(); let _: PhantomData<&'r (dyn Halfie<'a, 'b> + 'a)> = <(&'r &'a (), &'r &'b ()) as Assoc>::A05::default(); } }
Note: I have not performed extensive tests with GATs or associated types which themselves have lifetime bounds in combination with bounded traits.
Citations
Finally we compare what we've covered about dyn Trait
lifetime elision to the
current reference material, and supply some citations to the elision's storied
history.
Summary of differences from the reference
The official documentation on trait object lifetime elision can be found here.
In summary, it states that dyn Trait
lifetimes have a default object lifetime bound which varies based on context.
It states that the default bound only takes effect when the lifetime is entirely omitted. When you write out dyn Trait + '_
, the
normal lifetime elision rules
apply instead.
In particular, as of this writing, the official documentation states that
If the trait object is used as a type argument of a generic type then the containing type is first used to try to infer a bound.
- If there is a unique bound from the containing type then that is the default
- If there is more than one bound from the containing type then an explicit bound must be specified
If neither of those rules apply, then the bounds on the trait are used:
- If the trait is defined with a single lifetime bound then that bound is used.
- If
'static
is used for any lifetime bound then'static
is used.- If the trait has no lifetime bounds, then the lifetime is inferred in expressions and is
'static
outside of expressions.
Some differences from the reference which we have covered are that
- inferring bounds in expressions applies to
&T
types unless annotated with a named lifetime - inferring bounds in expressions applies to ambiguous types
- when trait bounds apply, they override struct bounds, not the other way around
- a
'static
trait bound always applies - otherwise, whether trait bounds apply or not depends on complicated contextual rules
- they always apply in
impl
headers, associated types, and function bodies - and technically in
static
contexts, with some odd caveats - whether they apply in function signatures depends on the bounding parameters being late or early bound
- a single parameter can apply to a trait bounds with multiple bounds in this context, introducing new implied lifetime bounds
- they always apply in
- trait bounds override
'_
in function bodies
And some other under or undocumented behaviors are that
- aliases override struct definitions
- trait bounds create implied bounds on the trait object lifetime
- associated type and GAT bounds do not effect the default trait object lifetime
RFCs, Issues, and PRs
Trait objects, and trait object lifetime elision in particular, has undergone a lot of evolution over time. Here we summarize some of the major developments and issues.
Reminder: a lot of these citations predate the dyn Trait
syntax.
Trait objects used to be just "spelled" as Trait
in type position, instead of dyn Trait
.
- RFC 0192 first introduced the trait object lifetime
- RFC 0599 first introduced default trait object lifetimes (
dyn Trait
lifetime elision) - RFC 1156 superseded RFC 0599 (
dyn Trait
lifetime elision) - PR 39305 modified RFC 1156 (unofficially) to allow more inference in function bodies
- RFC 2093 defined how
struct
bounds interact withdyn Trait
lifetime elision - Issue 100270 notes that type aliases take precedent in terms of RFC 2093
dyn Trait
lifetime elision rules - Issue 47078 notes that being late-bound influences
dyn Trait
lifetime elision
dyn Trait
vs. alternatives
When getting familiar with Rust, it can be hard at first to
recognize when you should use dyn Trait
versus some other
type mechanism, such as impl Trait
or generics.
In this section we look at some tradeoffs, depending on the use case.
Generic functions and argument position impl Trait
Preliminaries: What is argument position impl Trait
?
When we talk about argument position impl Trait
, aka APIT,
we're talking about functions such as this:
#![allow(unused)] fn main() { use std::fmt::Display; fn foo(d: impl Display) { println!("{d}"); } // APIT: ^^^^^^^^^^^^ }
That is, impl Trait
as an argument type of a function.
APIT is, so far at least, mostly the same as a generic parameter:
#![allow(unused)] fn main() { use std::fmt::Display; fn foo<D: Display>(d: D) { println!("{d}"); } }
The main difference is that generics allow
- the function writer to refer to
D
- e.g.
D::to_string(&d)
- e.g.
- other utilizers to turbofish the function
- e.g.
let function_pointer = foo::<String>;
- e.g.
Whereas the impl Display
parameter is not nameable inside nor
outside the function.
There may be more differences in the future, but for now at least,
generics are the more flexible and thus superior form -- unless you
have a burning hatred against the <...>
syntax, anyway.
At any rate, comparing dyn Trait
against APIT is essentially the
same as comparing dyn Trait
against a function with a generic type
parameter.
Tradeoffs between generic functions and dyn Trait
Here, we're talking about choosing between signatures like so:
#![allow(unused)] fn main() { trait Trait {} // Owned or borrowed generics fn foo1<T: Trait>(t: T) {} fn bar1<T: Trait + ?Sized>(t: &T) {} // Owned or borrowed `dyn Trait` fn foo2(t: Box<dyn Trait + '_>) {} fn bar2(t: &dyn Trait) {} }
When a function has a generic parameter, the parameter is monomorphized
for every concrete type which is used to call the function (after lifetime
erasure). That is, every type the parameter takes on results in a distinct
function in the compiled code. (Some of the resulting functions may be
eliminated or combined by optimization if possible). There could be many
copies of foo1
and bar1
, depending on how it's called.
But (after lifetime erasure), dyn Trait
is a singular concrete type.
There will only be one copy of foo2
and bar2
.
Yet in your typical Rust program, generic arguments are preferred over
dyn Trait
arguments. Why is that? There are a number of reasons:
- Each monomorphized function can typically be optimized better
- Trait bounds are more general than
dyn Trait
- No
dyn
safety concerns (T: Clone
is possible) - No single trait restriction (
T: Trait1 + Trait2
is allowed)
- No
- Less indirection through dynamic dispatch
- No need for boxing in the owned case
Box
isn't even available in#![no_std]
programs
The dyn Trait
versions do have the following advantages:
- Smaller code size
- Faster code generation
- Do not make traits
dyn
-unsafe
In general, you should prefer generics unless you have a specific
reason to opt for dyn Trait
in argument position.
Return position impl Trait
and TAIT
Preliminaries: What are return position impl Trait
and TAIT?
When we talk about return position impl Trait
, aka RPIT, we're talking
about functions such as this:
#![allow(unused)] fn main() { // RPIT: vvvvvvvvvvvvvvvvvvvvvvv fn foo<T>(v: Vec<T>) -> impl Iterator<Item = T> { v.into_iter().inspect(|t| println!("{t:p}")) } }
Unlike APIT,
RPITs are not the same as a generic type parameter. They are instead
opaque type aliases or opaque type alias constructors. In the above
example, the RPIT is an opaque type alias constructor which depends
on the input type parameter of the function (T
). For every concrete
T
, the RPIT is also an alias of a singular concrete type.
The function body and the compiler still know what the concrete type is,
but that is opaque to the caller and other code. Instead, the only ways
you can use the type are those which are compatible with the trait or
traits in the impl Trait
, plus any auto traits which the concrete type
happens to implement. (Or things provable from such properties, such
as blanket trait implementations.)
The singular part is key: the following code does not compile because it is trying to return two distinct types. Rust is strictly and statically typed, so this is not possible -- the opacity of the RPIT does not and cannot change that.
#![allow(unused)] fn main() { use std::fmt::Display; fn foo(b: bool) -> impl Display { if b { 0 } else { "hi!" } } }
type
alias impl Trait
, or TAIT, is a generalization of RPIT which
is not yet stable,
but will probably become stable before too much longer. TAIT allows
one to define aliases for opaque types, which allows them to be named
and to be used in more than one location.
#![allow(unused)] #![feature(type_alias_impl_trait)] fn main() { type MyDisplay = impl std::fmt::Display; fn foo() -> MyDisplay { "hello," } fn bar() -> MyDisplay { " world" } }
Notionally (and hopefully literally), RPIT desugars to a TAIT in a manner similar to this:
#![allow(unused)] #![feature(type_alias_impl_trait)] fn main() { use std::fmt::Display; fn foo1() -> impl Display { "hi" } // Same thing... or so type __Unnameable_Tait = impl Display; fn foo2() -> __Unnameable_Tait { "hi" } }
TAITs must still be an alias of a singular, concrete type.
Other downsides of opaque types generally from the perspective of the caller include
- Opaque types are invariant on all parameters, whereas nominal structs need not be
- No traits, including local traits, can be implemented on opaque types
- All of a nominal structs trait implementations, public fields, and public inherent methods are available, while opaque types reveal much less
Tradeoffs between RPIT and dyn Trait
RPITs and dyn Trait
returns share some benefits for the function writer:
- So long as the bounds don't change, you can change the concrete or base type
- You can return unnameable types, such as closures
- It simplifies complicated types, such as long iterator combinator chains
dyn Trait
does have some limitations and downsides:
- Only one non-auto-trait is supportable without subtrait boilerplate
- In contrast, with RPIT you can return
impl Trait1 + Trait2
- In contrast, with RPIT you can return
- Only
dyn
-safe traits are supportable- In contrast, with RPIT you can return
impl Clone
- In contrast, with RPIT you can return
- Boxing in some form is required to returned owned types
- You pay the typical optimization penalties of not knowing the base type and performing dynamic dispatch
However, RPITs also have their downsides:
- As an opaque alias, you can only return one actual, concrete type
- For now, the return type is unnameable, which can be awkward for consumers
- e.g. you can't store the result as a non-generic field in your struct
- ...unless the opaque type bounds are
dyn
-safe and you can type erase it yourself
- Auto-traits are leaky, so it's easy for the function writer to accidentally break SemVer
- Whereas auto-traits are explicit with
dyn Trait
- Whereas auto-traits are explicit with
- RPIT methods in traits (stabilized in Rust 1.75) are not
dyn
-dispatchable - Every RPIT is a distinct opaque type (note that TAIT works around this restriction)
RPITs also have some rather tricky behavior around type parameter and lifetime capture.
The planned impl Trait
functionalities deserve their own exploration independent of
dyn Trait
, so I'll only mention them in brief:
- RPIT captures all type parameters (and their implied lifetimes)
- And also lifetime parameters, in traits and in edition 2024
- RPIT captures specific lifetimes and not the intersection of all lifetimes
- And thus it is tedious to capture an intersection of input lifetimes instead of a union
- (The situation will be improved by precise capturing)
Despite all these downsides, I would say that RPIT has a slight edge over dyn Trait
in return position when applicable, especially for owned types. The advantage between
dyn Trait
and a (named) TAIT will be even greater, once that is available:
- You can give the return type a name and reuse it in multiple places
- TAIT inherently has precise capturing
- I.e. you have control over, and are explicit about, which lifetime and type parameters are captured
But dyn Trait
will still sometimes the better option, e.g.:
- when you need to type erase and return distinct types
- when you need trait object safety
However, there is often a third possibility available, which we explore below: return a generic struct.
An alternative to both: nominal generic structs
Here we can take inspiration from the standard library. One of the more popular
situations to use RPIT or return dyn Trait
is when dealing with iterators
(as iterator chains have long types and often involve unnameable types such as
closures as well).
So let's look at the Iterator methods.
You may notice a pattern with the combinators:
fn chain<U>(self, other: U) -> Chain<Self, <U as IntoIterator>::IntoIter>
where
Self: Sized,
U: IntoIterator<Item = Self::Item>,
{ todo!() }
fn filter<P>(self, predicate: P) -> Filter<Self, P>
where
Self: Sized,
P: FnMut(&Self::Item) -> bool,
{ todo!() }
fn map<B, F>(self, f: F) -> Map<Self, F>
where
Self: Sized,
F: FnMut(Self::Item) -> B,
{ todo!() }
The pattern is to have a function which is parameterized by a generic type
return a concrete (nominal) struct, also parameterized by the generic type.
This is possible even if the parameter itself is unnameable -- for example,
in the case of map
, the F: FnMut(Self::Item) -> B
parameter might well
be an unnameable closure.
The downside is much more boilerplate if you opt to follow this pattern
yourself: You have to define the struct, and (for examples like these)
implement the Iterator
trait for them, and perhaps other traits such
as DoubleEndedIterator
as desired. This will probably involve storing
the original iterator and calling next().map(|item| ...)
on it, or
such.
The upside is that you (and the consumers of your method) get many of the upsides of both RPIT and dyn Trait
:
- No dynamic dispatch penalty
- No boxing penalty
- No concrete-type specific optimization loss
- No single trait limitation
- No
dyn
-safe limitation - Applicable in traits
- Ability to be specific about captures
- Ability to change your implementation within the API bounds
- Nameable return type
You do retain some of the downsides:
- Auto-traits are leaky and still a semver hazard, as with RPIT
- Multiple concrete types aren't possible (without also utilizing type erasure), as with RPIT
And incur some unique ones as well:
- Variance of data types are leaky too
- Unnameable types that aren't input type parameters can't be supported (without also utilizing type erasure)
On the whole, when using a nominal type is possible, it is the best option for consumers of the function. But it's also the most amount of work for the function implementor.
I recommend nominal types for general libraries (i.e. intended for wide consumption) when possible, following the lead of the standard library.
Generic structs
In the last section, we covered how generic structs can often be used as an
alternative to RPIT or returning dyn Trait
in some form. A related question
is, when should you use type erasure within your data types?
The main reason to use type erasure in your data types are when you want to treat implementors of a trait as if they were the same type, for instance when storing a collection of callbacks. In this case, the decision to use type erasure is a question of functionality, and not really much of a choice.
However, you may also want to use type erasure in your data types in order to make your own struct non-generic. When your data type is generic, after all, those who use your data type in such a way that the parameter takes on more than one type will have to propagate the use of generics themselves, or face the decision of type erasing your data type themselves.
This can not only be a question of ergonomics, but also of compile time and even run time performance. Compiling strictly more code by having all your methods monomorphized will naturally tend to result in longer compile times, and the increase in actual code size can sometimes be slower at runtime than a touch of dynamic dispatch in the right areas.
Unfortunately, there is no silver bullet when it comes to choosing between being generic and using type erasure. However, a general principle is that your optimization sensitive, call-heavy code areas should not be type erased, and instead push type erasure to a boundary outside of your heavy computations.
For example, the failure to devirtualize and inline a call to
<dyn Iterator>::next
in a tight loop may have a relatively large impact,
whereas a dynamic callback that only fires occasionally (and then dispatches
to the optimized, non-type-erased implementation) is not likely to be
noticeable at all.
enum
s
Finally we'll mention one other alternative to type erasure:
just put all of the implementing types in an enum
!
This clearly only applies when you have a fixed set of types that you
expect to implement your trait. The downside of using an enum
is that
it can involve a lot of boilerplate, since you're frequently having to check
which variant you are instead of relying on dynamic dispatch to perform
that function for you.
The upside is avoiding practically all of the downsides of type erasure and the other alternatives such as opaque types.
Macros can help ease the pain of such boilerplate, and there are also crates in the ecosystem aimed at reducing the boilerplate.
In fact, there are also crates for this pattern as a whole.
In particular, if you find yourself in a situation where you've
chosen to use dyn Any
but you find yourself with
a bunch of attempted downcasts against a known set of types, you
should strongly consider just using an enum
. It won't be much
less ergonomic (if at all) and will be more efficient.
dyn Trait
examples
Here we provide some "recipes" for common dyn Trait
implementation
patterns.
In the examples, we'll typically be working with dyn Trait
, Box<dyn Trait>
,
and so on for the sake of brevity. But note that in more practical code, there
is a good chance you would also need to provide implementations for
Box<dyn Trait + Send + Sync>
or other variations across auto-traits. This
may be in place of implementations for dyn Trait
(if you always need the
auto-trait bounds) or in addition to implementations for dyn Trait
(to provide maximum flexibility).
Combining traits
Rust has no support for directly combining multiple non-auto traits
into one dyn Trait1 + Trait2
:
#![allow(unused)] fn main() { trait Foo { fn foo(&self) {} } trait Bar { fn bar(&self) {} } // Fails let _: Box<dyn Foo + Bar> = todo!(); }
However, the methods of a supertrait are available to the subtrait.
What's a supertrait? A supertrait is a trait bound on Self
in the
definition of the subtrait, like so:
#![allow(unused)] fn main() { trait Foo { fn foo(&self) {} } trait Bar { fn bar(&self) {} } trait Subtrait: Foo // ^^^^^^^^^^^^^ A supertrait bound where Self: Bar, // ^^^^^^^^^ Another one {} }
The supertrait bound is implied everywhere the subtrait bound is present, and the methods of the supertrait are always available on implementors of the subtrait.
Using these relationships, you can support something analogous to
dyn Foo + Bar
by using dyn Subtrait
.
trait Foo { fn foo(&self) {} } trait Bar { fn bar(&self) {} } impl Foo for () {} impl Bar for () {} trait Subtrait: Foo + Bar {} // Blanket implement for everything that meets the bounds... // ...including non-`Sized` types impl<T: ?Sized> Subtrait for T where T: Foo + Bar {} fn main() { let quz: &dyn Subtrait = &(); quz.foo(); quz.bar(); }
Note that despite the terminology, there is no sub/super type relationship
between sub/super traits, between dyn SubTrait
and dyn SuperTrait
,
between implementors of said traits, et cetera.
Traits are not about sub/super typing.
Manual supertrait upcasting
Supertrait upcasting is planned, but not yet stable.
Until stabilized, if you need to cast something like dyn Subtrait
to dyn Foo
, you
have to supply the implementation yourself.
For a start, we could build it into our traits like so:
#![allow(unused)] fn main() { trait Foo { fn foo(&self) {} fn as_dyn_foo(&self) -> &dyn Foo; } }
But we can't supply a default function body, as Self: Sized
is required to perform
the type erasing cast to dyn Foo
. We don't want that restriction or the method
won't be available on dyn Supertrait
, which is not Sized
.
Instead we can separate out the method and supply an implementation for all Sized
types, via another supertrait:
#![allow(unused)] fn main() { trait AsDynFoo { fn as_dyn_foo(&self) -> &dyn Foo; } trait Foo: AsDynFoo { fn foo(&self) {} } }
And then supply the implementation for all Sized + Foo
types:
#![allow(unused)] fn main() { trait AsDynFoo { fn as_dyn_foo(&self) -> &dyn Foo; } trait Foo: AsDynFoo { fn foo(&self) {} } impl<T: /* Sized + */ Foo> AsDynFoo for T { fn as_dyn_foo(&self) -> &dyn Foo { self } } }
The compiler will supply the implementation for both dyn AsDynFoo
and dyn Foo
.
When we put this altogether with the Subtrait
from above, we can now utilize
an explicit version of supertrait upcasting:
trait Foo: AsDynFoo { fn foo(&self) {} } trait Bar: AsDynBar { fn bar(&self) {} } impl Foo for () {} impl Bar for () {} trait AsDynFoo { fn as_dyn_foo(&self) -> &dyn Foo; } trait AsDynBar { fn as_dyn_bar(&self) -> &dyn Bar; } impl<T: Foo> AsDynFoo for T { fn as_dyn_foo(&self) -> &dyn Foo { self } } impl<T: Bar> AsDynBar for T { fn as_dyn_bar(&self) -> &dyn Bar { self } } trait Subtrait: Foo + Bar {} impl<T: ?Sized> Subtrait for T where T: Foo + Bar {} fn main() { let quz: &dyn Subtrait = &(); quz.foo(); quz.bar(); let _: &dyn Foo = quz.as_dyn_foo(); let _: &dyn Bar = quz.as_dyn_bar(); }
impl Trait for Box<dyn Trait>
Let's look at how one implements Trait for Box<dyn Trait + '_>
. One
thing to note off that bat is that most methods are going to involve
calling a method of the dyn Trait
inside of our box, but if we just
use self.method()
we would instantly recurse with the very method
we're writing (<Box<dyn Trait>>::method
)!
We need to take care to call <dyn Trait>::method
and not <Box<dyn Trait>>::method
in those cases to avoid infinite recursion.
Now that we've highlighted that consideration, let's dive right in:
#![allow(unused)] fn main() { trait Trait { fn look(&self); fn boop(&mut self); fn bye(self) where Self: Sized; } impl Trait for Box<dyn Trait + '_> { fn look(&self) { // We do NOT want to do this! // self.look() // That would recursively call *this* function! // We need to call `<dyn Trait as Trait>::look`. // Any of the below forms work, it depends on // how explicit you want to be. // Very explicit // <dyn Trait as Trait>::look(&**self) // Yay auto-deref for function parameters // <dyn Trait>::look(self) // Very succinct and a "makes sense once you've // seen it enough times" form. The first deref // is for the reference (`&Self`) and the second // deref is for the `Box<_>`. (**self).look() } fn boop(&mut self) { // This is similar to the `&self` case (**self).boop() } fn bye(self) { // Uh... see below } } }
Oh yeah, that last one. Remember what we said before?
dyn Trait
doesn't have this method, but Box<dyn Trait + '_>
does.
The compiler isn't going to just guess what to do here (and couldn't if,
say, we needed a return value). We can't move the dyn Trait
out of
the Box
because it's unsized. And we can't
downcast from dyn Trait
either; even if we could, it would rarely help here, as we'd have to both
impose a 'static
constraint and also know every type that implements our
trait to attempt downcasting on each one (or have some other clever scheme
for more efficient downcasting).
Ugh, no wonder Box<dyn Trait>
doesn't implement Trait
automatically.
Assuming we want to call Trait::bye
on the erased type, are we out of luck?
No, there are ways to work around this:
#![allow(unused)] fn main() { // Supertrait bound trait Trait: BoxedBye { fn bye(self); } trait BoxedBye { // Unlike `self: Self`, this does *not* imply `Self: Sized` and // thus *will* be available for `dyn BoxedBye + '_`... and for // `dyn Trait + '_` too, automatically. fn boxed_bye(self: Box<Self>); } // We implement it for all `Sized` implementors of `trait: Trait` by // unboxing and calling `Trait::bye` impl<T: Trait> BoxedBye for T { fn boxed_bye(self: Box<Self>) { <Self as Trait>::bye(*self) } } impl Trait for Box<dyn Trait + '_> { fn bye(self) { // This time we pass `self` not `*self` <dyn Trait as BoxedBye>::boxed_bye(self); } } }
By adding the supertrait bound, the compiler will supply an implementation of
BoxedBye for dyn Trait + '_
. That implementation will call the implementation
of BoxedBye
for Box<Erased>
, where Erased
is the erased base type. That
is our blanket implementation, which unboxes Erased
and calls Erased
's
Trait::bye
.
The signature of <dyn Trait as BoxedBye>::boxed_bye
has a receiver with the
type Box<dyn Trait + '_>
, which is exactly the same signature as
<Box<dyn Trait + '_> as Trait>::bye
. And that's how we were able to
complete the implementation of Trait
for Box<dyn Trait + '_>
.
Here's how things flow when calling Trait::bye
on Box<dyn Trait + '_>
:
<Box<dyn Trait>>::bye (_: Box<dyn Trait>) -- just passed -->
< dyn Trait >::boxed_bye(_: Box<dyn Trait>) -- via vtable -->
< Erased >::boxed_bye(_: Box< Erased >) -- via unbox -->
< Erased >::bye (_: Erased ) :)
There's rarely a reason to implement BoxedBye for Box<dyn Trait + '_>
, since
that takes a nested Box<Box<dyn Trait + '_>>
receiver.
Any Sized
implementor of Trait
will get our blanket implementation of
the BoxedBye
supertrait "for free", so they don't have to do anything
special.
The last thing I'll point out is how we did
#![allow(unused)] fn main() { trait Trait {} impl Trait for Box<dyn Trait + '_> { // this: ^^^^ } }
We didn't need to require 'static
, so this is more flexible.
It's also very easy to forget.
Clonable Box<dyn Trait>
What can you do if you want a Box<dyn Trait>
that you can clone?
You can't have Clone
as a supertrait,
because Clone
requires Sized
and that will make Trait
be
non-object-safe.
You might be tempted to do this:
#![allow(unused)] fn main() { trait Trait { fn dyn_clone(&self) -> Self where Self: Sized; } }
But then dyn Trait
won't have the method available, and that will
be a barrier to implementing Trait
for Box<dyn Trait>
.
But hey, you know what? Since this only really makes sense for
base types that implement Clone
, we don't need a method that returns
Self
. The base types already have that, it's called clone
.
What we ultimately want is to get a Box<dyn Trait>
instead, like so:
#![allow(unused)] fn main() { trait Trait { fn dyn_clone<'s>(&self) -> Box<dyn Trait + 's> where Self: 's; } // example implementor impl Trait for String { fn dyn_clone<'s>(&self) -> Box<dyn Trait + 's> where Self: 's { Box::new(self.clone()) } } }
If we omit all the lifetime stuff, it only works with Self: 'static
due to the default 'static
lifetime. And
sometimes, that's perfectly ok! But we'll stick with the more general
version for this example.
The example implementation will make dyn Trait
do the right thing
(clone the underlying base type via its implementation). We can't have
a default body though, because the implementation requires Clone
and Sized
, which again, we don't want as bounds.
But this is exactly the situation we had when we looked at
manual supertrait upcasting
and the self
receiver helper
in previous examples. The same pattern
will work here: move the method to a helper supertrait and supply
a blanket implementation for those cases where it makes sense.
#![allow(unused)] fn main() { trait DynClone { fn dyn_clone<'s>(&self) -> Box<dyn Trait + 's> where Self: 's; } impl<T: Clone + Trait> DynClone for T { fn dyn_clone<'s>(&self) -> Box<dyn Trait + 's> where Self: 's { Box::new(self.clone()) } } trait Trait: DynClone {} }
Now we're ready for Box<dyn Trait + '_>
.
#![allow(unused)] fn main() { trait Trait: DynClone {} trait DynClone { fn dyn_clone<'s>(&self) -> Box<dyn Trait + 's> where Self: 's; } impl<T: Clone + Trait> DynClone for T { fn dyn_clone<'s>(&self) -> Box<dyn Trait + 's> where Self: 's { Box::new(self.clone()) } } impl Trait for Box<dyn Trait + '_> {} impl Clone for Box<dyn Trait + '_> { fn clone(&self) -> Self { // Important! "recursive trait implementation" style (**self).dyn_clone() } } }
It's important that we called <dyn Trait as DynClone>::dyn_clone
! Our
blanket implementation of DynClone
was bounded on Clone + Trait
, but
now we have implemented both of those for Box<dyn Trait + '_>
. If we
had just called self.dyn_clone()
, the call graph would go like so:
<Box<dyn Trait> as Clone >::clone()
<Box<dyn Trait> as DynClone>::dyn_clone()
<Box<dyn Trait> as Clone >::clone()
<Box<dyn Trait> as DynClone>::dyn_clone()
<Box<dyn Trait> as Clone >::clone()
<Box<dyn Trait> as DynClone>::dyn_clone()
...
Yep, infinite recursion. Just like when implementing Trait for Box<dyn Trait>
,
we need to call the dyn Trait
method directly to avoid this.
There is also a crate for this use case: the dyn-clone
crate.
A comparison with the crate is beyond the scope of this guide for now.
Downcasting Self
parameters
Now let's move on to something a little more complicated. We
mentioned before that
Self
is not accepted outside of the receiver, such as when it's
another parameter, as there is no guarantee that the other
parameter has the same base type as the receiver (and if they
are not the same base type, there is no actual implementation to
call).
Let's see how we can work around this to implement
PartialOrd
for dyn Trait
, despite the &Self
parameter. The trait is a good
fit in the face of type erasure, as we can just return None
when
the types don't match, indicating that comparison is not possible.
PartialOrd
requires PartialEq
, so we'll tackle that as well.
Downcasting with dyn Any
to emulate dynamic typing
We haven't had to use dyn Any
in the previous examples, because
we've been able to maneuver our implementations in such a way that
dynamic dispatch implicitly "downcasted" our erased types to their
concrete base types for us. It's able to do this because the pointer
to the base type is coupled with a vtable that only accepts said base
type, and there is no need for actual dynamic typing or comparing types
at runtime. The conversion is infallible for those cases.
However, now we have two wide pointers which may point to different base types. In this particular application, we only really need to know if they have the same base type or not... though it would be nice to have some safe way to recover the erased type of non-receiver too, instead of whatever casting shenanigans might be necessary.
You might think you could somehow use the vtable pointers to see if the base types are the same. But unfortunately, we can't rely on the vtable to compare their types at runtime.
When comparing wide pointers, both the address and the metadata are tested for equality. However, note that comparing trait object pointers (
*const dyn Trait
) is unreliable: pointers to values of the same underlying type can compare unequal (because vtables are duplicated in multiple codegen units), and pointers to values of different underlying type can compare equal (since identical vtables can be deduplicated within a codegen unit).
That's right, false negatives and false positives. Fun!
So we need a different mechanism to compare types and know when we
have two wide pointers to the same base type, and that's where dyn Any
comes in. Any
is
the trait to emulate dynamic typing, and
many fallible downcasting methods
are supplied for the type-erased forms of dyn Any
, Box<dyn Any + Send>
,
et cetera. This will allow us to not just compare for base type equality,
but also to safely recover the erased base type ("downcast").
The Any
trait comes with a 'static
constraint for soundness reasons,
so note that our base types are going to be more limited for this example.
Additionally, the lack of supertrait upcasting is going to make things less ergonomic than they will be once that feature is available.
One last side note, we look at dyn Any
in a bit more detail later.
Well enough meta, let's dive in!
PartialEq
The general idea is that we're going to have a comparison trait, DynCompare
,
and then implement PartialEq
for dyn DynCompare
in a universal manner.
Then our actual trait (Trait
) can have DynCompare
as a supertrait, and
implement PartialEq
for dyn Trait
by upcasting to dyn DynCompare
.
In the implementation for dyn DynCompare
, we're going to have to (attempt to)
downcast to the erased base type. For that to be available we will need to
first be able to upcast from dyn DynCompare
to dyn Any
.
As the first step, we're going to use the "supertrait we can blanket implement" pattern yet again to make a trait that can handle all of our supertrait upcasting needs.
Here it is, similar to how we've done it before:
#![allow(unused)] fn main() { use std::any::Any; trait AsDynCompare: Any { fn as_any(&self) -> &dyn Any; fn as_dyn_compare(&self) -> &dyn DynCompare; } // Sized types only impl<T: Any + DynCompare> AsDynCompare for T { fn as_any(&self) -> &dyn Any { self } fn as_dyn_compare(&self) -> &dyn DynCompare { self } } trait DynCompare: AsDynCompare { fn dyn_eq(&self, other: &dyn DynCompare) -> bool; } }
There's an Any: 'static
bound which applies to dyn Any + '_
, so
all of those &dyn Any
are actually &dyn Any + 'static
.
I have also included an Any
supertrait to AsDynCompare
, so the
"always 'static
" property holds for &dyn DynCompare
as well, even
though it isn't strictly necessary. This way, we don't have to worry
about being flexible with the trait object lifetime at all -- it is
just always 'static
.
The downside is that only base types that satisfy the 'static
bound
can be supported, so there may be niche circumstances where you don't
want to include the supertrait bound. However, given that we need to
upcast to dyn Any
, this must mean you're pretending to be another
type, which seems quite niche indeed. If you do try the non-'static
route for your own use case, note that some of the implementations in
this example could be made more general.
Anyway, let's move on to performing cross-type equality checking:
#![allow(unused)] fn main() { use std::any::Any; trait AsDynCompare: Any { fn as_any(&self) -> &dyn Any; fn as_dyn_compare(&self) -> &dyn DynCompare; } impl<T: Any + DynCompare> AsDynCompare for T { fn as_any(&self) -> &dyn Any { self } fn as_dyn_compare(&self) -> &dyn DynCompare { self } } trait DynCompare: AsDynCompare { fn dyn_eq(&self, other: &dyn DynCompare) -> bool; } impl<T: Any + PartialEq> DynCompare for T { fn dyn_eq(&self, other: &dyn DynCompare) -> bool { if let Some(other) = other.as_any().downcast_ref::<Self>() { self == other } else { false } } } // n.b. this could be implemented in a more general way when // the trait object lifetime is not constrained to `'static` impl PartialEq<dyn DynCompare> for dyn DynCompare { fn eq(&self, other: &dyn DynCompare) -> bool { self.dyn_eq(other) } } }
Here we've utilized our dyn Any
upcasting to try and recover a
parameter of our own base type, and if successful, do the actual
(partial) comparison. Otherwise we say they're not equal.
This allows us to implement PartialEq
for dyn Compare
.
Then we want to wire this functionality up to our actual trait:
#![allow(unused)] fn main() { use std::any::Any; trait AsDynCompare: Any { fn as_any(&self) -> &dyn Any; fn as_dyn_compare(&self) -> &dyn DynCompare; } impl<T: Any + DynCompare> AsDynCompare for T { fn as_any(&self) -> &dyn Any { self } fn as_dyn_compare(&self) -> &dyn DynCompare { self } } trait DynCompare: AsDynCompare { fn dyn_eq(&self, other: &dyn DynCompare) -> bool; } impl<T: Any + PartialEq> DynCompare for T { fn dyn_eq(&self, other: &dyn DynCompare) -> bool { if let Some(other) = other.as_any().downcast_ref::<Self>() { self == other } else { false } } } impl PartialEq<dyn DynCompare> for dyn DynCompare { fn eq(&self, other: &dyn DynCompare) -> bool { self.dyn_eq(other) } } trait Trait: DynCompare {} impl Trait for i32 {} impl Trait for bool {} impl PartialEq<dyn Trait> for dyn Trait { fn eq(&self, other: &dyn Trait) -> bool { self.as_dyn_compare() == other.as_dyn_compare() } } }
The supertrait bound does most of the work, and we just use
upcasting again -- to dyn DynCompare
this time -- to be
able to perform PartialEq
on our dyn Trait
.
A blanket implementation in std
gives us PartialEq
for Box<dyn Trait>
automatically.
Now let's try it out:
use std::any::Any; trait AsDynCompare: Any { fn as_any(&self) -> &dyn Any; fn as_dyn_compare(&self) -> &dyn DynCompare; } impl<T: Any + DynCompare> AsDynCompare for T { fn as_any(&self) -> &dyn Any { self } fn as_dyn_compare(&self) -> &dyn DynCompare { self } } trait DynCompare: AsDynCompare { fn dyn_eq(&self, other: &dyn DynCompare) -> bool; } impl<T: Any + PartialEq> DynCompare for T { fn dyn_eq(&self, other: &dyn DynCompare) -> bool { if let Some(other) = other.as_any().downcast_ref::<Self>() { self == other } else { false } } } impl PartialEq<dyn DynCompare> for dyn DynCompare { fn eq(&self, other: &dyn DynCompare) -> bool { self.dyn_eq(other) } } trait Trait: DynCompare {} impl Trait for i32 {} impl Trait for bool {} impl PartialEq<dyn Trait> for dyn Trait { fn eq(&self, other: &dyn Trait) -> bool { self.as_dyn_compare() == other.as_dyn_compare() } } fn main() { let bx1a: Box<dyn Trait> = Box::new(1); let bx1b: Box<dyn Trait> = Box::new(1); let bx2: Box<dyn Trait> = Box::new(2); let bx3: Box<dyn Trait> = Box::new(true); println!("{}", bx1a == bx1a); println!("{}", bx1a == bx1b); println!("{}", bx1a == bx2); println!("{}", bx1a == bx3); }
Uh... it didn't work, but for weird reasons. Why is it trying to move out of the
Box
for a comparison? As it turns out, this is a longstanding bug in the
language. Fortunately that issue
also offers a workaround that's ergonomic at the use site: implement PartialEq<&Self>
too.
use std::any::Any; trait AsDynCompare: Any { fn as_any(&self) -> &dyn Any; fn as_dyn_compare(&self) -> &dyn DynCompare; } // Sized types only impl<T: Any + DynCompare> AsDynCompare for T { fn as_any(&self) -> &dyn Any { self } fn as_dyn_compare(&self) -> &dyn DynCompare { self } } trait DynCompare: AsDynCompare { fn dyn_eq(&self, other: &dyn DynCompare) -> bool; } impl<T: Any + PartialEq> DynCompare for T { fn dyn_eq(&self, other: &dyn DynCompare) -> bool { if let Some(other) = other.as_any().downcast_ref::<Self>() { self == other } else { false } } } impl PartialEq<dyn DynCompare> for dyn DynCompare { fn eq(&self, other: &dyn DynCompare) -> bool { self.dyn_eq(other) } } trait Trait: DynCompare {} impl Trait for i32 {} impl Trait for bool {} impl PartialEq<dyn Trait> for dyn Trait { fn eq(&self, other: &dyn Trait) -> bool { self.as_dyn_compare() == other.as_dyn_compare() } } // New impl PartialEq<&Self> for Box<dyn Trait> { fn eq(&self, other: &&Self) -> bool { <Self as PartialEq>::eq(self, *other) } } fn main() { let bx1a: Box<dyn Trait> = Box::new(1); let bx1b: Box<dyn Trait> = Box::new(1); let bx2: Box<dyn Trait> = Box::new(2); let bx3: Box<dyn Trait> = Box::new(true); println!("{}", bx1a == bx1a); println!("{}", bx1a == bx1b); println!("{}", bx1a == bx2); println!("{}", bx1a == bx3); }
Ok, now it works. Phew!
PartialOrd
From here it's mostly mechanical to add PartialOrd
support:
+use core::cmp::Ordering;
trait DynCompare: AsDynCompare {
fn dyn_eq(&self, other: &dyn DynCompare) -> bool;
+ fn dyn_partial_cmp(&self, other: &dyn DynCompare) -> Option<Ordering>;
}
-impl<T: Any + PartialEq> DynCompare for T {
+impl<T: Any + PartialOrd> DynCompare for T {
fn dyn_eq(&self, other: &dyn DynCompare) -> bool {
if let Some(other) = other.as_any().downcast_ref::<Self>() {
self == other
} else {
false
}
}
+
+ fn dyn_partial_cmp(&self, other: &dyn DynCompare) -> Option<Ordering> {
+ other
+ .as_any()
+ .downcast_ref::<Self>()
+ .and_then(|other| self.partial_cmp(other))
+ }
}
+impl PartialOrd<dyn DynCompare> for dyn DynCompare {
+ fn partial_cmp(&self, other: &dyn DynCompare) -> Option<Ordering> {
+ self.dyn_partial_cmp(other)
+ }
+}
+impl PartialOrd<dyn Trait> for dyn Trait {
+ fn partial_cmp(&self, other: &dyn Trait) -> Option<Ordering> {
+ self.as_dyn_compare().partial_cmp(other.as_dyn_compare())
+ }
+}
+impl PartialOrd<&Self> for Box<dyn Trait> {
+ fn partial_cmp(&self, other: &&Self) -> Option<Ordering> {
+ <Self as PartialOrd>::partial_cmp(self, *other)
+ }
+}
Generalizing borrows
A lot of core traits are built around some sort of field projection, where
the implementing type contains some other type T
and you can convert a
&self
to a &T
or &mut self
to a &mut T
.
#![allow(unused)] fn main() { pub trait Deref { type Target: ?Sized; fn deref(&self) -> &Self::Target; } pub trait Index<Idx: ?Sized> { type Output: ?Sized; fn index(&self, index: Idx) -> &Self::Output; } pub trait AsRef<T: ?Sized> { fn as_ref(&self) -> &T; } pub trait Borrow<Borrowed: ?Sized> { fn borrow(&self) -> &Borrowed; } // `DerefMut`, `IndexMut`, `AsMut`, ... }
There's generally no way to implement these traits if the type you want to
return is not contained within Self
(except for returning a reference to
some static value or similar, which is rarely what you want).
However, sometimes you have a custom borrowing type which is not actually contained within your owning type:
#![allow(unused)] fn main() { // We wish we could implement `Borrow<DataRef<'?>>`, but we can't pub struct Data { first: usize, others: Vec<usize>, } pub struct DataRef<'a> { first: usize, others: &'a [usize], } pub struct DataMut<'a> { first: usize, others: &'a mut Vec<usize>, } }
This can be problematic when interacting with libraries and data structures
such as std::collections::HashSet
,
which rely on the Borrow
trait
to be able to look up entries without taking ownership.
One way around this problem is to use a different library or type which is more flexible. However, it's also possible to tackle the problem with a bit of indirection and type erasure.
Your types contain a borrower
Here we present a solution to the problem by Eric Michael Sumner, who has graciously blessed its inclusion in this guide. I've rewritten the original for the sake of presentation, and any errors are my own.
The main idea behind the approach is to utilize the following trait, which
encapsulates the ability to borrow self
in the form of your custom borrowed
type:
#![allow(unused)] fn main() { pub struct Data { first: usize, others: Vec<usize> } pub struct DataRef<'a> { first: usize, others: &'a [usize] } pub trait Lend { fn lend(&self) -> DataRef<'_>; } impl Lend for Data { fn lend(&self) -> DataRef<'_> { DataRef { first: self.first, others: &self.others, } } } impl Lend for DataRef<'_> { fn lend(&self) -> DataRef<'_> { DataRef { first: self.first, others: self.others, } } } // impl Lend for DataMut<'_> ... }
And the key insight is that any implementor can also coerce from
&self
to &dyn Lend
. We can therefore implement traits like
Borrow
, because every implementor "contains" a dyn Lend
--
themselves!
#![allow(unused)] fn main() { pub struct Data { first: usize, others: Vec<usize> } pub struct DataRef<'a> { first: usize, others: &'a [usize] } pub trait Lend { fn lend(&self) -> DataRef<'_>; } impl Lend for Data { fn lend(&self) -> DataRef<'_> { DataRef { first: self.first, others: &self.others, } } } impl Lend for DataRef<'_> { fn lend(&self) -> DataRef<'_> { DataRef { first: self.first, others: self.others, } } } use std::borrow::Borrow; impl<'a> Borrow<dyn Lend + 'a> for Data { fn borrow(&self) -> &(dyn Lend + 'a) { self } } impl<'a, 'b: 'a> Borrow<dyn Lend + 'a> for DataRef<'b> { fn borrow(&self) -> &(dyn Lend + 'a) { self } } // impl<'a, 'b: 'a> Borrow<dyn Lend + 'a> for DataMut<'b> ... }
This gives us a common Borrow
type for both our owning and
custom borrowing data structures. To look up borrowed entries
in a HashSet
, for example, we can cast a &DataRef<'_>
to
a &dyn Lend
and pass that to set.contains
; the HashSet
can
hash the dyn Lend
and then borrow the owned Data
entries as
dyn Lend
as well, in order to do the necessary lookup
comparisons.
That means we need to implement the requisite functionality such as
PartialEq
and Hash
for dyn Lend
. But this is a different use case
than our general solution in the previous section.
In that case we wanted PartialEq
for our already-type-erased dyn Trait
,
so we could compare values across any arbitrary implementing types.
Here we don't care about arbitrary types, and we also have the ability
to produce a concrete type that references our actual data. We can
use that to implement the functionality; there's no need for downcasting
or any of that in order to implement the requisite traits for dyn Lend
.
We don't really care that dyn Lend
will implement PartialEq
and
Hash
per se, as that is just a means to an end: giving HashSet
and
friends a way to compare our custom concrete borrowing types despite the
Borrow
trait bound.
First things first though, we need our concrete types to implement
the requisite traits themselves. The main thing to be mindful of
is that we maintain
the invariants expected by Borrow
.
For this example, we're lucky enough that our borrowing
type can just derive all of the requisite functionality:
#![allow(unused)] fn main() { #[derive(Debug, Copy, Clone, Eq, PartialEq, Hash)] pub struct DataRef<'a> { first: usize, others: &'a [usize], } #[derive(Debug, Clone)] pub struct Data { first: usize, others: Vec<usize>, } #[derive(Debug)] pub struct DataMut<'a> { first: usize, others: &'a mut Vec<usize>, } }
However, we haven't derived the traits that are semantically important
to Borrow
for our other types. We technically could have in
this case, because
- our fields are in the same order as they are in the borrowed type
- every field is present
- every field has a
Borrow
relationship when comparing with the borrowed type's field - we understand how the
derive
works
But all those things might not be true for your use case, and even when they are, relying on them creates a very fragile arrangement. It's just too easy to accidentally break things by adding a field or even just rearranging the field order.
Instead, we implement the traits directly by deferring to the borrowed type:
#![allow(unused)] fn main() { struct Data; impl Data { fn lend(&self) {} } // Exercise for the reader: `PartialEq` across all of our // owned and borrowed types :-) impl std::cmp::PartialEq for Data { fn eq(&self, other: &Self) -> bool { self.lend() == other.lend() } } impl std::cmp::Eq for Data {} impl std::hash::Hash for Data { fn hash<H: std::hash::Hasher>(&self, hasher: &mut H) { self.lend().hash(hasher) } } // Similarly for `DataMut<'_>` }
And in fact, this is exactly the approach we want to take for
dyn Lend
as well:
#![allow(unused)] fn main() { pub struct DataRef<'a> { first: usize, others: &'a [usize] } pub trait Lend { fn lend(&self); } impl std::cmp::PartialEq<dyn Lend + '_> for dyn Lend + '_ { fn eq(&self, other: &(dyn Lend + '_)) -> bool { self.lend() == other.lend() } } impl std::cmp::Eq for dyn Lend + '_ {} impl std::hash::Hash for dyn Lend + '_ { fn hash<H: std::hash::Hasher>(&self, hasher: &mut H) { self.lend().hash(hasher) } } }
Whew, that was a lot of boilerplate. But we're finally at a
place where we can store Data
in a HashSet
and look up
entries when we only have a DataRef
:
use std::collections::HashSet;
let set = [
Data { first: 3, others: vec![5,7]},
].into_iter().collect::<HashSet<_>>();
assert!(set.contains::<dyn Lend>(&DataRef { first: 3, others: &[5,7]}))
// Alternative to turbofishing
let data_ref = DataRef { first: 3, others: &[5,7]};
assert!(set.contains(&data_ref as &dyn Lend));
Here's a playground with the complete example.
Another alternative to casting or turbofish is to add an
as_lend(&self) -> &dyn Lend + '_
method, similar to many
of the previous examples.
Erased traits
Let's say you have an existing trait which works well for the most part:
#![allow(unused)] fn main() { pub mod useful { pub trait Iterable { type Item; type Iter<'a>: Iterator<Item = &'a Self::Item> where Self: 'a; fn iter(&self) -> Self::Iter<'_>; fn visit<F: FnMut(&Self::Item)>(&self, mut f: F) { for item in self.iter() { f(item); } } } impl<I: Iterable + ?Sized> Iterable for &I { type Item = <I as Iterable>::Item; type Iter<'a> = <I as Iterable>::Iter<'a> where Self: 'a; fn iter(&self) -> Self::Iter<'_> { <I as Iterable>::iter(*self) } } impl<I: Iterable + ?Sized> Iterable for Box<I> { type Item = <I as Iterable>::Item; type Iter<'a> = <I as Iterable>::Iter<'a> where Self: 'a; fn iter(&self) -> Self::Iter<'_> { <I as Iterable>::iter(&**self) } } impl<T> Iterable for Vec<T> { type Item = T; type Iter<'a> = std::slice::Iter<'a, T> where Self: 'a; fn iter(&self) -> Self::Iter<'_> { <[T]>::iter(self) } } } }
However, it's not dyn
safe and you wish it was.
Even if we get support for GATs in dyn Trait
some day, there
are no plans to support functions with generic type parameters
like Iterable::visit
. Besides, you want the functionality now,
not "some day".
Perhaps you also have a lot of code utilizing this useful trait, and you don't want to redo everything. Maybe it's not even your own trait.
This may be a case where you want to provide an "erased" version
of the trait to make it dyn
safe. The general idea is to use
dyn
(type erasure) to replace all the non-dyn
-safe uses such
as GATs and type-parameterized methods.
#![allow(unused)] fn main() { pub mod erased { // This trait is `dyn` safe pub trait Iterable { type Item; // No more GAT fn iter(&self) -> Box<dyn Iterator<Item = &Self::Item> + '_>; // No more type parameter fn visit(&self, f: &mut dyn FnMut(&Self::Item)); } } }
We want to be able to create a dyn erased::Iterable
from anything
that is useful::Iterable
, so we need a blanket implementation to
connect the two:
fn main() {} pub mod useful { pub trait Iterable { type Item; type Iter<'a>: Iterator<Item = &'a Self::Item> where Self: 'a; fn iter(&self) -> Self::Iter<'_>; fn visit<F: FnMut(&Self::Item)>(&self, f: F); } } pub mod erased { use crate::useful; pub trait Iterable { type Item; fn iter(&self) -> Box<dyn Iterator<Item = &Self::Item> + '_>; fn visit(&self, f: &mut dyn FnMut(&Self::Item)) { for item in self.iter() { f(item); } } } impl<I: useful::Iterable + ?Sized> Iterable for I { type Item = <I as useful::Iterable>::Item; fn iter(&self) -> Box<dyn Iterator<Item = &Self::Item> + '_> { Box::new(useful::Iterable::iter(self)) } // By not using a default function body, we can avoid // boxing up the iterator fn visit(&self, f: &mut dyn FnMut(&Self::Item)) { for item in <Self as useful::Iterable>::iter(self) { f(item) } } } }
We're also going to want to pass our erased::Iterable
s to functions
that have a useful::Iterable
trait bound. However, we can't add
that as a supertrait, because that would remove the dyn
safety.
The purpose of our erased::Iterable
is to be able to type-erase to
dyn erased::Iterable
anyway though, so instead we just implement
useful::Iterable
directly on dyn erased::Iterable
:
fn main() {} pub mod useful { pub trait Iterable { type Item; type Iter<'a>: Iterator<Item = &'a Self::Item> where Self: 'a; fn iter(&self) -> Self::Iter<'_>; fn visit<F: FnMut(&Self::Item)>(&self, f: F); } } pub mod erased { use crate::useful; pub trait Iterable { type Item; fn iter(&self) -> Box<dyn Iterator<Item = &Self::Item> + '_>; fn visit(&self, f: &mut dyn FnMut(&Self::Item)) { for item in self.iter() { f(item); } } } impl<I: useful::Iterable + ?Sized> Iterable for I { type Item = <I as useful::Iterable>::Item; fn iter(&self) -> Box<dyn Iterator<Item = &Self::Item> + '_> { Box::new(useful::Iterable::iter(self)) } fn visit(&self, f: &mut dyn FnMut(&Self::Item)) { for item in <Self as useful::Iterable>::iter(self) { f(item) } } } impl<Item> useful::Iterable for dyn Iterable<Item = Item> + '_ { type Item = Item; type Iter<'a> = Box<dyn Iterator<Item = &'a Item> + 'a> where Self: 'a; fn iter(&self) -> Self::Iter<'_> { Iterable::iter(self) } // Here we can choose to override the default function body to avoid // boxing up the iterator, or we can use the default function body // to avoid dynamic dispatch of `F`. I've opted for the former. fn visit<F: FnMut(&Self::Item)>(&self, mut f: F) { <Self as Iterable>::visit(self, &mut f) } } }
Technically our blanket implementation of erased::Iterable
now applies to
dyn erased::Iterable
, but things still work out due to some
language magic.
The blanket implementations of useful::Iterable
in the useful
module gives
us implementations for &dyn erased::Iterable
and Box<dyn erased::Iterable>
,
so now we're good to go!
Mindful implementations and their limitations
You may have noticed how we took care to avoid boxing the iterator when possible
by being mindful of how we implemented some of the methods, for example not
having a default body for erased::Iterable::visit
, and then overriding the
default body of useful::Iterable::visit
. This can lead to better performance
but isn't necessarily critical, so long as you avoid things like accidental
infinite recursion.
How well did we do on this front? Let's take a look in the playground.
Hmm, perhaps not as well as we hoped! <dyn erased::Iterable as useful::Iterable>::visit
avoids the boxing as designed, but Box<dyn erased::Iterable>
's visit
still boxes the
iterator.
Why is that? It is because the implementation for the Box
is supplied by the useful
module, and that implementation uses the default body. In order to avoid the boxing,
it would need to recurse to the underlying implementation instead.
That way, the call to visit
will "drill down" until the implementation for
dyn erased::Iterable::visit
, which takes care to avoid the boxed iterator. Or
phrased another way, the recursive implementations "respects" any overrides of the
default function body by other implementors of useful::Iterable
.
Since the original trait might not even be in your crate, this might be out of your
control. Oh well, so it goes; maybe submit a PR 🙂. In this particular case you
could take care to pass &dyn erased::Iterable
by coding &*boxed_erased_iterable
.
Or maybe it doesn't really matter enough to bother in practice for your use case.
Real world examples
Perhaps the most popular crate to use this pattern is
the erased-serde
crate.
Another use case is working with async
-esque traits,
which tends to involve a lot of type erasure and unnameable types.
Hashable Box<dyn Trait>
Let's say we have a dyn Trait
that implements Eq
, but we also want it to implement Hash
so that we can use Box<dyn Trait>
in a HashSet
or as the key of a HashMap
and so on.
Here's our starting point
The only update from before is to require Eq
:
-impl<T: Any + PartialOrd> DynCompare for T {
+impl<T: Any + PartialOrd + Eq> DynCompare for T {
+impl Eq for dyn DynCompare {}
+impl Eq for dyn Trait {}
The complete code:
#![allow(unused)] fn main() { use core::cmp::Ordering; use std::any::Any; trait AsDynCompare: Any { fn as_any(&self) -> &dyn Any; fn as_dyn_compare(&self) -> &dyn DynCompare; } // Sized types only impl<T: Any + DynCompare> AsDynCompare for T { fn as_any(&self) -> &dyn Any { self } fn as_dyn_compare(&self) -> &dyn DynCompare { self } } trait DynCompare: AsDynCompare { fn dyn_eq(&self, other: &dyn DynCompare) -> bool; fn dyn_partial_cmp(&self, other: &dyn DynCompare) -> Option<Ordering>; } impl<T: Any + PartialOrd + Eq> DynCompare for T { fn dyn_eq(&self, other: &dyn DynCompare) -> bool { if let Some(other) = other.as_any().downcast_ref::<Self>() { self == other } else { false } } fn dyn_partial_cmp(&self, other: &dyn DynCompare) -> Option<Ordering> { other .as_any() .downcast_ref::<Self>() .and_then(|other| self.partial_cmp(other)) } } impl Eq for dyn DynCompare {} impl PartialEq<dyn DynCompare> for dyn DynCompare { fn eq(&self, other: &dyn DynCompare) -> bool { self.dyn_eq(other) } } impl PartialOrd<dyn DynCompare> for dyn DynCompare { fn partial_cmp(&self, other: &dyn DynCompare) -> Option<Ordering> { self.dyn_partial_cmp(other) } } trait Trait: DynCompare {} impl Trait for i32 {} impl Trait for bool {} impl Eq for dyn Trait {} impl PartialEq<dyn Trait> for dyn Trait { fn eq(&self, other: &dyn Trait) -> bool { self.as_dyn_compare() == other.as_dyn_compare() } } impl PartialOrd<dyn Trait> for dyn Trait { fn partial_cmp(&self, other: &dyn Trait) -> Option<Ordering> { self.as_dyn_compare().partial_cmp(other.as_dyn_compare()) } } impl PartialEq<&Self> for Box<dyn Trait> { fn eq(&self, other: &&Self) -> bool { <Self as PartialEq>::eq(self, *other) } } impl PartialOrd<&Self> for Box<dyn Trait> { fn partial_cmp(&self, other: &&Self) -> Option<Ordering> { <Self as PartialOrd>::partial_cmp(self, *other) } } }
Similarly to when we looked at Clone
, Hash
is not an object safe trait. So we can't
just add Hash
as a supertrait bound. This time it's not because it requires Sized
, though, it's because
it has a generic method:
#![allow(unused)] fn main() { use core::hash::Hasher; pub trait Hash { fn hash<H>(&self, state: &mut H) where H: Hasher; } }
Fortunately, we can use an erased trait approach:
#![allow(unused)] fn main() { use std::collections::HashSet; use core::hash::{Hash, Hasher}; trait DynHash { fn dyn_hash(&self, state: &mut dyn Hasher); } // impl<T: ?Sized + Hash> DynHash for T { impl<T: Hash> DynHash for T { fn dyn_hash(&self, mut state: &mut dyn Hasher) { self.hash(&mut state) } } impl Hash for dyn DynHash + '_ { fn hash<H: Hasher>(&self, state: &mut H) { self.dyn_hash(state) } } }
Now is a good time to point out a couple of things we're relying on:
-
Hasher
is object safe
If this wasn't the case, we couldn't take a&mut dyn Hasher
in ourdyn_hash
method. -
The generic
H
inHash::hash<H: Hasher>
has an implicitSized
bound
If this wasn't the case, we couldn't coerce the&mut H
to a&mut dyn Hasher
in our implementation ofHash
fordyn DynHash
.
This demonstrates that relaxing a Sized
bound can be a breaking change!
Moving on, we still need to wire up this new functionality to our own trait.
-trait Trait: DynCompare;
+trait Trait: DynCompare + DynHash {}
#![allow(unused)] fn main() { use core::hash::{Hash, Hasher}; trait DynHash { fn dyn_hash(&self, _: &mut dyn Hasher); } trait Trait: DynHash {} // Same as what we did for `dyn DynHash` impl Hash for dyn Trait { fn hash<H: Hasher>(&self, state: &mut H) { self.dyn_hash(state) } } }
let bx1a: Box<dyn Trait> = Box::new(1);
let bx1b: Box<dyn Trait> = Box::new(1);
let bx2: Box<dyn Trait> = Box::new(2);
let bx3: Box<dyn Trait> = Box::new(true);
let hm: HashSet<_> = HashSet::from_iter([bx1a, bx1b, bx2, bx3].into_iter());
assert_eq!(hm.len(), 3);
Closing remarks
Borrowing a concrete type is probably a better approach if it applies
to your use case, since it doesn't require Any + 'static
.
Although terribly inefficient, an implementation of Hash
that returns the hash for everything
is a correct implementation. So are other "rough approximations", like if we only hashed the
TypeId
. All that's required for logical behavior is that two equal values must also have
equal hashes. So arguably we didn't need to go to such lengths to get the exact hashes of the
values.
You may have noticed this commented out line:
// impl<T: ?Sized + Hash> DynHash for T {
impl<T: Hash> DynHash for T {
The reason I went with the less general implementation is two-fold:
- It wasn't needed for the example
- Prudency: because we
impl Hash for dyn DynHash
, it technically overlaps with the compiler's implementation ofDynHash for dyn DynHash
. See the final paragraph of this subsection.
dyn Any
examples
There are some additional examples in the dyn Any
section, since they make more sense there.
More about dyn Any
We've taken a lot of care to emphasize that dyn Trait
isn't
a supertype nor a form of dynamic typing, but
in one of the examples
we saw that dyn Any
can "downcast" back to the erased
base type. Or to quote the official documentation,
Any
is
A trait to emulate dynamic typing.
Here we take a closer look at dyn Any
specifically.
The general idea
The Any
trait is implemented for all types which satisfy a 'static
bound.
It supplies a method type_id
, which returns
an opaque but unique identifier
of the implementing type. We also have
TypeId::of::<T>
,
which lets us look up the TypeId
of any 'static
type.
Together, this allows fallible downcasting by doing things along these lines:
#![allow(unused)] fn main() { pub fn downcast_ref<T: Any>(&self) -> Option<&T> { if self.is::<T>() { // SAFETY: just checked whether we are pointing to the correct type, and we can rely on // that check for memory safety because we have implemented Any for all types; no other // impls can exist as they would conflict with our impl. unsafe { Some(self.downcast_ref_unchecked()) } } else { None } } }
And is
simply compares
TypeId::of::<T>()
to self.type_id()
.
Details of specific downcasts aside, that's pretty much it! All that was
needed is the global identifier (TypeId
). The standard library provides
the various downcasting methods in order to encapsulate the required
unsafe
ty.
As a reminder from before, the vtable pointers themselves are not suitable to use as a global identifier of erased types. The same trait can have multiple vtables due to codegen units and linker implementations, and different traits can have the same vtable due to deduplication optimizations.
Where exactly the language goes with respect to comparing vtable pointers
is an open question. It's
not unimaginable that all vtables will gain some lifetime-erased version of
TypeId
, but related to some discussion below, this may not
be as straightforward as it may sound.
Downcasting methods are not trait methods
Note that the only method available in the Any
trait is type_id
.
All of the downcasting methods
are implemented on the erased dyn Any
directly, or on Box<dyn Any>
, or
on dyn Any + Send
, etc. Downcasting doesn't generally make sense for a
non-erased base type -- you already know what it is!
Another good reason for this is that
types like Box<dyn Any>
implement Any
,
making easy to accidentally call the Box<dyn Any>
implementation instead of
the dyn Any
implementation in the case of type_id
. It would be much more
fraught if downcast_ref
worked like this, for example.
However, this does mean that having Any
as a supertrait does not allow
downcasting for your own dyn Trait
s. Instead you have to first upcast
to dyn Any, and then
downcast. Once we have built-in supertrait upcasting,
the process will involve much less boilerplate when an Any
suptertrait
bound is acceptable.
But note that a supertrait Any
bound is not the only solution for custom downcasting!
We explore another approach below.
Some brief examples
In our other example,
we used manual supertrait casting to
turn a dyn DynCompare
into a dyn Any
. This was a case where we really just wished we
could attempt to downcast dyn DynCompare
itself.
Here we instead look at some simple examples of type erasing and downcasting concrete types directly.
The basics
Getting a dyn Any
isn't any different than any other kind of type erasure:
#![allow(unused)] fn main() { use std::any::Any; let mut i = 0; let rf: &dyn Any = &(); let mt: &mut dyn Any = &mut i; let bx: Box<dyn Any> = Box::new(String::new()); }
You have to keep in mind the 'static
requirement though:
#![allow(unused)] fn main() { use std::any::Any; let local = (); let borrow = &local; // fails because `borrow` is not `'static` let _: &dyn Any = &borrow; }
On the upside, dyn Any
is always dyn Any + 'static
,
which makes many trait object related borrow check errors impossible.
Although Any
is implemented for unsized types, and unsized types can
have TypeId
s too, the Sized
restriction for type erasure still applies:
#![allow(unused)] fn main() { use std::any::Any; // fails because `str` is not `Sized` let _: &dyn Any = ""; }
Fallible downcasting is pretty straightforward as well. For references
the return is an Option
:
#![allow(unused)] fn main() { use std::any::Any; let mut i = 0; let rf: &dyn Any = &(); let mt: &mut dyn Any = &mut i; assert_eq!( rf.downcast_ref::<()>(), Some(&()) ); assert_eq!( rf.downcast_ref::<String>(), None ); assert!( mt.downcast_mut::<i32>().is_some() ); assert_eq!( mt.downcast_mut::<String>(), None ); }
For Box
es, the return type is a Result
so that the you can keep
ownership of the Box<dyn Any>
if the downcast is not applicable.
The Ok
variant is a Box<T>
so that you can choose whether it's
appropriate to unbox the type or not.
#![allow(unused)] fn main() { use std::any::Any; let bx: Box<dyn Any> = Box::new(String::new()); if let Err(bx) = bx.downcast::<i32>() { println!("Hmm, not an `i32`."); if let Ok(bx) = bx.downcast::<String>() { let s: String = *bx; println!("Yep, it was a `String`."); } } }
That's it for the basics!
The TypeMap
pattern
For an example with a more practical bent, let's say you wanted to store a distinct
value for each distinct type you may encounter, for some reason. Maybe you're
storing callbacks for types which are likewise type erased, say, and the callback
for a Dog
would be different than that for a Cat
, and you might not even have
a callback for the Mouse
.
One way to do this would be to have a data structure that maps a TypeId
to the
values. A "type map", if you will:
#![allow(unused)] fn main() { use std::any::Any; use std::any::TypeId; use std::collections::HashMap; pub struct TypeMap<V> { map: HashMap<TypeId, V>, } impl<V> TypeMap<V> { pub fn insert<T: Any>(&mut self, value: V) -> Option<V> { let id = TypeId::of::<T>(); self.map.insert(id, value) } pub fn get_mut<T: Any>(&mut self) -> Option<&mut V> { self.get_mut_of(&TypeId::of::<T>()) } pub fn get_mut_of(&mut self, id: &TypeId) -> Option<&mut V> { self.map.get_mut(id) } // ... } }
This could be used for the callback idea:
#![allow(unused)] fn main() { use std::any::Any; use std::any::TypeId; use std::collections::HashMap; pub struct TypeMap<V> { map: HashMap<TypeId, V>, } impl<V> TypeMap<V> { pub fn insert<T: Any>(&mut self, value: V) -> Option<V> { let id = TypeId::of::<T>(); self.map.insert(id, value) } pub fn get_mut<T: Any>(&mut self) -> Option<&mut V> { self.get_mut_of(&TypeId::of::<T>()) } pub fn get_mut_of(&mut self, id: &TypeId) -> Option<&mut V> { self.map.get_mut(id) } } pub struct Visitor { map: TypeMap<Box<dyn FnMut(&dyn Any)>>, } impl Visitor { // Because we return closures we have previously created, // we should take care to not *assume* that the parameter // in the callback is of the correct type. If we never // let our closures escape to the outside world, we could // safely assume that the parameter was, in fact, `T`. // // It would be sound to `panic` if the parameter was not // `T` even if we let the closures escape, but it would // not be sound to use the unstable `downcast_ref_unchecked` // so long as we're letting the closure escape. pub fn register<T, F>(&mut self, mut callback: F) -> Option<Box<dyn FnMut(&dyn Any)>> where T: Any, F: 'static + FnMut(&T), { let callback = Box::new(move |any: &dyn Any| { if let Some(t) = any.downcast_ref::<T>() { callback(t); } }); self.map.insert::<T>(callback) } pub fn get_callback<T: Any>(&mut self) -> Option<impl FnMut(&T) + '_> { self.map .get_mut::<T>() .map(|f| { |t: &T| f(t) }) } pub fn visit<T: Any>(&mut self, value: &T) -> bool { if let Some(mut callback) = self.get_callback::<T>() { callback(value); true } else { false } } pub fn visit_erased(&mut self, value: &dyn Any) -> bool { if let Some(callback) = self.map.get_mut_of(&value.type_id()) { callback(value); true } else { false } } } }
Above we have also type erased our callback signatures, since we needed a single type for our values. This is somewhat the data structure version of erasing a trait.
For whatever reason you might want to map by types, this is an existing pattern in the ecosystem.
Custom downcasting
In this example, we explore how you can add (emulated) downcasting to your own
traits, without an Any
supertrait bound and without a 'static
bound on your
entire trait, either. The implementation does still depend on the TypeId
,
so the actual downcasting is still limited to types which satisfy a 'static
bound.
The approach is to make implementing the general idea possible
by having a method in our trait that returns the TypeId
of the implementing type.
#![allow(unused)] fn main() { use core::any::TypeId; pub trait Trait { // Heads up: We'll need to revisit this definition in just a bit fn type_id(&self) -> TypeId where Self: 'static { TypeId::of::<Self>() } } }
Recall that the compiler's implementation of Trait for dyn Trait
will implicitly
downcast the receiver and call this method. Therefore, we now have a way to get
the TypeId
of the base type out of dyn Trait
.
Then we can implement our own downcast methods:
#![allow(unused)] fn main() { use core::any::TypeId; pub trait Trait { fn type_id(&self) -> TypeId where Self: 'static { TypeId::of::<Self>() } } // n.b. this is `dyn Trait + 'static` :-) impl dyn Trait { pub fn is<T: 'static>(&self) -> bool { TypeId::of::<T>() == self.type_id() } pub fn downcast<T: 'static>(self: Box<Self>) -> Result<Box<T>, Box<Self>> { if (*self).is::<T>() { let ptr = Box::into_raw(self) as *mut T; // SAFETY: Keep reading :-) unsafe { Ok(Box::from_raw(ptr)) } } else { Err(self) } } } }
...and similarly for downcast_ref
, downcast_mut
, and more implementations for
dyn Trait + Send
, dyn Trait + Send + Sync
, and so on. (Yes, it can be a lot
of boilerplate.)
However, there's a large soundness hole in this example. Implementors of Trait
can supply their own implementation of the type_id
method! They can override
the default body and return a TypeId
that is not the implementing type. That
makes our unsafe
produce UB; since type_id
isn't an unsafe
method, our
implementation is to blame.
We could make the method unsafe
. In this example, we will instead make it
impossible to override the default body. To do this, we need that method
to be final
. Well, Rust doesn't have final
yet. However, we can
seal
the method by giving it a parameter which can only be named within our module.
#![allow(unused)] fn main() { // A private module (no `pub`) mod private { // Containing a `pub` type (to avoid errors on our public trait method) pub struct Seal; } }
pub trait Trait {
- fn type_id(&self) -> TypeId
+ #[doc(hidden)]
+ fn type_id(&self, _: private::Seal) {
where
Self: 'static
{
TypeId::of::<Self>()
}
}
impl dyn Trait {
pub fn is<T: 'static>(&self) -> bool {
- TypeId::of::<T>() == self.type_id()
+ TypeId::of::<T>() == self.type_id(private::Seal)
}
Now if an implementor tries to write out the signature of the type_id
method,
they'll get a privacy error.
Here's a playground with a couple more methods.
This example is based on std::error::Error
and its own custom downcasting implementation.
Why 'static
?
The Any
trait is implemented for all types which satisfy a 'static
bound,
but no other types; in fact, it has a 'static
bound and thus cannot be implemented
for types that do not meet a 'static
bound. This means that emulating dynamic
typing with Any
cannot be done for borrowing types (except those that borrow for
'static
), for example.
Why such a harsh restriction? In short, lifetimes are erased before runtime, types
with different lifetimes would have to have the same TypeId
identifier, and thus
downcasting based on the TypeId
would ignore lifetimes and be wildly unsound.
Lifetimes are a part of types and certain relationships must be preserved for
soundness, but as the lifetimes have been erased before runtime, it's not possible
to preserve the relationships dynamically.
Thus there is just no sound way to use TypeId
or any similar lifetime-unaware
identifier to perform non-'static
downcasts directly.
There is more information in this RFC PR,
for the curious. Note that the PR was accepted but then later removed, and was never
about non-'static
downcasting; it was about a non-'static
type_id
method. The
idea was to get a "type" identifier that ignored lifetimes.
It was withdrawn in large part because if such a thing existed, the chances of it being abused in some wildly unsound way are about 100%.
An alternative (as presented in that thread) is to have some way to dynamically check
if two types (which are perhaps generic) are equal without imposing a 'static
bound.
The check could only be meaningful for types that were "inherently 'static
", that is,
types that do not involve any lifetime parameters at all. That would be possible
without actually exposing a non-'static
TypeId
or otherwise enabling downcasting.
The tradeoff results in pretty unintuitive behavior: &'static str
cannot be
compared to &'static str
with this approach, because there is a lifetime parameter
involved with &str
!
Another alternative is to provide some sort of "type lambda" which is itself 'static
,
but can soundly map erased lifetimes back to their proper position. A sketch is
provided here,
but an in-depth exploration is out of scope for this guide.
A potential footgun around subtypes (subtitle: why not const
?)
Let's take a minute to talk about types that do have a sub and supertype relationship in Rust! Types which are higher-ranked have this relationship. For example:
#![allow(unused)] fn main() { // More explicitly, this is a `for<'any> fn(&'any str)` function pointer. // The type is higher-ranked over the lifetime. let fp: fn(&str) = |_| {}; // This type is a supertype of the higher-ranked type. let fp: fn(&'static str) = fp; // This errors because you can't soundly downcast the types. // let fp: fn(&str) = fp; }
And as it turns out, it is possible for two Rust types which are more than superficially syntactically different to be subtypes of one another. Some parts of the language consider the existence of such a relationship to mean that the two types are equal. Let's say they are semantically equal.
Below is an example. Due to covariance, it's always possible to call either of the functions from the other, which helps explain why they are considered subtypes of one another.
#![allow(unused)] fn main() { let one: for<'a > fn(&'a str, &'a str) = |_, _| {}; let two: for<'a, 'b> fn(&'a str, &'b str) = |_, _| {}; let mut fp = one; fp = two; let mut fp = two; fp = one; }
However, these two types have different TypeId
s!
So different parts of Rust currently disagree about what types are equal or not.
As the issue explains, this is a bit of a footgun if you were expecting consistency.
Additionally, it's a blocker for a const type_id
function
as it is possible to cause UB in safe code with a const type_id
function so long
as this inconsistency remains.
How the language will evolve around this is unclear. People want the const
feature
bad enough that some version with caveats about false negatives
may be pursued. Personally I feel making the type system consistent would be the
better solution and worth waiting for.
More considerations around higher-ranked types
Even if the issue discussed above gets resolved and Rust becomes consistent about what types are equal, higher-ranked types introduce some nuance to be aware of. For example, when considering these two types:
#![allow(unused)] fn main() { trait Look<'s> {} type HR = dyn for<'any> Look<'any> + 'static; type ST = dyn Look<'static> + 'static; }
HR
is a subtype of ST
, but not the same type.
However, they both satisfy a 'static
bound:
#![allow(unused)] fn main() { trait Look<'s> {} type HR = dyn for<'any> Look<'any> + 'static; type ST = dyn Look<'static> + 'static; fn assert_static<T: ?Sized + 'static>() {} assert_static::<HR>(); assert_static::<ST>(); }
As 'static
types, they have TypeId
s. As distinct types, their
TypeId
s are different, even though one is a subtype of the other.
And this in turn means that you can't stop thinking about sub-and-super types
by simply applying a 'static
bound. If you need to "disable" sub/super type
coercions in a generic context for soundness, you must make that context invariant
or take other steps to avoid a soundness hole, even if you have a 'static
bound.
See this issue for a real-life example of such a soundness hole, and this comment in particular exploring the sub/super type relationships of higher-ranked function pointers.
The representation of TypeId
TypeId
is intentionally opaque and subject to change. It was internally represented
by a u64
for quite some time; as of Rust 1.72
the representation is a u128
.
At some future time it could be something more exotic.
Long story short, you're not meant to rely on the exact representation of TypeId
,
only it's type comparing properties.