mirror of https://github.com/rust-lang/rfcs
Compare commits
10 Commits
66870f8c09
...
9d9f293266
Author | SHA1 | Date |
---|---|---|
Mara Bos | 9d9f293266 | |
Eric Huss | 8e7887c8b7 | |
Ralf Jung | 865c00519b | |
Mara Bos | 61083fbd7f | |
Mara Bos | 4aa6cd93ad | |
Mara Bos | 9b3e6d1c11 | |
Mara Bos | bb2acd3759 | |
Mara Bos | 2fb04332c4 | |
Mara Bos | 4054f1bc14 | |
Mara Bos | 33968d16fa |
|
@ -115,7 +115,8 @@ merged into the RFC repository as a markdown file. At that point the RFC is
|
|||
feedback from the larger community, and the author should be prepared to
|
||||
revise it in response.
|
||||
- Now that your RFC has an open pull request, use the issue number of the PR
|
||||
to update your `0000-` prefix to that number.
|
||||
to rename the file: update your `0000-` prefix to that number. Also
|
||||
update the "RFC PR" link at the top of the file.
|
||||
- Each pull request will be labeled with the most relevant [sub-team], which
|
||||
will lead to its being triaged by that team in a future meeting and assigned
|
||||
to a member of the subteam.
|
||||
|
|
|
@ -0,0 +1,403 @@
|
|||
- Feature Name: `export`
|
||||
- Start Date: 2023-04-19
|
||||
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
|
||||
- Rust Issue: [rust-lang/rust#0000](https://github.com/rust-lang/rust/issues/0000)
|
||||
|
||||
# Dynamically Linked Crates
|
||||
|
||||
This is a proposal for a new `#[export]` attribute to greatly simplify
|
||||
the creation and use of dynamic libraries.
|
||||
|
||||
This proposal complements the ["crabi" ABI](https://github.com/rust-lang/rust/pull/105586) proposal.
|
||||
|
||||
## Problem statement
|
||||
|
||||
Imagine a simple library crate with just one simple function, and an application that uses it:
|
||||
|
||||
```rust
|
||||
//! library crate
|
||||
|
||||
pub fn hello() {
|
||||
println!("Hello!");
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
//! application crate
|
||||
|
||||
fn main() {
|
||||
library::hello();
|
||||
}
|
||||
```
|
||||
|
||||
By default, Cargo will automatically build both crates and **statically** link them into a single binary.
|
||||
|
||||
However, there are many reasons why one might want to **dynamically** link the library instead.
|
||||
The use cases for dynamic linking can be roughly split into two:
|
||||
|
||||
1. Cases where both the dynamic library and application are compiled with the exact same compiler (on the same platform, with the same settings) and shipped together.
|
||||
2. Cases where dynamic library and application can be compiled and shipped separately from each other.
|
||||
|
||||
The first situation is currently relatively well supported by Rust.
|
||||
The Rust compiler itself falls into this category, where we ship a single `librustc_driver.so` (or .dll or equivalent)
|
||||
file that is used by `rustc`, `rustfmt`, `rustdoc`, and `clippy`.
|
||||
The motivation is simply to reduce the binary size of the overall package containing all these tools.
|
||||
|
||||
The second situation has far more use cases and currently not supported well by Rust.
|
||||
A common use case is a library that is shipped as part of the system (e.g. `libz.so` or `kernel32.dll`),
|
||||
in which case you want to use the version provided by the system the program is run on,
|
||||
and not from the system it was compiled on.
|
||||
In these cases, dynamically linking is important to make sure the library can be independently updated.
|
||||
(And it also helps to not blow up binary sizes.)
|
||||
|
||||
We need a good solution for this second category of use cases.
|
||||
|
||||
### Solution today
|
||||
|
||||
Currently, a way to implement this would make use of a combination of `extern "C"`, `#[no_mangle]` and `unsafe`,
|
||||
each of which has major downsides.
|
||||
|
||||
It'd look something like this:
|
||||
|
||||
```rust
|
||||
//! library crate
|
||||
|
||||
pub fn hello() {
|
||||
println!("Hello!");
|
||||
}
|
||||
|
||||
#[no_mangle]
|
||||
pub extern "C" fn some_unique_name_for_hello() {
|
||||
hello();
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
//! library bindings crate
|
||||
|
||||
#[link(name = "library")]
|
||||
extern "C" {
|
||||
fn some_unique_name_for_hello();
|
||||
}
|
||||
|
||||
#[inline]
|
||||
pub fn hello() {
|
||||
unsafe { some_unique_name_for_hello() };
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
//! application crate
|
||||
|
||||
fn main() {
|
||||
library_bindings::hello();
|
||||
}
|
||||
```
|
||||
|
||||
This is bad. It's very verbose and error prone. More specifically:
|
||||
|
||||
- `#[no_mangle]` is needed to export a symbol under a stable name, but it requires manually picking a good unique name that won't collide with other items from other crates.
|
||||
- A stable ABI is necessary to allow linking code from a different compiler (version),
|
||||
but `extern "C"` puts severe limitations on the function signatures,
|
||||
as most Rust types can't directly pass through the C ABI.
|
||||
- `unsafe` code is required, because the compiler cannot validate the imported symbol matches the expected function signature.
|
||||
Importing the wrong library (with the same symbol name) could result in unsoundness.
|
||||
- There are now two library crates: one that will be compiled into the dynamic library (the .dll/.so/.dylib file),
|
||||
and one that provides the bindings to that dynamic library.
|
||||
The second library likely fully inlined into the final application,
|
||||
as it only has wrappers, just to bring back the original (safe) function signatures.
|
||||
|
||||
Much of this solution could be automated by a procedural macro,
|
||||
but splitting a library crate in two falls outside of what a procedural macro can reasonably do.
|
||||
|
||||
### Proposed solution sketch
|
||||
|
||||
Instead of all the manual usage of `#[no_mangle]`, `extern`, and `unsafe`,
|
||||
a much better solution would look as closely as possible to the original code.
|
||||
|
||||
With the proposal below, one only needs to add an `#[export]` attribute, and give the function a stable ABI
|
||||
(e.g. `extern "C"` or (in the future) `extern "crabi"`):
|
||||
|
||||
```rust
|
||||
//! library crate
|
||||
|
||||
#[export]
|
||||
pub extern "C" fn hello() {
|
||||
println!("Hello!");
|
||||
}
|
||||
```
|
||||
|
||||
```rust
|
||||
//! application crate
|
||||
|
||||
fn main() {
|
||||
library::hello();
|
||||
}
|
||||
```
|
||||
|
||||
The library can then be either linked statically or dynamically, by informing cargo of the choice:
|
||||
|
||||
```diff
|
||||
[dependencies]
|
||||
- library = { path = "..." }
|
||||
+ library = { path = "...", dynamic = true }
|
||||
```
|
||||
|
||||
## Proposal
|
||||
|
||||
Creating and using dynamic libraries involves three things:
|
||||
|
||||
1. A stable ABI that can be used for the items that are exported/imported.
|
||||
2. A way to export and import items.
|
||||
3. A way to create and use dynamic libraries.
|
||||
|
||||
For (1) we currently only have `extern "C"`, which only suffices for very simple cases.
|
||||
This proposal does not include any improvements for (1),
|
||||
but the ["crabi" proposal](https://github.com/rust-lang/rust/pull/105586) proposes the creation
|
||||
of a new `extern "…"` ABI that is more flexible, which perfectly complements this proposal.
|
||||
|
||||
This proposal provides solutions for (2) and (3).
|
||||
Exporting (and importing) items is done through a new language feature: the `#[export]` attribute.
|
||||
Creating and using dynamic libraries is made easy through a new Cargo feature: `dynamic` dependencies.
|
||||
|
||||
### The `#[export]` Attribute
|
||||
|
||||
The `#[export]` attribute is used to mark items which are "stable" (in ABI/layout/signature)
|
||||
such that they can be used across the border between (separately compiled) dynamically linked libraries/binaries.
|
||||
|
||||
The `#[export]` attribute can be applied to any public item that is *exportable*.
|
||||
Which items are *exportable* is something that can increase over time with future proposals.
|
||||
Initially, only the following items are *exportable*:
|
||||
|
||||
- Non-generic functions with a stable ABI (e.g. `extern "C"`)
|
||||
for which every user defined type used in the signature is also marked as `#[export]`.
|
||||
- This includes type associated functions ("methods").
|
||||
- Structs/enums/unions with a stable representation (e.g. `repr(i32)` or `repr(C)`).
|
||||
- Re-exports of those items (`use` statements, `type` aliases).
|
||||
|
||||
An `#[export]` attribute can also be applied to a crate, module, and non-generic type `impl` block,
|
||||
which is simply equivalent to applying the attribute to every public item within it.
|
||||
|
||||
For types, the `#[export]` attribute represents the commitment to keep the representation of the type stable.
|
||||
(To differentiate from, for example, a `#[repr(i32)]` that only exists as an optimization rather than as a stable promise.)
|
||||
|
||||
For functions, the `#[export]` attribute will make the function available from the dynamic library
|
||||
under a stable "mangled" symbol that uniquely represents its crate and module path *and full signature*.
|
||||
(More on that below.)
|
||||
|
||||
For aliases of functions, an `#[export]` attribute on the `use` statement will use the
|
||||
path (and name) of the alias, not of the original function.
|
||||
(So it won't 'leak' the name of any (possibly private/unstable) module it was re-exported from.)
|
||||
|
||||
### Privacy
|
||||
|
||||
It is an error to export an item that is not public, or is part of a non-public module.
|
||||
The set of exported items of a crate will always be a subset of the crate's public interface.
|
||||
|
||||
It's fine to `#[export]` a public alias of a public type from a private module:
|
||||
|
||||
```rust
|
||||
mod a {
|
||||
pub extern "C" fn f() { … }
|
||||
}
|
||||
|
||||
#[export]
|
||||
pub mod b {
|
||||
pub use super::a::f;
|
||||
}
|
||||
```
|
||||
|
||||
(This will export the function f as `b::f`.)
|
||||
|
||||
### Importing Exported Items
|
||||
|
||||
Normally, when using a crate as a dependency, any `#[export]` attributes of that crate have no effect
|
||||
and the dependency is statically linked into the resulting binary.
|
||||
|
||||
When explicitly specifying `dynamic = true` for the dependency with `Cargo.toml`,
|
||||
or when using a `extern dyn crate …;` statement in the source code,
|
||||
only the items marked as `#[export]` will be available and the dependency will be linked dynamically.
|
||||
|
||||
### Building Dynamic Dependencies
|
||||
|
||||
When using `dynamic = true` for a dependency, there is no need to build that full crate:
|
||||
only the signatures of its exported items are necessary.
|
||||
Cargo will pass a flag to the Rust compiler which will stop it from generating
|
||||
code for non-exported items and function bodies.
|
||||
|
||||
A clear separation between "public dependencies" (which used in the interface)
|
||||
and "private dependencies" (which are only used in the implementation) is required
|
||||
to avoid building unnecessary indirect dependencies.
|
||||
A system for that has been proposed in [RFC 1977](https://rust-lang.github.io/rfcs/1977-public-private-dependencies.html).
|
||||
|
||||
### Name Mangling and Safety
|
||||
|
||||
Because a dynamic dependency and the crate that uses it are compiled separately
|
||||
and only combined at runtime,
|
||||
it is impossible for the compiler to perform any (safety, borrow, signature, …) checks.
|
||||
However, making a (perhaps accidental) change to a function signature or type
|
||||
should not lead to undefined behavior at runtime.
|
||||
|
||||
There are two ways to solve this problem:
|
||||
|
||||
1. Make it the responsibility of the user.
|
||||
2. Make it the responsibility of the loader/linker.
|
||||
|
||||
Option (1) simply means making everything `unsafe`, which isn't very helpful to the user.
|
||||
Option (2) means the loader (the part that loads the dynamic library at runtime) needs to perform the checks.
|
||||
|
||||
Unless we ship our own loader as part of Rust binaries,
|
||||
we can only make use of the one functionality available in the loaders of all operating systems:
|
||||
looking up symbols by their name.
|
||||
|
||||
So, in order to be able to provide safety, the symbol name has to be unique for the full signature,
|
||||
including all relevant type descriptions.
|
||||
|
||||
To avoid extremely long symbol names that contain a full (encoded) version of the function signature
|
||||
and all relevant type descriptions, we use a 128-bit hash based on all this information.
|
||||
|
||||
For example, an exported item in `foo::bar` in the crate `mycrate` would be exported with a symbol name such as:
|
||||
|
||||
```
|
||||
_RNvNtC_7mycrate3foo3bar_f8771d213159376fafbff1d3b93bb212
|
||||
```
|
||||
|
||||
Where the first part is the (mangled) path and name of the item,
|
||||
and the second part is the hexadecimal representation of a 128-bit hash of all relevant signature and type information.
|
||||
The hash algorithm is still to be determined.
|
||||
|
||||
(See also the "alternatives" section below.)
|
||||
|
||||
### Type Information
|
||||
|
||||
As mentioned above, the hash in a symbol name needs to cover _all_ relevant type information.
|
||||
However, exactly which information is and isn't relevant for safety is a complicated question.
|
||||
|
||||
#### Types with Public Fields
|
||||
|
||||
For a simple user defined type where all fields are both public, like the `Point` struct below,
|
||||
the relevant parts are the size, alignment, and recursively all field information.
|
||||
|
||||
```rust
|
||||
#[export]
|
||||
#[repr(C)]
|
||||
pub struct Point {
|
||||
pub x: f32,
|
||||
pub y: f32,
|
||||
pub name: &str,
|
||||
}
|
||||
```
|
||||
|
||||
The `#[export]` attribute is the user's commitment to keep the type stable, but without `unsafe`,
|
||||
any mistakes should _not_ result in unsoundness.
|
||||
Accidentally changing the struct to swap the `x` and `name` fields should result in a different hash,
|
||||
such that the `f32` won't get interpreted as a `&str`, for example.
|
||||
|
||||
Note that, technically, the names of the type and the fields are not relevant, at least _not for memory safety_.
|
||||
Swapping the `x` and `y` fields result in surprises and bugs and shouldn't be done,
|
||||
but it won't result in undefined behaviour, since any Rust code can swap the fields without using `unsafe`.
|
||||
|
||||
However, for public fields, the field names are already part of the stable API, so we include them in the hash as well.
|
||||
|
||||
It is an error to use a plain `#[export]` attribute on a type with out stable `#[repr(…)]`,
|
||||
if it has any private fields,
|
||||
or if any of the fields are not of an `#[export]`ed or builtin type.
|
||||
|
||||
#### Types with Private Fields
|
||||
|
||||
For types where not all fields are public, the situation is much more complicated.
|
||||
|
||||
Private fields usually come with certain *invariants*, and come with `unsafe` code that makes assumptions about them.
|
||||
For example, the private fields of a `Vec` are assumed to represent a valid "owned" pointer to an allocation together with its capacity and initialized size.
|
||||
|
||||
If it would be possible to define a identically named type with the same fields but different (or no) invariants/assumptions,
|
||||
or just change the invariants in an existing library,
|
||||
it'd be possible to cause undefined behavior by loading the "wrong" dynamic library.
|
||||
|
||||
Therefore, we can't allow a regular `#[export]` attribute on a type with private fields,
|
||||
since we have no way of automatically determining the invariants / unsafe assumptions about private fields.
|
||||
|
||||
Instead, for these types, we must require the user to *unsafely* commit to
|
||||
ABI stability if they want to make the type available to exported functions.
|
||||
|
||||
Using `#[export(unsafe_stable_abi = «hash»)]`, one can make the (unsafe) promise
|
||||
that the type will remain ABI compatible as long as the provided hash remains the same.
|
||||
The hash must have been randomly generated to ensure uniqueness (which is part of the unsafe promise).
|
||||
|
||||
```rust
|
||||
#[export(unsafe_stable_abi = "ca83050b302bf0644a1417ac3fa6982a")]
|
||||
#[repr(C)]
|
||||
pub struct ListNode {
|
||||
next: *const ListNode,
|
||||
value: i32,
|
||||
}
|
||||
```
|
||||
|
||||
In this case, using the type as part of a function signature will not result in a hash based on the full (recursive) type definition,
|
||||
but will instead be based on the user provided hash (and the size and alignment of the type).
|
||||
|
||||
### Standard Library
|
||||
|
||||
Once the ["crabi"](https://github.com/rust-lang/rust/pull/105586) feature has progressed far enough,
|
||||
we should consider adding `#[export]` attributes to some standard library types, effectively committing to a stable ABI for those.
|
||||
For example, `Option`, `NonZero`, `File`,
|
||||
and many others are good candidates for `#[export(unsafe_stable_abi)]`
|
||||
(if the "crabi" ABI doesn't already handle them specially).
|
||||
|
||||
## Future Possibilities
|
||||
|
||||
- A `#[no_export]` attribute, which can be useful when marking an entire crate or module as `#![export]`.
|
||||
- Options within the `#[export(…)]` attribute for e.g. a future name mangling scheme version.
|
||||
- `#[export(by_path)]` to export a symbol only based on the path, without the hash of relevant safety/type information.
|
||||
This is useful in situations where safety is not a primary concern, to simplify cases like using Rust code from another language (with a simple symbol name).
|
||||
(Importing (or using) such a symbol from Rust will be `unsafe`.)
|
||||
- `#[export(opaque)]` (or `#[export(indirect)]`) for opaque types that can only be used indirectly (e.g. through a pointer or reference) in exported items,
|
||||
such that their size is not a stable ABI promise.
|
||||
- Exportable `static`s.
|
||||
- Exportable `trait`s, for e.g. dynamic trait objects. (See also https://github.com/rust-lang/rust/pull/105586.)
|
||||
- A tool to create a stripped, 'dynamic import only' version of the crate source code,
|
||||
with only the exported items, without the function bodies. (Essentially a "header file" or "bindings-only crate".)
|
||||
- Alternatively or in addition to that, a way to optionally include such a "header file" or "bindings"
|
||||
(or the same metadata in some other (perhaps Rust-agnostic) format)
|
||||
inside the resulting dynamic library file.
|
||||
- Allow exporting two identically named items to create a dynamic library that is backwards compatible with an older interface,
|
||||
including both a symbol for the old and new interface.
|
||||
- Next to the hash of the type information,
|
||||
additionally and optionally include the full type information in an extra section,
|
||||
to allow for (debug) tools to accurately diagnose mismatched symbol errors.
|
||||
- Some kind of `#[export_inline]` feature to allow for functions that will be inlined into the calling crate,
|
||||
rather than being part of the dynamic library, which will only be able to call exported items.
|
||||
|
||||
## Alternatives
|
||||
|
||||
- Alternatives for using a hash of all relevant type information:
|
||||
- Don't include type information in the symbols,
|
||||
but make using a dynamic dependency `unsafe` by requiring e.g. `unsafe extern dyn crate …;`.
|
||||
- Include the full (encoded) type information in the symbols, without hashing it.
|
||||
This results in extremely long symbol names, and all the type information will be recoverable
|
||||
(which might be useful or might be undesirable, depending on the use case).
|
||||
This can result in significantly larger binary sizes.
|
||||
- Don't include type information in the symbols, but include the information in another way (e.g. an extra section).
|
||||
If we do this, we can't make use of the loader/linker for the safety checks,
|
||||
so we'll have to include extra code in Rust binaries that will perform the checks separately
|
||||
before using any dynamic dependency.
|
||||
|
||||
## What this Proposal is not
|
||||
|
||||
Questions like
|
||||
|
||||
- How do panics propagate across dynamically linked crates or FFI boundaries?
|
||||
- How can allocated types can cross an export boundary and be dropped/deallocated on the other side?
|
||||
|
||||
are **not** solved by `#[export]`, but instead are the responsibility of the ABI.
|
||||
|
||||
The existing `extern "C"` ABI 'solves' these by simply not having any such features.
|
||||
|
||||
The [`extern "crabi"` ABI](https://github.com/rust-lang/rust/pull/105586)
|
||||
will attempt to solve these (but perhaps not in the first version),
|
||||
but that falls outside the scope of this RFC.
|
||||
|
||||
(A separate RFC for the first version of "crabi" might very well appear soon. ^^)
|
||||
|
||||
Separately, the question of how this will be (optionally) used for the standard library is another question entirely,
|
||||
which is left for a later proposal. (Although the hope is that this RFC gives at least a rough idea of how that might work.)
|
Loading…
Reference in New Issue