Compare commits

...

49 Commits

Author SHA1 Message Date
Mads Marquart 19a40fb09b
Merge 3cd8b7b123 into 51817951d0 2024-04-27 12:17:53 -07:00
Eric Huss 51817951d0
Merge pull request #1468 from petrochenkov/debmac
Add docs for `#[collapse_debuginfo]` attribute
2024-04-27 17:54:45 +00:00
Eric Huss 2d51a2aec4 Add an example of collapse_debuginfo 2024-04-27 10:53:08 -07:00
Eric Huss 5854fcc286
Merge pull request #1420 from daxpedda/wasm-target-feature-phase-4-5
Stabilize Wasm target features that are in phase 4 and 5
2024-04-21 13:47:07 +00:00
Eric Huss 5e68de3dc2
Merge pull request #1493 from kpreid/patch-1
Expand and clarify primitive alignment
2024-04-20 14:05:08 +00:00
Eric Huss 735c5dbf05
Merge pull request #1492 from conradludgate/patch-1
Update clone reference to include closures
2024-04-17 15:06:26 +00:00
Eric Huss 330ef95694
Clone: Also mention closures that don't capture anything 2024-04-17 08:04:47 -07:00
Kevin Reid a432cf4afd
Expand and clarify primitive alignment
These changes are intended to make the section more informative and readable, without making any new normative claims.

* Specify that the alignment might be _less_ than the size, rather than just that it might be different. This is mandatory and stated in the previous section, but I think it's useful to reiterate here.
* Mention `u128`/`i128` as another example of alignment less than size, so that this doesn't sound like a mainly 32-bit thing.
* Add `usize`/`isize` to the size table, so it can be spotted at a glance.
2024-04-16 09:35:51 -07:00
Conrad Ludgate 4f47e3ffe7
Update clone reference to include closures 2024-04-16 06:55:59 +01:00
Eric Huss 585b9bcb72
Merge pull request #1491 from kpreid/neunit
Document how `non_exhaustive` interacts with tuple and unit-like structs.
2024-04-15 19:59:53 +00:00
Kevin Reid 076a798583 Replace “min()” visibility notation with English. 2024-04-15 11:13:52 -07:00
Eric Huss a60221ad9c
Merge pull request #1490 from jlokier/patch-1
Fix link to RISC-V Zkt spec; it was pointing to Zkr
2024-04-15 16:12:40 +00:00
Kevin Reid ec0065fd92 Document how `non_exhaustive` interacts with tuple and unit-like structs. 2024-04-14 10:48:46 -07:00
Jamie Lokier b4311de691
Fix link to RISC-V Zkt spec; it was pointing to Zkr 2024-04-14 14:07:07 +01:00
Eric Huss 55694913b1
Merge pull request #1449 from weiznich/diagnostic_namespace
Add the `#[diagnostic]` attribute namespace and the `#[diagnostic::on_unimplemented]` feature to the reference
2024-04-03 21:31:14 +00:00
Eric Huss 52874b8312 Update on_unimplemented for format string changes.
Updated in https://github.com/rust-lang/rust/pull/122402
2024-04-03 14:29:34 -07:00
Eric Huss 1c03c9d3b8
Merge pull request #1393 from dvdhrm/pr/align32
type-layout: be more specific about 32-bit alignments
2024-04-03 02:21:31 +00:00
Eric Huss 1e1fec30f1
Merge pull request #1488 from yotamofek/patch-1
Fix clippy warning in procedural macro example
2024-04-01 19:56:13 +00:00
Yotam Ofek a7a86824fa
Fix clippy warning in procedural macro example
I copy+pasted this example into my code and the `clippy::to_string_in_format_args` lint fired.
2024-03-30 23:14:03 +03:00
Eric Huss 984b36eca4
Merge pull request #1486 from aoyama-val/patch-1
Fix typo of shebang
2024-03-25 14:05:19 +00:00
Shotaro Aoyama 0b153cb607
fix typo of shebang 2024-03-24 14:02:49 +09:00
Eric Huss 824b9156b2
Merge pull request #1461 from clubby789/imported-main
Document importing `main`
2024-03-20 15:54:19 +00:00
Eric Huss b6779f40a1
Merge pull request #1481 from compiler-errors/atb
add grammar for `associated_type_bounds` in reference
2024-03-20 15:52:40 +00:00
Eric Huss be4f7be926
Merge pull request #1483 from mattheww/2024-03_unicode_escape_fix
Literal expressions: fix mistake in the definition of unicode escapes
2024-03-19 20:01:14 +00:00
Matthew Woodcraft 659915cc11 Literal expressions: fix mistake in the definition of unicode escapes 2024-03-19 19:36:32 +00:00
Eric Huss 5e29b0135e Various fixes and editing. 2024-03-12 09:21:17 -07:00
Georg Semmler 99b19d92c1 Apply more review suggestions manually
Co-authored-by: Eric Huss <eric@huss.org>
2024-03-12 08:23:37 -07:00
Georg Semmler 5baf87cdd9 Apply suggestions from code review
Co-authored-by: Eric Huss <eric@huss.org>
2024-03-12 08:23:37 -07:00
Georg Semmler 81fe01a111 Add the `#[diagnostic]` attribute namespace and the
`#[diagnostic::on_unimplemented]` feature to the reference
2024-03-12 08:23:37 -07:00
Michael Goulet 6c77f499ea
Update src/paths.md 2024-03-08 11:42:50 -05:00
Michael Goulet 9ad55f00b1
Fix copy/paste error 2024-03-08 11:42:32 -05:00
Michael Goulet 684b549fc7 add support for ATB in reference 2024-03-07 19:47:28 +00:00
Eric Huss 5afb503a4c
Merge pull request #1459 from mattheww/2024-01_input_format
Input format
2024-03-06 21:29:54 +00:00
Eric Huss 54400709b0
Merge pull request #1479 from mattheww/2024-03_lifetime_tokens
Lexer: say that lifetime-like tokens can't be immediately followed by '
2024-03-06 19:01:19 +00:00
Matthew Woodcraft 7bd81a6a03 tokens.md: say that lifetime-like tokens can't be immediately followed by '
Forms like 'ab'c are rejected, so we need some way to explain why they
don't tokenise as two consecutive LIFETIME_OR_LABEL tokens.

Address this by adding "not immediately followed by `'`" to each of the
lexer rules for the lifetime-like tokens.

This also means there can be no ambiguity between CHAR_LITERAL and these
tokens (at present we don't say how such ambiguities are resolved).
2024-03-04 21:32:01 +00:00
Eric Huss c495b9660f Link `collapse_debuginfo` in the index of built-in attributes. 2024-02-14 10:21:12 -08:00
Eric Huss 860fe4acc1 Use semantic line wrapping. 2024-02-14 10:18:40 -08:00
Eric Huss 4e9c91f0ec Place `rustc` behavior in a side note.
Generally the reference tries to stay focused on the language, and only
provide implementation notes as side-information.
2024-02-14 10:17:29 -08:00
Eric Huss 224b6c5306 Use em-dash separator 2024-02-14 10:16:40 -08:00
Eric Huss bb166095d1 Use standard template introducing an attribute. 2024-02-14 10:16:23 -08:00
Vadim Petrochenkov 0bf5d4e44c Add docs for `#[collapse_debuginfo]` attribute 2024-02-13 16:26:42 +03:00
clubby789 50a2c87f82 Document importing `main` 2024-01-31 16:22:22 +00:00
Matthew Woodcraft 8ba3c49114 Input format: note about include! macros 2024-01-28 18:44:28 +00:00
Matthew Woodcraft e364b6c6f9 lexical structure: move the description of shebang-removal
This takes place after CRLF normalization.

It's better not to list the shebang in a Lexer block, as it isn't a token that
can be fed to a macro.
2024-01-28 18:42:40 +00:00
Matthew Woodcraft 5f512692d3 lexical structure: move the description of BOM-removal
This takes place at the same time as CRLF normalisation.

It's better not to list it in a Lexer block, as it isn't a token that can be
fed to a macro.
2024-01-28 18:42:40 +00:00
Matthew Woodcraft fa56fdba0e Lexical structure: move the description of CRLF normalization
We now say that CRLF normalization happens as a separate pass before
tokenization.
2024-01-28 18:42:40 +00:00
Mads Marquart 3cd8b7b123
Proposal for update after RFC 3519 2023-11-21 15:11:08 +01:00
daxpedda d035af92a1
Stabilize Wasm target features that are in phase 4 and 5 2023-11-01 00:19:00 +01:00
David Rheinsberg fdee1043ca type-layout: be more specific about 32-bit alignments
The rust-reference implies that 64-bit types are aligned to 32-bit for
platforms with 32-bit addresses. This is not necessarily correct. Fix
the wording.

Note that there is no general rule how data-types greater than the
native address size are aligned. On most Unix'y systems, they use the
native alignment of the platform. However, the Windows ABI aligns them
to their size (up to at least 64-bit).

There are advantages for either of those decisions. But we should at
least make clear that there is no fixed rule for 32-bit platforms.

Signed-off-by: David Rheinsberg <david@readahead.eu>
2023-08-11 10:11:43 +02:00
15 changed files with 297 additions and 97 deletions

View File

@ -196,7 +196,7 @@ struct S {
pub fn f() {}
```
> Note: `rustc` currently recognizes the tools "clippy" and "rustfmt".
> Note: `rustc` currently recognizes the tools "clippy", "rustfmt" and "diagnostic".
## Built-in attributes index
@ -224,6 +224,8 @@ The following is an index of all built-in attributes.
- [`allow`], [`warn`], [`deny`], [`forbid`] — Alters the default lint level.
- [`deprecated`] — Generates deprecation notices.
- [`must_use`] — Generates a lint for unused values.
- [`diagnostic::on_unimplemented`] — Hints the compiler to emit a certain error
message if a trait is not implemented.
- ABI, linking, symbols, and FFI
- [`link`] — Specifies a native library to link with an `extern` block.
- [`link_name`] — Specifies the name of the symbol for functions or statics
@ -273,6 +275,7 @@ The following is an index of all built-in attributes.
added in future.
- Debugger
- [`debugger_visualizer`] — Embeds a file that specifies debugger output for a type.
- [`collapse_debuginfo`] — Controls how macro invocations are encoded in debuginfo.
[Doc comments]: comments.md#doc-comments
[ECMA-334]: https://www.ecma-international.org/publications-and-standards/standards/ecma-334/
@ -291,6 +294,7 @@ The following is an index of all built-in attributes.
[`cfg_attr`]: conditional-compilation.md#the-cfg_attr-attribute
[`cfg`]: conditional-compilation.md#the-cfg-attribute
[`cold`]: attributes/codegen.md#the-cold-attribute
[`collapse_debuginfo`]: attributes/debugger.md#the-collapse_debuginfo-attribute
[`crate_name`]: crates-and-source-files.md#the-crate_name-attribute
[`crate_type`]: linkage.md
[`debugger_visualizer`]: attributes/debugger.md#the-debugger_visualizer-attribute
@ -352,3 +356,4 @@ The following is an index of all built-in attributes.
[closure]: expressions/closure-expr.md
[function pointer]: types/function-pointer.md
[variadic functions]: items/external-blocks.html#variadic-functions
[`diagnostic::on_unimplemented`]: attributes/diagnostics.md#the-diagnosticon_unimplemented-attribute

View File

@ -262,7 +262,7 @@ Feature | Implicitly Enables | Description
[rv-zks]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zks.adoc
[rv-zksed]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zksed.adoc
[rv-zksh]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zksh.adoc
[rv-zkt]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zkr.adoc
[rv-zkt]: https://github.com/riscv/riscv-crypto/blob/e2dd7d98b7f34d477e38cb5fd7a3af4379525189/doc/scalar/riscv-crypto-scalar-zkt.adoc
#### `wasm32` or `wasm64`
@ -273,10 +273,20 @@ attempting to use instructions unsupported by the Wasm engine will fail at load
time without the risk of being interpreted in a way different from what the
compiler expected.
Feature | Description
------------|-------------------
`simd128` | [WebAssembly simd proposal][simd128]
Feature | Description
----------------------|-------------------
`bulk-memory` | [WebAssembly bulk memory operations proposal][bulk-memory]
`extended-const` | [WebAssembly extended const expressions proposal][extended-const]
`mutable-globals` | [WebAssembly mutable global proposal][mutable-globals]
`nontrapping-fptoint` | [WebAssembly non-trapping float-to-int conversion proposal][nontrapping-fptoint]
`sign-ext` | [WebAssembly sign extension operators Proposal][sign-ext]
`simd128` | [WebAssembly simd proposal][simd128]
[bulk-memory]: https://github.com/WebAssembly/bulk-memory-operations
[extended-const]: https://github.com/WebAssembly/extended-const
[mutable-globals]: https://github.com/WebAssembly/mutable-global
[nontrapping-fptoint]: https://github.com/WebAssembly/nontrapping-float-to-int-conversions
[sign-ext]: https://github.com/WebAssembly/sign-extension-ops
[simd128]: https://github.com/webassembly/simd
### Additional information

View File

@ -139,3 +139,32 @@ When the crate's debug executable is passed into GDB[^rust-gdb], `print bob` wil
[Natvis documentation]: https://docs.microsoft.com/en-us/visualstudio/debugger/create-custom-views-of-native-objects
[pretty printing documentation]: https://sourceware.org/gdb/onlinedocs/gdb/Pretty-Printing.html
[_MetaListNameValueStr_]: ../attributes.md#meta-item-attribute-syntax
## The `collapse_debuginfo` attribute
The *`collapse_debuginfo` [attribute]* controls whether code locations from a macro definition are collapsed into a single location associated with the macro's call site,
when generating debuginfo for code calling this macro.
The attribute uses the [_MetaListIdents_] syntax to specify its inputs, and can only be applied to macro definitions.
Accepted options:
- `#[collapse_debuginfo(yes)]` — code locations in debuginfo are collapsed.
- `#[collapse_debuginfo(no)]` — code locations in debuginfo are not collapsed.
- `#[collapse_debuginfo(external)]` — code locations in debuginfo are collapsed only if the macro comes from a different crate.
The `external` behavior is the default for macros that don't have this attribute, unless they are built-in macros.
For built-in macros the default is `yes`.
> **Note**: `rustc` has a `-C collapse-macro-debuginfo` CLI option to override both the default collapsing behavior and `#[collapse_debuginfo]` attributes.
```rust
#[collapse_debuginfo(yes)]
macro_rules! example {
() => {
println!("hello!");
};
}
```
[attribute]: ../attributes.md
[_MetaListIdents_]: ../attributes.md#meta-item-attribute-syntax

View File

@ -301,6 +301,76 @@ When used on a function in a trait implementation, the attribute does nothing.
> let _ = five();
> ```
## The `diagnostic` tool attribute namespace
The `#[diagnostic]` attribute namespace is a home for attributes to influence compile-time error messages.
The hints provided by these attributes are not guaranteed to be used.
Unknown attributes in this namespace are accepted, though they may emit warnings for unused attributes.
Additionally, invalid inputs to known attributes will typically be a warning (see the attribute definitions for details).
This is meant to allow adding or discarding attributes and changing inputs in the future to allow changes without the need to keep the non-meaningful attributes or options working.
### The `diagnostic::on_unimplemented` attribute
The `#[diagnostic::on_unimplemented]` attribute is a hint to the compiler to supplement the error message that would normally be generated in scenarios where a trait is required but not implemented on a type.
The attribute should be placed on a [trait declaration], though it is not an error to be located in other positions.
The attribute uses the [_MetaListNameValueStr_] syntax to specify its inputs, though any malformed input to the attribute is not considered as an error to provide both forwards and backwards compatibility.
The following keys have the given meaning:
* `message` — The text for the top level error message.
* `label` — The text for the label shown inline in the broken code in the error message.
* `note` — Provides additional notes.
The `note` option can appear several times, which results in several note messages being emitted.
If any of the other options appears several times the first occurrence of the relevant option specifies the actually used value.
Any other occurrence generates an lint warning.
For any other non-existing option a lint-warning is generated.
All three options accept a string as an argument, interpreted using the same formatting as a [`std::fmt`] string.
Format parameters with the given named parameter will be replaced with the following text:
* `{Self}` — The name of the type implementing the trait.
* `{` *GenericParameterName* `}` — The name of the generic argument's type for the given generic parameter.
Any other format parameter will generate a warning, but will otherwise be included in the string as-is.
Invalid format strings may generate a warning, but are otherwise allowed, but may not display as intended.
Format specifiers may generate a warning, but are otherwise ignored.
In this example:
```rust,compile_fail,E0277
#[diagnostic::on_unimplemented(
message = "My Message for `ImportantTrait<{A}>` implemented for `{Self}`",
label = "My Label",
note = "Note 1",
note = "Note 2"
)]
trait ImportantTrait<A> {}
fn use_my_trait(_: impl ImportantTrait<i32>) {}
fn main() {
use_my_trait(String::new());
}
```
the compiler may generate an error message which looks like this:
```text
error[E0277]: My Message for `ImportantTrait<i32>` implemented for `String`
--> src/main.rs:14:18
|
14 | use_my_trait(String::new());
| ------------ ^^^^^^^^^^^^^ My Label
| |
| required by a bound introduced by this call
|
= help: the trait `ImportantTrait<i32>` is not implemented for `String`
= note: Note 1
= note: Note 2
```
[`std::fmt`]: ../../std/fmt/index.html
[Clippy]: https://github.com/rust-lang/rust-clippy
[_MetaListNameValueStr_]: ../attributes.md#meta-item-attribute-syntax
[_MetaListPaths_]: ../attributes.md#meta-item-attribute-syntax

View File

@ -20,6 +20,12 @@ pub struct Config {
pub window_height: u16,
}
#[non_exhaustive]
pub struct Token;
#[non_exhaustive]
pub struct Id(pub u64);
#[non_exhaustive]
pub enum Error {
Message(String),
@ -34,11 +40,13 @@ pub enum Message {
// Non-exhaustive structs can be constructed as normal within the defining crate.
let config = Config { window_width: 640, window_height: 480 };
let token = Token;
let id = Id(4);
// Non-exhaustive structs can be matched on exhaustively within the defining crate.
if let Config { window_width, window_height } = config {
// ...
}
let Config { window_width, window_height } = config;
let Token = token;
let Id(id_number) = id;
let error = Error::Other;
let message = Message::Reaction(3);
@ -64,30 +72,49 @@ Non-exhaustive types cannot be constructed outside of the defining crate:
- Non-exhaustive variants ([`struct`][struct] or [`enum` variant][enum]) cannot be constructed
with a [_StructExpression_] \(including with [functional update syntax]).
- The implicitly defined same-named constant of a [unit-like struct][struct],
or the same-named constructor function of a [tuple struct][struct],
has a [visibility] no greater than `pub(crate)`.
That is, if the structs visibility is `pub`, then the constant or constructors visibility
is `pub(crate)`, and otherwise the visibility of the two items is the same
(as is the case without `#[non_exhaustive]`).
- [`enum`][enum] instances can be constructed.
The following examples of construction do not compile when outside the defining crate:
<!-- ignore: requires external crates -->
```rust,ignore
// `Config`, `Error`, and `Message` are types defined in an upstream crate that have been
// annotated as `#[non_exhaustive]`.
use upstream::{Config, Error, Message};
// These are types defined in an upstream crate that have been annotated as
// `#[non_exhaustive]`.
use upstream::{Config, Token, Id, Error, Message};
// Cannot construct an instance of `Config`, if new fields were added in
// Cannot construct an instance of `Config`; if new fields were added in
// a new version of `upstream` then this would fail to compile, so it is
// disallowed.
let config = Config { window_width: 640, window_height: 480 };
// Can construct an instance of `Error`, new variants being introduced would
// Cannot construct an instance of `Token`; if new fields were added, then
// it would not be a unit-like struct any more, so the same-named constant
// created by it being a unit-like struct is not public outside the crate;
// this code fails to compile.
let token = Token;
// Cannot construct an instance of `Id`; if new fields were added, then
// its constructor function signature would change, so its constructor
// function is not public outside the crate; this code fails to compile.
let id = Id(5);
// Can construct an instance of `Error`; new variants being introduced would
// not result in this failing to compile.
let error = Error::Message("foo".to_string());
// Cannot construct an instance of `Message::Send` or `Message::Reaction`,
// Cannot construct an instance of `Message::Send` or `Message::Reaction`;
// if new fields were added in a new version of `upstream` then this would
// fail to compile, so it is disallowed.
let message = Message::Send { from: 0, to: 1, contents: "foo".to_string(), };
let message = Message::Reaction(0);
// Cannot construct an instance of `Message::Quit`, if this were converted to
// Cannot construct an instance of `Message::Quit`; if this were converted to
// a tuple-variant `upstream` then this would fail to compile.
let message = Message::Quit;
```
@ -95,16 +122,18 @@ let message = Message::Quit;
There are limitations when matching on non-exhaustive types outside of the defining crate:
- When pattern matching on a non-exhaustive variant ([`struct`][struct] or [`enum` variant][enum]),
a [_StructPattern_] must be used which must include a `..`. Tuple variant constructor visibility
is lowered to `min($vis, pub(crate))`.
a [_StructPattern_] must be used which must include a `..`. A tuple variant's constructor's
[visibility] is reduced to be no greater than `pub(crate)`.
- When pattern matching on a non-exhaustive [`enum`][enum], matching on a variant does not
contribute towards the exhaustiveness of the arms.
The following examples of matching do not compile when outside the defining crate:
<!-- ignore: requires external crates -->
```rust, ignore
// `Config`, `Error`, and `Message` are types defined in an upstream crate that have been
// annotated as `#[non_exhaustive]`.
use upstream::{Config, Error, Message};
// These are types defined in an upstream crate that have been annotated as
// `#[non_exhaustive]`.
use upstream::{Config, Token, Id, Error, Message};
// Cannot match on a non-exhaustive enum without including a wildcard arm.
match error {
@ -118,6 +147,13 @@ if let Ok(Config { window_width, window_height }) = config {
// would compile with: `..`
}
// Cannot match a non-exhaustive unit-like or tuple struct except by using
// braced struct syntax with a wildcard.
// This would compile as `let Token { .. } = token;`
let Token = token;
// This would compile as `let Id { 0: id_number, .. } = id;`
let Id(id_number) = id;
match message {
// Cannot match on a non-exhaustive struct enum variant without including a wildcard.
Message::Send { from, to, contents } => { },
@ -147,3 +183,4 @@ Non-exhaustive types are always considered inhabited in downstream crates.
[enum]: ../items/enumerations.md
[functional update syntax]: ../expressions/struct-expr.md#functional-update-syntax
[struct]: ../items/structs.md
[visibility]: ../visibility-and-privacy.md

View File

@ -30,7 +30,7 @@
> &nbsp;&nbsp; | INNER_BLOCK_DOC
>
> _IsolatedCR_ :\
> &nbsp;&nbsp; _A `\r` not followed by a `\n`_
> &nbsp;&nbsp; \\r
## Non-doc comments
@ -53,8 +53,9 @@ that follows. That is, they are equivalent to writing `#![doc="..."]` around
the body of the comment. `//!` comments are usually used to document
modules that occupy a source file.
Isolated CRs (`\r`), i.e. not followed by LF (`\n`), are not allowed in doc
comments.
The character `U+000D` (CR) is not allowed in doc comments.
> **Note**: The sequence `U+000D` (CR) immediately followed by `U+000A` (LF) would have been previously transformed into a single `U+000A` (LF).
## Examples

View File

@ -2,16 +2,9 @@
> **<sup>Syntax</sup>**\
> _Crate_ :\
> &nbsp;&nbsp; UTF8BOM<sup>?</sup>\
> &nbsp;&nbsp; SHEBANG<sup>?</sup>\
> &nbsp;&nbsp; [_InnerAttribute_]<sup>\*</sup>\
> &nbsp;&nbsp; [_Item_]<sup>\*</sup>
> **<sup>Lexer</sup>**\
> UTF8BOM : `\uFEFF`\
> SHEBANG : `#!` \~`\n`<sup>\+</sup>[†](#shebang)
> Note: Although Rust, like any other language, can be implemented by an
> interpreter as well as a compiler, the only existing implementation is a
> compiler, and the language has always been designed to be compiled. For these
@ -53,6 +46,8 @@ that apply to the containing module, most of which influence the behavior of
the compiler. The anonymous crate module can have additional attributes that
apply to the crate as a whole.
> **Note**: The file's contents may be preceded by a [shebang].
```rust
// Specify the crate name.
#![crate_name = "projx"]
@ -65,34 +60,6 @@ apply to the crate as a whole.
#![warn(non_camel_case_types)]
```
## Byte order mark
The optional [_UTF8 byte order mark_] (UTF8BOM production) indicates that the
file is encoded in UTF8. It can only occur at the beginning of the file and
is ignored by the compiler.
## Shebang
A source file can have a [_shebang_] (SHEBANG production), which indicates
to the operating system what program to use to execute this file. It serves
essentially to treat the source file as an executable script. The shebang
can only occur at the beginning of the file (but after the optional
_UTF8BOM_). It is ignored by the compiler. For example:
<!-- ignore: tests don't like shebang -->
```rust,ignore
#!/usr/bin/env rustx
fn main() {
println!("Hello!");
}
```
A restriction is imposed on the shebang syntax to avoid confusion with an
[attribute]. The `#!` characters must not be followed by a `[` token, ignoring
intervening [comments] or [whitespace]. If this restriction fails, then it is
not treated as a shebang, but instead as the start of an attribute.
## Preludes and `no_std`
This section has been moved to the [Preludes chapter](names/preludes.md).
@ -119,6 +86,17 @@ fn main() -> impl std::process::Termination {
}
```
The `main` function may be an import, e.g. from an external crate or from the current one.
```rust
mod foo {
pub fn bar() {
println!("Hello, world!");
}
}
use foo::bar as main;
```
> **Note**: Types with implementations of [`Termination`] in the standard library include:
>
> * `()`
@ -161,20 +139,17 @@ or `_` (U+005F) characters.
[_InnerAttribute_]: attributes.md
[_Item_]: items.md
[_MetaNameValueStr_]: attributes.md#meta-item-attribute-syntax
[_shebang_]: https://en.wikipedia.org/wiki/Shebang_(Unix)
[_utf8 byte order mark_]: https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
[`ExitCode`]: ../std/process/struct.ExitCode.html
[`Infallible`]: ../std/convert/enum.Infallible.html
[`Termination`]: ../std/process/trait.Termination.html
[attribute]: attributes.md
[attributes]: attributes.md
[comments]: comments.md
[function]: items/functions.md
[module]: items/modules.md
[module path]: paths.md
[shebang]: input-format.md#shebang-removal
[trait or lifetime bounds]: trait-bounds.md
[where clauses]: items/generics.md#where-clauses
[whitespace]: whitespace.md
<script>
(function() {

View File

@ -76,7 +76,7 @@ The escaped value is the character whose [Unicode scalar value] is the result of
The escape sequence consists of `\u{`, followed by a sequence of characters each of which is a hexadecimal digit or `_`, followed by `}`.
The escaped value is the character whose [Unicode scalar value] is the result of interpreting the hexadecimal digits contained in the escape sequence as a hexadecimal integer, as if by [`u8::from_str_radix`] with radix 16.
The escaped value is the character whose [Unicode scalar value] is the result of interpreting the hexadecimal digits contained in the escape sequence as a hexadecimal integer, as if by [`u32::from_str_radix`] with radix 16.
> **Note**: the permitted forms of a [CHAR_LITERAL] or [STRING_LITERAL] token ensure that there is such a character.
@ -438,6 +438,7 @@ The expression's type is the primitive [boolean type], and its value is:
[`f64::INFINITY`]: ../../core/primitive.f64.md#associatedconstant.INFINITY
[`f64::NAN`]: ../../core/primitive.f64.md#associatedconstant.NAN
[`u8::from_str_radix`]: ../../core/primitive.u8.md#method.from_str_radix
[`u32::from_str_radix`]: ../../core/primitive.u32.md#method.from_str_radix
[`u128::from_str_radix`]: ../../core/primitive.u128.md#method.from_str_radix
[CHAR_LITERAL]: ../tokens.md#character-literals
[STRING_LITERAL]: ../tokens.md#string-literals

View File

@ -18,14 +18,14 @@ This requires a more complex lookup process than for other functions, since ther
The following procedure is used:
The first step is to build a list of candidate receiver types.
Obtain these by repeatedly [dereferencing][dereference] the receiver expression's type, adding each type encountered to the list, then finally attempting an [unsized coercion] at the end, and adding the result type if that is successful.
Obtain these by repeatedly adding each type encountered in the receiver expression's type's [`Receiver::Target`] to the list, then finally attempting an [unsized coercion] at the end, and adding the result type if that is successful.
Then, for each candidate `T`, add `&T` and `&mut T` to the list immediately after `T`.
For instance, if the receiver has type `Box<[i32;2]>`, then the candidate types will be `Box<[i32;2]>`, `&Box<[i32;2]>`, `&mut Box<[i32;2]>`, `[i32; 2]` (by dereferencing), `&[i32; 2]`, `&mut [i32; 2]`, `[i32]` (by unsized coercion), `&[i32]`, and finally `&mut [i32]`.
Then, for each candidate type `T`, search for a [visible] method with a receiver of that type in the following places:
1. `T`'s inherent methods (methods implemented directly on `T`).
1. `T`'s inherent methods, or receivers to `T`'s inherent methods (methods implemented directly on `T`, or on receivers to `T`).
1. Any of the methods provided by a [visible] trait implemented by `T`.
If `T` is a type parameter, methods provided by trait bounds on `T` are looked up first.
Then all remaining methods in scope are looked up.
@ -94,3 +94,4 @@ Just don't define inherent methods on trait objects with the same name as a trai
[methods]: ../items/associated-items.md#methods
[unsized coercion]: ../type-coercions.md#unsized-coercions
[`IntoIterator`]: ../../std/iter/trait.IntoIterator.html
[`Receiver::Target`]: ../../std/ops/trait.Receiver.html#associatedtype.Target

View File

@ -1,3 +1,55 @@
# Input format
Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8.
This chapter describes how a source file is interpreted as a sequence of tokens.
See [Crates and source files] for a description of how programs are organised into files.
## Source encoding
Each source file is interpreted as a sequence of Unicode characters encoded in UTF-8.
It is an error if the file is not valid UTF-8.
## Byte order mark removal
If the first character in the sequence is `U+FEFF` ([BYTE ORDER MARK]), it is removed.
## CRLF normalization
Each pair of characters `U+000D` (CR) immediately followed by `U+000A` (LF) is replaced by a single `U+000A` (LF).
Other occurrences of the character `U+000D` (CR) are left in place (they are treated as [whitespace]).
## Shebang removal
If the remaining sequence begins with the characters `#!`, the characters up to and including the first `U+000A` (LF) are removed from the sequence.
For example, the first line of the following file would be ignored:
<!-- ignore: tests don't like shebang -->
```rust,ignore
#!/usr/bin/env rustx
fn main() {
println!("Hello!");
}
```
As an exception, if the `#!` characters are followed (ignoring intervening [comments] or [whitespace]) by a `[` token, nothing is removed.
This prevents an [inner attribute] at the start of a source file being removed.
> **Note**: The standard library [`include!`] macro applies byte order mark removal, CRLF normalization, and shebang removal to the file it reads. The [`include_str!`] and [`include_bytes!`] macros do not.
## Tokenization
The resulting sequence of characters is then converted into tokens as described in the remainder of this chapter.
[`include!`]: ../std/macro.include.md
[`include_bytes!`]: ../std/macro.include_bytes.md
[`include_str!`]: ../std/macro.include_str.md
[inner attribute]: attributes.md
[BYTE ORDER MARK]: https://en.wikipedia.org/wiki/Byte_order_mark#UTF-8
[comments]: comments.md
[Crates and source files]: crates-and-source-files.md
[_shebang_]: https://en.wikipedia.org/wiki/Shebang_(Unix)
[whitespace]: whitespace.md

View File

@ -53,7 +53,7 @@ mod m {
> &nbsp;&nbsp; | `<` ( _GenericArg_ `,` )<sup>\*</sup> _GenericArg_ `,`<sup>?</sup> `>`
>
> _GenericArg_ :\
> &nbsp;&nbsp; [_Lifetime_] | [_Type_] | _GenericArgsConst_ | _GenericArgsBinding_
> &nbsp;&nbsp; [_Lifetime_] | [_Type_] | _GenericArgsConst_ | _GenericArgsBinding_ | _GenericArgsBounds_
>
> _GenericArgsConst_ :\
> &nbsp;&nbsp; &nbsp;&nbsp; [_BlockExpression_]\
@ -62,7 +62,10 @@ mod m {
> &nbsp;&nbsp; | [_SimplePathSegment_]
>
> _GenericArgsBinding_ :\
> &nbsp;&nbsp; [IDENTIFIER] `=` [_Type_]
> &nbsp;&nbsp; [IDENTIFIER] _GenericArgs_<sup>?</sup> `=` [_Type_]
>
> _GenericArgsBounds_ :\
> &nbsp;&nbsp; [IDENTIFIER] _GenericArgs_<sup>?</sup> `:` [_TypeParamBounds_]
Paths in expressions allow for paths with generic arguments to be specified. They are
used in various places in [expressions] and [patterns].
@ -396,6 +399,7 @@ mod without { // crate::without
[_SimplePathSegment_]: #simple-paths
[_Type_]: types.md#type-expressions
[_TypeNoBounds_]: types.md#type-expressions
[_TypeParamBounds_]: trait-bounds.md
[literal]: expressions/literal-expr.md
[item]: items.md
[variable]: variables.md

View File

@ -234,8 +234,8 @@ shown in the comments after the function prefixed with "out:".
#[proc_macro_attribute]
pub fn show_streams(attr: TokenStream, item: TokenStream) -> TokenStream {
println!("attr: \"{}\"", attr.to_string());
println!("item: \"{}\"", item.to_string());
println!("attr: \"{attr}\"");
println!("item: \"{item}\"");
item
}
```

View File

@ -80,6 +80,7 @@ types:
* Types with a built-in `Copy` implementation (see above)
* [Tuples] of `Clone` types
* [Closures] that only capture values of `Clone` types or capture no values from the environment
## `Send`

View File

@ -37,6 +37,8 @@ Literals are tokens used in [literal expressions].
[^nsets]: The number of `#`s on each side of the same literal must be equivalent.
> **Note**: Character and string literal tokens never include the sequence of `U+000D` (CR) immediately followed by `U+000A` (LF): this pair would have been previously transformed into a single `U+000A` (LF).
#### ASCII escapes
| | Name |
@ -156,13 +158,10 @@ A _string literal_ is a sequence of any Unicode characters enclosed within two
`U+0022` (double-quote) characters, with the exception of `U+0022` itself,
which must be _escaped_ by a preceding `U+005C` character (`\`).
Line-breaks are allowed in string literals.
A line-break is either a newline (`U+000A`) or a pair of carriage return and newline (`U+000D`, `U+000A`).
Both byte sequences are translated to `U+000A`.
Line-breaks, represented by the character `U+000A` (LF), are allowed in string literals.
When an unescaped `U+005C` character (`\`) occurs immediately before a line break, the line break does not appear in the string represented by the token.
See [String continuation escapes] for details.
The character `U+000D` (CR) may not appear in a string literal other than as part of such a string continuation escape.
#### Character escapes
@ -198,10 +197,10 @@ following forms:
Raw string literals do not process any escapes. They start with the character
`U+0072` (`r`), followed by fewer than 256 of the character `U+0023` (`#`) and a
`U+0022` (double-quote) character. The _raw string body_ can contain any sequence
of Unicode characters and is terminated only by another `U+0022` (double-quote)
character, followed by the same number of `U+0023` (`#`) characters that preceded
the opening `U+0022` (double-quote) character.
`U+0022` (double-quote) character.
The _raw string body_ can contain any sequence of Unicode characters other than `U+000D` (CR).
It is terminated only by another `U+0022` (double-quote) character, followed by the same number of `U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote) character.
All Unicode characters contained in the raw string body represent themselves,
the characters `U+0022` (double-quote) (except when followed by at least as
@ -259,6 +258,11 @@ the literal, it must be _escaped_ by a preceding `U+005C` (`\`) character.
Alternatively, a byte string literal can be a _raw byte string literal_, defined
below.
Line-breaks, represented by the character `U+000A` (LF), are allowed in byte string literals.
When an unescaped `U+005C` character (`\`) occurs immediately before a line break, the line break does not appear in the string represented by the token.
See [String continuation escapes] for details.
The character `U+000D` (CR) may not appear in a byte string literal other than as part of such a string continuation escape.
Some additional _escapes_ are available in either byte or non-raw byte string
literals. An escape starts with a `U+005C` (`\`) and continues with one of the
following forms:
@ -281,19 +285,19 @@ following forms:
> &nbsp;&nbsp; `br` RAW_BYTE_STRING_CONTENT SUFFIX<sup>?</sup>
>
> RAW_BYTE_STRING_CONTENT :\
> &nbsp;&nbsp; &nbsp;&nbsp; `"` ASCII<sup>* (non-greedy)</sup> `"`\
> &nbsp;&nbsp; &nbsp;&nbsp; `"` ASCII_FOR_RAW<sup>* (non-greedy)</sup> `"`\
> &nbsp;&nbsp; | `#` RAW_BYTE_STRING_CONTENT `#`
>
> ASCII :\
> &nbsp;&nbsp; _any ASCII (i.e. 0x00 to 0x7F)_
> ASCII_FOR_RAW :\
> &nbsp;&nbsp; _any ASCII (i.e. 0x00 to 0x7F) except IsolatedCR_
Raw byte string literals do not process any escapes. They start with the
character `U+0062` (`b`), followed by `U+0072` (`r`), followed by fewer than 256
of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The
_raw string body_ can contain any sequence of ASCII characters and is terminated
only by another `U+0022` (double-quote) character, followed by the same number of
`U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote)
character. A raw byte string literal can not contain any non-ASCII byte.
of the character `U+0023` (`#`), and a `U+0022` (double-quote) character.
The _raw string body_ can contain any sequence of ASCII characters other than `U+000D` (CR).
It is terminated only by another `U+0022` (double-quote) character, followed by the same number of `U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote) character.
A raw byte string literal can not contain any non-ASCII byte.
All characters contained in the raw string body represent their ASCII encoding,
the characters `U+0022` (double-quote) (except when followed by at least as
@ -339,6 +343,11 @@ C strings are implicitly terminated by byte `0x00`, so the C string literal
literal `b"\x00"`. Other than the implicit terminator, byte `0x00` is not
permitted within a C string.
Line-breaks, represented by the character `U+000A` (LF), are allowed in C string literals.
When an unescaped `U+005C` character (`\`) occurs immediately before a line break, the line break does not appear in the string represented by the token.
See [String continuation escapes] for details.
The character `U+000D` (CR) may not appear in a C string literal other than as part of such a string continuation escape.
Some additional _escapes_ are available in non-raw C string literals. An escape
starts with a `U+005C` (`\`) and continues with one of the following forms:
@ -381,11 +390,10 @@ c"\xC3\xA6";
Raw C string literals do not process any escapes. They start with the
character `U+0063` (`c`), followed by `U+0072` (`r`), followed by fewer than 256
of the character `U+0023` (`#`), and a `U+0022` (double-quote) character. The
_raw C string body_ can contain any sequence of Unicode characters (other than
`U+0000`) and is terminated only by another `U+0022` (double-quote) character,
followed by the same number of `U+0023` (`#`) characters that preceded the
opening `U+0022` (double-quote) character.
of the character `U+0023` (`#`), and a `U+0022` (double-quote) character.
The _raw C string body_ can contain any sequence of Unicode characters other than `U+0000` (NUL) and `U+000D` (CR).
It is terminated only by another `U+0022` (double-quote) character, followed by the same number of `U+0023` (`#`) characters that preceded the opening `U+0022` (double-quote) character.
All characters contained in the raw C string body represent themselves in UTF-8
encoding. The characters `U+0022` (double-quote) (except when followed by at
@ -630,11 +638,14 @@ Examples of reserved forms:
> **<sup>Lexer</sup>**\
> LIFETIME_TOKEN :\
> &nbsp;&nbsp; &nbsp;&nbsp; `'` [IDENTIFIER_OR_KEYWORD][identifier]\
> &nbsp;&nbsp; &nbsp;&nbsp; `'` [IDENTIFIER_OR_KEYWORD][identifier]
> _(not immediately followed by `'`)_\
> &nbsp;&nbsp; | `'_`
> _(not immediately followed by `'`)_
>
> LIFETIME_OR_LABEL :\
> &nbsp;&nbsp; &nbsp;&nbsp; `'` [NON_KEYWORD_IDENTIFIER][identifier]
> _(not immediately followed by `'`)_
Lifetime parameters and [loop labels] use LIFETIME_OR_LABEL tokens. Any
LIFETIME_TOKEN will be accepted by the lexer, and for example, can be used in

View File

@ -44,17 +44,20 @@ The size of most primitives is given in this table.
| `u32` / `i32` | 4 |
| `u64` / `i64` | 8 |
| `u128` / `i128` | 16 |
| `usize` / `isize` | See below |
| `f32` | 4 |
| `f64` | 8 |
| `char` | 4 |
`usize` and `isize` have a size big enough to contain every address on the
target platform. For example, on a 32 bit target, this is 4 bytes and on a 64
target platform. For example, on a 32 bit target, this is 4 bytes, and on a 64
bit target, this is 8 bytes.
Most primitives are generally aligned to their size, although this is
platform-specific behavior. In particular, on x86 u64 and f64 are only
aligned to 32 bits.
The alignment of primitives is platform-specific.
In most cases, their alignment is equal to their size, but it may be less.
In particular, `i128` and `u128` are often aligned to 4 or 8 bytes even though
their size is 16, and on many 32-bit platforms, `i64`, `u64`, and `f64` are only
aligned to 4 bytes, not 8.
## Pointers and References Layout