Identifiers
Lexer:
IDENTIFIER_OR_KEYWORD :
XID_Start XID_Continue*
|_
XID_Continue+RAW_IDENTIFIER :
r#
IDENTIFIER_OR_KEYWORD Exceptcrate
,self
,super
,Self
NON_KEYWORD_IDENTIFIER : IDENTIFIER_OR_KEYWORD Except a strict or reserved keyword
IDENTIFIER :
NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIERRESERVED_RAW_IDENTIFIER :
r#_
Identifiers follow the specification in Unicode Standard Annex #31 for Unicode version 16.0, with the additions described below. Some examples of identifiers:
foo
_identifier
r#true
Москва
東京
The profile used from UAX #31 is:
- Start :=
XID_Start
, plus the underscore character (U+005F) - Continue :=
XID_Continue
- Medial := empty
with the additional constraint that a single underscore character is not an identifier.
Note
Identifiers starting with an underscore are typically used to indicate an identifier that is intentionally unused, and will silence the unused warning in
rustc
.
Identifiers may not be a strict or reserved keyword without the r#
prefix described below in raw identifiers.
Zero width non-joiner (ZWNJ U+200C) and zero width joiner (ZWJ U+200D) characters are not allowed in identifiers.
Identifiers are restricted to the ASCII subset of XID_Start
and XID_Continue
in the following situations:
extern crate
declarations (except the AsClause identifier)- External crate names referenced in a path
- Module names loaded from the filesystem without a
path
attribute no_mangle
attributed items- Item names in external blocks
Normalization
Identifiers are normalized using Normalization Form C (NFC) as defined in Unicode Standard Annex #15. Two identifiers are equal if their NFC forms are equal.
Procedural and declarative macros receive normalized identifiers in their input.
Raw identifiers
A raw identifier is like a normal identifier, but prefixed by r#
. (Note that
the r#
prefix is not included as part of the actual identifier.)
Unlike a normal identifier, a raw identifier may be any strict or reserved
keyword except the ones listed above for RAW_IDENTIFIER
.
It is an error to use the RESERVED_RAW_IDENTIFIER token r#_
in order to avoid confusion with the WildcardPattern.