The Rust Programming Language
The Rust Programming Language
html
Foreword
It wasn’t always so clear, but the Rust programming language is fundamentally about
empowerment: no matter what kind of code you are writing now, Rust empowers you to reach
farther, to program with con�dence in a wider variety of domains than you did before.
Take, for example, “systems-level” work that deals with low-level details of memory management,
data representation, and concurrency. Traditionally, this realm of programming is seen as arcane,
accessible only to a select few who have devoted the necessary years learning to avoid its
infamous pitfalls. And even those who practice it do so with caution, lest their code be open to
exploits, crashes, or corruption.
Rust breaks down these barriers by eliminating the old pitfalls and providing a friendly, polished
set of tools to help you along the way. Programmers who need to “dip down” into lower-level
control can do so with Rust, without taking on the customary risk of crashes or security holes, and
without having to learn the �ne points of a �ckle toolchain. Better yet, the language is designed to
guide you naturally towards reliable code that is e�cient in terms of speed and memory usage.
Programmers who are already working with low-level code can use Rust to raise their ambitions.
For example, introducing parallelism in Rust is a relatively low-risk operation: the compiler will
catch the classical mistakes for you. And you can tackle more aggressive optimizations in your
code with the con�dence that you won’t accidentally introduce crashes or vulnerabilities.
But Rust isn’t limited to low-level systems programming. It’s expressive and ergonomic enough to
make CLI apps, web servers, and many other kinds of code quite pleasant to write — you’ll �nd
simple examples of both later in the book. Working with Rust allows you to build skills that
transfer from one domain to another; you can learn Rust by writing a web app, then apply those
same skills to target your Raspberry Pi.
This book fully embraces the potential of Rust to empower its users. It’s a friendly and
approachable text intended to help you level up not just your knowledge of Rust, but also your
reach and con�dence as a programmer in general. So dive in, get ready to learn—and welcome to
the Rust community!
Introduction
Note: This edition of the book is the same as The Rust Programming Language available in
print and ebook format from No Starch Press.
Welcome to The Rust Programming Language, an introductory book about Rust. The Rust
programming language helps you write faster, more reliable software. High-level ergonomics and
low-level control are often at odds in programming language design; Rust challenges that con�ict.
Through balancing powerful technical capacity and a great developer experience, Rust gives you
the option to control low-level details (such as memory usage) without all the hassle traditionally
associated with such control.
Teams of Developers
Rust is proving to be a productive tool for collaborating among large teams of developers with
varying levels of systems programming knowledge. Low-level code is prone to a variety of subtle
bugs, which in most other languages can be caught only through extensive testing and careful
code review by experienced developers. In Rust, the compiler plays a gatekeeper role by refusing
to compile code with these elusive bugs, including concurrency bugs. By working alongside the
compiler, the team can spend their time focusing on the program’s logic rather than chasing down
bugs.
Rust also brings contemporary developer tools to the systems programming world:
Cargo, the included dependency manager and build tool, makes adding, compiling, and
managing dependencies painless and consistent across the Rust ecosystem.
Rustfmt ensures a consistent coding style across developers.
The Rust Language Server powers Integrated Development Environment (IDE) integration
for code completion and inline error messages.
By using these and other tools in the Rust ecosystem, developers can be productive while writing
systems-level code.
Students
Rust is for students and those who are interested in learning about systems concepts. Using Rust,
many people have learned about topics like operating systems development. The community is
very welcoming and happy to answer student questions. Through e�orts such as this book, the
Rust teams want to make systems concepts more accessible to more people, especially those new
to programming.
Companies
Hundreds of companies, large and small, use Rust in production for a variety of tasks. Those tasks
include command line tools, web services, DevOps tooling, embedded devices, audio and video
analysis and transcoding, cryptocurrencies, bioinformatics, search engines, Internet of Things
applications, machine learning, and even major parts of the Firefox web browser.
Rust is for people who want to build the Rust programming language, community, developer tools,
and libraries. We’d love to have you contribute to the Rust language.
Rust is for people who crave speed and stability in a language. By speed, we mean the speed of
the programs that you can create with Rust and the speed at which Rust lets you write them. The
Rust compiler’s checks ensure stability through feature additions and refactoring. This is in
contrast to the brittle legacy code in languages without these checks, which developers are often
afraid to modify. By striving for zero-cost abstractions, higher-level features that compile to lower-
level code as fast as code written manually, Rust endeavors to make safe code be fast code as
well.
The Rust language hopes to support many other users as well; those mentioned here are merely
some of the biggest stakeholders. Overall, Rust’s greatest ambition is to eliminate the trade-o�s
that programmers have accepted for decades by providing safety and productivity, speed and
ergonomics. Give Rust a try and see if its choices work for you.
from a wide variety of programming backgrounds. We don’t spend a lot of time talking about what
programming is or how to think about it. If you’re entirely new to programming, you would be
better served by reading a book that speci�cally provides an introduction to programming.
You’ll �nd two kinds of chapters in this book: concept chapters and project chapters. In concept
chapters, you’ll learn about an aspect of Rust. In project chapters, we’ll build small programs
together, applying what you’ve learned so far. Chapters 2, 12, and 20 are project chapters; the rest
are concept chapters.
Chapter 1 explains how to install Rust, how to write a Hello, world! program, and how to use
Cargo, Rust’s package manager and build tool. Chapter 2 is a hands-on introduction to the Rust
language. Here we cover concepts at a high level, and later chapters will provide additional detail.
If you want to get your hands dirty right away, Chapter 2 is the place for that. At �rst, you might
even want to skip Chapter 3, which covers Rust features similar to those of other programming
languages, and head straight to Chapter 4 to learn about Rust’s ownership system. However, if
you’re a particularly meticulous learner who prefers to learn every detail before moving on to the
next, you might want to skip Chapter 2 and go straight to Chapter 3, returning to Chapter 2 when
you’d like to work on a project applying the details you’ve learned.
Chapter 5 discusses structs and methods, and Chapter 6 covers enums, match expressions, and
the if let control �ow construct. You’ll use structs and enums to make custom types in Rust.
In Chapter 7, you’ll learn about Rust’s module system and about privacy rules for organizing your
code and its public Application Programming Interface (API). Chapter 8 discusses some common
collection data structures that the standard library provides, such as vectors, strings, and hash
maps. Chapter 9 explores Rust’s error-handling philosophy and techniques.
Chapter 10 digs into generics, traits, and lifetimes, which give you the power to de�ne code that
applies to multiple types. Chapter 11 is all about testing, which even with Rust’s safety guarantees
is necessary to ensure your program’s logic is correct. In Chapter 12, we’ll build our own
implementation of a subset of functionality from the grep command line tool that searches for
text within �les. For this, we’ll use many of the concepts we discussed in the previous chapters.
Chapter 13 explores closures and iterators: features of Rust that come from functional
programming languages. In Chapter 14, we’ll examine Cargo in more depth and talk about best
practices for sharing your libraries with others. Chapter 15 discusses smart pointers that the
standard library provides and the traits that enable their functionality.
In Chapter 16, we’ll walk through di�erent models of concurrent programming and talk about how
Rust helps you to program in multiple threads fearlessly. Chapter 17 looks at how Rust idioms
compare to object-oriented programming principles you might be familiar with.
Chapter 18 is a reference on patterns and pattern matching, which are powerful ways of
expressing ideas throughout Rust programs. Chapter 19 contains a smorgasbord of advanced
topics of interest, including unsafe Rust, macros, and more about lifetimes, traits, types, functions,
and closures.
In Chapter 20, we’ll complete a project in which we’ll implement a low-level multithreaded web
server!
Finally, some appendixes contain useful information about the language in a more reference-like
format. Appendix A covers Rust’s keywords, Appendix B covers Rust’s operators and symbols,
Appendix C covers derivable traits provided by the standard library, Appendix D covers some
useful development tools, and Appendix E explains Rust editions.
There is no wrong way to read this book: if you want to skip ahead, go for it! You might have to
jump back to earlier chapters if you experience any confusion. But do whatever works for you.
An important part of the process of learning Rust is learning how to read the error messages the
compiler displays: these will guide you toward working code. As such, we’ll provide many
examples of code that doesn’t compile along with the error message the compiler will show you in
each situation. Know that if you enter and run a random example, it may not compile! Make sure
you read the surrounding text to see whether the example you’re trying to run is meant to error.
Ferris will also help you distinguish code that isn’t meant to work:
Ferris Meaning
Ferris Meaning
In most situations, we’ll lead you to the correct version of any code that doesn’t compile.
Source Code
The source �les from which this book is generated can be found on GitHub.
Getting Started
Let’s start your Rust journey! There’s a lot to learn, but every journey starts somewhere. In this
chapter, we’ll discuss:
Installation
The �rst step is to install Rust. We’ll download Rust through rustup , a command line tool for
managing Rust versions and associated tools. You’ll need an internet connection for the download.
Note: If you prefer not to use rustup for some reason, please see the Rust installation page
for other options.
The following steps install the latest stable version of the Rust compiler. Rust’s stability guarantees
ensure that all the examples in the book that compile will continue to compile with newer Rust
versions. The output might di�er slightly between versions, because Rust often improves error
messages and warnings. In other words, any newer, stable version of Rust you install using these
steps should work as expected with the content of this book.
In this chapter and throughout the book, we’ll show some commands used in the terminal.
Lines that you should enter in a terminal all start with $ . You don’t need to type in the $
character; it indicates the start of each command. Lines that don’t start with $ typically
show the output of the previous command. Additionally, PowerShell-speci�c examples will
use > rather than $ .
If you’re using Linux or macOS, open a terminal and enter the following command:
The command downloads a script and starts the installation of the rustup tool, which installs the
latest stable version of Rust. You might be prompted for your password. If the install is successful,
the following line will appear:
If you prefer, feel free to download the script and inspect it before running it.
The installation script automatically adds Rust to your system PATH after your next login. If you
want to start using Rust right away instead of restarting your terminal, run the following command
in your shell to add Rust to your system PATH manually:
$ source $HOME/.cargo/env
$ export PATH="$HOME/.cargo/bin:$PATH"
Additionally, you’ll need a linker of some kind. It’s likely one is already installed, but when you try
to compile a Rust program and get errors indicating that a linker could not execute, that means a
linker isn’t installed on your system and you’ll need to install one manually. C compilers usually
come with the correct linker. Check your platform’s documentation for how to install a C compiler.
Also, some common Rust packages depend on C code and will need a C compiler. Therefore, it
might be worth installing one now.
The rest of this book uses commands that work in both cmd.exe and PowerShell. If there are
speci�c di�erences, we’ll explain which to use.
After you’ve installed Rust via rustup , updating to the latest version is easy. From your shell, run
the following update script:
$ rustup update
To uninstall Rust and rustup , run the following uninstall script from your shell:
Troubleshooting
To check whether you have Rust installed correctly, open a shell and enter this line:
$ rustc --version
You should see the version number, commit hash, and commit date for the latest stable version
that has been released in the following format:
If you see this information, you have installed Rust successfully! If you don’t see this information
and you’re on Windows, check that Rust is in your %PATH% system variable. If that’s all correct and
Rust still isn’t working, there are a number of places you can get help. The easiest is the #rust IRC
channel on irc.mozilla.org, which you can access through Mibbit. At that address you can chat with
other Rustaceans (a silly nickname we call ourselves) who can help you out. Other great resources
include the Users forum and Stack Over�ow.
Local Documentation
The installer also includes a copy of the documentation locally, so you can read it o�ine. Run
rustup doc to open the local documentation in your browser.
Any time a type or function is provided by the standard library and you’re not sure what it does or
how to use it, use the application programming interface (API) documentation to �nd out!
Hello, World!
Now that you’ve installed Rust, let’s write your �rst Rust program. It’s traditional when learning a
new language to write a little program that prints the text Hello, world! to the screen, so we’ll
do the same here!
Note: This book assumes basic familiarity with the command line. Rust makes no speci�c
demands about your editing or tooling or where your code lives, so if you prefer to use an
integrated development environment (IDE) instead of the command line, feel free to use
your favorite IDE. Many IDEs now have some degree of Rust support; check the IDE’s
documentation for details. Recently, the Rust team has been focusing on enabling great IDE
support, and progress has been made rapidly on that front!
You’ll start by making a directory to store your Rust code. It doesn’t matter to Rust where your
code lives, but for the exercises and projects in this book, we suggest making a projects directory in
your home directory and keeping all your projects there.
Open a terminal and enter the following commands to make a projects directory and a directory
for the Hello, world! project within the projects directory.
$ mkdir ~/projects
$ cd ~/projects
$ mkdir hello_world
$ cd hello_world
Next, make a new source �le and call it main.rs. Rust �les always end with the .rs extension. If
you’re using more than one word in your �lename, use an underscore to separate them. For
example, use hello_world.rs rather than helloworld.rs.
Now open the main.rs �le you just created and enter the code in Listing 1-1.
Filename: main.rs
fn main() {
println!("Hello, world!");
}
Save the �le and go back to your terminal window. On Linux or macOS, enter the following
commands to compile and run the �le:
$ rustc main.rs
$ ./main
Hello, world!
Regardless of your operating system, the string Hello, world! should print to the terminal. If you
don’t see this output, refer back to the “Troubleshooting” part of the Installation section for ways
to get help.
If Hello, world! did print, congratulations! You’ve o�cially written a Rust program. That makes
you a Rust programmer—welcome!
Let’s review in detail what just happened in your Hello, world! program. Here’s the �rst piece of
the puzzle:
fn main() {
These lines de�ne a function in Rust. The main function is special: it is always the �rst code that
runs in every executable Rust program. The �rst line declares a function named main that has no
parameters and returns nothing. If there were parameters, they would go inside the parentheses,
() .
Also, note that the function body is wrapped in curly brackets, {} . Rust requires these around all
function bodies. It’s good style to place the opening curly bracket on the same line as the function
declaration, adding one space in between.
At the time of this writing, an automatic formatter tool called rustfmt is under development. If
you want to stick to a standard style across Rust projects, rustfmt will format your code in a
particular style. The Rust team plans to eventually include this tool with the standard Rust
distribution, like rustc . So depending on when you read this book, it might already be installed
on your computer! Check the online documentation for more details.
println!("Hello, world!");
This line does all the work in this little program: it prints text to the screen. There are four
important details to notice here. First, Rust style is to indent with four spaces, not a tab.
Second, println! calls a Rust macro. If it called a function instead, it would be entered as
println (without the ! ). We’ll discuss Rust macros in more detail in Chapter 19. For now, you just
need to know that using a ! means that you’re calling a macro instead of a normal function.
Third, you see the "Hello, world!" string. We pass this string as an argument to println! , and
the string is printed to the screen.
Fourth, we end the line with a semicolon ( ; ), which indicates that this expression is over and the
next one is ready to begin. Most lines of Rust code end with a semicolon.
You’ve just run a newly created program, so let’s examine each step in the process.
Before running a Rust program, you must compile it using the Rust compiler by entering the
rustc command and passing it the name of your source �le, like this:
$ rustc main.rs
If you have a C or C++ background, you’ll notice that this is similar to gcc or clang . After
compiling successfully, Rust outputs a binary executable.
On Linux and macOS you can see the executable by entering the ls command in your shell as
follows:
$ ls
main main.rs
With PowerShell on Windows, you can use ls as well, but you’ll see three �les:
> ls
Directory: Path:\to\the\project
> dir /B %= the /B option says to only show the file names =%
main.exe
main.pdb
main.rs
This shows the source code �le with the .rs extension, the executable �le (main.exe on Windows,
but main on all other platforms), and, when using CMD, a �le containing debugging information
with the .pdb extension. From here, you run the main or main.exe �le, like this:
If main.rs was your Hello, world! program, this line would print Hello, world! to your terminal.
If you’re more familiar with a dynamic language, such as Ruby, Python, or JavaScript, you might not
be used to compiling and running a program as separate steps. Rust is an ahead-of-time compiled
language, meaning you can compile a program and give the executable to someone else, and they
can run it even without having Rust installed. If you give someone a .rb, .py, or .js �le, they need to
have a Ruby, Python, or JavaScript implementation installed (respectively). But in those languages,
you only need one command to compile and run your program. Everything is a trade-o� in
language design.
Just compiling with rustc is �ne for simple programs, but as your project grows, you’ll want to
manage all the options and make it easy to share your code. Next, we’ll introduce you to the Cargo
tool, which will help you write real-world Rust programs.
Hello, Cargo!
Cargo is Rust’s build system and package manager. Most Rustaceans use this tool to manage their
Rust projects because Cargo handles a lot of tasks for you, such as building your code,
downloading the libraries your code depends on, and building those libraries. (We call libraries
your code needs dependencies.)
The simplest Rust programs, like the one we’ve written so far, don’t have any dependencies. So if
we had built the Hello, world! project with Cargo, it would only use the part of Cargo that handles
building your code. As you write more complex Rust programs, you’ll add dependencies, and if you
start a project using Cargo, adding dependencies will be much easier to do.
Because the vast majority of Rust projects use Cargo, the rest of this book assumes that you’re
using Cargo too. Cargo comes installed with Rust if you used the o�cial installers discussed in the
“Installation” section. If you installed Rust through some other means, check whether Cargo is
installed by entering the following into your terminal:
$ cargo --version
If you see a version number, you have it! If you see an error, such as command not found , look at
the documentation for your method of installation to determine how to install Cargo separately.
Let’s create a new project using Cargo and look at how it di�ers from our original Hello, world!
project. Navigate back to your projects directory (or wherever you decided to store your code).
Then, on any operating system, run the following:
The �rst command creates a new directory called hello_cargo. We’ve named our project
hello_cargo, and Cargo creates its �les in a directory of the same name.
Go into the hello_cargo directory and list the �les. You’ll see that Cargo has generated two �les and
one directory for us: a Cargo.toml �le and a src directory with a main.rs �le inside. It has also
initialized a new Git repository along with a .gitignore �le.
Note: Git is a common version control system. You can change cargo new to use a di�erent
version control system or no version control system by using the --vcs �ag. Run
cargo new --help to see the available options.
Open Cargo.toml in your text editor of choice. It should look similar to the code in Listing 1-2.
Filename: Cargo.toml
[package]
name = "hello_cargo"
version = "0.1.0"
authors = ["Your Name <[email protected]>"]
edition = "2018"
[dependencies]
This �le is in the TOML (Tom’s Obvious, Minimal Language) format, which is Cargo’s con�guration
format.
The �rst line, [package] , is a section heading that indicates that the following statements are
con�guring a package. As we add more information to this �le, we’ll add other sections.
The next four lines set the con�guration information Cargo needs to compile your program: the
name, the version, and who wrote it. Cargo gets your name and email information from your
environment, so if that information is not correct, �x the information now and then save the �le.
We’ll talk about the edition key in Appendix E.
The last line, [dependencies] , is the start of a section for you to list any of your project’s
dependencies. In Rust, packages of code are referred to as crates. We won’t need any other crates
for this project, but we will in the �rst project in Chapter 2, so we’ll use this dependencies section
then.
Filename: src/main.rs
fn main() {
println!("Hello, world!");
}
Cargo has generated a Hello, world! program for you, just like the one we wrote in Listing 1-1! So
far, the di�erences between our previous project and the project Cargo generates are that Cargo
placed the code in the src directory, and we have a Cargo.toml con�guration �le in the top
directory.
Cargo expects your source �les to live inside the src directory. The top-level project directory is just
for README �les, license information, con�guration �les, and anything else not related to your
code. Using Cargo helps you organize your projects. There’s a place for everything, and everything
is in its place.
If you started a project that doesn’t use Cargo, as we did with the Hello, world! project, you can
convert it to a project that does use Cargo. Move the project code into the src directory and create
an appropriate Cargo.toml �le.
Now let’s look at what’s di�erent when we build and run the Hello, world! program with Cargo!
From your hello_cargo directory, build your project by entering the following command:
$ cargo build
Compiling hello_cargo v0.1.0 (file:///projects/hello_cargo)
Finished dev [unoptimized + debuginfo] target(s) in 2.85 secs
If all goes well, Hello, world! should print to the terminal. Running cargo build for the �rst
time also causes Cargo to create a new �le at the top level: Cargo.lock. This �le keeps track of the
exact versions of dependencies in your project. This project doesn’t have dependencies, so the �le
is a bit sparse. You won’t ever need to change this �le manually; Cargo manages its contents for
you.
We just built a project with cargo build and ran it with ./target/debug/hello_cargo , but we
can also use cargo run to compile the code and then run the resulting executable all in one
command:
$ cargo run
Finished dev [unoptimized + debuginfo] target(s) in 0.0 secs
Running `target/debug/hello_cargo`
Hello, world!
Notice that this time we didn’t see output indicating that Cargo was compiling hello_cargo . Cargo
�gured out that the �les hadn’t changed, so it just ran the binary. If you had modi�ed your source
code, Cargo would have rebuilt the project before running it, and you would have seen this
output:
$ cargo run
Compiling hello_cargo v0.1.0 (file:///projects/hello_cargo)
Finished dev [unoptimized + debuginfo] target(s) in 0.33 secs
Running `target/debug/hello_cargo`
Hello, world!
Cargo also provides a command called cargo check . This command quickly checks your code to
make sure it compiles but doesn’t produce an executable:
$ cargo check
Checking hello_cargo v0.1.0 (file:///projects/hello_cargo)
Finished dev [unoptimized + debuginfo] target(s) in 0.32 secs
Why would you not want an executable? Often, cargo check is much faster than cargo build ,
because it skips the step of producing an executable. If you’re continually checking your work
while writing the code, using cargo check will speed up the process! As such, many Rustaceans
run cargo check periodically as they write their program to make sure it compiles. Then they run
cargo build when they’re ready to use the executable.
An additional advantage of using Cargo is that the commands are the same no matter which
operating system you’re working on. So, at this point, we’ll no longer provide speci�c instructions
for Linux and macOS versus Windows.
When your project is �nally ready for release, you can use cargo build --release to compile it
with optimizations. This command will create an executable in target/release instead of
target/debug. The optimizations make your Rust code run faster, but turning them on lengthens
the time it takes for your program to compile. This is why there are two di�erent pro�les: one for
development, when you want to rebuild quickly and often, and another for building the �nal
program you’ll give to a user that won’t be rebuilt repeatedly and that will run as fast as possible. If
you’re benchmarking your code’s running time, be sure to run cargo build --release and
benchmark with the executable in target/release.
Cargo as Convention
With simple projects, Cargo doesn’t provide a lot of value over just using rustc , but it will prove
its worth as your programs become more intricate. With complex projects composed of multiple
crates, it’s much easier to let Cargo coordinate the build.
Even though the hello_cargo project is simple, it now uses much of the real tooling you’ll use in
the rest of your Rust career. In fact, to work on any existing projects, you can use the following
commands to check out the code using Git, change to that project’s directory, and build:
Summary
You’re already o� to a great start on your Rust journey! In this chapter, you’ve learned how to:
This is a great time to build a more substantial program to get used to reading and writing Rust
code. So, in Chapter 2, we’ll build a guessing game program. If you would rather start by learning
how common programming concepts work in Rust, see Chapter 3 and then return to Chapter 2.
We’ll implement a classic beginner programming problem: a guessing game. Here’s how it works:
the program will generate a random integer between 1 and 100. It will then prompt the player to
enter a guess. After a guess is entered, the program will indicate whether the guess is too low or
too high. If the guess is correct, the game will print a congratulatory message and exit.
The �rst command, cargo new , takes the name of the project ( guessing_game ) as the �rst
argument. The second command changes to the new project’s directory.
Filename: Cargo.toml
[package]
name = "guessing_game"
version = "0.1.0"
authors = ["Your Name <[email protected]>"]
edition = "2018"
[dependencies]
If the author information that Cargo obtained from your environment is not correct, �x that in the
�le and save it again.
As you saw in Chapter 1, cargo new generates a “Hello, world!” program for you. Check out the
src/main.rs �le:
Filename: src/main.rs
fn main() {
println!("Hello, world!");
}
Now let’s compile this “Hello, world!” program and run it in the same step using the cargo run
command:
$ cargo run
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Finished dev [unoptimized + debuginfo] target(s) in 1.50 secs
Running `target/debug/guessing_game`
Hello, world!
The run command comes in handy when you need to rapidly iterate on a project, as we’ll do in
this game, quickly testing each iteration before moving on to the next one.
Reopen the src/main.rs �le. You’ll be writing all the code in this �le.
Processing a Guess
The �rst part of the guessing game program will ask for user input, process that input, and check
that the input is in the expected form. To start, we’ll allow the player to input a guess. Enter the
code in Listing 2-1 into src/main.rs.
Filename: src/main.rs
use std::io;
fn main() {
println!("Guess the number!");
io::stdin().read_line(&mut guess)
.expect("Failed to read line");
Listing 2-1: Code that gets a guess from the user and prints it
This code contains a lot of information, so let’s go over it line by line. To obtain user input and then
print the result as output, we need to bring the io (input/output) library into scope. The io
library comes from the standard library (which is known as std ):
use std::io;
By default, Rust brings only a few types into the scope of every program in the prelude. If a type
you want to use isn’t in the prelude, you have to bring that type into scope explicitly with a use
statement. Using the std::io library provides you with a number of useful features, including the
ability to accept user input.
As you saw in Chapter 1, the main function is the entry point into the program:
fn main() {
The fn syntax declares a new function, the parentheses, () , indicate there are no parameters,
and the curly bracket, { , starts the body of the function.
As you also learned in Chapter 1, println! is a macro that prints a string to the screen:
This code is printing a prompt stating what the game is and requesting input from the user.
Next, we’ll create a place to store the user input, like this:
Now the program is getting interesting! There’s a lot going on in this little line. Notice that this is a
let statement, which is used to create a variable. Here’s another example:
This line creates a new variable named foo and binds it to the value of the bar variable. In Rust,
variables are immutable by default. We’ll be discussing this concept in detail in the “Variables and
Mutability” section in Chapter 3. The following example shows how to use mut before the variable
name to make a variable mutable:
Note: The // syntax starts a comment that continues until the end of the line. Rust ignores
everything in comments, which are discussed in more detail in Chapter 3.
Let's return to the guessing game program. You now know that let mut guess will introduce a
mutable variable named guess . On the other side of the equal sign ( = ) is the value that guess is
bound to, which is the result of calling String::new , a function that returns a new instance of a
String . String is a string type provided by the standard library that is a growable, UTF-8
encoded bit of text.
The :: syntax in the ::new line indicates that new is an associated function of the String type.
An associated function is implemented on a type, in this case String , rather than on a particular
instance of a String . Some languages call this a static method.
This new function creates a new, empty string. You’ll �nd a new function on many types, because
it’s a common name for a function that makes a new value of some kind.
To summarize, the let mut guess = String::new(); line has created a mutable variable that is
currently bound to a new, empty instance of a String . Whew!
Recall that we included the input/output functionality from the standard library with
use std::io; on the �rst line of the program. Now we’ll call an associated function, stdin , on
io :
io::stdin().read_line(&mut guess)
.expect("Failed to read line");
If we hadn’t listed the use std::io line at the beginning of the program, we could have written
this function call as std::io::stdin . The stdin function returns an instance of std::io::Stdin ,
which is a type that represents a handle to the standard input for your terminal.
The next part of the code, .read_line(&mut guess) , calls the read_line method on the standard
input handle to get input from the user. We’re also passing one argument to read_line :
&mut guess .
The job of read_line is to take whatever the user types into standard input and place that into a
string, so it takes that string as an argument. The string argument needs to be mutable so the
method can change the string’s content by adding the user input.
The & indicates that this argument is a reference, which gives you a way to let multiple parts of
your code access one piece of data without needing to copy that data into memory multiple times.
References are a complex feature, and one of Rust’s major advantages is how safe and easy it is to
use references. You don’t need to know a lot of those details to �nish this program. For now, all
you need to know is that like variables, references are immutable by default. Hence, you need to
write &mut guess rather than &guess to make it mutable. (Chapter 4 will explain references more
thoroughly.)
We’re not quite done with this line of code. Although what we’ve discussed so far is a single line of
text, it’s only the �rst part of the single logical line of code. The second part is this method:
When you call a method with the .foo() syntax, it’s often wise to introduce a newline and other
whitespace to help break up long lines. We could have written this code as:
However, one long line is di�cult to read, so it’s best to divide it: two lines for two method calls.
Now let’s discuss what this line does.
As mentioned earlier, read_line puts what the user types into the string we’re passing it, but it
also returns a value—in this case, an io::Result . Rust has a number of types named Result in
its standard library: a generic Result as well as speci�c versions for submodules, such as
io::Result .
The Result types are enumerations, often referred to as enums. An enumeration is a type that can
have a �xed set of values, and those values are called the enum’s variants. Chapter 6 will cover
enums in more detail.
For Result , the variants are Ok or Err . The Ok variant indicates the operation was successful,
and inside Ok is the successfully generated value. The Err variant means the operation failed,
and Err contains information about how or why the operation failed.
The purpose of these Result types is to encode error-handling information. Values of the Result
type, like values of any type, have methods de�ned on them. An instance of io::Result has an
expect method that you can call. If this instance of io::Result is an Err value, expect will
cause the program to crash and display the message that you passed as an argument to expect .
If the read_line method returns an Err , it would likely be the result of an error coming from the
underlying operating system. If this instance of io::Result is an Ok value, expect will take the
return value that Ok is holding and return just that value to you so you can use it. In this case, that
value is the number of bytes in what the user entered into standard input.
If you don’t call expect , the program will compile, but you’ll get a warning:
$ cargo build
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
warning: unused `std::result::Result` which must be used
--> src/main.rs:10:5
|
10 | io::stdin().read_line(&mut guess);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: #[warn(unused_must_use)] on by default
Rust warns that you haven’t used the Result value returned from read_line , indicating that the
program hasn’t handled a possible error.
The right way to suppress the warning is to actually write error handling, but because you just
want to crash this program when a problem occurs, you can use expect . You’ll learn about
recovering from errors in Chapter 9.
Aside from the closing curly brackets, there’s only one more line to discuss in the code added so
far, which is the following:
This line prints the string we saved the user’s input in. The set of curly brackets, {} , is a
placeholder: think of {} as little crab pincers that hold a value in place. You can print more than
one value using curly brackets: the �rst set of curly brackets holds the �rst value listed after the
format string, the second set holds the second value, and so on. Printing multiple values in one
call to println! would look like this:
let x = 5;
let y = 10;
Let’s test the �rst part of the guessing game. Run it using cargo run :
$ cargo run
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Finished dev [unoptimized + debuginfo] target(s) in 2.53 secs
Running `target/debug/guessing_game`
Guess the number!
Please input your guess.
6
You guessed: 6
At this point, the �rst part of the game is done: we’re getting input from the keyboard and then
printing it.
Remember that a crate is a collection of Rust source code �les. The project we’ve been building is
a binary crate, which is an executable. The rand crate is a library crate, which contains code
intended to be used in other programs.
Cargo’s use of external crates is where it really shines. Before we can write code that uses rand ,
we need to modify the Cargo.toml �le to include the rand crate as a dependency. Open that �le
now and add the following line to the bottom beneath the [dependencies] section header that
Cargo created for you:
Filename: Cargo.toml
[dependencies]
rand = "0.3.14"
In the Cargo.toml �le, everything that follows a header is part of a section that continues until
another section starts. The [dependencies] section is where you tell Cargo which external crates
your project depends on and which versions of those crates you require. In this case, we’ll specify
the rand crate with the semantic version speci�er 0.3.14 . Cargo understands Semantic
Versioning (sometimes called SemVer), which is a standard for writing version numbers. The
number 0.3.14 is actually shorthand for ^0.3.14 , which means “any version that has a public
API compatible with version 0.3.14.”
Now, without changing any of the code, let’s build the project, as shown in Listing 2-2.
$ cargo build
Updating registry `https://github.com/rust-lang/crates.io-index`
Downloading rand v0.3.14
Downloading libc v0.2.14
Compiling libc v0.2.14
Compiling rand v0.3.14
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Finished dev [unoptimized + debuginfo] target(s) in 2.53 secs
Listing 2-2: The output from running cargo build after adding the rand crate as a dependency
You may see di�erent version numbers (but they will all be compatible with the code, thanks to
SemVer!), and the lines may be in a di�erent order.
Now that we have an external dependency, Cargo fetches the latest versions of everything from
the registry, which is a copy of data from Crates.io. Crates.io is where people in the Rust ecosystem
post their open source Rust projects for others to use.
After updating the registry, Cargo checks the [dependencies] section and downloads any crates
you don’t have yet. In this case, although we only listed rand as a dependency, Cargo also
grabbed a copy of libc , because rand depends on libc to work. After downloading the crates,
Rust compiles them and then compiles the project with the dependencies available.
If you immediately run cargo build again without making any changes, you won’t get any output
aside from the Finished line. Cargo knows it has already downloaded and compiled the
dependencies, and you haven’t changed anything about them in your Cargo.toml �le. Cargo also
knows that you haven’t changed anything about your code, so it doesn’t recompile that either.
With nothing to do, it simply exits.
If you open up the src/main.rs �le, make a trivial change, and then save it and build again, you’ll
only see two lines of output:
$ cargo build
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Finished dev [unoptimized + debuginfo] target(s) in 2.53 secs
These lines show Cargo only updates the build with your tiny change to the src/main.rs �le. Your
dependencies haven’t changed, so Cargo knows it can reuse what it has already downloaded and
compiled for those. It just rebuilds your part of the code.
Cargo has a mechanism that ensures you can rebuild the same artifact every time you or anyone
else builds your code: Cargo will use only the versions of the dependencies you speci�ed until you
indicate otherwise. For example, what happens if next week version 0.3.15 of the rand crate
comes out and contains an important bug �x but also contains a regression that will break your
code?
The answer to this problem is the Cargo.lock �le, which was created the �rst time you ran
cargo build and is now in your guessing_game directory. When you build a project for the �rst
time, Cargo �gures out all the versions of the dependencies that �t the criteria and then writes
them to the Cargo.lock �le. When you build your project in the future, Cargo will see that the
Cargo.lock �le exists and use the versions speci�ed there rather than doing all the work of �guring
out versions again. This lets you have a reproducible build automatically. In other words, your
project will remain at 0.3.14 until you explicitly upgrade, thanks to the Cargo.lock �le.
When you do want to update a crate, Cargo provides another command, update , which will ignore
the Cargo.lock �le and �gure out all the latest versions that �t your speci�cations in Cargo.toml. If
that works, Cargo will write those versions to the Cargo.lock �le.
But by default, Cargo will only look for versions larger than 0.3.0 and smaller than 0.4.0 . If the
rand crate has released two new versions, 0.3.15 and 0.4.0 , you would see the following if you
ran cargo update :
$ cargo update
Updating registry `https://github.com/rust-lang/crates.io-index`
Updating rand v0.3.14 -> v0.3.15
At this point, you would also notice a change in your Cargo.lock �le noting that the version of the
rand crate you are now using is 0.3.15 .
If you wanted to use rand version 0.4.0 or any version in the 0.4.x series, you’d have to
update the Cargo.toml �le to look like this instead:
[dependencies]
rand = "0.4.0"
The next time you run cargo build , Cargo will update the registry of crates available and
reevaluate your rand requirements according to the new version you have speci�ed.
There’s a lot more to say about Cargo and its ecosystem which we’ll discuss in Chapter 14, but for
now, that’s all you need to know. Cargo makes it very easy to reuse libraries, so Rustaceans are
able to write smaller projects that are assembled from a number of packages.
Now that you’ve added the rand crate to Cargo.toml, let’s start using rand . The next step is to
update src/main.rs, as shown in Listing 2-3.
Filename: src/main.rs
use std::io;
use rand::Rng;
fn main() {
println!("Guess the number!");
io::stdin().read_line(&mut guess)
.expect("Failed to read line");
First, we add a line that lets Rust know we’ll be using the rand crate as an external dependency.
This also does the equivalent of calling use rand , so now we can call anything in the rand crate
by placing rand:: before it.
Next, we add another use line: use rand::Rng . The Rng trait de�nes methods that random
number generators implement, and this trait must be in scope for us to use those methods.
Chapter 10 will cover traits in detail.
Also, we’re adding two more lines in the middle. The rand::thread_rng function will give us the
particular random number generator that we’re going to use: one that is local to the current
thread of execution and seeded by the operating system. Next, we call the gen_range method on
the random number generator. This method is de�ned by the Rng trait that we brought into
scope with the use rand::Rng statement. The gen_range method takes two numbers as
arguments and generates a random number between them. It’s inclusive on the lower bound but
exclusive on the upper bound, so we need to specify 1 and 101 to request a number between 1
and 100.
Note: You won’t just know which traits to use and which methods and functions to call from
a crate. Instructions for using a crate are in each crate’s documentation. Another neat
feature of Cargo is that you can run the cargo doc --open command, which will build
documentation provided by all of your dependencies locally and open it in your browser. If
you’re interested in other functionality in the rand crate, for example, run
cargo doc --open and click rand in the sidebar on the left.
The second line that we added to the code prints the secret number. This is useful while we’re
developing the program to be able to test it, but we’ll delete it from the �nal version. It’s not much
of a game if the program prints the answer as soon as it starts!
$ cargo run
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Finished dev [unoptimized + debuginfo] target(s) in 2.53 secs
Running `target/debug/guessing_game`
Guess the number!
The secret number is: 7
Please input your guess.
4
You guessed: 4
$ cargo run
Running `target/debug/guessing_game`
Guess the number!
The secret number is: 83
Please input your guess.
5
You guessed: 5
You should get di�erent random numbers, and they should all be numbers between 1 and 100.
Great job!
Filename: src/main.rs
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
// ---snip---
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
}
Listing 2-4: Handling the possible return values of comparing two numbers
The �rst new bit here is another use statement, bringing a type called std::cmp::Ordering into
scope from the standard library. Like Result , Ordering is another enum, but the variants for
Ordering are Less , Greater , and Equal . These are the three outcomes that are possible when
you compare two values.
Then we add �ve new lines at the bottom that use the Ordering type. The cmp method compares
two values and can be called on anything that can be compared. It takes a reference to whatever
you want to compare with: here it’s comparing the guess to the secret_number . Then it returns a
variant of the Ordering enum we brought into scope with the use statement. We use a match
expression to decide what to do next based on which variant of Ordering was returned from the
call to cmp with the values in guess and secret_number .
A match expression is made up of arms. An arm consists of a pattern and the code that should be
run if the value given to the beginning of the match expression �ts that arm’s pattern. Rust takes
the value given to match and looks through each arm’s pattern in turn. The match construct and
patterns are powerful features in Rust that let you express a variety of situations your code might
encounter and make sure that you handle them all. These features will be covered in detail in
Chapter 6 and Chapter 18, respectively.
Let’s walk through an example of what would happen with the match expression used here. Say
that the user has guessed 50 and the randomly generated secret number this time is 38. When the
code compares 50 to 38, the cmp method will return Ordering::Greater , because 50 is greater
than 38. The match expression gets the Ordering::Greater value and starts checking each arm’s
pattern. It looks at the �rst arm’s pattern, Ordering::Less , and sees that the value
Ordering::Greater does not match Ordering::Less , so it ignores the code in that arm and
moves to the next arm. The next arm’s pattern, Ordering::Greater , does match
Ordering::Greater ! The associated code in that arm will execute and print Too big! to the
screen. The match expression ends because it has no need to look at the last arm in this scenario.
However, the code in Listing 2-4 won’t compile yet. Let’s try it:
$ cargo build
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
error[E0308]: mismatched types
--> src/main.rs:23:21
|
23 | match guess.cmp(&secret_number) {
| ^^^^^^^^^^^^^^ expected struct `std::string::String`, found
integral variable
|
= note: expected type `&std::string::String`
= note: found type `&{integer}`
The core of the error states that there are mismatched types. Rust has a strong, static type system.
However, it also has type inference. When we wrote let mut guess = String::new(); , Rust was
able to infer that guess should be a String and didn’t make us write the type. The
secret_number , on the other hand, is a number type. A few number types can have a value
between 1 and 100: i32 , a 32-bit number; u32 , an unsigned 32-bit number; i64 , a 64-bit
number; as well as others. Rust defaults to an i32 , which is the type of secret_number unless
you add type information elsewhere that would cause Rust to infer a di�erent numerical type. The
reason for the error is that Rust cannot compare a string and a number type.
Ultimately, we want to convert the String the program reads as input into a real number type so
we can compare it numerically to the guess. We can do that by adding the following two lines to
the main function body:
Filename: src/main.rs
// --snip--
io::stdin().read_line(&mut guess)
.expect("Failed to read line");
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
}
We create a variable named guess . But wait, doesn’t the program already have a variable named
guess ? It does, but Rust allows us to shadow the previous value of guess with a new one. This
feature is often used in situations in which you want to convert a value from one type to another
type. Shadowing lets us reuse the guess variable name rather than forcing us to create two
unique variables, such as guess_str and guess for example. (Chapter 3 covers shadowing in
more detail.)
We bind guess to the expression guess.trim().parse() . The guess in the expression refers to
the original guess that was a String with the input in it. The trim method on a String instance
will eliminate any whitespace at the beginning and end. Although u32 can contain only numerical
characters, the user must press enter to satisfy read_line . When the user presses enter, a
newline character is added to the string. For example, if the user types 5 and presses enter, guess
looks like this: 5\n . The \n represents “newline,” the result of pressing enter. The trim method
eliminates \n , resulting in just 5 .
The parse method on strings parses a string into some kind of number. Because this method can
parse a variety of number types, we need to tell Rust the exact number type we want by using
let guess: u32 . The colon ( : ) after guess tells Rust we’ll annotate the variable’s type. Rust has
a few built-in number types; the u32 seen here is an unsigned, 32-bit integer. It’s a good default
choice for a small positive number. You’ll learn about other number types in Chapter 3.
Additionally, the u32 annotation in this example program and the comparison with
secret_number means that Rust will infer that secret_number should be a u32 as well. So now
the comparison will be between two values of the same type!
The call to parse could easily cause an error. If, for example, the string contained A�
�% , there
would be no way to convert that to a number. Because it might fail, the parse method returns a
Result type, much as the read_line method does (discussed earlier in “Handling Potential
Failure with the Result Type”). We’ll treat this Result the same way by using the expect method
again. If parse returns an Err Result variant because it couldn’t create a number from the
string, the expect call will crash the game and print the message we give it. If parse can
successfully convert the string to a number, it will return the Ok variant of Result , and expect
will return the number that we want from the Ok value.
$ cargo run
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Finished dev [unoptimized + debuginfo] target(s) in 0.43 secs
Running `target/debug/guessing_game`
Guess the number!
The secret number is: 58
Please input your guess.
76
You guessed: 76
Too big!
Nice! Even though spaces were added before the guess, the program still �gured out that the user
guessed 76. Run the program a few times to verify the di�erent behavior with di�erent kinds of
input: guess the number correctly, guess a number that is too high, and guess a number that is
too low.
We have most of the game working now, but the user can make only one guess. Let’s change that
by adding a loop!
Filename: src/main.rs
// --snip--
loop {
println!("Please input your guess.");
// --snip--
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => println!("You win!"),
}
}
}
As you can see, we’ve moved everything into a loop from the guess input prompt onward. Be sure
to indent the lines inside the loop another four spaces each and run the program again. Notice
that there is a new problem because the program is doing exactly what we told it to do: ask for
another guess forever! It doesn’t seem like the user can quit!
The user could always halt the program by using the keyboard shortcut ctrl-c. But there’s another
way to escape this insatiable monster, as mentioned in the parse discussion in “Comparing the
Guess to the Secret Number”: if the user enters a non-number answer, the program will crash. The
user can take advantage of that in order to quit, as shown here:
$ cargo run
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Finished dev [unoptimized + debuginfo] target(s) in 1.50 secs
Running `target/debug/guessing_game`
Guess the number!
The secret number is: 59
Please input your guess.
45
You guessed: 45
Too small!
Please input your guess.
60
You guessed: 60
Too big!
Please input your guess.
59
You guessed: 59
You win!
Please input your guess.
quit
thread 'main' panicked at 'Please type a number!: ParseIntError { kind: InvalidDigit
}', src/libcore/result.rs:785
note: Run with `RUST_BACKTRACE=1` for a backtrace.
error: Process didn't exit successfully: `target/debug/guess` (exit code: 101)
Typing quit actually quits the game, but so will any other non-number input. However, this is
suboptimal to say the least. We want the game to automatically stop when the correct number is
guessed.
Let’s program the game to quit when the user wins by adding a break statement:
Filename: src/main.rs
// --snip--
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => {
println!("You win!");
break;
}
}
}
}
Adding the break line after You win! makes the program exit the loop when the user guesses
the secret number correctly. Exiting the loop also means exiting the program, because the loop is
the last part of main .
To further re�ne the game’s behavior, rather than crashing the program when the user inputs a
non-number, let’s make the game ignore a non-number so the user can continue guessing. We
can do that by altering the line where guess is converted from a String to a u32 , as shown in
Listing 2-5.
Filename: src/main.rs
// --snip--
io::stdin().read_line(&mut guess)
.expect("Failed to read line");
// --snip--
Listing 2-5: Ignoring a non-number guess and asking for another guess instead of crashing the program
Switching from an expect call to a match expression is how you generally move from crashing on
an error to handling the error. Remember that parse returns a Result type and Result is an
enum that has the variants Ok or Err . We’re using a match expression here, as we did with the
Ordering result of the cmp method.
If parse is able to successfully turn the string into a number, it will return an Ok value that
contains the resulting number. That Ok value will match the �rst arm’s pattern, and the match
expression will just return the num value that parse produced and put inside the Ok value. That
number will end up right where we want it in the new guess variable we’re creating.
If parse is not able to turn the string into a number, it will return an Err value that contains more
information about the error. The Err value does not match the Ok(num) pattern in the �rst
match arm, but it does match the Err(_) pattern in the second arm. The underscore, _ , is a
catchall value; in this example, we’re saying we want to match all Err values, no matter what
information they have inside them. So the program will execute the second arm’s code, continue ,
which tells the program to go to the next iteration of the loop and ask for another guess. So
e�ectively, the program ignores all errors that parse might encounter!
Now everything in the program should work as expected. Let’s try it:
$ cargo run
Compiling guessing_game v0.1.0 (file:///projects/guessing_game)
Running `target/debug/guessing_game`
Guess the number!
The secret number is: 61
Please input your guess.
10
You guessed: 10
Too small!
Please input your guess.
99
You guessed: 99
Too big!
Please input your guess.
foo
Please input your guess.
61
You guessed: 61
You win!
Awesome! With one tiny �nal tweak, we will �nish the guessing game. Recall that the program is
still printing the secret number. That worked well for testing, but it ruins the game. Let’s delete the
println! that outputs the secret number. Listing 2-6 shows the �nal code.
Filename: src/main.rs
use std::io;
use std::cmp::Ordering;
use rand::Rng;
fn main() {
println!("Guess the number!");
loop {
println!("Please input your guess.");
io::stdin().read_line(&mut guess)
.expect("Failed to read line");
match guess.cmp(&secret_number) {
Ordering::Less => println!("Too small!"),
Ordering::Greater => println!("Too big!"),
Ordering::Equal => {
println!("You win!");
break;
}
}
}
}
Summary
At this point, you’ve successfully built the guessing game. Congratulations!
This project was a hands-on way to introduce you to many new Rust concepts: let , match ,
methods, associated functions, the use of external crates, and more. In the next few chapters,
you’ll learn about these concepts in more detail. Chapter 3 covers concepts that most
programming languages have, such as variables, data types, and functions, and shows how to use
them in Rust. Chapter 4 explores ownership, a feature that makes Rust di�erent from other
languages. Chapter 5 discusses structs and method syntax, and Chapter 6 explains how enums
work.
Speci�cally, you’ll learn about variables, basic types, functions, comments, and control �ow. These
foundations will be in every Rust program, and learning them early will give you a strong core to
start from.
Keywords
The Rust language has a set of keywords that are reserved for use by the language only, much as in
other languages. Keep in mind that you cannot use these words as names of variables or
functions. Most of the keywords have special meanings, and you’ll be using them to do various
tasks in your Rust programs; a few have no current functionality associated with them but have
been reserved for functionality that might be added to Rust in the future. You can �nd a list of the
keywords in Appendix A.
Identi�ers
We’re going to be explaining a bunch of concepts in this book: variables, functions, structs, lots of
things. All of these things need names. A name in Rust is called an “identi�er,” and can be made up
of any nonempty ASCII string, with some restrictions:
Either:
or:
Raw identi�ers
Sometimes, you may need to use a name that’s a keyword for another purpose. Maybe you need
to call a function named match that is coming from a C library, where ‘match’ is not a keyword. To
do this, you can use a “raw identi�er.” Raw identi�ers start with r# :
let r#fn = "this variable is named 'fn' even though that's a keyword";
You won’t need raw identi�ers often, but when you do, you really need them.
When a variable is immutable, once a value is bound to a name, you can’t change that value. To
illustrate this, let’s generate a new project called variables in your projects directory by using
cargo new variables .
Then, in your new variables directory, open src/main.rs and replace its code with the following code
that won’t compile just yet:
Filename: src/main.rs
fn main() {
let x = 5;
println!("The value of x is: {}", x);
x = 6;
println!("The value of x is: {}", x);
}
Save and run the program using cargo run . You should receive an error message, as shown in
this output:
This example shows how the compiler helps you �nd errors in your programs. Even though
compiler errors can be frustrating, they only mean your program isn’t safely doing what you want
it to do yet; they do not mean that you’re not a good programmer! Experienced Rustaceans still get
compiler errors.
The error message indicates that the cause of the error is that you
cannot assign twice to immutable variable x , because you tried to assign a second value to
the immutable x variable.
It’s important that we get compile-time errors when we attempt to change a value that we
previously designated as immutable because this very situation can lead to bugs. If one part of our
code operates on the assumption that a value will never change and another part of our code
changes that value, it’s possible that the �rst part of the code won’t do what it was designed to do.
The cause of this kind of bug can be di�cult to track down after the fact, especially when the
second piece of code changes the value only sometimes.
In Rust, the compiler guarantees that when you state that a value won’t change, it really won’t
change. That means that when you’re reading and writing code, you don’t have to keep track of
how and where a value might change. Your code is thus easier to reason through.
But mutability can be very useful. Variables are immutable only by default; as you did in Chapter 2,
you can make them mutable by adding mut in front of the variable name. In addition to allowing
this value to change, mut conveys intent to future readers of the code by indicating that other
parts of the code will be changing this variable value.
Filename: src/main.rs
fn main() {
let mut x = 5;
println!("The value of x is: {}", x);
x = 6;
println!("The value of x is: {}", x);
}
$ cargo run
Compiling variables v0.1.0 (file:///projects/variables)
Finished dev [unoptimized + debuginfo] target(s) in 0.30 secs
Running `target/debug/variables`
The value of x is: 5
The value of x is: 6
We’re allowed to change the value that x binds to from 5 to 6 when mut is used. In some cases,
you’ll want to make a variable mutable because it makes the code more convenient to write than if
it had only immutable variables.
There are multiple trade-o�s to consider in addition to the prevention of bugs. For example, in
cases where you’re using large data structures, mutating an instance in place may be faster than
copying and returning newly allocated instances. With smaller data structures, creating new
instances and writing in a more functional programming style may be easier to think through, so
lower performance might be a worthwhile penalty for gaining that clarity.
Being unable to change the value of a variable might have reminded you of another programming
concept that most other languages have: constants. Like immutable variables, constants are values
that are bound to a name and are not allowed to change, but there are a few di�erences between
constants and variables.
First, you aren’t allowed to use mut with constants. Constants aren’t just immutable by default
—they’re always immutable.
You declare constants using the const keyword instead of the let keyword, and the type of the
value must be annotated. We’re about to cover types and type annotations in the next section,
“Data Types,” so don’t worry about the details right now. Just know that you must always annotate
the type.
Constants can be declared in any scope, including the global scope, which makes them useful for
values that many parts of code need to know about.
The last di�erence is that constants may be set only to a constant expression, not the result of a
function call or any other value that could only be computed at runtime.
Here’s an example of a constant declaration where the constant’s name is MAX_POINTS and its
value is set to 100,000. (Rust’s naming convention for constants is to use all uppercase with
underscores between words, and underscores can be inserted in numeric literals to improve
readability):
const MAX_POINTS: u32 = 100_000;
Constants are valid for the entire time a program runs, within the scope they were declared in,
making them a useful choice for values in your application domain that multiple parts of the
program might need to know about, such as the maximum number of points any player of a game
is allowed to earn or the speed of light.
Naming hardcoded values used throughout your program as constants is useful in conveying the
meaning of that value to future maintainers of the code. It also helps to have only one place in
your code you would need to change if the hardcoded value needed to be updated in the future.
Shadowing
As you saw in the guessing game tutorial in the “Comparing the Guess to the Secret Number”
section in Chapter 2, you can declare a new variable with the same name as a previous variable,
and the new variable shadows the previous variable. Rustaceans say that the �rst variable is
shadowed by the second, which means that the second variable’s value is what appears when the
variable is used. We can shadow a variable by using the same variable’s name and repeating the
use of the let keyword as follows:
Filename: src/main.rs
fn main() {
let x = 5;
let x = x + 1;
let x = x * 2;
This program �rst binds x to a value of 5 . Then it shadows x by repeating let x = , taking the
original value and adding 1 so the value of x is then 6 . The third let statement also shadows
x , multiplying the previous value by 2 to give x a �nal value of 12 . When we run this program,
it will output the following:
$ cargo run
Compiling variables v0.1.0 (file:///projects/variables)
Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs
Running `target/debug/variables`
The value of x is: 12
Shadowing is di�erent than marking a variable as mut , because we’ll get a compile-time error if
we accidentally try to reassign to this variable without using the let keyword. By using let , we
can perform a few transformations on a value but have the variable be immutable after those
transformations have been completed.
The other di�erence between mut and shadowing is that because we’re e�ectively creating a new
variable when we use the let keyword again, we can change the type of the value but reuse the
same name. For example, say our program asks a user to show how many spaces they want
between some text by inputting space characters, but we really want to store that input as a
number:
let spaces = " ";
let spaces = spaces.len();
This construct is allowed because the �rst spaces variable is a string type and the second spaces
variable, which is a brand-new variable that happens to have the same name as the �rst one, is a
number type. Shadowing thus spares us from having to come up with di�erent names, such as
spaces_str and spaces_num ; instead, we can reuse the simpler spaces name. However, if we try
to use mut for this, as shown here, we’ll get a compile-time error:
Now that we’ve explored how variables work, let’s look at more data types they can have.
Data Types
Every value in Rust is of a certain data type, which tells Rust what kind of data is being speci�ed so
it knows how to work with that data. We’ll look at two data type subsets: scalar and compound.
Keep in mind that Rust is a statically typed language, which means that it must know the types of
all variables at compile time. The compiler can usually infer what type we want to use based on
the value and how we use it. In cases when many types are possible, such as when we converted a
String to a numeric type using parse in the “Comparing the Guess to the Secret Number”
section in Chapter 2, we must add a type annotation, like this:
let guess: u32 = "42".parse().expect("Not a number!");
If we don’t add the type annotation here, Rust will display the following error, which means the
compiler needs more information from us to know which type we want to use:
Scalar Types
A scalar type represents a single value. Rust has four primary scalar types: integers, �oating-point
numbers, Booleans, and characters. You may recognize these from other programming
languages. Let’s jump into how they work in Rust.
Integer Types
An integer is a number without a fractional component. We used one integer type in Chapter 2, the
u32 type. This type declaration indicates that the value it’s associated with should be an unsigned
integer (signed integer types start with i , instead of u ) that takes up 32 bits of space. Table 3-1
shows the built-in integer types in Rust. Each variant in the Signed and Unsigned columns (for
example, i16 ) can be used to declare the type of an integer value.
Each variant can be either signed or unsigned and has an explicit size. Signed and unsigned refer to
whether it’s possible for the number to be negative or positive—in other words, whether the
number needs to have a sign with it (signed) or whether it will only ever be positive and can
therefore be represented without a sign (unsigned). It’s like writing numbers on paper: when the
sign matters, a number is shown with a plus sign or a minus sign; however, when it’s safe to
assume the number is positive, it’s shown with no sign. Signed numbers are stored using two’s
complement representation (if you’re unsure what this is, you can search for it online; an
explanation is outside the scope of this book).
Each signed variant can store numbers from -(2n - 1) to 2n - 1 - 1 inclusive, where n is the number of
bits that variant uses. So an i8 can store numbers from -(27) to 27 - 1, which equals -128 to 127.
Unsigned variants can store numbers from 0 to 2n - 1, so a u8 can store numbers from 0 to 28 - 1,
which equals 0 to 255.
Additionally, the isize and usize types depend on the kind of computer your program is
running on: 64 bits if you’re on a 64-bit architecture and 32 bits if you’re on a 32-bit architecture.
You can write integer literals in any of the forms shown in Table 3-2. Note that all number literals
except the byte literal allow a type su�x, such as 57u8 , and _ as a visual separator, such as
1_000 .
Hex 0xff
Octal 0o77
Binary 0b1111_0000
So how do you know which type of integer to use? If you’re unsure, Rust’s defaults are generally
good choices, and integer types default to i32 : this type is generally the fastest, even on 64-bit
systems. The primary situation in which you’d use isize or usize is when indexing some sort of
collection.
Integer Over�ow
Let’s say that you have a u8 , which can hold values between zero and 255 . What happens if you
try to change it to 256 ? This is called “integer over�ow,” and Rust has some interesting rules
around this behavior. When compiling in debug mode, Rust checks for this kind of issue and will
cause your program to panic, which is the term Rust uses when a program exits with an error.
We’ll discuss panics more in Chapter 9.
In release builds, Rust does not check for over�ow, and instead will do something called “two’s
complement wrapping.” In short, 256 becomes 0 , 257 becomes 1 , etc. Relying on over�ow is
considered an error, even if this behavior happens. If you want this behavior explicitly, the
standard library has a type, Wrapping , that provides it explicitly.
Floating-Point Types
Rust also has two primitive types for �oating-point numbers, which are numbers with decimal
points. Rust’s �oating-point types are f32 and f64 , which are 32 bits and 64 bits in size,
respectively. The default type is f64 because on modern CPUs it’s roughly the same speed as
f32 but is capable of more precision.
Filename: src/main.rs
fn main() {
let x = 2.0; // f64
Floating-point numbers are represented according to the IEEE-754 standard. The f32 type is a
single-precision �oat, and f64 has double precision.
Numeric Operations
Rust supports the basic mathematical operations you’d expect for all of the number types:
addition, subtraction, multiplication, division, and remainder. The following code shows how you’d
use each one in a let statement:
Filename: src/main.rs
fn main() {
// addition
let sum = 5 + 10;
// subtraction
let difference = 95.5 - 4.3;
// multiplication
let product = 4 * 30;
// division
let quotient = 56.7 / 32.2;
// remainder
let remainder = 43 % 5;
}
Each expression in these statements uses a mathematical operator and evaluates to a single
value, which is then bound to a variable. Appendix B contains a list of all operators that Rust
provides.
As in most other programming languages, a Boolean type in Rust has two possible values: true
and false . The Boolean type in Rust is speci�ed using bool . For example:
Filename: src/main.rs
fn main() {
let t = true;
The main way to use Boolean values is through conditionals, such as an if expression. We’ll
cover how if expressions work in Rust in the “Control Flow” section.
So far we’ve worked only with numbers, but Rust supports letters too. Rust’s char type is the
language’s most primitive alphabetic type, and the following code shows one way to use it. (Note
that the char literal is speci�ed with single quotes, as opposed to string literals, which use double
quotes.)
Filename: src/main.rs
fn main() {
let c = 'z';
let z = 'ℤ';
let heart_eyed_cat = '�
�';
}
Rust’s char type represents a Unicode Scalar Value, which means it can represent a lot more than
just ASCII. Accented letters; Chinese, Japanese, and Korean characters; emoji; and zero-width
spaces are all valid char values in Rust. Unicode Scalar Values range from U+0000 to U+D7FF and
U+E000 to U+10FFFF inclusive. However, a “character” isn’t really a concept in Unicode, so your
human intuition for what a “character” is may not match up with what a char is in Rust. We’ll
discuss this topic in detail in “Strings” in Chapter 8.
Compound Types
Compound types can group multiple values into one type. Rust has two primitive compound types:
tuples and arrays.
A tuple is a general way of grouping together some number of other values with a variety of types
into one compound type. Tuples have a �xed length: once declared, they cannot grow or shrink in
size.
We create a tuple by writing a comma-separated list of values inside parentheses. Each position in
the tuple has a type, and the types of the di�erent values in the tuple don’t have to be the same.
We’ve added optional type annotations in this example:
Filename: src/main.rs
fn main() {
let tup: (i32, f64, u8) = (500, 6.4, 1);
}
The variable tup binds to the entire tuple, because a tuple is considered a single compound
element. To get the individual values out of a tuple, we can use pattern matching to destructure a
tuple value, like this:
Filename: src/main.rs
fn main() {
let tup = (500, 6.4, 1);
This program �rst creates a tuple and binds it to the variable tup . It then uses a pattern with let
to take tup and turn it into three separate variables, x , y , and z . This is called destructuring,
because it breaks the single tuple into three parts. Finally, the program prints the value of y ,
which is 6.4 .
In addition to destructuring through pattern matching, we can access a tuple element directly by
using a period ( . ) followed by the index of the value we want to access. For example:
Filename: src/main.rs
fn main() {
let x: (i32, f64, u8) = (500, 6.4, 1);
This program creates a tuple, x , and then makes new variables for each element by using their
index. As with most programming languages, the �rst index in a tuple is 0.
Another way to have a collection of multiple values is with an array. Unlike a tuple, every element
of an array must have the same type. Arrays in Rust are di�erent from arrays in some other
languages because arrays in Rust have a �xed length, like tuples.
In Rust, the values going into an array are written as a comma-separated list inside square
brackets:
Filename: src/main.rs
fn main() {
let a = [1, 2, 3, 4, 5];
}
Arrays are useful when you want your data allocated on the stack rather than the heap (we will
discuss the stack and the heap more in Chapter 4) or when you want to ensure you always have a
�xed number of elements. An array isn’t as �exible as the vector type, though. A vector is a similar
collection type provided by the standard library that is allowed to grow or shrink in size. If you’re
unsure whether to use an array or a vector, you should probably use a vector. Chapter 8 discusses
vectors in more detail.
An example of when you might want to use an array rather than a vector is in a program that
needs to know the names of the months of the year. It’s very unlikely that such a program will
need to add or remove months, so you can use an array because you know it will always contain
12 items:
let months = ["January", "February", "March", "April", "May", "June", "July",
"August", "September", "October", "November", "December"];
Arrays have an interesting type; it looks like this: [type; number] . For example:
let a: [i32; 5] = [1, 2, 3, 4, 5];
First, there’s square brackets; they look like the syntax for creating an array. Inside, there’s two
pieces of information, separated by a semicolon. The �rst is the type of each element of the array.
Since all elements have the same type, we only need to list it once. After the semicolon, there’s a
number that indicates the length of the array. Since an array has a �xed size, this number is
always the same, even if the array’s elements are modi�ed, it cannot grow or shrink.
An array is a single chunk of memory allocated on the stack. You can access elements of an array
using indexing, like this:
Filename: src/main.rs
fn main() {
let a = [1, 2, 3, 4, 5];
In this example, the variable named first will get the value 1 , because that is the value at index
[0] in the array. The variable named second will get the value 2 from index [1] in the array.
What happens if you try to access an element of an array that is past the end of the array? Say you
change the example to the following code, which will compile but exit with an error when it runs:
Filename: src/main.rs
fn main() {
let a = [1, 2, 3, 4, 5];
let index = 10;
Running this code using cargo run produces the following result:
$ cargo run
Compiling arrays v0.1.0 (file:///projects/arrays)
Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs
Running `target/debug/arrays`
thread '<main>' panicked at 'index out of bounds: the len is 5 but the index is
10', src/main.rs:6
note: Run with `RUST_BACKTRACE=1` for a backtrace.
The compilation didn’t produce any errors, but the program resulted in a runtime error and didn’t
exit successfully. When you attempt to access an element using indexing, Rust will check that the
index you’ve speci�ed is less than the array length. If the index is greater than the length, Rust will
panic.
This is the �rst example of Rust’s safety principles in action. In many low-level languages, this kind
of check is not done, and when you provide an incorrect index, invalid memory can be accessed.
Rust protects you against this kind of error by immediately exiting instead of allowing the memory
access and continuing. Chapter 9 discusses more of Rust’s error handling.
Functions
Functions are pervasive in Rust code. You’ve already seen one of the most important functions in
the language: the main function, which is the entry point of many programs. You’ve also seen the
fn keyword, which allows you to declare new functions.
Rust code uses snake case as the conventional style for function and variable names. In snake case,
all letters are lowercase and underscores separate words. Here’s a program that contains an
example function de�nition:
Filename: src/main.rs
fn main() {
println!("Hello, world!");
another_function();
}
fn another_function() {
println!("Another function.");
}
Function de�nitions in Rust start with fn and have a set of parentheses after the function name.
The curly brackets tell the compiler where the function body begins and ends.
We can call any function we’ve de�ned by entering its name followed by a set of parentheses.
Because another_function is de�ned in the program, it can be called from inside the main
function. Note that we de�ned another_function after the main function in the source code; we
could have de�ned it before as well. Rust doesn’t care where you de�ne your functions, only that
they’re de�ned somewhere.
Let’s start a new binary project named functions to explore functions further. Place the
another_function example in src/main.rs and run it. You should see the following output:
$ cargo run
Compiling functions v0.1.0 (file:///projects/functions)
Finished dev [unoptimized + debuginfo] target(s) in 0.28 secs
Running `target/debug/functions`
Hello, world!
Another function.
The lines execute in the order in which they appear in the main function. First, the “Hello, world!”
message prints, and then another_function is called and its message is printed.
Function Parameters
Functions can also be de�ned to have parameters, which are special variables that are part of a
function’s signature. When a function has parameters, you can provide it with concrete values for
those parameters. Technically, the concrete values are called arguments, but in casual
conversation, people tend to use the words parameter and argument interchangeably for either
the variables in a function’s de�nition or the concrete values passed in when you call a function.
The following rewritten version of another_function shows what parameters look like in Rust:
Filename: src/main.rs
fn main() {
another_function(5);
}
fn another_function(x: i32) {
println!("The value of x is: {}", x);
}
Try running this program; you should get the following output:
$ cargo run
Compiling functions v0.1.0 (file:///projects/functions)
Finished dev [unoptimized + debuginfo] target(s) in 1.21 secs
Running `target/debug/functions`
The value of x is: 5
The declaration of another_function has one parameter named x . The type of x is speci�ed as
i32 . When 5 is passed to another_function , the println! macro puts 5 where the pair of
curly brackets were in the format string.
In function signatures, you must declare the type of each parameter. This is a deliberate decision
in Rust’s design: requiring type annotations in function de�nitions means the compiler almost
never needs you to use them elsewhere in the code to �gure out what you mean.
When you want a function to have multiple parameters, separate the parameter declarations with
commas, like this:
Filename: src/main.rs
fn main() {
another_function(5, 6);
}
This example creates a function with two parameters, both of which are i32 types. The function
then prints the values in both of its parameters. Note that function parameters don’t all need to
be the same type, they just happen to be in this example.
Let’s try running this code. Replace the program currently in your functions project’s src/main.rs �le
with the preceding example and run it using cargo run :
$ cargo run
Compiling functions v0.1.0 (file:///projects/functions)
Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs
Running `target/debug/functions`
The value of x is: 5
The value of y is: 6
Because we called the function with 5 as the value for x and 6 is passed as the value for y , the
two strings are printed with these values.
Function bodies are made up of a series of statements optionally ending in an expression. So far,
we’ve only covered functions without an ending expression, but you have seen an expression as
part of a statement. Because Rust is an expression-based language, this is an important distinction
to understand. Other languages don’t have the same distinctions, so let’s look at what statements
and expressions are and how their di�erences a�ect the bodies of functions.
We’ve actually already used statements and expressions. Statements are instructions that perform
some action and do not return a value. Expressions evaluate to a resulting value. Let’s look at some
examples.
Creating a variable and assigning a value to it with the let keyword is a statement. In Listing 3-1,
let y = 6; is a statement.
Filename: src/main.rs
fn main() {
let y = 6;
}
Function de�nitions are also statements; the entire preceding example is a statement in itself.
Statements do not return values. Therefore, you can’t assign a let statement to another variable,
as the following code tries to do; you’ll get an error:
Filename: src/main.rs
fn main() {
let x = (let y = 6);
}
When you run this program, the error you’ll get looks like this:
$ cargo run
Compiling functions v0.1.0 (file:///projects/functions)
error: expected expression, found statement (`let`)
--> src/main.rs:2:14
|
2 | let x = (let y = 6);
| ^^^
|
= note: variable declaration using `let` is a statement
The let y = 6 statement does not return a value, so there isn’t anything for x to bind to. This is
di�erent from what happens in other languages, such as C and Ruby, where the assignment
returns the value of the assignment. In those languages, you can write x = y = 6 and have both
x and y have the value 6 ; that is not the case in Rust.
Expressions evaluate to something and make up most of the rest of the code that you’ll write in
Rust. Consider a simple math operation, such as 5 + 6 , which is an expression that evaluates to
the value 11 . Expressions can be part of statements: in Listing 3-1, the 6 in the statement
let y = 6; is an expression that evaluates to the value 6 . Calling a function is an expression.
Calling a macro is an expression. The block that we use to create new scopes, {} , is an
expression, for example:
Filename: src/main.rs
fn main() {
let x = 5;
let y = {
let x = 3;
x + 1
};
This expression:
{
let x = 3;
x + 1
}
is a block that, in this case, evaluates to 4 . That value gets bound to y as part of the let
statement. Note the x + 1 line without a semicolon at the end, which is unlike most of the lines
you’ve seen so far. Expressions do not include ending semicolons. If you add a semicolon to the
end of an expression, you turn it into a statement, which will then not return a value. Keep this in
mind as you explore function return values and expressions next.
Functions can return values to the code that calls them. We don’t name return values, but we do
declare their type after an arrow ( -> ). In Rust, the return value of the function is synonymous with
the value of the �nal expression in the block of the body of a function. You can return early from a
function by using the return keyword and specifying a value, but most functions return the last
expression implicitly. Here’s an example of a function that returns a value:
Filename: src/main.rs
fn main() {
let x = five();
There are no function calls, macros, or even let statements in the five function—just the
number 5 by itself. That’s a perfectly valid function in Rust. Note that the function’s return type is
speci�ed, too, as -> i32 . Try running this code; the output should look like this:
$ cargo run
Compiling functions v0.1.0 (file:///projects/functions)
Finished dev [unoptimized + debuginfo] target(s) in 0.30 secs
Running `target/debug/functions`
The value of x is: 5
The 5 in five is the function’s return value, which is why the return type is i32 . Let’s examine
this in more detail. There are two important bits: �rst, the line let x = five(); shows that we’re
using the return value of a function to initialize a variable. Because the function five returns a 5 ,
that line is the same as the following:
let x = 5;
Second, the five function has no parameters and de�nes the type of the return value, but the
body of the function is a lonely 5 with no semicolon because it’s an expression whose value we
want to return.
Filename: src/main.rs
fn main() {
let x = plus_one(5);
Running this code will print The value of x is: 6 . But if we place a semicolon at the end of the
line containing x + 1 , changing it from an expression to a statement, we’ll get an error.
Filename: src/main.rs
fn main() {
let x = plus_one(5);
The main error message, “mismatched types,” reveals the core issue with this code. The de�nition
of the function plus_one says that it will return an i32 , but statements don’t evaluate to a value,
which is expressed by () , the empty tuple. Therefore, nothing is returned, which contradicts the
function de�nition and results in an error. In this output, Rust provides a message to possibly help
rectify this issue: it suggests removing the semicolon, which would �x the error.
Comments
All programmers strive to make their code easy to understand, but sometimes extra explanation is
warranted. In these cases, programmers leave notes, or comments, in their source code that the
compiler will ignore but people reading the source code may �nd useful.
// hello, world
In Rust, comments must start with two slashes and continue until the end of the line. For
comments that extend beyond a single line, you’ll need to include // on each line, like this:
// So we’re doing something complicated here, long enough that we need
// multiple lines of comments to do it! Whew! Hopefully, this comment will
// explain what’s going on.
Filename: src/main.rs
fn main() {
let lucky_number = 7; // I’m feeling lucky today
}
But you’ll more often see them used in this format, with the comment on a separate line above
the code it’s annotating:
Filename: src/main.rs
fn main() {
// I’m feeling lucky today
let lucky_number = 7;
}
Rust also has another kind of comment, documentation comments, which we’ll discuss in Chapter
14.
Control Flow
Deciding whether or not to run some code depending on if a condition is true and deciding to run
some code repeatedly while a condition is true are basic building blocks in most programming
languages. The most common constructs that let you control the �ow of execution of Rust code
are if expressions and loops.
if Expressions
An if expression allows you to branch your code depending on conditions. You provide a
condition and then state, “If this condition is met, run this block of code. If the condition is not met,
do not run this block of code.”
Create a new project called branches in your projects directory to explore the if expression. In the
src/main.rs �le, input the following:
Filename: src/main.rs
fn main() {
let number = 3;
if number < 5 {
println!("condition was true");
} else {
println!("condition was false");
}
}
All if expressions start with the keyword if , which is followed by a condition. In this case, the
condition checks whether or not the variable number has a value less than 5. The block of code we
want to execute if the condition is true is placed immediately after the condition inside curly
brackets. Blocks of code associated with the conditions in if expressions are sometimes called
arms, just like the arms in match expressions that we discussed in the “Comparing the Guess to
the Secret Number” section of Chapter 2.
Optionally, we can also include an else expression, which we chose to do here, to give the
program an alternative block of code to execute should the condition evaluate to false. If you don’t
provide an else expression and the condition is false, the program will just skip the if block and
move on to the next bit of code.
Try running this code; you should see the following output:
$ cargo run
Compiling branches v0.1.0 (file:///projects/branches)
Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs
Running `target/debug/branches`
condition was true
Let’s try changing the value of number to a value that makes the condition false to see what
happens:
let number = 7;
$ cargo run
Compiling branches v0.1.0 (file:///projects/branches)
Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs
Running `target/debug/branches`
condition was false
It’s also worth noting that the condition in this code must be a bool . If the condition isn’t a bool ,
we’ll get an error. For example, try running the following code:
Filename: src/main.rs
fn main() {
let number = 3;
if number {
println!("number was three");
}
}
The if condition evaluates to a value of 3 this time, and Rust throws an error:
The error indicates that Rust expected a bool but got an integer. Unlike languages such as Ruby
and JavaScript, Rust will not automatically try to convert non-Boolean types to a Boolean. You
must be explicit and always provide if with a Boolean as its condition. If we want the if code
block to run only when a number is not equal to 0 , for example, we can change the if
expression to the following:
Filename: src/main.rs
fn main() {
let number = 3;
if number != 0 {
println!("number was something other than zero");
}
}
Running this code will print number was something other than zero .
You can have multiple conditions by combining if and else in an else if expression. For
example:
Filename: src/main.rs
fn main() {
let number = 6;
if number % 4 == 0 {
println!("number is divisible by 4");
} else if number % 3 == 0 {
println!("number is divisible by 3");
} else if number % 2 == 0 {
println!("number is divisible by 2");
} else {
println!("number is not divisible by 4, 3, or 2");
}
}
This program has four possible paths it can take. After running it, you should see the following
output:
$ cargo run
Compiling branches v0.1.0 (file:///projects/branches)
Finished dev [unoptimized + debuginfo] target(s) in 0.31 secs
Running `target/debug/branches`
number is divisible by 3
When this program executes, it checks each if expression in turn and executes the �rst body for
which the condition holds true. Note that even though 6 is divisible by 2, we don’t see the output
number is divisible by 2 , nor do we see the number is not divisible by 4, 3, or 2 text
from the else block. That’s because Rust only executes the block for the �rst true condition, and
once it �nds one, it doesn’t even check the rest.
Using too many else if expressions can clutter your code, so if you have more than one, you
might want to refactor your code. Chapter 6 describes a powerful Rust branching construct called
match for these cases.
Because if is an expression, we can use it on the right side of a let statement, as in Listing 3-2.
Filename: src/main.rs
fn main() {
let condition = true;
let number = if condition {
5
} else {
6
};
The number variable will be bound to a value based on the outcome of the if expression. Run
this code to see what happens:
$ cargo run
Compiling branches v0.1.0 (file:///projects/branches)
Finished dev [unoptimized + debuginfo] target(s) in 0.30 secs
Running `target/debug/branches`
The value of number is: 5
Remember that blocks of code evaluate to the last expression in them, and numbers by
themselves are also expressions. In this case, the value of the whole if expression depends on
which block of code executes. This means the values that have the potential to be results from
each arm of the if must be the same type; in Listing 3-2, the results of both the if arm and the
else arm were i32 integers. If the types are mismatched, as in the following example, we’ll get
an error:
Filename: src/main.rs
fn main() {
let condition = true;
When we try to compile this code, we’ll get an error. The if and else arms have value types that
are incompatible, and Rust indicates exactly where to �nd the problem in the program:
The expression in the if block evaluates to an integer, and the expression in the else block
evaluates to a string. This won’t work because variables must have a single type. Rust needs to
know at compile time what type the number variable is, de�nitively, so it can verify at compile time
that its type is valid everywhere we use number . Rust wouldn’t be able to do that if the type of
number was only determined at runtime; the compiler would be more complex and would make
fewer guarantees about the code if it had to keep track of multiple hypothetical types for any
variable.
It’s often useful to execute a block of code more than once. For this task, Rust provides several
loops. A loop runs through the code inside the loop body to the end and then starts immediately
back at the beginning. To experiment with loops, let’s make a new project called loops.
Rust has three kinds of loops: loop , while , and for . Let’s try each one.
The loop keyword tells Rust to execute a block of code over and over again forever or until you
explicitly tell it to stop.
As an example, change the src/main.rs �le in your loops directory to look like this:
Filename: src/main.rs
fn main() {
loop {
println!("again!");
}
}
When we run this program, we’ll see again! printed over and over continuously until we stop the
program manually. Most terminals support a keyboard shortcut, ctrl-c, to halt a program that is
stuck in a continual loop. Give it a try:
$ cargo run
Compiling loops v0.1.0 (file:///projects/loops)
Finished dev [unoptimized + debuginfo] target(s) in 0.29 secs
Running `target/debug/loops`
again!
again!
again!
again!
^Cagain!
The symbol ^C represents where you pressed ctrl-c . You may or may not see the word again!
printed after the ^C , depending on where the code was in the loop when it received the halt
signal.
Fortunately, Rust provides another, more reliable way to break out of a loop. You can place the
break keyword within the loop to tell the program when to stop executing the loop. Recall that
we did this in the guessing game in the “Quitting After a Correct Guess” section of Chapter 2 to exit
the program when the user won the game by guessing the correct number.
One of the uses of a loop is to retry an operation you know can fail, such as checking if a thread
completed its job. However, you might need to pass the result of that operation to the rest of your
code. If you add it to the break expression you use to stop the loop, it will be returned by the
broken loop:
fn main() {
let mut counter = 0;
if counter == 10 {
break counter * 2;
}
};
assert_eq!(result, 20);
}
It’s often useful for a program to evaluate a condition within a loop. While the condition is true, the
loop runs. When the condition ceases to be true, the program calls break , stopping the loop. This
loop type could be implemented using a combination of loop , if , else , and break ; you could
try that now in a program, if you’d like.
However, this pattern is so common that Rust has a built-in language construct for it, called a
while loop. Listing 3-3 uses while : the program loops three times, counting down each time,
and then, after the loop, it prints another message and exits.
Filename: src/main.rs
fn main() {
let mut number = 3;
while number != 0 {
println!("{}!", number);
number = number - 1;
}
println!("LIFTOFF!!!");
}
Listing 3-3: Using a while loop to run code while a condition holds true
This construct eliminates a lot of nesting that would be necessary if you used loop , if , else ,
and break , and it’s clearer. While a condition holds true, the code runs; otherwise, it exits the
loop.
You could use the while construct to loop over the elements of a collection, such as an array. For
example, let’s look at Listing 3-4.
Filename: src/main.rs
fn main() {
let a = [10, 20, 30, 40, 50];
let mut index = 0;
index = index + 1;
}
}
Listing 3-4: Looping through each element of a collection using a while loop
Here, the code counts up through the elements in the array. It starts at index 0 , and then loops
until it reaches the �nal index in the array (that is, when index < 5 is no longer true). Running
this code will print every element in the array:
$ cargo run
Compiling loops v0.1.0 (file:///projects/loops)
Finished dev [unoptimized + debuginfo] target(s) in 0.32 secs
Running `target/debug/loops`
the value is: 10
the value is: 20
the value is: 30
the value is: 40
the value is: 50
All �ve array values appear in the terminal, as expected. Even though index will reach a value of
5 at some point, the loop stops executing before trying to fetch a sixth value from the array.
But this approach is error prone; we could cause the program to panic if the index length is
incorrect. It’s also slow, because the compiler adds runtime code to perform the conditional check
on every element on every iteration through the loop.
As a more concise alternative, you can use a for loop and execute some code for each item in a
collection. A for loop looks like the code in Listing 3-5.
Filename: src/main.rs
fn main() {
let a = [10, 20, 30, 40, 50];
Listing 3-5: Looping through each element of a collection using a for loop
When we run this code, we’ll see the same output as in Listing 3-4. More importantly, we’ve now
increased the safety of the code and eliminated the chance of bugs that might result from going
beyond the end of the array or not going far enough and missing some items.
For example, in the code in Listing 3-4, if you removed an item from the a array but forgot to
update the condition to while index < 4 , the code would panic. Using the for loop, you
wouldn’t need to remember to change any other code if you changed the number of values in the
array.
The safety and conciseness of for loops make them the most commonly used loop construct in
Rust. Even in situations in which you want to run some code a certain number of times, as in the
countdown example that used a while loop in Listing 3-3, most Rustaceans would use a for
loop. The way to do that would be to use a Range , which is a type provided by the standard library
that generates all numbers in sequence starting from one number and ending before another
number.
Here’s what the countdown would look like using a for loop and another method we’ve not yet
talked about, rev , to reverse the range:
Filename: src/main.rs
fn main() {
for number in (1..4).rev() {
println!("{}!", number);
}
println!("LIFTOFF!!!");
}
Summary
You made it! That was a sizable chapter: you learned about variables, scalar and compound data
types, functions, comments, if expressions, and loops! If you want to practice with the concepts
discussed in this chapter, try building programs to do the following:
When you’re ready to move on, we’ll talk about a concept in Rust that doesn’t commonly exist in
other programming languages: ownership.
Understanding Ownership
Ownership is Rust’s most unique feature, and it enables Rust to make memory safety guarantees
without needing a garbage collector. Therefore, it’s important to understand how ownership
works in Rust. In this chapter, we’ll talk about ownership as well as several related features:
borrowing, slices, and how Rust lays data out in memory.
What Is Ownership?
Rust’s central feature is ownership. Although the feature is straightforward to explain, it has deep
implications for the rest of the language.
All programs have to manage the way they use a computer’s memory while running. Some
languages have garbage collection that constantly looks for no longer used memory as the
program runs; in other languages, the programmer must explicitly allocate and free the memory.
Rust uses a third approach: memory is managed through a system of ownership with a set of rules
that the compiler checks at compile time. None of the ownership features slow down your
program while it’s running.
Because ownership is a new concept for many programmers, it does take some time to get used
to. The good news is that the more experienced you become with Rust and the rules of the
ownership system, the more you’ll be able to naturally develop code that is safe and e�cient.
Keep at it!
When you understand ownership, you’ll have a solid foundation for understanding the features
that make Rust unique. In this chapter, you’ll learn ownership by working through some examples
that focus on a very common data structure: strings.
In many programming languages, you don’t have to think about the stack and the heap very
often. But in a systems programming language like Rust, whether a value is on the stack or
the heap has more of an e�ect on how the language behaves and why you have to make
certain decisions. Parts of ownership will be described in relation to the stack and the heap
later in this chapter, so here is a brief explanation in preparation.
Both the stack and the heap are parts of memory that are available to your code to use at
runtime, but they are structured in di�erent ways. The stack stores values in the order it
gets them and removes the values in the opposite order. This is referred to as last in, �rst
out. Think of a stack of plates: when you add more plates, you put them on top of the pile,
and when you need a plate, you take one o� the top. Adding or removing plates from the
middle or bottom wouldn’t work as well! Adding data is called pushing onto the stack, and
removing data is called popping o� the stack.
The stack is fast because of the way it accesses the data: it never has to search for a place to
put new data or a place to get data from because that place is always the top. Another
property that makes the stack fast is that all data on the stack must take up a known, �xed
size.
Data with a size unknown at compile time or a size that might change can be stored on the
heap instead. The heap is less organized: when you put data on the heap, you ask for some
amount of space. The operating system �nds an empty spot somewhere in the heap that is
big enough, marks it as being in use, and returns a pointer, which is the address of that
location. This process is called allocating on the heap, sometimes abbreviated as just
“allocating.” Pushing values onto the stack is not considered allocating. Because the pointer
is a known, �xed size, you can store the pointer on the stack, but when you want the actual
data, you have to follow the pointer.
Think of being seated at a restaurant. When you enter, you state the number of people in
your group, and the sta� �nds an empty table that �ts everyone and leads you there. If
someone in your group comes late, they can ask where you’ve been seated to �nd you.
Accessing data in the heap is slower than accessing data on the stack because you have to
follow a pointer to get there. Contemporary processors are faster if they jump around less
in memory. Continuing the analogy, consider a server at a restaurant taking orders from
many tables. It’s most e�cient to get all the orders at one table before moving on to the
next table. Taking an order from table A, then an order from table B, then one from A again,
and then one from B again would be a much slower process. By the same token, a
processor can do its job better if it works on data that’s close to other data (as it is on the
stack) rather than farther away (as it can be on the heap). Allocating a large amount of space
on the heap can also take time.
When your code calls a function, the values passed into the function (including, potentially,
pointers to data on the heap) and the function’s local variables get pushed onto the stack.
When the function is over, those values get popped o� the stack.
Keeping track of what parts of code are using what data on the heap, minimizing the
amount of duplicate data on the heap, and cleaning up unused data on the heap so you
don’t run out of space are all problems that ownership addresses. Once you understand
ownership, you won’t need to think about the stack and the heap very often, but knowing
that managing heap data is why ownership exists can help explain why it works the way it
does.
Ownership Rules
First, let’s take a look at the ownership rules. Keep these rules in mind as we work through the
examples that illustrate them:
Variable Scope
We’ve walked through an example of a Rust program already in Chapter 2. Now that we’re past
basic syntax, we won’t include all the fn main() { code in examples, so if you’re following along,
you’ll have to put the following examples inside a main function manually. As a result, our
examples will be a bit more concise, letting us focus on the actual details rather than boilerplate
code.
As a �rst example of ownership, we’ll look at the scope of some variables. A scope is the range
within a program for which an item is valid. Let’s say we have a variable that looks like this:
let s = "hello";
The variable s refers to a string literal, where the value of the string is hardcoded into the text of
our program. The variable is valid from the point at which it’s declared until the end of the current
scope. Listing 4-1 has comments annotating where the variable s is valid.
{ // s is not valid here, it’s not yet declared
let s = "hello"; // s is valid from this point forward
// do stuff with s
} // this scope is now over, and s is no longer valid
At this point, the relationship between scopes and when variables are valid is similar to that in
other programming languages. Now we’ll build on top of this understanding by introducing the
String type.
To illustrate the rules of ownership, we need a data type that is more complex than the ones we
covered in the “Data Types” section of Chapter 3. The types covered previously are all stored on
the stack and popped o� the stack when their scope is over, but we want to look at data that is
stored on the heap and explore how Rust knows when to clean up that data.
We’ll use String as the example here and concentrate on the parts of String that relate to
ownership. These aspects also apply to other complex data types provided by the standard library
and that you create. We’ll discuss String in more depth in Chapter 8.
We’ve already seen string literals, where a string value is hardcoded into our program. String
literals are convenient, but they aren’t suitable for every situation in which we may want to use
text. One reason is that they’re immutable. Another is that not every string value can be known
when we write our code: for example, what if we want to take user input and store it? For these
situations, Rust has a second string type, String . This type is allocated on the heap and as such is
able to store an amount of text that is unknown to us at compile time. You can create a String
from a string literal using the from function, like so:
let s = String::from("hello");
The double colon ( :: ) is an operator that allows us to namespace this particular from function
under the String type rather than using some sort of name like string_from . We’ll discuss this
syntax more in the “Method Syntax” section of Chapter 5 and when we talk about namespacing
with modules in “Paths for Referring to an Item in the Module Tree” in Chapter 7.
let mut s = String::from("hello");
So, what’s the di�erence here? Why can String be mutated but literals cannot? The di�erence is
how these two types deal with memory.
In the case of a string literal, we know the contents at compile time, so the text is hardcoded
directly into the �nal executable. This is why string literals are fast and e�cient. But these
properties only come from the string literal’s immutability. Unfortunately, we can’t put a blob of
memory into the binary for each piece of text whose size is unknown at compile time and whose
size might change while running the program.
With the String type, in order to support a mutable, growable piece of text, we need to allocate
an amount of memory on the heap, unknown at compile time, to hold the contents. This means:
That �rst part is done by us: when we call String::from , its implementation requests the
memory it needs. This is pretty much universal in programming languages.
However, the second part is di�erent. In languages with a garbage collector (GC), the GC keeps
track and cleans up memory that isn’t being used anymore, and we don’t need to think about it.
Without a GC, it’s our responsibility to identify when memory is no longer being used and call code
to explicitly return it, just as we did to request it. Doing this correctly has historically been a
di�cult programming problem. If we forget, we’ll waste memory. If we do it too early, we’ll have an
invalid variable. If we do it twice, that’s a bug too. We need to pair exactly one allocate with
exactly one free .
Rust takes a di�erent path: the memory is automatically returned once the variable that owns it
goes out of scope. Here’s a version of our scope example from Listing 4-1 using a String instead
of a string literal:
{
let s = String::from("hello"); // s is valid from this point forward
// do stuff with s
} // this scope is now over, and s is no
// longer valid
There is a natural point at which we can return the memory our String needs to the operating
system: when s goes out of scope. When a variable goes out of scope, Rust calls a special
function for us. This function is called drop , and it’s where the author of String can put the code
to return the memory. Rust calls drop automatically at the closing curly bracket.
Note: In C++, this pattern of deallocating resources at the end of an item’s lifetime is
sometimes called Resource Acquisition Is Initialization (RAII). The drop function in Rust will be
familiar to you if you’ve used RAII patterns.
This pattern has a profound impact on the way Rust code is written. It may seem simple right now,
but the behavior of code can be unexpected in more complicated situations when we want to
have multiple variables use the data we’ve allocated on the heap. Let’s explore some of those
situations now.
Multiple variables can interact with the same data in di�erent ways in Rust. Let’s look at an
example using an integer in Listing 4-2.
let x = 5;
let y = x;
We can probably guess what this is doing: “bind the value 5 to x ; then make a copy of the value
in x and bind it to y .” We now have two variables, x and y , and both equal 5 . This is indeed
what is happening, because integers are simple values with a known, �xed size, and these two 5
values are pushed onto the stack.
let s1 = String::from("hello");
let s2 = s1;
This looks very similar to the previous code, so we might assume that the way it works would be
the same: that is, the second line would make a copy of the value in s1 and bind it to s2 . But this
isn’t quite what happens.
Take a look at Figure 4-1 to see what is happening to String under the covers. A String is made
up of three parts, shown on the left: a pointer to the memory that holds the contents of the string,
a length, and a capacity. This group of data is stored on the stack. On the right is the memory on
s1
name value index value
ptr 0 h
len 5 1 e
capacity 5 2 l
3 l
4 o
Figure 4-1: Representation in memory of a String holding the value "hello" bound to s1
The length is how much memory, in bytes, the contents of the String is currently using. The
capacity is the total amount of memory, in bytes, that the String has received from the operating
system. The di�erence between length and capacity matters, but not in this context, so for now,
it’s �ne to ignore the capacity.
When we assign s1 to s2 , the String data is copied, meaning we copy the pointer, the length,
and the capacity that are on the stack. We do not copy the data on the heap that the pointer refers
to. In other words, the data representation in memory looks like Figure 4-2.
s1
name value
ptr
len 5
capacity 5 index value
0 h
s2 1 e
name value 2 l
ptr 3 l
len 5 4 o
capacity 5
Figure 4-2: Representation in memory of the variable s2 that has a copy of the pointer, length, and capacity of
s1
The representation does not look like Figure 4-3, which is what memory would look like if Rust
instead copied the heap data as well. If Rust did this, the operation s2 = s1 could be very
expensive in terms of runtime performance if the data on the heap were large.
s1
name value index value
ptr 0 h
len 5 1 e
capacity 5 2 l
3 l
4 o
s2
name value index value
ptr 0 h
len 5 1 e
capacity 5 2 l
3 l
4 o
Figure 4-3: Another possibility for what s2 = s1 might do if Rust copied the heap data as well
Earlier, we said that when a variable goes out of scope, Rust automatically calls the drop function
and cleans up the heap memory for that variable. But Figure 4-2 shows both data pointers
pointing to the same location. This is a problem: when s2 and s1 go out of scope, they will both
try to free the same memory. This is known as a double free error and is one of the memory safety
bugs we mentioned previously. Freeing memory twice can lead to memory corruption, which can
potentially lead to security vulnerabilities.
To ensure memory safety, there’s one more detail to what happens in this situation in Rust.
Instead of trying to copy the allocated memory, Rust considers s1 to no longer be valid and,
therefore, Rust doesn’t need to free anything when s1 goes out of scope. Check out what
happens when you try to use s1 after s2 is created; it won’t work:
let s1 = String::from("hello");
let s2 = s1;
You’ll get an error like this because Rust prevents you from using the invalidated reference:
If you’ve heard the terms shallow copy and deep copy while working with other languages, the
concept of copying the pointer, length, and capacity without copying the data probably sounds like
making a shallow copy. But because Rust also invalidates the �rst variable, instead of being called
a shallow copy, it’s known as a move. In this example, we would say that s1 was moved into s2 .
So what actually happens is shown in Figure 4-4.
s1
name value
ptr
len 5
capacity 5 index value
0 h
s2 1 e
name value 2 l
ptr 3 l
len 5 4 o
capacity 5
That solves our problem! With only s2 valid, when it goes out of scope, it alone will free the
memory, and we’re done.
In addition, there’s a design choice that’s implied by this: Rust will never automatically create
“deep” copies of your data. Therefore, any automatic copying can be assumed to be inexpensive in
terms of runtime performance.
If we do want to deeply copy the heap data of the String , not just the stack data, we can use a
common method called clone . We’ll discuss method syntax in Chapter 5, but because methods
are a common feature in many programming languages, you’ve probably seen them before.
let s1 = String::from("hello");
let s2 = s1.clone();
This works just �ne and explicitly produces the behavior shown in Figure 4-3, where the heap data
does get copied.
When you see a call to clone , you know that some arbitrary code is being executed and that code
may be expensive. It’s a visual indicator that something di�erent is going on.
There’s another wrinkle we haven’t talked about yet. This code using integers, part of which was
shown in Listing 4-2, works and is valid:
let x = 5;
let y = x;
But this code seems to contradict what we just learned: we don’t have a call to clone , but x is
still valid and wasn’t moved into y .
The reason is that types such as integers that have a known size at compile time are stored
entirely on the stack, so copies of the actual values are quick to make. That means there’s no
reason we would want to prevent x from being valid after we create the variable y . In other
words, there’s no di�erence between deep and shallow copying here, so calling clone wouldn’t
do anything di�erent from the usual shallow copying and we can leave it out.
Rust has a special annotation called the Copy trait that we can place on types like integers that are
stored on the stack (we’ll talk more about traits in Chapter 10). If a type has the Copy trait, an
older variable is still usable after assignment. Rust won’t let us annotate a type with the Copy trait
if the type, or any of its parts, has implemented the Drop trait. If the type needs something special
to happen when the value goes out of scope and we add the Copy annotation to that type, we’ll
get a compile-time error. To learn about how to add the Copy annotation to your type, see
“Derivable Traits” in Appendix C.
So what types are Copy ? You can check the documentation for the given type to be sure, but as a
general rule, any group of simple scalar values can be Copy , and nothing that requires allocation
or is some form of resource is Copy . Here are some of the types that are Copy :
The semantics for passing a value to a function are similar to those for assigning a value to a
variable. Passing a variable to a function will move or copy, just as assignment does. Listing 4-3
has an example with some annotations showing where variables go into and out of scope.
Filename: src/main.rs
fn main() {
let s = String::from("hello"); // s comes into scope
} // Here, x goes out of scope, then s. But because s's value was moved, nothing
// special happens.
If we tried to use s after the call to takes_ownership , Rust would throw a compile-time error.
These static checks protect us from mistakes. Try adding code to main that uses s and x to see
where you can use them and where the ownership rules prevent you from doing so.
Returning values can also transfer ownership. Listing 4-4 is an example with similar annotations to
those in Listing 4-3.
Filename: src/main.rs
fn main() {
let s1 = gives_ownership(); // gives_ownership moves its return
// value into s1
The ownership of a variable follows the same pattern every time: assigning a value to another
variable moves it. When a variable that includes data on the heap goes out of scope, the value will
be cleaned up by drop unless the data has been moved to be owned by another variable.
Taking ownership and then returning ownership with every function is a bit tedious. What if we
want to let a function use a value but not take ownership? It’s quite annoying that anything we
pass in also needs to be passed back if we want to use it again, in addition to any data resulting
from the body of the function that we might want to return as well.
It’s possible to return multiple values using a tuple, as shown in Listing 4-5.
Filename: src/main.rs
fn main() {
let s1 = String::from("hello");
(s, length)
}
But this is too much ceremony and a lot of work for a concept that should be common. Luckily for
us, Rust has a feature for this concept, called references.
Here is how you would de�ne and use a calculate_length function that has a reference to an
object as a parameter instead of taking ownership of the value:
Filename: src/main.rs
fn main() {
let s1 = String::from("hello");
First, notice that all the tuple code in the variable declaration and the function return value is
gone. Second, note that we pass &s1 into calculate_length and, in its de�nition, we take
These ampersands are references, and they allow you to refer to some value without taking
ownership of it. Figure 4-5 shows a diagram.
s s1
name value name value index value
ptr ptr 0 h
len 5 1 e
capacity 5 2 l
3 l
4 o
Note: The opposite of referencing by using & is dereferencing, which is accomplished with
the dereference operator, * . We’ll see some uses of the dereference operator in Chapter 8
and discuss details of dereferencing in Chapter 15.
let s1 = String::from("hello");
The &s1 syntax lets us create a reference that refers to the value of s1 but does not own it.
Because it does not own it, the value it points to will not be dropped when the reference goes out
of scope.
Likewise, the signature of the function uses & to indicate that the type of the parameter s is a
reference. Let’s add some explanatory annotations:
fn calculate_length(s: &String) -> usize { // s is a reference to a String
s.len()
} // Here, s goes out of scope. But because it does not have ownership of what
// it refers to, nothing happens.
The scope in which the variable s is valid is the same as any function parameter’s scope, but we
don’t drop what the reference points to when it goes out of scope because we don’t have
ownership. When functions have references as parameters instead of the actual values, we won’t
need to return the values in order to give back ownership, because we never had ownership.
We call having references as function parameters borrowing. As in real life, if a person owns
something, you can borrow it from them. When you’re done, you have to give it back.
So what happens if we try to modify something we’re borrowing? Try the code in Listing 4-6.
Spoiler alert: it doesn’t work!
Filename: src/main.rs
fn main() {
let s = String::from("hello");
change(&s);
}
fn change(some_string: &String) {
some_string.push_str(", world");
}
Just as variables are immutable by default, so are references. We’re not allowed to modify
something we have a reference to.
Mutable References
We can �x the error in the code from Listing 4-6 with just a small tweak:
Filename: src/main.rs
fn main() {
let mut s = String::from("hello");
change(&mut s);
}
First, we had to change s to be mut . Then we had to create a mutable reference with &mut s
and accept a mutable reference with some_string: &mut String .
But mutable references have one big restriction: you can have only one mutable reference to a
particular piece of data in a particular scope. This code will fail:
Filename: src/main.rs
let r1 = &mut s;
let r2 = &mut s;
This restriction allows for mutation but in a very controlled fashion. It’s something that new
Rustaceans struggle with, because most languages let you mutate whenever you’d like.
The bene�t of having this restriction is that Rust can prevent data races at compile time. A data
race is similar to a race condition and happens when these three behaviors occur:
Two or more pointers access the same data at the same time.
At least one of the pointers is being used to write to the data.
There’s no mechanism being used to synchronize access to the data.
Data races cause unde�ned behavior and can be di�cult to diagnose and �x when you’re trying to
track them down at runtime; Rust prevents this problem from happening because it won’t even
compile code with data races!
As always, we can use curly brackets to create a new scope, allowing for multiple mutable
references, just not simultaneous ones:
let mut s = String::from("hello");
{
let r1 = &mut s;
} // r1 goes out of scope here, so we can make a new reference with no problems.
let r2 = &mut s;
A similar rule exists for combining mutable and immutable references. This code results in an
error:
Whew! We also cannot have a mutable reference while we have an immutable one. Users of an
immutable reference don’t expect the values to suddenly change out from under them! However,
multiple immutable references are okay because no one who is just reading the data has the
ability to a�ect anyone else’s reading of the data.
Even though these errors may be frustrating at times, remember that it’s the Rust compiler
pointing out a potential bug early (at compile time rather than at runtime) and showing you
exactly where the problem is. Then you don’t have to track down why your data isn’t what you
thought it was.
Dangling References
In languages with pointers, it’s easy to erroneously create a dangling pointer, a pointer that
references a location in memory that may have been given to someone else, by freeing some
memory while preserving a pointer to that memory. In Rust, by contrast, the compiler guarantees
that references will never be dangling references: if you have a reference to some data, the
compiler will ensure that the data will not go out of scope before the reference to the data does.
Let’s try to create a dangling reference, which Rust will prevent with a compile-time error:
Filename: src/main.rs
fn main() {
let reference_to_nothing = dangle();
}
&s
}
This error message refers to a feature we haven’t covered yet: lifetimes. We’ll discuss lifetimes in
detail in Chapter 10. But, if you disregard the parts about lifetimes, the message does contain the
this function's return type contains a borrowed value, but there is no value
for it to be borrowed from.
Let’s take a closer look at exactly what’s happening at each stage of our dangle code:
Because s is created inside dangle , when the code of dangle is �nished, s will be deallocated.
But we tried to return a reference to it. That means this reference would be pointing to an invalid
String . That’s no good! Rust won’t let us do this.
fn no_dangle() -> String {
let s = String::from("hello");
s
}
This works without any problems. Ownership is moved out, and nothing is deallocated.
At any given time, you can have either one mutable reference or any number of immutable
references.
References must always be valid.
Another data type that does not have ownership is the slice. Slices let you reference a contiguous
sequence of elements in a collection rather than the whole collection.
Here’s a small programming problem: write a function that takes a string and returns the �rst
word it �nds in that string. If the function doesn’t �nd a space in the string, the whole string must
be one word, so the entire string should be returned.
This function, first_word , has a &String as a parameter. We don’t want ownership, so this is
�ne. But what should we return? We don’t really have a way to talk about part of a string. However,
we could return the index of the end of the word. Let’s try that, as shown in Listing 4-7.
Filename: src/main.rs
fn first_word(s: &String) -> usize {
let bytes = s.as_bytes();
s.len()
}
Listing 4-7: The first_word function that returns a byte index value into the String parameter
Because we need to go through the String element by element and check whether a value is a
space, we’ll convert our String to an array of bytes using the as_bytes method:
Next, we create an iterator over the array of bytes using the iter method:
We’ll discuss iterators in more detail in Chapter 13. For now, know that iter is a method that
returns each element in a collection and that enumerate wraps the result of iter and returns
each element as part of a tuple instead. The �rst element of the tuple returned from enumerate is
the index, and the second element is a reference to the element. This is a bit more convenient
Because the enumerate method returns a tuple, we can use patterns to destructure that tuple,
just like everywhere else in Rust. So in the for loop, we specify a pattern that has i for the index
in the tuple and &item for the single byte in the tuple. Because we get a reference to the element
from .iter().enumerate() , we use & in the pattern.
Inside the for loop, we search for the byte that represents the space by using the byte literal
syntax. If we �nd a space, we return the position. Otherwise, we return the length of the string by
using s.len() :
s.len()
We now have a way to �nd out the index of the end of the �rst word in the string, but there’s a
problem. We’re returning a usize on its own, but it’s only a meaningful number in the context of
the &String . In other words, because it’s a separate value from the String , there’s no guarantee
that it will still be valid in the future. Consider the program in Listing 4-8 that uses the first_word
function from Listing 4-7.
Filename: src/main.rs
fn main() {
let mut s = String::from("hello world");
// word still has the value 5 here, but there's no more string that
// we could meaningfully use the value 5 with. word is now totally invalid!
}
Listing 4-8: Storing the result from calling the first_word function and then changing the String contents
This program compiles without any errors and would also do so if we used word after calling
s.clear() . Because word isn’t connected to the state of s at all, word still contains the value 5 .
We could use that value 5 with the variable s to try to extract the �rst word out, but this would
be a bug because the contents of s have changed since we saved 5 in word .
Having to worry about the index in word getting out of sync with the data in s is tedious and
error prone! Managing these indices is even more brittle if we write a second_word function. Its
signature would have to look like this:
Now we’re tracking a starting and an ending index, and we have even more values that were
calculated from data in a particular state but aren’t tied to that state at all. We now have three
unrelated variables �oating around that need to be kept in sync.
String Slices
let s = String::from("hello world");
This is similar to taking a reference to the whole String but with the extra [0..5] bit. Rather
than a reference to the entire String , it’s a reference to a portion of the String . The
start..end syntax is a range that begins at start and continues up to, but not including, end . If
we wanted to include end , we can use ..= instead of .. :
let s = String::from("hello world");
The = means that we’re including the last number, if that helps you remember the di�erence
between .. and ..= .
s
name value index value
ptr 0 h
len 11 1 e
capacity 11 2 l
3 l
world 4 o
name value 5
ptr 6 w
len 5 7 o
8 r
9 l
10 d
With Rust’s .. range syntax, if you want to start at the �rst index (zero), you can drop the value
before the two periods. In other words, these are equal:
let s = String::from("hello");
By the same token, if your slice includes the last byte of the String , you can drop the trailing
number. That means these are equal:
let s = String::from("hello");
You can also drop both values to take a slice of the entire string. So these are equal:
let s = String::from("hello");
Note: String slice range indices must occur at valid UTF-8 character boundaries. If you
attempt to create a string slice in the middle of a multibyte character, your program will exit
with an error. For the purposes of introducing string slices, we are assuming ASCII only in
this section; a more thorough discussion of UTF-8 handling is in the “Storing UTF-8 Encoded
Text with Strings” section of Chapter 8.
With all this information in mind, let’s rewrite first_word to return a slice. The type that signi�es
“string slice” is written as &str :
Filename: src/main.rs
fn first_word(s: &String) -> &str {
let bytes = s.as_bytes();
&s[..]
}
We get the index for the end of the word in the same way as we did in Listing 4-7, by looking for
the �rst occurrence of a space. When we �nd a space, we return a string slice using the start of the
string and the index of the space as the starting and ending indices.
Now when we call first_word , we get back a single value that is tied to the underlying data. The
value is made up of a reference to the starting point of the slice and the number of elements in
the slice.
We now have a straightforward API that’s much harder to mess up, because the compiler will
ensure the references into the String remain valid. Remember the bug in the program in Listing
4-8, when we got the index to the end of the �rst word but then cleared the string so our index
was invalid? That code was logically incorrect but didn’t show any immediate errors. The problems
would show up later if we kept trying to use the �rst word index with an emptied string. Slices
make this bug impossible and let us know we have a problem with our code much sooner. Using
the slice version of first_word will throw a compile-time error:
Filename: src/main.rs
fn main() {
let mut s = String::from("hello world");
s.clear(); // error!
Recall from the borrowing rules that if we have an immutable reference to something, we cannot
also take a mutable reference. Because clear needs to truncate the String , it tries to take a
mutable reference, which fails. Not only has Rust made our API easier to use, but it has also
eliminated an entire class of errors at compile time!
Recall that we talked about string literals being stored inside the binary. Now that we know about
slices, we can properly understand string literals:
let s = "Hello, world!";
The type of s here is &str : it’s a slice pointing to that speci�c point of the binary. This is also why
string literals are immutable; &str is an immutable reference.
Knowing that you can take slices of literals and String values leads us to one more improvement
on first_word , and that’s its signature:
A more experienced Rustacean would write the signature shown in Listing 4-9 instead because it
allows us to use the same function on both String values and &str values.
Listing 4-9: Improving the first_word function by using a string slice for the type of the s parameter
If we have a string slice, we can pass that directly. If we have a String , we can pass a slice of the
entire String . De�ning a function to take a string slice instead of a reference to a String makes
our API more general and useful without losing any functionality:
Filename: src/main.rs
fn main() {
let my_string = String::from("hello world");
Other Slices
String slices, as you might imagine, are speci�c to strings. But there’s a more general slice type,
too. Consider this array:
let a = [1, 2, 3, 4, 5];
Just as we might want to refer to a part of a string, we might want to refer to part of an array. We’d
do so like this:
let a = [1, 2, 3, 4, 5];
This slice has the type &[i32] . It works the same way as string slices do, by storing a reference to
the �rst element and a length. You’ll use this kind of slice for all sorts of other collections. We’ll
discuss these collections in detail when we talk about vectors in Chapter 8.
Summary
The concepts of ownership, borrowing, and slices ensure memory safety in Rust programs at
compile time. The Rust language gives you control over your memory usage in the same way as
other systems programming languages, but having the owner of data automatically clean up that
data when the owner goes out of scope means you don’t have to write and debug extra code to
get this control.
Ownership a�ects how lots of other parts of Rust work, so we’ll talk about these concepts further
throughout the rest of the book. Let’s move on to Chapter 5 and look at grouping pieces of data
together in a struct .
To de�ne a struct, we enter the keyword struct and name the entire struct. A struct’s name
should describe the signi�cance of the pieces of data being grouped together. Then, inside curly
brackets, we de�ne the names and types of the pieces of data, which we call �elds. For example,
Listing 5-1 shows a struct that stores information about a user account.
struct User {
username: String,
email: String,
sign_in_count: u64,
active: bool,
}
To use a struct after we’ve de�ned it, we create an instance of that struct by specifying concrete
values for each of the �elds. We create an instance by stating the name of the struct and then add
curly brackets containing key: value pairs, where the keys are the names of the �elds and the
values are the data we want to store in those �elds. We don’t have to specify the �elds in the same
order in which we declared them in the struct. In other words, the struct de�nition is like a general
template for the type, and instances �ll in that template with particular data to create values of the
type. For example, we can declare a particular user as shown in Listing 5-2.
let user1 = User {
email: String::from("[email protected]"),
username: String::from("someusername123"),
active: true,
sign_in_count: 1,
};
To get a speci�c value from a struct, we can use dot notation. If we wanted just this user’s email
address, we could use user1.email wherever we wanted to use this value. If the instance is
mutable, we can change a value by using the dot notation and assigning into a particular �eld.
Listing 5-3 shows how to change the value in the email �eld of a mutable User instance.
let mut user1 = User {
email: String::from("[email protected]"),
username: String::from("someusername123"),
active: true,
sign_in_count: 1,
};
user1.email = String::from("[email protected]");
Listing 5-3: Changing the value in the email �eld of a User instance
Note that the entire instance must be mutable; Rust doesn’t allow us to mark only certain �elds as
mutable. As with any expression, we can construct a new instance of the struct as the last
expression in the function body to implicitly return that new instance.
Listing 5-4 shows a build_user function that returns a User instance with the given email and
username. The active �eld gets the value of true , and the sign_in_count gets a value of 1 .
fn build_user(email: String, username: String) -> User {
User {
email: email,
username: username,
active: true,
sign_in_count: 1,
}
}
Listing 5-4: A build_user function that takes an email and username and returns a User instance
It makes sense to name the function parameters with the same name as the struct �elds, but
having to repeat the email and username �eld names and variables is a bit tedious. If the struct
had more �elds, repeating each name would get even more annoying. Luckily, there’s a
convenient shorthand!
Using the Field Init Shorthand when Variables and Fields Have the Same Name
Because the parameter names and the struct �eld names are exactly the same in Listing 5-4, we
can use the �eld init shorthand syntax to rewrite build_user so that it behaves exactly the same
but doesn’t have the repetition of email and username , as shown in Listing 5-5.
fn build_user(email: String, username: String) -> User {
User {
email,
username,
active: true,
sign_in_count: 1,
}
}
Listing 5-5: A build_user function that uses �eld init shorthand because the email and username
parameters have the same name as struct �elds
Here, we’re creating a new instance of the User struct, which has a �eld named email . We want
to set the email �eld’s value to the value in the email parameter of the build_user function.
Because the email �eld and the email parameter have the same name, we only need to write
email rather than email: email .
It’s often useful to create a new instance of a struct that uses most of an old instance’s values but
changes some. You’ll do this using struct update syntax.
First, Listing 5-6 shows how we create a new User instance in user2 without the update syntax.
We set new values for email and username but otherwise use the same values from user1 that
we created in Listing 5-2.
let user2 = User {
email: String::from("[email protected]"),
username: String::from("anotherusername567"),
active: user1.active,
sign_in_count: user1.sign_in_count,
};
Listing 5-6: Creating a new User instance using some of the values from user1
Using struct update syntax, we can achieve the same e�ect with less code, as shown in Listing 5-7.
The syntax .. speci�es that the remaining �elds not explicitly set should have the same value as
the �elds in the given instance.
let user2 = User {
email: String::from("[email protected]"),
username: String::from("anotherusername567"),
..user1
};
Listing 5-7: Using struct update syntax to set new email and username values for a User instance but use
the rest of the values from the �elds of the instance in the user1 variable
The code in Listing 5-7 also creates an instance in user2 that has a di�erent value for email and
username but has the same values for the active and sign_in_count �elds from user1 .
You can also de�ne structs that look similar to tuples, called tuple structs. Tuple structs have the
added meaning the struct name provides but don’t have names associated with their �elds;
rather, they just have the types of the �elds. Tuple structs are useful when you want to give the
whole tuple a name and make the tuple be a di�erent type than other tuples, and naming each
�eld as in a regular struct would be verbose or redundant.
To de�ne a tuple struct, start with the struct keyword and the struct name followed by the types
in the tuple. For example, here are de�nitions and usages of two tuple structs named Color and
Point :
struct Color(i32, i32, i32);
struct Point(i32, i32, i32);
Note that the black and origin values are di�erent types, because they’re instances of di�erent
tuple structs. Each struct you de�ne is its own type, even though the �elds within the struct have
the same types. For example, a function that takes a parameter of type Color cannot take a
Point as an argument, even though both types are made up of three i32 values. Otherwise,
tuple struct instances behave like tuples: you can destructure them into their individual pieces,
you can use a . followed by the index to access an individual value, and so on.
You can also de�ne structs that don’t have any �elds! These are called unit-like structs because they
behave similarly to () , the unit type. Unit-like structs can be useful in situations in which you
need to implement a trait on some type but don’t have any data that you want to store in the type
itself. We’ll discuss traits in Chapter 10.
In the User struct de�nition in Listing 5-1, we used the owned String type rather than the
&str string slice type. This is a deliberate choice because we want instances of this struct to
own all of its data and for that data to be valid for as long as the entire struct is valid.
It’s possible for structs to store references to data owned by something else, but to do so
requires the use of lifetimes, a Rust feature that we’ll discuss in Chapter 10. Lifetimes ensure
that the data referenced by a struct is valid for as long as the struct is. Let’s say you try to
store a reference in a struct without specifying lifetimes, like this, which won’t work:
Filename: src/main.rs
struct User {
username: &str,
email: &str,
sign_in_count: u64,
active: bool,
}
fn main() {
let user1 = User {
email: "[email protected]",
username: "someusername123",
active: true,
sign_in_count: 1,
};
}
In Chapter 10, we’ll discuss how to �x these errors so you can store references in structs,
but for now, we’ll �x errors like these using owned types like String instead of references
like &str .
Let’s make a new binary project with Cargo called rectangles that will take the width and height of a
rectangle speci�ed in pixels and calculate the area of the rectangle. Listing 5-8 shows a short
program with one way of doing exactly that in our project’s src/main.rs.
Filename: src/main.rs
fn main() {
let width1 = 30;
let height1 = 50;
println!(
"The area of the rectangle is {} square pixels.",
area(width1, height1)
);
}
Listing 5-8: Calculating the area of a rectangle speci�ed by separate width and height variables
Even though Listing 5-8 works and �gures out the area of the rectangle by calling the area
function with each dimension, we can do better. The width and the height are related to each
other because together they describe one rectangle.
The area function is supposed to calculate the area of one rectangle, but the function we wrote
has two parameters. The parameters are related, but that’s not expressed anywhere in our
program. It would be more readable and more manageable to group width and height together.
We’ve already discussed one way we might do that in “The Tuple Type” section of Chapter 3: by
using tuples.
Listing 5-9 shows another version of our program that uses tuples.
Filename: src/main.rs
fn main() {
let rect1 = (30, 50);
println!(
"The area of the rectangle is {} square pixels.",
area(rect1)
);
}
Listing 5-9: Specifying the width and height of the rectangle with a tuple
In one way, this program is better. Tuples let us add a bit of structure, and we’re now passing just
one argument. But in another way, this version is less clear: tuples don’t name their elements, so
our calculation has become more confusing because we have to index into the parts of the tuple.
It doesn’t matter if we mix up width and height for the area calculation, but if we want to draw the
rectangle on the screen, it would matter! We would have to keep in mind that width is the tuple
index 0 and height is the tuple index 1 . If someone else worked on this code, they would have
to �gure this out and keep it in mind as well. It would be easy to forget or mix up these values and
cause errors, because we haven’t conveyed the meaning of our data in our code.
We use structs to add meaning by labeling the data. We can transform the tuple we’re using into a
data type with a name for the whole as well as names for the parts, as shown in Listing 5-10.
Filename: src/main.rs
struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let rect1 = Rectangle { width: 30, height: 50 };
println!(
"The area of the rectangle is {} square pixels.",
area(&rect1)
);
}
Here we’ve de�ned a struct and named it Rectangle . Inside the curly brackets, we de�ned the
�elds as width and height , both of which have type u32 . Then in main , we created a particular
instance of Rectangle that has a width of 30 and a height of 50.
Our area function is now de�ned with one parameter, which we’ve named rectangle , whose
type is an immutable borrow of a struct Rectangle instance. As mentioned in Chapter 4, we want
to borrow the struct rather than take ownership of it. This way, main retains its ownership and
can continue using rect1 , which is the reason we use the & in the function signature and where
we call the function.
The area function accesses the width and height �elds of the Rectangle instance. Our
function signature for area now says exactly what we mean: calculate the area of Rectangle ,
using its width and height �elds. This conveys that the width and height are related to each
other, and it gives descriptive names to the values rather than using the tuple index values of 0
and 1 . This is a win for clarity.
It’d be nice to be able to print an instance of Rectangle while we’re debugging our program and
see the values for all its �elds. Listing 5-11 tries using the println! macro as we have used in
previous chapters. This won’t work, however.
Filename: src/main.rs
struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let rect1 = Rectangle { width: 30, height: 50 };
When we run this code, we get an error with this core message:
The println! macro can do many kinds of formatting, and by default, the curly brackets tell
println! to use formatting known as Display : output intended for direct end user
consumption. The primitive types we’ve seen so far implement Display by default, because
there’s only one way you’d want to show a 1 or any other primitive type to a user. But with
structs, the way println! should format the output is less clear because there are more display
possibilities: Do you want commas or not? Do you want to print the curly brackets? Should all the
�elds be shown? Due to this ambiguity, Rust doesn’t try to guess what we want, and structs don’t
have a provided implementation of Display .
Let’s try it! The println! macro call will now look like println!("rect1 is {:?}", rect1); .
Putting the speci�er :? inside the curly brackets tells println! we want to use an output format
called Debug . The Debug trait enables us to print our struct in a way that is useful for developers
so we can see its value while we’re debugging our code.
Run the code with this change. Drat! We still get an error:
Rust does include functionality to print out debugging information, but we have to explicitly opt in
to make that functionality available for our struct. To do that, we add the annotation
#[derive(Debug)] just before the struct de�nition, as shown in Listing 5-12.
Filename: src/main.rs
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
fn main() {
let rect1 = Rectangle { width: 30, height: 50 };
Listing 5-12: Adding the annotation to derive the Debug trait and printing the Rectangle instance using
debug formatting
Now when we run the program, we won’t get any errors, and we’ll see the following output:
Nice! It’s not the prettiest output, but it shows the values of all the �elds for this instance, which
would de�nitely help during debugging. When we have larger structs, it’s useful to have output
that’s a bit easier to read; in those cases, we can use {:#?} instead of {:?} in the println!
string. When we use the {:#?} style in the example, the output will look like this:
rect1 is Rectangle {
width: 30,
height: 50
}
Rust has provided a number of traits for us to use with the derive annotation that can add useful
behavior to our custom types. Those traits and their behaviors are listed in Appendix C. We’ll cover
how to implement these traits with custom behavior as well as how to create your own traits in
Chapter 10.
Our area function is very speci�c: it only computes the area of rectangles. It would be helpful to
tie this behavior more closely to our Rectangle struct, because it won’t work with any other type.
Let’s look at how we can continue to refactor this code by turning the area function into an area
method de�ned on our Rectangle type.
Method Syntax
Methods are similar to functions: they’re declared with the fn keyword and their name, they can
have parameters and a return value, and they contain some code that is run when they’re called
from somewhere else. However, methods are di�erent from functions in that they’re de�ned
within the context of a struct (or an enum or a trait object, which we cover in Chapters 6 and 17,
respectively), and their �rst parameter is always self , which represents the instance of the struct
the method is being called on.
De�ning Methods
Let’s change the area function that has a Rectangle instance as a parameter and instead make
an area method de�ned on the Rectangle struct, as shown in Listing 5-13.
Filename: src/main.rs
#[derive(Debug)]
struct Rectangle {
width: u32,
height: u32,
}
impl Rectangle {
fn area(&self) -> u32 {
self.width * self.height
}
}
fn main() {
let rect1 = Rectangle { width: 30, height: 50 };
println!(
"The area of the rectangle is {} square pixels.",
rect1.area()
);
}
To de�ne the function within the context of Rectangle , we start an impl (implementation) block.
Then we move the area function within the impl curly brackets and change the �rst (and in this
case, only) parameter to be self in the signature and everywhere within the body. In main ,
where we called the area function and passed rect1 as an argument, we can instead use
method syntax to call the area method on our Rectangle instance. The method syntax goes after
an instance: we add a dot followed by the method name, parentheses, and any arguments.
In the signature for area , we use &self instead of rectangle: &Rectangle because Rust knows
the type of self is Rectangle due to this method’s being inside the impl Rectangle context.
Note that we still need to use the & before self , just as we did in &Rectangle . Methods can take
ownership of self , borrow self immutably as we’ve done here, or borrow self mutably, just
as they can any other parameter.
We’ve chosen &self here for the same reason we used &Rectangle in the function version: we
don’t want to take ownership, and we just want to read the data in the struct, not write to it. If we
wanted to change the instance that we’ve called the method on as part of what the method does,
we’d use &mut self as the �rst parameter. Having a method that takes ownership of the instance
by using just self as the �rst parameter is rare; this technique is usually used when the method
transforms self into something else and you want to prevent the caller from using the original
instance after the transformation.
The main bene�t of using methods instead of functions, in addition to using method syntax and
not having to repeat the type of self in every method’s signature, is for organization. We’ve put
all the things we can do with an instance of a type in one impl block rather than making future
users of our code search for capabilities of Rectangle in various places in the library we provide.
In C and C++, two di�erent operators are used for calling methods: you use . if you’re
calling a method on the object directly and -> if you’re calling the method on a pointer to
the object and need to dereference the pointer �rst. In other words, if object is a pointer,
object->something() is similar to (*object).something() .
Rust doesn’t have an equivalent to the -> operator; instead, Rust has a feature called
automatic referencing and dereferencing. Calling methods is one of the few places in Rust that
has this behavior.
Here’s how it works: when you call a method with object.something() , Rust automatically
adds in & , &mut , or * so object matches the signature of the method. In other words, the
following are the same:
p1.distance(&p2);
(&p1).distance(&p2);
The �rst one looks much cleaner. This automatic referencing behavior works because
methods have a clear receiver—the type of self . Given the receiver and name of a method,
Rust can �gure out de�nitively whether the method is reading ( &self ), mutating (
&mut self ), or consuming ( self ). The fact that Rust makes borrowing implicit for method
receivers is a big part of making ownership ergonomic in practice.
Let’s practice using methods by implementing a second method on the Rectangle struct. This
time, we want an instance of Rectangle to take another instance of Rectangle and return true
if the second Rectangle can �t completely within self ; otherwise it should return false . That
is, we want to be able to write the program shown in Listing 5-14, once we’ve de�ned the
can_hold method.
Filename: src/main.rs
fn main() {
let rect1 = Rectangle { width: 30, height: 50 };
let rect2 = Rectangle { width: 10, height: 40 };
let rect3 = Rectangle { width: 60, height: 45 };
And the expected output would look like the following, because both dimensions of rect2 are
smaller than the dimensions of rect1 but rect3 is wider than rect1 :
We know we want to de�ne a method, so it will be within the impl Rectangle block. The method
name will be can_hold , and it will take an immutable borrow of another Rectangle as a
parameter. We can tell what the type of the parameter will be by looking at the code that calls the
method: rect1.can_hold(&rect2) passes in &rect2 , which is an immutable borrow to rect2 , an
instance of Rectangle . This makes sense because we only need to read rect2 (rather than write,
which would mean we’d need a mutable borrow), and we want main to retain ownership of
rect2 so we can use it again after calling the can_hold method. The return value of can_hold
will be a Boolean, and the implementation will check whether the width and height of self are
both greater than the width and height of the other Rectangle , respectively. Let’s add the new
can_hold method to the impl block from Listing 5-13, shown in Listing 5-15.
Filename: src/main.rs
impl Rectangle {
fn area(&self) -> u32 {
self.width * self.height
}
Listing 5-15: Implementing the can_hold method on Rectangle that takes another Rectangle instance as
a parameter
When we run this code with the main function in Listing 5-14, we’ll get our desired output.
Methods can take multiple parameters that we add to the signature after the self parameter,
and those parameters work just like parameters in functions.
Associated Functions
Another useful feature of impl blocks is that we’re allowed to de�ne functions within impl blocks
that don’t take self as a parameter. These are called associated functions because they’re
associated with the struct. They’re still functions, not methods, because they don’t have an
instance of the struct to work with. You’ve already used the String::from associated function.
Associated functions are often used for constructors that will return a new instance of the struct.
For example, we could provide an associated function that would have one dimension parameter
and use that as both width and height, thus making it easier to create a square Rectangle rather
than having to specify the same value twice:
Filename: src/main.rs
impl Rectangle {
fn square(size: u32) -> Rectangle {
Rectangle { width: size, height: size }
}
}
To call this associated function, we use the :: syntax with the struct name;
let sq = Rectangle::square(3); is an example. This function is namespaced by the struct: the
:: syntax is used for both associated functions and namespaces created by modules. We’ll
discuss modules in Chapter 7.
Each struct is allowed to have multiple impl blocks. For example, Listing 5-15 is equivalent to the
code shown in Listing 5-16, which has each method in its own impl block.
impl Rectangle {
fn area(&self) -> u32 {
self.width * self.height
}
}
impl Rectangle {
fn can_hold(&self, other: &Rectangle) -> bool {
self.width > other.width && self.height > other.height
}
}
There’s no reason to separate these methods into multiple impl blocks here, but this is valid
syntax. We’ll see a case in which multiple impl blocks are useful in Chapter 10, where we discuss
generic types and traits.
Summary
Structs let you create custom types that are meaningful for your domain. By using structs, you can
keep associated pieces of data connected to each other and name each piece to make your code
clear. Methods let you specify the behavior that instances of your structs have, and associated
functions let you namespace functionality that is particular to your struct without having an
instance available.
But structs aren’t the only way you can create custom types: let’s turn to Rust’s enum feature to
add another tool to your toolbox.
Enums are a feature in many languages, but their capabilities di�er in each language. Rust’s
enums are most similar to algebraic data types in functional languages, such as F#, OCaml, and
Haskell.
De�ning an Enum
Let’s look at a situation we might want to express in code and see why enums are useful and more
appropriate than structs in this case. Say we need to work with IP addresses. Currently, two major
standards are used for IP addresses: version four and version six. These are the only possibilities
for an IP address that our program will come across: we can enumerate all possible values, which is
where enumeration gets its name.
Any IP address can be either a version four or a version six address, but not both at the same
time. That property of IP addresses makes the enum data structure appropriate, because enum
values can only be one of the variants. Both version four and version six addresses are still
fundamentally IP addresses, so they should be treated as the same type when the code is
handling situations that apply to any kind of IP address.
We can express this concept in code by de�ning an IpAddrKind enumeration and listing the
possible kinds an IP address can be, V4 and V6 . These are known as the variants of the enum:
enum IpAddrKind {
V4,
V6,
}
IpAddrKind is now a custom data type that we can use elsewhere in our code.
Enum Values
We can create instances of each of the two variants of IpAddrKind like this:
let four = IpAddrKind::V4;
let six = IpAddrKind::V6;
Note that the variants of the enum are namespaced under its identi�er, and we use a double
colon to separate the two. The reason this is useful is that now both values IpAddrKind::V4 and
IpAddrKind::V6 are of the same type: IpAddrKind . We can then, for instance, de�ne a function
that takes any IpAddrKind :
fn route(ip_type: IpAddrKind) { }
route(IpAddrKind::V4);
route(IpAddrKind::V6);
Using enums has even more advantages. Thinking more about our IP address type, at the moment
we don’t have a way to store the actual IP address data; we only know what kind it is. Given that
you just learned about structs in Chapter 5, you might tackle this problem as shown in Listing 6-1.
enum IpAddrKind {
V4,
V6,
}
struct IpAddr {
kind: IpAddrKind,
address: String,
}
Listing 6-1: Storing the data and IpAddrKind variant of an IP address using a struct
Here, we’ve de�ned a struct IpAddr that has two �elds: a kind �eld that is of type IpAddrKind
(the enum we de�ned previously) and an address �eld of type String . We have two instances of
this struct. The �rst, home , has the value IpAddrKind::V4 as its kind with associated address
data of 127.0.0.1 . The second instance, loopback , has the other variant of IpAddrKind as its
kind value, V6 , and has address ::1 associated with it. We’ve used a struct to bundle the kind
and address values together, so now the variant is associated with the value.
We can represent the same concept in a more concise way using just an enum, rather than an
enum inside a struct, by putting data directly into each enum variant. This new de�nition of the
IpAddr enum says that both V4 and V6 variants will have associated String values:
enum IpAddr {
V4(String),
V6(String),
}
We attach data to each variant of the enum directly, so there is no need for an extra struct.
There’s another advantage to using an enum rather than a struct: each variant can have di�erent
types and amounts of associated data. Version four type IP addresses will always have four
numeric components that will have values between 0 and 255. If we wanted to store V4
addresses as four u8 values but still express V6 addresses as one String value, we wouldn’t be
able to with a struct. Enums handle this case with ease:
enum IpAddr {
V4(u8, u8, u8, u8),
V6(String),
}
We’ve shown several di�erent ways to de�ne data structures to store version four and version six
IP addresses. However, as it turns out, wanting to store IP addresses and encode which kind they
are is so common that the standard library has a de�nition we can use! Let’s look at how the
standard library de�nes IpAddr : it has the exact enum and variants that we’ve de�ned and used,
but it embeds the address data inside the variants in the form of two di�erent structs, which are
de�ned di�erently for each variant:
struct Ipv4Addr {
// --snip--
}
struct Ipv6Addr {
// --snip--
}
enum IpAddr {
V4(Ipv4Addr),
V6(Ipv6Addr),
}
This code illustrates that you can put any kind of data inside an enum variant: strings, numeric
types, or structs, for example. You can even include another enum! Also, standard library types are
often not much more complicated than what you might come up with.
Note that even though the standard library contains a de�nition for IpAddr , we can still create
and use our own de�nition without con�ict because we haven’t brought the standard library’s
de�nition into our scope. We’ll talk more about bringing types into scope in Chapter 7.
Let’s look at another example of an enum in Listing 6-2: this one has a wide variety of types
embedded in its variants.
enum Message {
Quit,
Move { x: i32, y: i32 },
Write(String),
ChangeColor(i32, i32, i32),
}
Listing 6-2: A Message enum whose variants each store di�erent amounts and types of values
De�ning an enum with variants such as the ones in Listing 6-2 is similar to de�ning di�erent kinds
of struct de�nitions, except the enum doesn’t use the struct keyword and all the variants are
grouped together under the Message type. The following structs could hold the same data that
the preceding enum variants hold:
struct QuitMessage; // unit struct
struct MoveMessage {
x: i32,
y: i32,
}
struct WriteMessage(String); // tuple struct
struct ChangeColorMessage(i32, i32, i32); // tuple struct
But if we used the di�erent structs, which each have their own type, we couldn’t as easily de�ne a
function to take any of these kinds of messages as we could with the Message enum de�ned in
Listing 6-2, which is a single type.
There is one more similarity between enums and structs: just as we’re able to de�ne methods on
structs using impl , we’re also able to de�ne methods on enums. Here’s a method named call
that we could de�ne on our Message enum:
impl Message {
fn call(&self) {
// method body would be defined here
}
}
let m = Message::Write(String::from("hello"));
m.call();
The body of the method would use self to get the value that we called the method on. In this
example, we’ve created a variable m that has the value Message::Write(String::from("hello"))
, and that is what self will be in the body of the call method when m.call() runs.
Let’s look at another enum in the standard library that is very common and useful: Option .
In the previous section, we looked at how the IpAddr enum let us use Rust’s type system to
encode more information than just the data into our program. This section explores a case study
of Option , which is another enum de�ned by the standard library. The Option type is used in
many places because it encodes the very common scenario in which a value could be something
or it could be nothing. Expressing this concept in terms of the type system means the compiler
can check whether you’ve handled all the cases you should be handling; this functionality can
prevent bugs that are extremely common in other programming languages.
Programming language design is often thought of in terms of which features you include, but the
features you exclude are important too. Rust doesn’t have the null feature that many other
languages have. Null is a value that means there is no value there. In languages with null, variables
can always be in one of two states: null or not-null.
In his 2009 presentation “Null References: The Billion Dollar Mistake,” Tony Hoare, the inventor of
null, has this to say:
I call it my billion-dollar mistake. At that time, I was designing the �rst comprehensive type
system for references in an object-oriented language. My goal was to ensure that all use of
references should be absolutely safe, with checking performed automatically by the
compiler. But I couldn’t resist the temptation to put in a null reference, simply because it
was so easy to implement. This has led to innumerable errors, vulnerabilities, and system
crashes, which have probably caused a billion dollars of pain and damage in the last forty
years.
The problem with null values is that if you try to use a null value as a not-null value, you’ll get an
error of some kind. Because this null or not-null property is pervasive, it’s extremely easy to make
this kind of error.
However, the concept that null is trying to express is still a useful one: a null is a value that is
currently invalid or absent for some reason.
The problem isn’t really with the concept but with the particular implementation. As such, Rust
does not have nulls, but it does have an enum that can encode the concept of a value being
present or absent. This enum is Option<T> , and it is de�ned by the standard library as follows:
enum Option<T> {
Some(T),
None,
}
The Option<T> enum is so useful that it’s even included in the prelude; you don’t need to bring it
into scope explicitly. In addition, so are its variants: you can use Some and None directly without
the Option:: pre�x. The Option<T> enum is still just a regular enum, and Some(T) and None
are still variants of type Option<T> .
The <T> syntax is a feature of Rust we haven’t talked about yet. It’s a generic type parameter, and
we’ll cover generics in more detail in Chapter 10. For now, all you need to know is that <T> means
the Some variant of the Option enum can hold one piece of data of any type. Here are some
examples of using Option values to hold number types and string types:
let some_number = Some(5);
let some_string = Some("a string");
If we use None rather than Some , we need to tell Rust what type of Option<T> we have, because
the compiler can’t infer the type that the Some variant will hold by looking only at a None value.
When we have a Some value, we know that a value is present and the value is held within the
Some . When we have a None value, in some sense, it means the same thing as null: we don’t have
a valid value. So why is having Option<T> any better than having null?
In short, because Option<T> and T (where T can be any type) are di�erent types, the compiler
won’t let us use an Option<T> value as if it were de�nitely a valid value. For example, this code
won’t compile because it’s trying to add an i8 to an Option<i8> :
let x: i8 = 5;
let y: Option<i8> = Some(5);
let sum = x + y;
Intense! In e�ect, this error message means that Rust doesn’t understand how to add an i8 and
an Option<i8> , because they’re di�erent types. When we have a value of a type like i8 in Rust,
the compiler will ensure that we always have a valid value. We can proceed con�dently without
having to check for null before using that value. Only when we have an Option<i8> (or whatever
type of value we’re working with) do we have to worry about possibly not having a value, and the
compiler will make sure we handle that case before using the value.
In other words, you have to convert an Option<T> to a T before you can perform T operations
with it. Generally, this helps catch one of the most common issues with null: assuming that
something isn’t null when it actually is.
Not having to worry about incorrectly assuming a not-null value helps you to be more con�dent in
your code. In order to have a value that can possibly be null, you must explicitly opt in by making
the type of that value Option<T> . Then, when you use that value, you are required to explicitly
handle the case when the value is null. Everywhere that a value has a type that isn’t an Option<T> ,
you can safely assume that the value isn’t null. This was a deliberate design decision for Rust to
limit null’s pervasiveness and increase the safety of Rust code.
So, how do you get the T value out of a Some variant when you have a value of type Option<T>
so you can use that value? The Option<T> enum has a large number of methods that are useful in
a variety of situations; you can check them out in its documentation. Becoming familiar with the
methods on Option<T> will be extremely useful in your journey with Rust.
In general, in order to use an Option<T> value, you want to have code that will handle each
variant. You want some code that will run only when you have a Some(T) value, and this code is
allowed to use the inner T . You want some other code to run if you have a None value, and that
code doesn’t have a T value available. The match expression is a control �ow construct that does
just this when used with enums: it will run di�erent code depending on which variant of the enum
it has, and that code can use the data inside the matching value.
Think of a match expression as being like a coin-sorting machine: coins slide down a track with
variously sized holes along it, and each coin falls through the �rst hole it encounters that it �ts
into. In the same way, values go through each pattern in a match , and at the �rst pattern the
value “�ts,” the value falls into the associated code block to be used during execution.
Because we just mentioned coins, let’s use them as an example using match ! We can write a
function that can take an unknown United States coin and, in a similar way as the counting
machine, determine which coin it is and return its value in cents, as shown here in Listing 6-3.
enum Coin {
Penny,
Nickel,
Dime,
Quarter,
}
Listing 6-3: An enum and a match expression that has the variants of the enum as its patterns
Let’s break down the match in the value_in_cents function. First, we list the match keyword
followed by an expression, which in this case is the value coin . This seems very similar to an
expression used with if , but there’s a big di�erence: with if , the expression needs to return a
Boolean value, but here, it can be any type. The type of coin in this example is the Coin enum
that we de�ned on line 1.
Next are the match arms. An arm has two parts: a pattern and some code. The �rst arm here has
a pattern that is the value Coin::Penny and then the => operator that separates the pattern and
the code to run. The code in this case is just the value 1 . Each arm is separated from the next with
a comma.
When the match expression executes, it compares the resulting value against the pattern of each
arm, in order. If a pattern matches the value, the code associated with that pattern is executed. If
that pattern doesn’t match the value, execution continues to the next arm, much as in a coin-
sorting machine. We can have as many arms as we need: in Listing 6-3, our match has four arms.
The code associated with each arm is an expression, and the resulting value of the expression in
the matching arm is the value that gets returned for the entire match expression.
Curly brackets typically aren’t used if the match arm code is short, as it is in Listing 6-3 where each
arm just returns a value. If you want to run multiple lines of code in a match arm, you can use
curly brackets. For example, the following code would print “Lucky penny!” every time the method
was called with a Coin::Penny but would still return the last value of the block, 1 :
fn value_in_cents(coin: Coin) -> u32 {
match coin {
Coin::Penny => {
println!("Lucky penny!");
1
},
Coin::Nickel => 5,
Coin::Dime => 10,
Coin::Quarter => 25,
}
}
Another useful feature of match arms is that they can bind to the parts of the values that match
the pattern. This is how we can extract values out of enum variants.
As an example, let’s change one of our enum variants to hold data inside it. From 1999 through
2008, the United States minted quarters with di�erent designs for each of the 50 states on one
side. No other coins got state designs, so only quarters have this extra value. We can add this
information to our enum by changing the Quarter variant to include a UsState value stored
inside it, which we’ve done here in Listing 6-4.
#[derive(Debug)] // so we can inspect the state in a minute
enum UsState {
Alabama,
Alaska,
// --snip--
}
enum Coin {
Penny,
Nickel,
Dime,
Quarter(UsState),
}
Listing 6-4: A Coin enum in which the Quarter variant also holds a UsState value
Let’s imagine that a friend of ours is trying to collect all 50 state quarters. While we sort our loose
change by coin type, we’ll also call out the name of the state associated with each quarter so if it’s
one our friend doesn’t have, they can add it to their collection.
In the match expression for this code, we add a variable called state to the pattern that matches
values of the variant Coin::Quarter . When a Coin::Quarter matches, the state variable will
bind to the value of that quarter’s state. Then we can use state in the code for that arm, like so:
fn value_in_cents(coin: Coin) -> u32 {
match coin {
Coin::Penny => 1,
Coin::Nickel => 5,
Coin::Dime => 10,
Coin::Quarter(state) => {
println!("State quarter from {:?}!", state);
25
},
}
}
In the previous section, we wanted to get the inner T value out of the Some case when using
Option<T> ; we can also handle Option<T> using match as we did with the Coin enum! Instead
of comparing coins, we’ll compare the variants of Option<T> , but the way that the match
expression works remains the same.
Let’s say we want to write a function that takes an Option<i32> and, if there’s a value inside, adds
1 to that value. If there isn’t a value inside, the function should return the None value and not
attempt to perform any operations.
This function is very easy to write, thanks to match , and will look like Listing 6-5.
fn plus_one(x: Option<i32>) -> Option<i32> {
match x {
None => None,
Some(i) => Some(i + 1),
}
}
Let’s examine the �rst execution of plus_one in more detail. When we call plus_one(five) , the
variable x in the body of plus_one will have the value Some(5) . We then compare that against
each match arm.
The Some(5) value doesn’t match the pattern None , so we continue to the next arm.
Does Some(5) match Some(i) ? Why yes it does! We have the same variant. The i binds to the
value contained in Some , so i takes the value 5 . The code in the match arm is then executed, so
we add 1 to the value of i and create a new Some value with our total 6 inside.
Now let’s consider the second call of plus_one in Listing 6-5, where x is None . We enter the
match and compare to the �rst arm.
It matches! There’s no value to add to, so the program stops and returns the None value on the
right side of => . Because the �rst arm matched, no other arms are compared.
Combining match and enums is useful in many situations. You’ll see this pattern a lot in Rust
code: match against an enum, bind a variable to the data inside, and then execute code based on
it. It’s a bit tricky at �rst, but once you get used to it, you’ll wish you had it in all languages. It’s
consistently a user favorite.
There’s one other aspect of match we need to discuss. Consider this version of our plus_one
function that has a bug and won’t compile:
We didn’t handle the None case, so this code will cause a bug. Luckily, it’s a bug Rust knows how to
catch. If we try to compile this code, we’ll get this error:
Rust knows that we didn’t cover every possible case and even knows which pattern we forgot!
Matches in Rust are exhaustive: we must exhaust every last possibility in order for the code to be
valid. Especially in the case of Option<T> , when Rust prevents us from forgetting to explicitly
handle the None case, it protects us from assuming that we have a value when we might have
null, thus making the billion-dollar mistake discussed earlier.
The _ Placeholder
Rust also has a pattern we can use when we don’t want to list all possible values. For example, a
u8 can have valid values of 0 through 255. If we only care about the values 1, 3, 5, and 7, we don’t
want to have to list out 0, 2, 4, 6, 8, 9 all the way up to 255. Fortunately, we don’t have to: we can
use the special pattern _ instead:
let some_u8_value = 0u8;
match some_u8_value {
1 => println!("one"),
3 => println!("three"),
5 => println!("five"),
7 => println!("seven"),
_ => (),
}
The _ pattern will match any value. By putting it after our other arms, the _ will match all the
possible cases that aren’t speci�ed before it. The () is just the unit value, so nothing will happen
in the _ case. As a result, we can say that we want to do nothing for all the possible values that we
However, the match expression can be a bit wordy in a situation in which we care about only one
of the cases. For this situation, Rust provides if let .
let some_u8_value = Some(0u8);
match some_u8_value {
Some(3) => println!("three"),
_ => (),
}
Listing 6-6: A match that only cares about executing code when the value is Some(3)
We want to do something with the Some(3) match but do nothing with any other Some<u8> value
or the None value. To satisfy the match expression, we have to add _ => () after processing just
one variant, which is a lot of boilerplate code to add.
Instead, we could write this in a shorter way using if let . The following code behaves the same
as the match in Listing 6-6:
if let Some(3) = some_u8_value {
println!("three");
}
The syntax if let takes a pattern and an expression separated by an equal sign. It works the
same way as a match , where the expression is given to the match and the pattern is its �rst arm.
Using if let means less typing, less indentation, and less boilerplate code. However, you lose
the exhaustive checking that match enforces. Choosing between match and if let depends on
what you’re doing in your particular situation and whether gaining conciseness is an appropriate
trade-o� for losing exhaustive checking.
In other words, you can think of if let as syntax sugar for a match that runs code when the
value matches one pattern and then ignores all other values.
We can include an else with an if let . The block of code that goes with the else is the same
as the block of code that would go with the _ case in the match expression that is equivalent to
the if let and else . Recall the Coin enum de�nition in Listing 6-4, where the Quarter variant
also held a UsState value. If we wanted to count all non-quarter coins we see while also
announcing the state of the quarters, we could do that with a match expression like this:
let mut count = 0;
match coin {
Coin::Quarter(state) => println!("State quarter from {:?}!", state),
_ => count += 1,
}
let mut count = 0;
if let Coin::Quarter(state) = coin {
println!("State quarter from {:?}!", state);
} else {
count += 1;
}
If you have a situation in which your program has logic that is too verbose to express using a
match , remember that if let is in your Rust toolbox as well.
Summary
We’ve now covered how to use enums to create custom types that can be one of a set of
enumerated values. We’ve shown how the standard library’s Option<T> type helps you use the
type system to prevent errors. When enum values have data inside them, you can use match or
if let to extract and use those values, depending on how many cases you need to handle.
Your Rust programs can now express concepts in your domain using structs and enums. Creating
custom types to use in your API ensures type safety: the compiler will make certain your functions
get only values of the type each function expects.
In order to provide a well-organized API to your users that is straightforward to use and only
exposes exactly what your users will need, let’s now turn to Rust’s modules.
A key question when writing programs is scope: what names does the compiler know about at this
location in the code? What functions am I allowed to call? What does this variable refer to?
Rust has a number of features related to scopes. This is sometimes called “the module system,”
but it encompases more than just modules:
Packages are a Cargo feature that let you build, test, and share crates.
Crates are a tree of modules that produce a library or executable.
Modules and the use keyword let you control the scope and privacy of paths.
A path is a way of naming an item such as a struct, function, or module.
This chapter will cover all of these concepts. You’ll be bringing names into scopes, de�ning scopes,
and exporting names to scopes like a pro soon!
Because Cargo created a Cargo.toml, that means we now have a package. If we look at the
contents of Cargo.toml, there’s no mention of src/main.rs. However, Cargo’s conventions are that if
you have a src directory containing main.rs in the same directory as a package’s Cargo.toml, Cargo
knows this package contains a binary crate with the same name as the package, and src/main.rs is
its crate root. Another convention of Cargo’s is that if the package directory contains src/lib.rs, the
package contains a library crate with the same name as the package, and src/lib.rs is its crate root.
The crate root �les are passed by Cargo to rustc to actually build the library or binary.
A package can contain zero or one library crates and as many binary crates as you’d like. There
must be at least one crate (either a library or a binary) in a package.
If a package contains both src/main.rs and src/lib.rs, then it has two crates: a library and a binary,
both with the same name. If we only had one of the two, the package would have either a single
library or binary crate. A package can have multiple binary crates by placing �les in the src/bin
directory: each �le will be a separate binary crate.
First up, modules. Modules let us organize code into groups. Listing 7-1 has an example of some
code that de�nes a module named sound that contains a function named guitar .
Filename: src/main.rs
mod sound {
fn guitar() {
// Function body code goes here
}
}
fn main() {
Listing 7-1: A sound module containing a guitar function and a main function
We’ve de�ned two functions, guitar and main . We’ve de�ned the guitar function within a mod
block. This block de�nes a module named sound .
To organize code into a hierarchy of modules, you can nest modules inside of other modules, as
Filename: src/main.rs
mod sound {
mod instrument {
mod woodwind {
fn clarinet() {
// Function body code goes here
}
}
}
mod voice {
}
}
fn main() {
In this example, we de�ned a sound module in the same way as we did in Listing 7-1. We then
de�ned two modules within the sound module named instrument and voice . The instrument
module has another module de�ned within it, woodwind , and that module contains a function
named clarinet .
We mentioned in the “Packages and Crates for Making Libraries and Executables” section that
src/main.rs and src/lib.rs are called crate roots. They are called crate roots because the contents of
either of these two �les form a module named crate at the root of the crate’s module tree. So in
Listing 7-2, we have a module tree that looks like Listing 7-3:
crate
└── sound
└── instrument
└── woodwind
└── voice
Listing 7-3: The module tree for the code in Listing 7-2
This tree shows how some of the modules nest inside one another (such as woodwind nests inside
instrument ) and how some modules are siblings to each other ( instrument and voice are both
de�ned within sound ). The entire module tree is rooted under the implicit module named crate .
This tree might remind you of the directory tree of the �lesystem you have on your computer; this
is a very apt comparison! Just like directories in a �lesystem, you place code inside whichever
module will create the organization you’d like. Another similarity is that to refer to an item in a
�lesystem or a module tree, you use its path.
If we want to call a function, we need to know its path. “Path” is a synonym for “name” in a way, but
it evokes that �lesystem metaphor. Additionally, functions, structs, and other items may have
multiple paths that refer to the same item, so “name” isn’t quite the right concept.
An absolute path starts from a crate root by using a crate name or a literal crate .
A relative path starts from the current module and uses self , super , or an identi�er in the
current module.
Both absolute and relative paths are followed by one or more identi�ers separated by double
colons ( :: ).
How do we call the clarinet function in the main function in Listing 7-2? That is, what’s the path
of the clarinet function? In Listing 7-4, let’s simplify our code a bit by removing some of the
modules, and we’ll show two ways to call the clarinet function from main . This example won’t
compile just yet, we’ll explain why in a bit.
Filename: src/main.rs
mod sound {
mod instrument {
fn clarinet() {
// Function body code goes here
}
}
}
fn main() {
// Absolute path
crate::sound::instrument::clarinet();
// Relative path
sound::instrument::clarinet();
}
Listing 7-4: Calling the clarinet function in a simpli�ed module tree from the main function using absolute
and relative paths
The �rst way we’re calling the clarinet function from the main function uses an absolute path.
Because clarinet is de�ned within the same crate as main , we use the crate keyword to start
an absolute path. Then we include each of the modules until we make our way to clarinet . This
is similar to specifying the path /sound/instrument/clarinet to run the program at that location
on your computer; using the crate name to start from the crate root is like using / to start from
the �lesystem root in your shell.
The second way we’re calling the clarinet function from the main function uses a relative path.
The path starts with the name sound , a module de�ned at the same level of the module tree as
the main function. This is similar to specifying the path sound/instrument/clarinet to run the
program at that location on your computer; starting with a name means that the path is relative.
We mentioned that Listing 7-4 won’t compile yet, let’s try to compile it and �nd out why not! The
error we get is shown in Listing 7-5.
$ cargo build
Compiling sampleproject v0.1.0 (file:///projects/sampleproject)
error[E0603]: module `instrument` is private
--> src/main.rs:11:19
|
11 | crate::sound::instrument::clarinet();
| ^^^^^^^^^^
Listing 7-5: Compiler errors from building the code in Listing 7-4
The error messsages say that module instrument is private. We can see that we have the correct
paths for the instrument module and the clarinet function, but Rust won’t let us use them
because they’re private. It’s time to learn about the pub keyword!
Earlier, we talked about the syntax of modules and that they can be used for organization. There’s
another reason Rust has modules: modules are the privacy boundary in Rust. If you want to make
an item like a function or struct private, you put it in a module. Here are the privacy rules:
All items (functions, methods, structs, enums, modules, annd constants) are private by
default.
You can use the pub keyword to make an item public.
You aren’t allowed to use private code de�ned in modules that are children of the current
module.
You are allowed to use any code de�ned in ancestor modules or the current module.
In other words, items without the pub keyword are private as you look “down” the module tree
from the current module, but items without the pub keyword are public as you look “up” the tree
from the current module. Again, think of a �lesystem: if you don’t have permissions to a directory,
you can’t look into it from its parent directory. If you do have permissions to a directory, you can
look inside it and any of its ancestor directories.
The error in Listing 7-5 said the instrument module is private. Let’s mark the instrument module
with the pub keyword so that we can use it from the main function. This change is shown in
Listing 7-6, which still won’t compile, but we’ll get a di�erent error:
Filename: src/main.rs
mod sound {
pub mod instrument {
fn clarinet() {
// Function body code goes here
}
}
}
fn main() {
// Absolute path
crate::sound::instrument::clarinet();
// Relative path
sound::instrument::clarinet();
}
Listing 7-6: Declaring the instrument module as pub so that we’re allowed to use it from main
Adding the pub keyword in front of mod instrument makes the module public. With this change,
if we’re allowed to access sound , we can access instrument . The contents of instrument are still
private; making the module public does not make its contents public. The pub keyword on a
module lets code in its parent module refer to it.
The code in Listing 7-6 still results in an error, though, as shown in Listing 7-7:
$ cargo build
Compiling sampleproject v0.1.0 (file:///projects/sampleproject)
error[E0603]: function `clarinet` is private
--> src/main.rs:11:31
|
11 | crate::sound::instrument::clarinet();
| ^^^^^^^^
Listing 7-7: Compiler errors from building the code in Listing 7-6
The errors now say that the clarinet function is private. The privacy rules apply to structs,
enums, functions, and methods as well as modules.
Let’s make the clarinet function public as well by adding the pub keyword before its de�nition,
as shown in Listing 7-8:
Filename: src/main.rs
mod sound {
pub mod instrument {
pub fn clarinet() {
// Function body code goes here
}
}
}
fn main() {
// Absolute path
crate::sound::instrument::clarinet();
// Relative path
sound::instrument::clarinet();
}
Listing 7-8: Adding the pub keyword to both mod instrument and fn clarinet lets us call the function
from main
This will now compile! Let’s look at both the absolute and the relative path and double check why
adding the pub keyword lets us use these paths in main .
In the absolute path case, we start with crate , the root of our crate. From there, we have sound ,
and it is a module that is de�ned in the crate root. The sound module isn’t public, but because the
main function is de�ned in the same module that sound is de�ned, we’re allowed to refer to
sound from main . Next is instrument , which is a module marked with pub . We can access the
parent module of instrument , so we’re allowed to access instrument . Finally, clarinet is a
function marked with pub and we can access its parent module, so this function call works!
In the relative path case, the logic is the same as the absolute path except for the �rst step. Rather
than starting from the crate root, the path starts from sound . The sound module is de�ned within
the same module as main is, so the relative path starting from the module in which main is
de�ned works. Then because instrument and clarinet are marked with pub , the rest of the
path works and this function call is valid as well!
You can also construct relative paths beginning with super . Doing so is like starting a �lesystem
path with .. : the path starts from the parent module, rather than the current module. This is
useful in situations such as the example in Listing 7-9, where the function clarinet calls the
function breathe_in by specifying the path to breathe_in start with super :
Filename: src/lib.rs
mod instrument {
fn clarinet() {
super::breathe_in();
}
}
fn breathe_in() {
// Function body code goes here
}
Listing 7-9: Calling a function using a relative path starting with super to look in the parent module
The clarinet function is in the instrument module, so we can use super to go to the parent
module of instrument , which in this case is crate , the root. From there, we look for breathe_in ,
and �nd it. Success!
The reason you might want to choose a relative path starting with super rather than an absolute
path starting with crate is that using super may make it easier to update your code to have a
di�erent module hierarchy, if the code de�ning the item and the code calling the item are moved
together. For example, if we decide to put the instrument module and the breathe_in function
into a module named sound , we would only need to add the sound module, as shown in Listing
7-10.
Filename: src/lib.rs
mod sound {
mod instrument {
fn clarinet() {
super::breathe_in();
}
}
fn breathe_in() {
// Function body code goes here
}
}
Listing 7-10: Adding a parent module named sound doesn’t a�ect the relative path super::breathe_in
The call to super::breathe_in from the clarinet function will continue to work in Listing 7-10 as
it did in Listing 7-9, without needing to update the path. If instead of super::breathe_in we had
used crate::breathe_in in the clarinet function, when we add the parent sound module, we
would need to update the clarinet function to use the path crate::sound::breathe_in instead.
Using a relative path can mean fewer updates are necessary when rearranging modules.
You can designate structs and enums to be public in a similar way as we’ve shown with modules
and functions, with a few additional details.
If you use pub before a struct de�nition, you make the struct public. However, the struct’s �elds
are still private. You can choose to make each �eld public or not on a case-by-case basis. In Listing
7-11, we’ve de�ned a public plant::Vegetable struct with a public name �eld but a private id
�eld.
Filename: src/main.rs
mod plant {
pub struct Vegetable {
pub name: String,
id: i32,
}
impl Vegetable {
pub fn new(name: &str) -> Vegetable {
Vegetable {
name: String::from(name),
id: 1,
}
}
}
}
fn main() {
let mut v = plant::Vegetable::new("squash");
Listing 7-11: A struct with some public �elds and some private �elds
Because the name �eld of the plant::Vegetable struct is public, in main we can write and read
to the name �eld by using dot notation. We’re not allowed to use the id �eld in main because it’s
private. Try uncommenting the line printing the id �eld value to see what error you get! Also note
that because plant::Vegetable has a private �eld, the struct needs to provide a public
associated function that constructs an instance of Vegetable (we’ve used the conventional name
new here). If Vegetable didn’t have such a function, we wouldn’t be able to create an instance of
Vegetable in main because we’re not allowed to set the value of the private id �eld in main .
In contrast, if you make a public enum, all of its variants are public. You only need the pub before
the enum keyword, as shown in Listing 7-12.
Filename: src/main.rs
mod menu {
pub enum Appetizer {
Soup,
Salad,
}
}
fn main() {
let order1 = menu::Appetizer::Soup;
let order2 = menu::Appetizer::Salad;
}
Listing 7-12: Designating an enum as public makes all its variants public
Because we made the Appetizer enum public, we’re able to use the Soup and Salad variants in
main .
There’s one more situation involving pub that we haven’t covered, and that’s with our last module
system feature: the use keyword. Let’s cover use by itself, and then we’ll show how pub and
use can be combined.
You may have been thinking that many of the paths we’ve written to call functions in the listings in
this chapter are long and repetitive. For example, in Listing 7-8, whether we chose the absolute or
relative path to the clarinet function, every time we wanted to call clarinet we had to specify
sound and instrument too. Luckily, there’s a way to bring a path into a scope once and then call
the items in that path as if they’re local items: with the use keyword. In Listing 7-13, we bring the
crate::sound::instrument module into the scope of the main function so that we only have to
specify instrument::clarinet to call the clarinet function in main .
Filename: src/main.rs
mod sound {
pub mod instrument {
pub fn clarinet() {
// Function body code goes here
}
}
}
use crate::sound::instrument;
fn main() {
instrument::clarinet();
instrument::clarinet();
instrument::clarinet();
}
Listing 7-13: Bringing a module into scope with use and an absolute path to shorten the path we have to
specify to call an item within that module
Adding use and a path in a scope is similar to creating a symbolic link in the �lesystem. By adding
use crate::sound::instrument in the crate root, instrument is now a valid name in that scope
as if the instrument module had been de�ned in the crate root. We can now reach items in the
instrument module through the older, full paths, or we can reach items through the new, shorter
path that we’ve created with use . Paths brought into scope with use also check privacy, like any
other paths.
If you want to bring an item into scope with use and a relative path, there’s a small di�erence
from directly calling the item using a relative path: instead of starting from a name in the current
scope, you must start the path given to use with self . Listing 7-14 shows how to specify a
relative path to get the same behavior as Listing 7-13 that used an absolute path.
Filename: src/main.rs
mod sound {
pub mod instrument {
pub fn clarinet() {
// Function body code goes here
}
}
}
use self::sound::instrument;
fn main() {
instrument::clarinet();
instrument::clarinet();
instrument::clarinet();
}
Listing 7-14: Bringing a module into scope with use and a relative path starting with self
Starting relative paths with self when speci�ed after use might not be neccesary in the future;
it’s an inconsistency in the language that people are working on eliminating.
Choosing to specify absolute paths with use can make updates easier if the code calling the items
moves to a di�erent place in the module tree but the code de�ning the items does not, as
opposed to when they moved together in the changes we made in Listing 7-10. For example, if we
decide to take the code from Listing 7-13, extract the behavior in the main function to a function
called clarinet_trio , and move that function into a module named performance_group , the
path speci�ed in use wouldn’t need to change, as shown in Listing 7-15.
Filename: src/main.rs
mod sound {
pub mod instrument {
pub fn clarinet() {
// Function body code goes here
}
}
}
mod performance_group {
use crate::sound::instrument;
pub fn clarinet_trio() {
instrument::clarinet();
instrument::clarinet();
instrument::clarinet();
}
}
fn main() {
performance_group::clarinet_trio();
}
Listing 7-15: The absolute path doesn’t need to be updated when moving the code that calls the item
In contrast, if we made the same change to the code in Listing 7-14 that speci�es a relative path,
we would need to change use self::sound::instrument to use super::sound::instrument .
Choosing whether relative or absolute paths will result in fewer updates can be a guess if you’re
not sure how your module tree will change in the future, but your authors tend to specify absolute
paths starting with crate because code de�ning and calling items is more likely to be moved
around the module tree independently of each other, rather than together as we saw in Listing
7-10.
In Listing 7-13, you may have wondered why we speci�ed use crate::sound::instrument and
then called instrument::clarinet in main , rather than the code shown in Listing 7-16 that has
the same behavior:
Filename: src/main.rs
mod sound {
pub mod instrument {
pub fn clarinet() {
// Function body code goes here
}
}
}
use crate::sound::instrument::clarinet;
fn main() {
clarinet();
clarinet();
clarinet();
}
Listing 7-16: Bringing the clarinet function into scoope with use , which is unidiomatic
For functions, it’s considered idiomatic to specify the function’s parent module with use , and then
specify the parent module when calling the function. Doing so rather than specifying the path to
the function with use , as Listing 7-16 does, makes it clear that the function isn’t locally de�ned,
while still minimizing repetition of the full path.
For structs, enums, and other items, specifying the full path to the item with use is idiomatic. For
example, Listing 7-17 shows the idiomatic way to bring the standard library’s HashMap struct into
scope.
Filename: src/main.rs
use std::collections::HashMap;
fn main() {
let mut map = HashMap::new();
map.insert(1, 2);
}
In contrast, the code in Listing 7-18 that brings the parent module of HashMap into scope would
not be considered idiomatic. There’s not a strong reason for this idiom; this is the convention that
has emerged and folks have gotten used to reading and writing.
Filename: src/main.rs
use std::collections;
fn main() {
let mut map = collections::HashMap::new();
map.insert(1, 2);
}
The exception to this idiom is if the use statements would bring two items with the same name
into scope, which isn’t allowed. Listing 7-19 shows how to bring two Result types that have
di�erent parent modules into scope and refer to them.
Filename: src/lib.rs
use std::fmt;
use std::io;
Listing 7-19: Bringing two types with the same name into the same scope requires using their parent modules
If instead we speci�ed use std::fmt::Result and use std::io::Result , we’d have two Result
types in the same scope and Rust wouldn’t know which one we meant when we used Result . Try
it and see what compiler error you get!
There’s another solution to the problem of bringing two types of the same name into the same
scope: we can specify a new local name for the type by adding as and a new name after the use .
Listing 7-20 shows another way to write the code from Listing 7-19 by renaming one of the two
Result types using as .
Filename: src/lib.rs
use std::fmt::Result;
use std::io::Result as IoResult;
Listing 7-20: Renaming a type when it’s brought into scope with the as keyword
In the second use statement, we chose the new name IoResult for the std::io::Result type,
which won’t con�ict with the Result from std::fmt that we’ve also brought into scope. This is
also considered idiomatic; choosing between the code in Listing 7-19 and Listing 7-20 is up to you.
When you bring a name into scope with the use keyword, the name being available in the new
scope is private. If you want to enable code calling your code to be able to refer to the type as if it
was de�ned in that scope just as your code does, you can combine pub and use . This technique
is called re-exporting because you’re bringing an item into scope but also making that item
available for others to bring into their scope.
For example, Listing 7-21 shows the code from Listing 7-15 with the use within the
performance_group module changed to pub use .
Filename: src/main.rs
mod sound {
pub mod instrument {
pub fn clarinet() {
// Function body code goes here
}
}
}
mod performance_group {
pub use crate::sound::instrument;
pub fn clarinet_trio() {
instrument::clarinet();
instrument::clarinet();
instrument::clarinet();
}
}
fn main() {
performance_group::clarinet_trio();
performance_group::instrument::clarinet();
}
Listing 7-21: Making a name available for any code to use from a new scope with pub use
By using pub use , the main function can now call the clarinet function through this new path
with performance_group::instrument::clarinet . If we hadn’t speci�ed pub use , the
clarinet_trio function can call instrument::clarinet in its scope but main wouldn’t be
allowed to take advantage of this new path.
In Chapter 2, we programmed a guessing game. That project used an external package, rand , to
get random numbers. To use rand in our project, we added this line to Cargo.toml:
Filename: Cargo.toml
[dependencies]
rand = "0.5.5"
Adding rand as a dependency in Cargo.toml tells Cargo to download the rand package and its
dependencies from https://crates.io and make its code available to our project.
Then, to bring rand de�nitions into the scope of our package, we added a use line starting with
the name of the package, rand , and listing the items we wanted to bring into scope. Recall that in
the “Generating a Random Number” section in Chapter 2, we brought the Rng trait into scope and
called the rand::thread_rng function:
use rand::Rng;
fn main() {
let secret_number = rand::thread_rng().gen_range(1, 101);
}
There are many packages that members of the community have published on https://crates.io, and
pulling any of them in to your package involves these same steps: listing them in your package’s
Cargo.toml and bringing items de�ned in them into a scope in your package with use .
Note that the standard library ( std ) is also a crate that’s external to your package. Because the
standard library is shipped with the Rust language, you don’t need to change Cargo.toml to include
std , but you refer to it in use to bring items the standard library de�nes into your package’s
scope, such as with HashMap :
use std::collections::HashMap;
This is an absolute path starting with std , the name of the standard library crate.
When you use many items de�ned by the same package or in the same module, listing each item
on its own line can take up a lot of vertical space in your �les. For example, these two use
statements we had in Listing 2-4 in the Guessing Game both bring items from std into scope:
Filename: src/main.rs
use std::cmp::Ordering;
use std::io;
// ---snip---
We can use nested paths to bring the same items into scope in one line instead of two, by
specifying the common part of the path, then two colons, then curly brackets around a list of the
parts of the paths that di�er, as shown in Listing 7-22.
Filename: src/main.rs
use std::{cmp::Ordering, io};
// ---snip---
Listing 7-22: Specifying a nested path to bring multiple items with the same pre�x into scope in one line
instead of two
In programs bringing many items into scope from the same package or module, using nested
paths can reduce the number of separate use statements needed by a lot!
We can also deduplicate paths where one path is completely shared with part of another path. For
example, Listing 7-23 shows two use statements: one that brings std::io into scope, and one
that brings std::io::Write into scope:
Filename: src/lib.rs
use std::io;
use std::io::Write;
Listing 7-23: Bringing two paths into scope in two use statements where one is a sub-path of the other
The common part between these two paths is std::io , and that’s the complete �rst path. To
deduplicate these two paths into one use statement, we can use self in the nested path as
shown in Listing 7-24.
Filename: src/lib.rs
use std::io::{self, Write};
Listing 7-24: Deduplicating the paths from Listing 7-23 into one use statement
Bringing All Public De�nitions into Scope with the Glob Operator
If you’d like to bring all public items de�ned in a path into scope, you can use specify that path
followed by * , the glob operator:
use std::collections::*;
This use statements brings all public items de�ned in std::collections into the current scope.
Be careful with using the glob operator! It makes it harder to tell what names are in scope and
where a name your program uses was de�ned.
The glob operator is often used when testing to bring everything under test into the tests
module; we’ll talk about that in the “How to Write Tests” section of Chapter 11. The glob operator
is also sometimes used as part of the prelude pattern; see the standard library documentation for
more information on that pattern.
All of the examples in this chapter so far de�ned multiple modules in one �le. When modules get
large, you may want to move their de�nitions to a separate �le to make the code easier to
navigate.
For example, if we started from the code in Listing 7-8, we can move the sound module to its own
�le src/sound.rs by changing the crate root �le (in this case, src/main.rs) to contain the code shown
in Listing 7-25.
Filename: src/main.rs
mod sound;
fn main() {
// Absolute path
crate::sound::instrument::clarinet();
// Relative path
sound::instrument::clarinet();
}
Listing 7-25: Declaring the sound module whose body will be in src/sound.rs
And src/sound.rs gets the de�nitions from the body of the sound module, shown in Listing 7-26.
Filename: src/sound.rs
pub mod instrument {
pub fn clarinet() {
// Function body code goes here
}
}
Using a semicolon after mod sound instead of a block tells Rust to load the contents of the module
from another �le with the same name as the module.
To continue with our example and extract the instrument module to its own �le as well, we
change src/sound.rs to contain only the declaration of the instrument module:
Filename: src/sound.rs
pub mod instrument;
Then we create a src/sound directory and a �le src/sound/instrument.rs to contain the de�nitions
made in the instrument module:
Filename: src/sound/instrument.rs
pub fn clarinet() {
// Function body code goes here
}
The module tree remains the same and the function calls in main continue to work without any
modi�cation, even though the de�nitions live in di�erent �les. This lets you move modules to new
�les as they grow in size.
Summary
Rust provides ways to organize your packages into crates, your crates into modules, and to refer
to items de�ned in one module from another by specifying absolute or relative paths. These paths
can be brought into a scope with a use statement so that you can use a shorter path for multiple
uses of the item in that scope. Modules de�ne code that’s private by default, but you can choose
to make de�nitions public by adding the pub keyword.
Next, we’ll look at some collection data structures in the standard library that you can use in your
nice, neat code.
Common Collections
Rust’s standard library includes a number of very useful data structures called collections. Most
other data types represent one speci�c value, but collections can contain multiple values. Unlike
the built-in array and tuple types, the data these collections point to is stored on the heap, which
means the amount of data does not need to be known at compile time and can grow or shrink as
the program runs. Each kind of collection has di�erent capabilities and costs, and choosing an
appropriate one for your current situation is a skill you’ll develop over time. In this chapter, we’ll
discuss three collections that are used very often in Rust programs:
A vector allows you to store a variable number of values next to each other.
A string is a collection of characters. We’ve mentioned the String type previously, but in
this chapter we’ll talk about it in depth.
A hash map allows you to associate a value with a particular key. It’s a particular
implementation of the more general data structure called a map.
To learn about the other kinds of collections provided by the standard library, see the
documentation.
We’ll discuss how to create and update vectors, strings, and hash maps, as well as what makes
each special.
To create a new, empty vector, we can call the Vec::new function, as shown in Listing 8-1.
let v: Vec<i32> = Vec::new();
Listing 8-1: Creating a new, empty vector to hold values of type i32
Note that we added a type annotation here. Because we aren’t inserting any values into this
vector, Rust doesn’t know what kind of elements we intend to store. This is an important point.
Vectors are implemented using generics; we’ll cover how to use generics with your own types in
Chapter 10. For now, know that the Vec<T> type provided by the standard library can hold any
type, and when a speci�c vector holds a speci�c type, the type is speci�ed within angle brackets. In
Listing 8-1, we’ve told Rust that the Vec<T> in v will hold elements of the i32 type.
In more realistic code, Rust can often infer the type of value you want to store once you insert
values, so you rarely need to do this type annotation. It’s more common to create a Vec<T> that
has initial values, and Rust provides the vec! macro for convenience. The macro will create a new
vector that holds the values you give it. Listing 8-2 creates a new Vec<i32> that holds the values
1 , 2 , and 3 .
let v = vec![1, 2, 3];
Because we’ve given initial i32 values, Rust can infer that the type of v is Vec<i32> , and the type
annotation isn’t necessary. Next, we’ll look at how to modify a vector.
Updating a Vector
To create a vector and then add elements to it, we can use the push method, as shown in Listing
8-3.
let mut v = Vec::new();
v.push(5);
v.push(6);
v.push(7);
v.push(8);
As with any variable, if we want to be able to change its value, we need to make it mutable using
the mut keyword, as discussed in Chapter 3. The numbers we place inside are all of type i32 , and
Rust infers this from the data, so we don’t need the Vec<i32> annotation.
Like any other struct , a vector is freed when it goes out of scope, as annotated in Listing 8-4.
{
let v = vec![1, 2, 3, 4];
// do stuff with v
Listing 8-4: Showing where the vector and its elements are dropped
When the vector gets dropped, all of its contents are also dropped, meaning those integers it
holds will be cleaned up. This may seem like a straightforward point but can get a bit more
complicated when you start to introduce references to the elements of the vector. Let’s tackle that
next!
Now that you know how to create, update, and destroy vectors, knowing how to read their
contents is a good next step. There are two ways to reference a value stored in a vector. In the
examples, we’ve annotated the types of the values that are returned from these functions for
extra clarity.
Listing 8-5 shows both methods of accessing a value in a vector, either with indexing syntax or the
get method.
let v = vec![1, 2, 3, 4, 5];
match v.get(2) {
Some(third) => println!("The third element is {}", third),
None => println!("There is no third element."),
}
Listing 8-5: Using indexing syntax or the get method to access an item in a vector
Note two details here. First, we use the index value of 2 to get the third element: vectors are
indexed by number, starting at zero. Second, the two ways to get the third element are by using &
and [] , which gives us a reference, or by using the get method with the index passed as an
argument, which gives us an Option<&T> .
Rust has two ways to reference an element so you can choose how the program behaves when
you try to use an index value that the vector doesn’t have an element for. As an example, let’s see
what a program will do if it has a vector that holds �ve elements and then tries to access an
element at index 100, as shown in Listing 8-6.
let v = vec![1, 2, 3, 4, 5];
Listing 8-6: Attempting to access the element at index 100 in a vector containing �ve elements
When we run this code, the �rst [] method will cause the program to panic because it references
a nonexistent element. This method is best used when you want your program to crash if there’s
an attempt to access an element past the end of the vector.
When the get method is passed an index that is outside the vector, it returns None without
panicking. You would use this method if accessing an element beyond the range of the vector
happens occasionally under normal circumstances. Your code will then have logic to handle
having either Some(&element) or None , as discussed in Chapter 6. For example, the index could
be coming from a person entering a number. If they accidentally enter a number that’s too large
and the program gets a None value, you could tell the user how many items are in the current
vector and give them another chance to enter a valid value. That would be more user-friendly than
crashing the program due to a typo!
When the program has a valid reference, the borrow checker enforces the ownership and
borrowing rules (covered in Chapter 4) to ensure this reference and any other references to the
contents of the vector remain valid. Recall the rule that states you can’t have mutable and
immutable references in the same scope. That rule applies in Listing 8-7, where we hold an
immutable reference to the �rst element in a vector and try to add an element to the end, which
won’t work.
v.push(6);
Listing 8-7: Attempting to add an element to a vector while holding a reference to an item
The code in Listing 8-7 might look like it should work: why should a reference to the �rst element
care about what changes at the end of the vector? This error is due to the way vectors work:
adding a new element onto the end of the vector might require allocating new memory and
copying the old elements to the new space, if there isn’t enough room to put all the elements next
to each other where the vector currently is. In that case, the reference to the �rst element would
be pointing to deallocated memory. The borrowing rules prevent programs from ending up in that
situation.
Note: For more on the implementation details of the Vec<T> type, see “The Rustonomicon”
at https://doc.rust-lang.org/stable/nomicon/vec.html.
If we want to access each element in a vector in turn, we can iterate through all of the elements
rather than use indexes to access one at a time. Listing 8-8 shows how to use a for loop to get
immutable references to each element in a vector of i32 values and print them.
let v = vec![100, 32, 57];
for i in &v {
println!("{}", i);
}
Listing 8-8: Printing each element in a vector by iterating over the elements using a for loop
We can also iterate over mutable references to each element in a mutable vector in order to make
changes to all the elements. The for loop in Listing 8-9 will add 50 to each element.
let mut v = vec![100, 32, 57];
for i in &mut v {
*i += 50;
}
To change the value that the mutable reference refers to, we have to use the dereference
operator ( * ) to get to the value in i before we can use the += operator. We’ll talk more about *
in Chapter 15.
At the beginning of this chapter, we said that vectors can only store values that are the same type.
This can be inconvenient; there are de�nitely use cases for needing to store a list of items of
di�erent types. Fortunately, the variants of an enum are de�ned under the same enum type, so
when we need to store elements of a di�erent type in a vector, we can de�ne and use an enum!
For example, say we want to get values from a row in a spreadsheet in which some of the columns
in the row contain integers, some �oating-point numbers, and some strings. We can de�ne an
enum whose variants will hold the di�erent value types, and then all the enum variants will be
considered the same type: that of the enum. Then we can create a vector that holds that enum
and so, ultimately, holds di�erent types. We’ve demonstrated this in Listing 8-10.
enum SpreadsheetCell {
Int(i32),
Float(f64),
Text(String),
}
Listing 8-10: De�ning an enum to store values of di�erent types in one vector
Rust needs to know what types will be in the vector at compile time so it knows exactly how much
memory on the heap will be needed to store each element. A secondary advantage is that we can
be explicit about what types are allowed in this vector. If Rust allowed a vector to hold any type,
there would be a chance that one or more of the types would cause errors with the operations
performed on the elements of the vector. Using an enum plus a match expression means that
Rust will ensure at compile time that every possible case is handled, as discussed in Chapter 6.
When you’re writing a program, if you don’t know the exhaustive set of types the program will get
at runtime to store in a vector, the enum technique won’t work. Instead, you can use a trait object,
which we’ll cover in Chapter 17.
Now that we’ve discussed some of the most common ways to use vectors, be sure to review the
API documentation for all the many useful methods de�ned on Vec<T> by the standard library.
For example, in addition to push , a pop method removes and returns the last element. Let’s
move on to the next collection type: String !
It’s useful to discuss strings in the context of collections because strings are implemented as a
collection of bytes, plus some methods to provide useful functionality when those bytes are
interpreted as text. In this section, we’ll talk about the operations on String that every collection
type has, such as creating, updating, and reading. We’ll also discuss the ways in which String is
di�erent from the other collections, namely how indexing into a String is complicated by the
di�erences between how people and computers interpret String data.
What Is a String?
We’ll �rst de�ne what we mean by the term string. Rust has only one string type in the core
language, which is the string slice str that is usually seen in its borrowed form &str . In Chapter
4, we talked about string slices, which are references to some UTF-8 encoded string data stored
elsewhere. String literals, for example, are stored in the binary output of the program and are
therefore string slices.
The String type, which is provided by Rust’s standard library rather than coded into the core
language, is a growable, mutable, owned, UTF-8 encoded string type. When Rustaceans refer to
“strings” in Rust, they usually mean the String and the string slice &str types, not just one of
those types. Although this section is largely about String , both types are used heavily in Rust’s
standard library, and both String and string slices are UTF-8 encoded.
Rust’s standard library also includes a number of other string types, such as OsString , OsStr ,
CString , and CStr . Library crates can provide even more options for storing string data. See how
those names all end in String or Str ? They refer to owned and borrowed variants, just like the
String and str types you’ve seen previously. These string types can store text in di�erent
encodings or be represented in memory in a di�erent way, for example. We won’t discuss these
other string types in this chapter; see their API documentation for more about how to use them
and when each is appropriate.
Many of the same operations available with Vec<T> are available with String as well, starting
with the new function to create a string, shown in Listing 8-11.
let mut s = String::new();
This line creates a new empty string called s , which we can then load data into. Often, we’ll have
some initial data that we want to start the string with. For that, we use the to_string method,
which is available on any type that implements the Display trait, as string literals do. Listing 8-12
shows two examples.
let data = "initial contents";
let s = data.to_string();
Listing 8-12: Using the to_string method to create a String from a string literal
We can also use the function String::from to create a String from a string literal. The code in
Listing 8-13 is equivalent to the code from Listing 8-12 that uses to_string .
let s = String::from("initial contents");
Listing 8-13: Using the String::from function to create a String from a string literal
Because strings are used for so many things, we can use many di�erent generic APIs for strings,
providing us with a lot of options. Some of them can seem redundant, but they all have their
place! In this case, String::from and to_string do the same thing, so which you choose is a
matter of style.
Remember that strings are UTF-8 encoded, so we can include any properly encoded data in them,
as shown in Listing 8-14.
let hello = String::from(";)"اﻟﺴﻼم ﻋﻠﻴﻜﻢ
let hello = String::from("Dobrý den");
let hello = String::from("Hello");
let hello = String::from(";)"שָׁלוֹם
let hello = String::from("नम�ते");
let hello = String::from("こんにちは");
let hello = String::from(" ");
let hello = String::from("你好");
let hello = String::from("Olá");
let hello = String::from("Здравствуйте");
let hello = String::from("Hola");
Updating a String
A String can grow in size and its contents can change, just like the contents of a Vec<T> , if you
push more data into it. In addition, you can conveniently use the + operator or the format!
macro to concatenate String values.
We can grow a String by using the push_str method to append a string slice, as shown in
Listing 8-15.
let mut s = String::from("foo");
s.push_str("bar");
Listing 8-15: Appending a string slice to a String using the push_str method
After these two lines, s will contain foobar . The push_str method takes a string slice because
we don’t necessarily want to take ownership of the parameter. For example, the code in Listing
8-16 shows that it would be unfortunate if we weren’t able to use s2 after appending its contents
to s1 .
let mut s1 = String::from("foo");
let s2 = "bar";
s1.push_str(s2);
println!("s2 is {}", s2);
Listing 8-16: Using a string slice after appending its contents to a String
If the push_str method took ownership of s2 , we wouldn’t be able to print its value on the last
line. However, this code works as we’d expect!
The push method takes a single character as a parameter and adds it to the String . Listing 8-17
shows code that adds the letter l to a String using the push method.
let mut s = String::from("lo");
s.push('l');
Often, you’ll want to combine two existing strings. One way is to use the + operator, as shown in
Listing 8-18.
let s1 = String::from("Hello, ");
let s2 = String::from("world!");
let s3 = s1 + &s2; // note s1 has been moved here and can no longer be used
Listing 8-18: Using the + operator to combine two String values into a new String value
The string s3 will contain Hello, world! as a result of this code. The reason s1 is no longer
valid after the addition and the reason we used a reference to s2 has to do with the signature of
the method that gets called when we use the + operator. The + operator uses the add method,
whose signature looks something like this:
This isn’t the exact signature that’s in the standard library: in the standard library, add is de�ned
using generics. Here, we’re looking at the signature of add with concrete types substituted for the
generic ones, which is what happens when we call this method with String values. We’ll discuss
generics in Chapter 10. This signature gives us the clues we need to understand the tricky bits of
the + operator.
First, s2 has an & , meaning that we’re adding a reference of the second string to the �rst string
because of the s parameter in the add function: we can only add a &str to a String ; we can’t
add two String values together. But wait—the type of &s2 is &String , not &str , as speci�ed in
the second parameter to add . So why does Listing 8-18 compile?
The reason we’re able to use &s2 in the call to add is that the compiler can coerce the &String
argument into a &str . When we call the add method, Rust uses a deref coercion, which here turns
&s2 into &s2[..] . We’ll discuss deref coercion in more depth in Chapter 15. Because add does
not take ownership of the s parameter, s2 will still be a valid String after this operation.
Second, we can see in the signature that add takes ownership of self , because self does not
have an & . This means s1 in Listing 8-18 will be moved into the add call and no longer be valid
after that. So although let s3 = s1 + &s2; looks like it will copy both strings and create a new
one, this statement actually takes ownership of s1 , appends a copy of the contents of s2 , and
then returns ownership of the result. In other words, it looks like it’s making a lot of copies but
isn’t; the implementation is more e�cient than copying.
If we need to concatenate multiple strings, the behavior of the + operator gets unwieldy:
let s1 = String::from("tic");
let s2 = String::from("tac");
let s3 = String::from("toe");
At this point, s will be tic-tac-toe . With all of the + and " characters, it’s di�cult to see what’s
going on. For more complicated string combining, we can use the format! macro:
let s1 = String::from("tic");
let s2 = String::from("tac");
let s3 = String::from("toe");
This code also sets s to tic-tac-toe . The format! macro works in the same way as println! ,
but instead of printing the output to the screen, it returns a String with the contents. The version
of the code using format! is much easier to read and doesn’t take ownership of any of its
parameters.
let s1 = String::from("hello");
let h = s1[0];
The error and the note tell the story: Rust strings don’t support indexing. But why not? To answer
that question, we need to discuss how Rust stores strings in memory.
Internal Representation
A String is a wrapper over a Vec<u8> . Let’s look at some of our properly encoded UTF-8 example
strings from Listing 8-14. First, this one:
let len = String::from("Hola").len();
In this case, len will be 4, which means the vector storing the string “Hola” is 4 bytes long. Each of
these letters takes 1 byte when encoded in UTF-8. But what about the following line? (Note that
this string begins with the capital Cyrillic letter Ze, not the Arabic number 3.)
let len = String::from("Здравствуйте").len();
Asked how long the string is, you might say 12. However, Rust’s answer is 24: that’s the number of
bytes it takes to encode “Здравствуйте” in UTF-8, because each Unicode scalar value in that string
takes 2 bytes of storage. Therefore, an index into the string’s bytes will not always correlate to a
valid Unicode scalar value. To demonstrate, consider this invalid Rust code:
What should the value of answer be? Should it be З , the �rst letter? When encoded in UTF-8, the
�rst byte of З is 208 and the second is 151 , so answer should in fact be 208 , but 208 is not a
valid character on its own. Returning 208 is likely not what a user would want if they asked for the
�rst letter of this string; however, that’s the only data that Rust has at byte index 0. Users generally
don’t want the byte value returned, even if the string contains only Latin letters: if &"hello"[0]
were valid code that returned the byte value, it would return 104 , not h . To avoid returning an
unexpected value and causing bugs that might not be discovered immediately, Rust doesn’t
compile this code at all and prevents misunderstandings early in the development process.
Another point about UTF-8 is that there are actually three relevant ways to look at strings from
Rust’s perspective: as bytes, scalar values, and grapheme clusters (the closest thing to what we
would call letters).
If we look at the Hindi word “नम�ते” written in the Devanagari script, it is stored as a vector of u8
values that looks like this:
[224, 164, 168, 224, 164, 174, 224, 164, 184, 224, 165, 141, 224, 164, 164,
224, 165, 135]
That’s 18 bytes and is how computers ultimately store this data. If we look at them as Unicode
scalar values, which are what Rust’s char type is, those bytes look like this:
There are six char values here, but the fourth and sixth are not letters: they’re diacritics that don’t
make sense on their own. Finally, if we look at them as grapheme clusters, we’d get what a person
would call the four letters that make up the Hindi word:
Rust provides di�erent ways of interpreting the raw string data that computers store so that each
program can choose the interpretation it needs, no matter what human language the data is in.
A �nal reason Rust doesn’t allow us to index into a String to get a character is that indexing
operations are expected to always take constant time (O(1)). But it isn’t possible to guarantee that
performance with a String , because Rust would have to walk through the contents from the
beginning to the index to determine how many valid characters there were.
Slicing Strings
Indexing into a string is often a bad idea because it’s not clear what the return type of the string-
indexing operation should be: a byte value, a character, a grapheme cluster, or a string slice.
Therefore, Rust asks you to be more speci�c if you really need to use indices to create string slices.
To be more speci�c in your indexing and indicate that you want a string slice, rather than indexing
using [] with a single number, you can use [] with a range to create a string slice containing
particular bytes:
let hello = "Здравствуйте";
let s = &hello[0..4];
Here, s will be a &str that contains the �rst 4 bytes of the string. Earlier, we mentioned that each
of these characters was 2 bytes, which means s will be Зд .
What would happen if we used &hello[0..1] ? The answer: Rust would panic at runtime in the
same way as if an invalid index were accessed in a vector:
thread 'main' panicked at 'byte index 1 is not a char boundary; it is inside 'З'
(bytes 0..2) of `Здравствуйте`', src/libcore/str/mod.rs:2188:4
You should use ranges to create string slices with caution, because doing so can crash your
program.
If you need to perform operations on individual Unicode scalar values, the best way to do so is to
use the chars method. Calling chars on “नम�ते” separates out and returns six values of type
char , and you can iterate over the result to access each element:
for c in "नम�ते".chars() {
println!("{}", c);
}
न
म
स
◌्
त
◌े
The bytes method returns each raw byte, which might be appropriate for your domain:
for b in "नम�ते".bytes() {
println!("{}", b);
}
This code will print the 18 bytes that make up this String :
224
164
// --snip--
165
135
But be sure to remember that valid Unicode scalar values may be made up of more than 1 byte.
Getting grapheme clusters from strings is complex, so this functionality is not provided by the
standard library. Crates are available on crates.io if this is the functionality you need.
To summarize, strings are complicated. Di�erent programming languages make di�erent choices
about how to present this complexity to the programmer. Rust has chosen to make the correct
handling of String data the default behavior for all Rust programs, which means programmers
have to put more thought into handling UTF-8 data upfront. This trade-o� exposes more of the
complexity of strings than is apparent in other programming languages, but it prevents you from
having to handle errors involving non-ASCII characters later in your development life cycle.
Hash maps are useful when you want to look up data not by using an index, as you can with
vectors, but by using a key that can be of any type. For example, in a game, you could keep track
of each team’s score in a hash map in which each key is a team’s name and the values are each
team’s score. Given a team name, you can retrieve its score.
We’ll go over the basic API of hash maps in this section, but many more goodies are hiding in the
functions de�ned on HashMap<K, V> by the standard library. As always, check the standard library
documentation for more information.
You can create an empty hash map with new and add elements with insert . In Listing 8-20, we’re
keeping track of the scores of two teams whose names are Blue and Yellow. The Blue team starts
with 10 points, and the Yellow team starts with 50.
use std::collections::HashMap;
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
Listing 8-20: Creating a new hash map and inserting some keys and values
Note that we need to �rst use the HashMap from the collections portion of the standard library.
Of our three common collections, this one is the least often used, so it’s not included in the
features brought into scope automatically in the prelude. Hash maps also have less support from
the standard library; there’s no built-in macro to construct them, for example.
Just like vectors, hash maps store their data on the heap. This HashMap has keys of type String
and values of type i32 . Like vectors, hash maps are homogeneous: all of the keys must have the
same type, and all of the values must have the same type.
Another way of constructing a hash map is by using the collect method on a vector of tuples,
where each tuple consists of a key and its value. The collect method gathers data into a number
of collection types, including HashMap . For example, if we had the team names and initial scores in
two separate vectors, we could use the zip method to create a vector of tuples where “Blue” is
paired with 10, and so forth. Then we could use the collect method to turn that vector of tuples
into a hash map, as shown in Listing 8-21.
use std::collections::HashMap;
Listing 8-21: Creating a hash map from a list of teams and a list of scores
The type annotation HashMap<_, _> is needed here because it’s possible to collect into many
di�erent data structures and Rust doesn’t know which you want unless you specify. For the
parameters for the key and value types, however, we use underscores, and Rust can infer the
types that the hash map contains based on the types of the data in the vectors.
For types that implement the Copy trait, like i32 , the values are copied into the hash map. For
owned values like String , the values will be moved and the hash map will be the owner of those
values, as demonstrated in Listing 8-22.
use std::collections::HashMap;
Listing 8-22: Showing that keys and values are owned by the hash map once they’re inserted
We aren’t able to use the variables field_name and field_value after they’ve been moved into
the hash map with the call to insert .
If we insert references to values into the hash map, the values won’t be moved into the hash map.
The values that the references point to must be valid for at least as long as the hash map is valid.
We’ll talk more about these issues in the “Validating References with Lifetimes” section in Chapter
10.
We can get a value out of the hash map by providing its key to the get method, as shown in
Listing 8-23.
use std::collections::HashMap;
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
Listing 8-23: Accessing the score for the Blue team stored in the hash map
Here, score will have the value that’s associated with the Blue team, and the result will be
Some(&10) . The result is wrapped in Some because get returns an Option<&V> ; if there’s no
value for that key in the hash map, get will return None . The program will need to handle the
Option in one of the ways that we covered in Chapter 6.
We can iterate over each key/value pair in a hash map in a similar manner as we do with vectors,
using a for loop:
use std::collections::HashMap;
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Yellow"), 50);
Yellow: 50
Blue: 10
Although the number of keys and values is growable, each key can only have one value associated
with it at a time. When you want to change the data in a hash map, you have to decide how to
handle the case when a key already has a value assigned. You could replace the old value with the
new value, completely disregarding the old value. You could keep the old value and ignore the
new value, only adding the new value if the key doesn’t already have a value. Or you could combine
the old value and the new value. Let’s look at how to do each of these!
Overwriting a Value
If we insert a key and a value into a hash map and then insert that same key with a di�erent value,
the value associated with that key will be replaced. Even though the code in Listing 8-24 calls
insert twice, the hash map will only contain one key/value pair because we’re inserting the value
for the Blue team’s key both times.
use std::collections::HashMap;
scores.insert(String::from("Blue"), 10);
scores.insert(String::from("Blue"), 25);
println!("{:?}", scores);
This code will print {"Blue": 25} . The original value of 10 has been overwritten.
It’s common to check whether a particular key has a value and, if it doesn’t, insert a value for it.
Hash maps have a special API for this called entry that takes the key you want to check as a
parameter. The return value of the entry method is an enum called Entry that represents a
value that might or might not exist. Let’s say we want to check whether the key for the Yellow
team has a value associated with it. If it doesn’t, we want to insert the value 50, and the same for
the Blue team. Using the entry API, the code looks like Listing 8-25.
use std::collections::HashMap;
scores.entry(String::from("Yellow")).or_insert(50);
scores.entry(String::from("Blue")).or_insert(50);
println!("{:?}", scores);
Listing 8-25: Using the entry method to only insert if the key does not already have a value
The or_insert method on Entry is de�ned to return a mutable reference to the value for the
corresponding Entry key if that key exists, and if not, inserts the parameter as the new value for
this key and returns a mutable reference to the new value. This technique is much cleaner than
writing the logic ourselves and, in addition, plays more nicely with the borrow checker.
Running the code in Listing 8-25 will print {"Yellow": 50, "Blue": 10} . The �rst call to entry
will insert the key for the Yellow team with the value 50 because the Yellow team doesn’t have a
value already. The second call to entry will not change the hash map because the Blue team
already has the value 10.
Another common use case for hash maps is to look up a key’s value and then update it based on
the old value. For instance, Listing 8-26 shows code that counts how many times each word
appears in some text. We use a hash map with the words as keys and increment the value to keep
track of how many times we’ve seen that word. If it’s the �rst time we’ve seen a word, we’ll �rst
insert the value 0.
use std::collections::HashMap;
println!("{:?}", map);
Listing 8-26: Counting occurrences of words using a hash map that stores words and counts
This code will print {"world": 2, "hello": 1, "wonderful": 1} . The or_insert method
actually returns a mutable reference ( &mut V ) to the value for this key. Here we store that
mutable reference in the count variable, so in order to assign to that value, we must �rst
dereference count using the asterisk ( * ). The mutable reference goes out of scope at the end of
the for loop, so all of these changes are safe and allowed by the borrowing rules.
Hashing Functions
By default, HashMap uses a “cryptographically strong”1 hashing function that can provide
resistance to Denial of Service (DoS) attacks. This is not the fastest hashing algorithm available, but
the trade-o� for better security that comes with the drop in performance is worth it. If you pro�le
your code and �nd that the default hash function is too slow for your purposes, you can switch to
another function by specifying a di�erent hasher. A hasher is a type that implements the
BuildHasher trait. We’ll talk about traits and how to implement them in Chapter 10. You don’t
necessarily have to implement your own hasher from scratch; crates.io has libraries shared by
other Rust users that provide hashers implementing many common hashing algorithms.
1 https://www.131002.net/siphash/siphash.pdf
Summary
Vectors, strings, and hash maps will provide a large amount of functionality necessary in programs
when you need to store, access, and modify data. Here are some exercises you should now be
equipped to solve:
Given a list of integers, use a vector and return the mean (the average value), median (when
sorted, the value in the middle position), and mode (the value that occurs most often; a
hash map will be helpful here) of the list.
Convert strings to pig latin. The �rst consonant of each word is moved to the end of the
word and “ay” is added, so “�rst” becomes “irst-fay.” Words that start with a vowel have “hay”
added to the end instead (“apple” becomes “apple-hay”). Keep in mind the details about
UTF-8 encoding!
Using a hash map and vectors, create a text interface to allow a user to add employee
names to a department in a company. For example, “Add Sally to Engineering” or “Add Amir
to Sales.” Then let the user retrieve a list of all people in a department or all people in the
company by department, sorted alphabetically.
The standard library API documentation describes methods that vectors, strings, and hash maps
have that will be helpful for these exercises!
We’re getting into more complex programs in which operations can fail, so, it’s a perfect time to
discuss error handling. We’ll do that next!
Error Handling
Rust’s commitment to reliability extends to error handling. Errors are a fact of life in software, so
Rust has a number of features for handling situations in which something goes wrong. In many
cases, Rust requires you to acknowledge the possibility of an error and take some action before
your code will compile. This requirement makes your program more robust by ensuring that you’ll
discover errors and handle them appropriately before you’ve deployed your code to production!
Rust groups errors into two major categories: recoverable and unrecoverable errors. For a
recoverable error, such as a �le not found error, it’s reasonable to report the problem to the user
and retry the operation. Unrecoverable errors are always symptoms of bugs, like trying to access a
location beyond the end of an array.
Most languages don’t distinguish between these two kinds of errors and handle both in the same
way, using mechanisms such as exceptions. Rust doesn’t have exceptions. Instead, it has the type
Result<T, E> for recoverable errors and the panic! macro that stops execution when the
program encounters an unrecoverable error. This chapter covers calling panic! �rst and then
talks about returning Result<T, E> values. Additionally, we’ll explore considerations when
deciding whether to try to recover from an error or to stop execution.
By default, when a panic occurs, the program starts unwinding, which means Rust walks
back up the stack and cleans up the data from each function it encounters. But this walking
back and cleanup is a lot of work. The alternative is to immediately abort, which ends the
program without cleaning up. Memory that the program was using will then need to be
cleaned up by the operating system. If in your project you need to make the resulting binary
as small as possible, you can switch from unwinding to aborting upon a panic by adding
panic = 'abort' to the appropriate [profile] sections in your Cargo.toml �le. For
example, if you want to abort on panic in release mode, add this:
[profile.release]
panic = 'abort'
Filename: src/main.rs
fn main() {
panic!("crash and burn");
}
When you run the program, you’ll see something like this:
$ cargo run
Compiling panic v0.1.0 (file:///projects/panic)
Finished dev [unoptimized + debuginfo] target(s) in 0.25 secs
Running `target/debug/panic`
thread 'main' panicked at 'crash and burn', src/main.rs:2:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.
The call to panic! causes the error message contained in the last two lines. The �rst line shows
our panic message and the place in our source code where the panic occurred: src/main.rs:2:4
indicates that it’s the second line, fourth character of our src/main.rs �le.
In this case, the line indicated is part of our code, and if we go to that line, we see the panic!
macro call. In other cases, the panic! call might be in code that our code calls, and the �lename
and line number reported by the error message will be someone else’s code where the panic!
macro is called, not the line of our code that eventually led to the panic! call. We can use the
backtrace of the functions the panic! call came from to �gure out the part of our code that is
causing the problem. We’ll discuss what a backtrace is in more detail next.
Let’s look at another example to see what it’s like when a panic! call comes from a library
because of a bug in our code instead of from our code calling the macro directly. Listing 9-1 has
Filename: src/main.rs
fn main() {
let v = vec![1, 2, 3];
v[99];
}
Listing 9-1: Attempting to access an element beyond the end of a vector, which will cause a call to panic!
Here, we’re attempting to access the 100th element of our vector (which is at index 99 because
indexing starts at zero), but it has only 3 elements. In this situation, Rust will panic. Using [] is
supposed to return an element, but if you pass an invalid index, there’s no element that Rust
could return here that would be correct.
Other languages, like C, will attempt to give you exactly what you asked for in this situation, even
though it isn’t what you want: you’ll get whatever is at the location in memory that would
correspond to that element in the vector, even though the memory doesn’t belong to the vector.
This is called a bu�er overread and can lead to security vulnerabilities if an attacker is able to
manipulate the index in such a way as to read data they shouldn’t be allowed to that is stored
after the array.
To protect your program from this sort of vulnerability, if you try to read an element at an index
that doesn’t exist, Rust will stop execution and refuse to continue. Let’s try it and see:
$ cargo run
Compiling panic v0.1.0 (file:///projects/panic)
Finished dev [unoptimized + debuginfo] target(s) in 0.27 secs
Running `target/debug/panic`
thread 'main' panicked at 'index out of bounds: the len is 3 but the index is
99', /checkout/src/liballoc/vec.rs:1555:10
note: Run with `RUST_BACKTRACE=1` for a backtrace.
This error points at a �le we didn’t write, vec.rs. That’s the implementation of Vec<T> in the
standard library. The code that gets run when we use [] on our vector v is in vec.rs, and that is
where the panic! is actually happening.
The next note line tells us that we can set the RUST_BACKTRACE environment variable to get a
backtrace of exactly what happened to cause the error. A backtrace is a list of all the functions that
have been called to get to this point. Backtraces in Rust work as they do in other languages: the
key to reading the backtrace is to start from the top and read until you see �les you wrote. That’s
the spot where the problem originated. The lines above the lines mentioning your �les are code
that your code called; the lines below are code that called your code. These lines might include
core Rust code, standard library code, or crates that you’re using. Let’s try getting a backtrace by
setting the RUST_BACKTRACE environment variable to any value except 0. Listing 9-2 shows output
similar to what you’ll see.
Listing 9-2: The backtrace generated by a call to panic! displayed when the environment variable
RUST_BACKTRACE is set
That’s a lot of output! The exact output you see might be di�erent depending on your operating
system and Rust version. In order to get backtraces with this information, debug symbols must be
enabled. Debug symbols are enabled by default when using cargo build or cargo run without
the --release �ag, as we have here.
In the output in Listing 9-2, line 11 of the backtrace points to the line in our project that’s causing
the problem: line 4 of src/main.rs. If we don’t want our program to panic, the location pointed to by
the �rst line mentioning a �le we wrote is where we should start investigating. In Listing 9-1, where
we deliberately wrote code that would panic in order to demonstrate how to use backtraces, the
way to �x the panic is to not request an element at index 99 from a vector that only contains 3
items. When your code panics in the future, you’ll need to �gure out what action the code is taking
with what values to cause the panic and what the code should do instead.
We’ll come back to panic! and when we should and should not use panic! to handle error
conditions in the “To panic! or Not to panic! ” section later in this chapter. Next, we’ll look at
how to recover from an error using Result .
Recall from “Handling Potential Failure with the Result Type” in Chapter 2 that the Result enum
is de�ned as having two variants, Ok and Err , as follows:
enum Result<T, E> {
Ok(T),
Err(E),
}
The T and E are generic type parameters: we’ll discuss generics in more detail in Chapter 10.
What you need to know right now is that T represents the type of the value that will be returned
in a success case within the Ok variant, and E represents the type of the error that will be
returned in a failure case within the Err variant. Because Result has these generic type
parameters, we can use the Result type and the functions that the standard library has de�ned
on it in many di�erent situations where the successful value and error value we want to return
may di�er.
Let’s call a function that returns a Result value because the function could fail. In Listing 9-3 we
Filename: src/main.rs
use std::fs::File;
fn main() {
let f = File::open("hello.txt");
}
How do we know File::open returns a Result ? We could look at the standard library API
documentation, or we could ask the compiler! If we give f a type annotation that we know is not
the return type of the function and then try to compile the code, the compiler will tell us that the
types don’t match. The error message will then tell us what the type of f is. Let’s try it! We know
that the return type of File::open isn’t of type u32 , so let’s change the let f statement to this:
This tells us the return type of the File::open function is a Result<T, E> . The generic
parameter T has been �lled in here with the type of the success value, std::fs::File , which is a
�le handle. The type of E used in the error value is std::io::Error .
This return type means the call to File::open might succeed and return a �le handle that we can
read from or write to. The function call also might fail: for example, the �le might not exist, or we
might not have permission to access the �le. The File::open function needs to have a way to tell
us whether it succeeded or failed and at the same time give us either the �le handle or error
information. This information is exactly what the Result enum conveys.
In the case where File::open succeeds, the value in the variable f will be an instance of Ok that
contains a �le handle. In the case where it fails, the value in f will be an instance of Err that
contains more information about the kind of error that happened.
We need to add to the code in Listing 9-3 to take di�erent actions depending on the value
File::open returns. Listing 9-4 shows one way to handle the Result using a basic tool, the
match expression that we discussed in Chapter 6.
Filename: src/main.rs
use std::fs::File;
fn main() {
let f = File::open("hello.txt");
let f = match f {
Ok(file) => file,
Err(error) => {
panic!("There was a problem opening the file: {:?}", error)
},
};
}
Listing 9-4: Using a match expression to handle the Result variants that might be returned
Note that, like the Option enum, the Result enum and its variants have been brought into scope
by the prelude, so we don’t need to specify Result:: before the Ok and Err variants in the
match arms.
Here we tell Rust that when the result is Ok , return the inner file value out of the Ok variant,
and we then assign that �le handle value to the variable f . After the match , we can use the �le
handle for reading or writing.
The other arm of the match handles the case where we get an Err value from File::open . In
this example, we’ve chosen to call the panic! macro. If there’s no �le named hello.txt in our
current directory and we run this code, we’ll see the following output from the panic! macro:
thread 'main' panicked at 'There was a problem opening the file: Error { repr:
Os { code: 2, message: "No such file or directory" } }', src/main.rs:9:12
The code in Listing 9-4 will panic! no matter why File::open failed. What we want to do instead
is take di�erent actions for di�erent failure reasons: if File::open failed because the �le doesn’t
exist, we want to create the �le and return the handle to the new �le. If File::open failed for any
other reason—for example, because we didn’t have permission to open the �le—we still want the
code to panic! in the same way as it did in Listing 9-4. Look at Listing 9-5, which adds another
arm to the match .
Filename: src/main.rs
use std::fs::File;
use std::io::ErrorKind;
fn main() {
let f = File::open("hello.txt");
let f = match f {
Ok(file) => file,
Err(error) => match error.kind() {
ErrorKind::NotFound => match File::create("hello.txt") {
Ok(fc) => fc,
Err(e) => panic!("Tried to create file but there was a problem:
{:?}", e),
},
other_error => panic!("There was a problem opening the file: {:?}",
other_error),
},
};
}
The type of the value that File::open returns inside the Err variant is io::Error , which is a
struct provided by the standard library. This struct has a method kind that we can call to get an
io::ErrorKind value. The enum io::ErrorKind is provided by the standard library and has
variants representing the di�erent kinds of errors that might result from an io operation. The
variant we want to use is ErrorKind::NotFound , which indicates the �le we’re trying to open
doesn’t exist yet. So we match on f , but we also have an inner match on error.kind() .
The condition we want to check in the match guard is whether the value returned by
error.kind() is the NotFound variant of the ErrorKind enum. If it is, we try to create the �le
with File::create . However, because File::create could also fail, we need to add another
inner match expression as well. When the �le can’t be opened, a di�erent error message will be
printed. The last arm of the outer match stays the same so the program panics on any error
besides the missing �le error.
That’s a lot of match ! match is very powerful, but also very much a primitive. In Chapter 13, we’ll
learn about closures. The Result<T, E> type has many methods that accept a closure, and are
implemented as match expressions. A more seasoned Rustacean might write this:
use std::fs::File;
use std::io::ErrorKind;
fn main() {
let f = File::open("hello.txt").map_err(|error| {
if error.kind() == ErrorKind::NotFound {
File::create("hello.txt").unwrap_or_else(|error| {
panic!("Tried to create file but there was a problem: {:?}", error);
})
} else {
panic!("There was a problem opening the file: {:?}", error);
}
});
}
Come back to this example after you’ve read Chapter 13, and look up what the map_err and
unwrap_or_else methods do in the standard library documentation. There’s many more of these
methods that can clean up huge nested match expressions when dealing with errors.
Using match works well enough, but it can be a bit verbose and doesn’t always communicate
intent well. The Result<T, E> type has many helper methods de�ned on it to do various tasks.
One of those methods, called unwrap , is a shortcut method that is implemented just like the
match expression we wrote in Listing 9-4. If the Result value is the Ok variant, unwrap will
return the value inside the Ok . If the Result is the Err variant, unwrap will call the panic!
macro for us. Here is an example of unwrap in action:
Filename: src/main.rs
use std::fs::File;
fn main() {
let f = File::open("hello.txt").unwrap();
}
If we run this code without a hello.txt �le, we’ll see an error message from the panic! call that the
unwrap method makes:
Another method, expect , which is similar to unwrap , lets us also choose the panic! error
message. Using expect instead of unwrap and providing good error messages can convey your
intent and make tracking down the source of a panic easier. The syntax of expect looks like this:
Filename: src/main.rs
use std::fs::File;
fn main() {
let f = File::open("hello.txt").expect("Failed to open hello.txt");
}
We use expect in the same way as unwrap : to return the �le handle or call the panic! macro.
The error message used by expect in its call to panic! will be the parameter that we pass to
expect , rather than the default panic! message that unwrap uses. Here’s what it looks like:
Because this error message starts with the text we speci�ed, Failed to open hello.txt , it will be
easier to �nd where in the code this error message is coming from. If we use unwrap in multiple
places, it can take more time to �gure out exactly which unwrap is causing the panic because all
unwrap calls that panic print the same message.
Propagating Errors
When you’re writing a function whose implementation calls something that might fail, instead of
handling the error within this function, you can return the error to the calling code so that it can
decide what to do. This is known as propagating the error and gives more control to the calling
code, where there might be more information or logic that dictates how the error should be
handled than what you have available in the context of your code.
For example, Listing 9-6 shows a function that reads a username from a �le. If the �le doesn’t exist
or can’t be read, this function will return those errors to the code that called this function:
Filename: src/main.rs
use std::io;
use std::io::Read;
use std::fs::File;
match f.read_to_string(&mut s) {
Ok(_) => Ok(s),
Err(e) => Err(e),
}
}
Listing 9-6: A function that returns errors to the calling code using match
This function can be written in a much shorter way, but we’re going to start by doing a lot of it
manually in order to explore error handling; at the end, we’ll show the easy way. Let’s look at the
return type of the function �rst: Result<String, io::Error> . This means the function is
returning a value of the type Result<T, E> where the generic parameter T has been �lled in
with the concrete type String , and the generic type E has been �lled in with the concrete type
io::Error . If this function succeeds without any problems, the code that calls this function will
receive an Ok value that holds a String —the username that this function read from the �le. If
this function encounters any problems, the code that calls this function will receive an Err value
that holds an instance of io::Error that contains more information about what the problems
were. We chose io::Error as the return type of this function because that happens to be the
type of the error value returned from both of the operations we’re calling in this function’s body
that might fail: the File::open function and the read_to_string method.
The body of the function starts by calling the File::open function. Then we handle the Result
value returned with a match similar to the match in Listing 9-4, only instead of calling panic! in
the Err case, we return early from this function and pass the error value from File::open back
to the calling code as this function’s error value. If File::open succeeds, we store the �le handle
in the variable f and continue.
Then we create a new String in variable s and call the read_to_string method on the �le
handle in f to read the contents of the �le into s . The read_to_string method also returns a
Result because it might fail, even though File::open succeeded. So we need another match to
handle that Result : if read_to_string succeeds, then our function has succeeded, and we
return the username from the �le that’s now in s wrapped in an Ok . If read_to_string fails, we
return the error value in the same way that we returned the error value in the match that handled
the return value of File::open . However, we don’t need to explicitly say return , because this is
the last expression in the function.
The code that calls this code will then handle getting either an Ok value that contains a username
or an Err value that contains an io::Error . We don’t know what the calling code will do with
those values. If the calling code gets an Err value, it could call panic! and crash the program,
use a default username, or look up the username from somewhere other than a �le, for example.
We don’t have enough information on what the calling code is actually trying to do, so we
propagate all the success or error information upward for it to handle appropriately.
This pattern of propagating errors is so common in Rust that Rust provides the question mark
operator ? to make this easier.
Listing 9-7 shows an implementation of read_username_from_file that has the same functionality
as it had in Listing 9-6, but this implementation uses the question mark operator:
Filename: src/main.rs
use std::io;
use std::io::Read;
use std::fs::File;
Listing 9-7: A function that returns errors to the calling code using ?
The ? placed after a Result value is de�ned to work in almost the same way as the match
expressions we de�ned to handle the Result values in Listing 9-6. If the value of the Result is an
Ok , the value inside the Ok will get returned from this expression, and the program will continue.
If the value is an Err , the Err will be returned from the whole function as if we had used the
return keyword so the error value gets propagated to the calling code.
There is a di�erence between what the match expression from Listing 9-6 and ? do: error values
taken by ? go through the from function, de�ned in the From trait in the standard library, which
is used to convert errors from one type into another. When ? calls the from function, the error
type received is converted into the error type de�ned in the return type of the current function.
This is useful when a function returns one error type to represent all the ways a function might
fail, even if parts might fail for many di�erent reasons. As long as each error type implements the
from function to de�ne how to convert itself to the returned error type, ? takes care of the
conversion automatically.
In the context of Listing 9-7, the ? at the end of the File::open call will return the value inside an
Ok to the variable f . If an error occurs, ? will return early out of the whole function and give any
Err value to the calling code. The same thing applies to the ? at the end of the read_to_string
call.
The ? operator eliminates a lot of boilerplate and makes this function’s implementation simpler.
We could even shorten this code further by chaining method calls immediately after the ? , as
shown in Listing 9-8:
Filename: src/main.rs
use std::io;
use std::io::Read;
use std::fs::File;
File::open("hello.txt")?.read_to_string(&mut s)?;
Ok(s)
}
We’ve moved the creation of the new String in s to the beginning of the function; that part
hasn’t changed. Instead of creating a variable f , we’ve chained the call to read_to_string
directly onto the result of File::open("hello.txt")? . We still have a ? at the end of the
read_to_string call, and we still return an Ok value containing the username in s when both
File::open and read_to_string succeed rather than returning errors. The functionality is again
the same as in Listing 9-6 and Listing 9-7; this is just a di�erent, more ergonomic way to write it.
Speaking of di�erent ways to write this function, there’s a way to make this even shorter:
Filename: src/main.rs
use std::io;
use std::fs;
Reading a �le into a string is a fairly common operation, and so Rust provides a convenience
function called fs::read_to_string that will open the �le, create a new String , read the
contents of the �le, and put the contents into that String , and then return it. Of course, this
doesn’t give us the opportunity to show o� all of this error handling, so we did it the hard way at
�rst.
The ? operator can only be used in functions that have a return type of Result , because it is
de�ned to work in the same way as the match expression we de�ned in Listing 9-6. The part of
the match that requires a return type of Result is return Err(e) , so the return type of the
function must be a Result to be compatible with this return .
Let’s look at what happens if we use ? in the main function, which you’ll recall has a return type
of () :
use std::fs::File;
fn main() {
let f = File::open("hello.txt")?;
}
error[E0277]: the `?` operator can only be used in a function that returns `Result`
or `Option` (or another type that implements `std::ops::Try`)
--> src/main.rs:4:13
|
4 | let f = File::open("hello.txt")?;
| ^^^^^^^^^^^^^^^^^^^^^^^^ cannot use the `?` operator in a function
that returns `()`
|
= help: the trait `std::ops::Try` is not implemented for `()`
= note: required by `std::ops::Try::from_error`
This error points out that we’re only allowed to use ? in a function that returns Result<T, E> . In
functions that don’t return Result<T, E> , when you call other functions that return
Result<T, E> , you’ll need to use a match or one of the Result<T, E> methods to handle the
Result<T, E> instead of using ? to potentially propagate the error to the calling code.
use std::error::Error;
use std::fs::File;
Ok(())
}
The Box<dyn Error> is called a “trait object”, which we’ll talk about in Chapter 17. For now, you
can read Box<dyn Error> to mean “any kind of error.”
Now that we’ve discussed the details of calling panic! or returning Result , let’s return to the
topic of how to decide which is appropriate to use in which cases.
In rare situations, it’s more appropriate to write code that panics instead of returning a Result .
Let’s explore why it’s appropriate to panic in examples, prototype code, and tests. Then we’ll
discuss situations in which the compiler can’t tell that failure is impossible, but you as a human
can. The chapter will conclude with some general guidelines on how to decide whether to panic in
library code.
When you’re writing an example to illustrate some concept, having robust error-handling code in
the example as well can make the example less clear. In examples, it’s understood that a call to a
method like unwrap that could panic is meant as a placeholder for the way you’d want your
application to handle errors, which can di�er based on what the rest of your code is doing.
Similarly, the unwrap and expect methods are very handy when prototyping, before you’re ready
to decide how to handle errors. They leave clear markers in your code for when you’re ready to
make your program more robust.
If a method call fails in a test, you’d want the whole test to fail, even if that method isn’t the
functionality under test. Because panic! is how a test is marked as a failure, calling unwrap or
expect is exactly what should happen.
It would also be appropriate to call unwrap when you have some other logic that ensures the
Result will have an Ok value, but the logic isn’t something the compiler understands. You’ll still
have a Result value that you need to handle: whatever operation you’re calling still has the
possibility of failing in general, even though it’s logically impossible in your particular situation. If
you can ensure by manually inspecting the code that you’ll never have an Err variant, it’s
perfectly acceptable to call unwrap . Here’s an example:
use std::net::IpAddr;
We’re creating an IpAddr instance by parsing a hardcoded string. We can see that 127.0.0.1 is a
valid IP address, so it’s acceptable to use unwrap here. However, having a hardcoded, valid string
doesn’t change the return type of the parse method: we still get a Result value, and the
compiler will still make us handle the Result as if the Err variant is a possibility because the
compiler isn’t smart enough to see that this string is always a valid IP address. If the IP address
string came from a user rather than being hardcoded into the program and therefore did have a
possibility of failure, we’d de�nitely want to handle the Result in a more robust way instead.
It’s advisable to have your code panic when it’s possible that your code could end up in a bad
state. In this context, a bad state is when some assumption, guarantee, contract, or invariant has
been broken, such as when invalid values, contradictory values, or missing values are passed to
your code—plus one or more of the following:
If someone calls your code and passes in values that don’t make sense, the best choice might be
to call panic! and alert the person using your library to the bug in their code so they can �x it
during development. Similarly, panic! is often appropriate if you’re calling external code that is
out of your control and it returns an invalid state that you have no way of �xing.
However, when failure is expected, it is more appropriate to return a Result than to make a
panic! call. Examples include a parser being given malformed data or an HTTP request returning
a status that indicates you have hit a rate limit. In these cases, returning a Result indicates that
failure is an expected possibility that the calling code must decide how to handle.
When your code performs operations on values, your code should verify the values are valid �rst
and panic if the values aren’t valid. This is mostly for safety reasons: attempting to operate on
invalid data can expose your code to vulnerabilities. This is the main reason the standard library
will call panic! if you attempt an out-of-bounds memory access: trying to access memory that
doesn’t belong to the current data structure is a common security problem. Functions often have
contracts: their behavior is only guaranteed if the inputs meet particular requirements. Panicking
when the contract is violated makes sense because a contract violation always indicates a caller-
side bug and it’s not a kind of error you want the calling code to have to explicitly handle. In fact,
there’s no reasonable way for calling code to recover; the calling programmers need to �x the
code. Contracts for a function, especially when a violation will cause a panic, should be explained
in the API documentation for the function.
However, having lots of error checks in all of your functions would be verbose and annoying.
Fortunately, you can use Rust’s type system (and thus the type checking the compiler does) to do
many of the checks for you. If your function has a particular type as a parameter, you can proceed
with your code’s logic knowing that the compiler has already ensured you have a valid value. For
example, if you have a type rather than an Option , your program expects to have something
rather than nothing. Your code then doesn’t have to handle two cases for the Some and None
variants: it will only have one case for de�nitely having a value. Code trying to pass nothing to your
function won’t even compile, so your function doesn’t have to check for that case at runtime.
Another example is using an unsigned integer type such as u32 , which ensures the parameter is
never negative.
Let’s take the idea of using Rust’s type system to ensure we have a valid value one step further and
look at creating a custom type for validation. Recall the guessing game in Chapter 2 in which our
code asked the user to guess a number between 1 and 100. We never validated that the user’s
guess was between those numbers before checking it against our secret number; we only
validated that the guess was positive. In this case, the consequences were not very dire: our
output of “Too high” or “Too low” would still be correct. But it would be a useful enhancement to
guide the user toward valid guesses and have di�erent behavior when a user guesses a number
that’s out of range versus when a user types, for example, letters instead.
One way to do this would be to parse the guess as an i32 instead of only a u32 to allow
potentially negative numbers, and then add a check for the number being in range, like so:
loop {
// --snip--
match guess.cmp(&secret_number) {
// --snip--
}
The if expression checks whether our value is out of range, tells the user about the problem,
and calls continue to start the next iteration of the loop and ask for another guess. After the if
expression, we can proceed with the comparisons between guess and the secret number
knowing that guess is between 1 and 100.
However, this is not an ideal solution: if it was absolutely critical that the program only operated
on values between 1 and 100, and it had many functions with this requirement, having a check like
this in every function would be tedious (and might impact performance).
Instead, we can make a new type and put the validations in a function to create an instance of the
type rather than repeating the validations everywhere. That way, it’s safe for functions to use the
new type in their signatures and con�dently use the values they receive. Listing 9-10 shows one
way to de�ne a Guess type that will only create an instance of Guess if the new function receives
a value between 1 and 100:
pub struct Guess {
value: i32,
}
impl Guess {
pub fn new(value: i32) -> Guess {
if value < 1 || value > 100 {
panic!("Guess value must be between 1 and 100, got {}.", value);
}
Guess {
value
}
}
Listing 9-10: A Guess type that will only continue with values between 1 and 100
First, we de�ne a struct named Guess that has a �eld named value that holds a i32 . This is
where the number will be stored.
Then we implement an associated function named new on Guess that creates instances of Guess
values. The new function is de�ned to have one parameter named value of type i32 and to
return a Guess . The code in the body of the new function tests value to make sure it’s between 1
and 100. If value doesn’t pass this test, we make a panic! call, which will alert the programmer
who is writing the calling code that they have a bug they need to �x, because creating a Guess
with a value outside this range would violate the contract that Guess::new is relying on. The
conditions in which Guess::new might panic should be discussed in its public-facing API
documentation; we’ll cover documentation conventions indicating the possibility of a panic! in
the API documentation that you create in Chapter 14. If value does pass the test, we create a new
Guess with its value �eld set to the value parameter and return the Guess .
Next, we implement a method named value that borrows self , doesn’t have any other
parameters, and returns a i32 . This kind of method is sometimes called a getter, because its
purpose is to get some data from its �elds and return it. This public method is necessary because
the value �eld of the Guess struct is private. It’s important that the value �eld be private so
code using the Guess struct is not allowed to set value directly: code outside the module must
use the Guess::new function to create an instance of Guess , thereby ensuring there’s no way for
a Guess to have a value that hasn’t been checked by the conditions in the Guess::new function.
A function that has a parameter or returns only numbers between 1 and 100 could then declare in
its signature that it takes or returns a Guess rather than a i32 and wouldn’t need to do any
additional checks in its body.
Summary
Rust’s error handling features are designed to help you write more robust code. The panic!
macro signals that your program is in a state it can’t handle and lets you tell the process to stop
instead of trying to proceed with invalid or incorrect values. The Result enum uses Rust’s type
system to indicate that operations might fail in a way that your code could recover from. You can
use Result to tell code that calls your code that it needs to handle potential success or failure as
well. Using panic! and Result in the appropriate situations will make your code more reliable in
the face of inevitable problems.
Now that you’ve seen useful ways that the standard library uses generics with the Option and
Result enums, we’ll talk about how generics work and how you can use them in your code.
Similar to the way a function takes parameters with unknown values to run the same code on
multiple concrete values, functions can take parameters of some generic type instead of a
concrete type, like i32 or String . In fact, we’ve already used generics in Chapter 6 with
Option<T> , Chapter 8 with Vec<T> and HashMap<K, V> , and Chapter 9 with Result<T, E> . In
this chapter, you’ll explore how to de�ne your own types, functions, and methods with generics!
First, we’ll review how to extract a function to reduce code duplication. Next, we’ll use the same
technique to make a generic function from two functions that di�er only in the types of their
parameters. We’ll also explain how to use generic types in struct and enum de�nitions.
Then you’ll learn how to use traits to de�ne behavior in a generic way. You can combine traits with
generic types to constrain a generic type to only those types that have a particular behavior, as
opposed to just any type.
Finally, we’ll discuss lifetimes, a variety of generics that give the compiler information about how
references relate to each other. Lifetimes allow us to borrow values in many situations while still
enabling the compiler to check that the references are valid.
Consider a short program that �nds the largest number in a list, as shown in Listing 10-1.
Filename: src/main.rs
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
This code stores a list of integers in the variable number_list and places the �rst number in the
list in a variable named largest . Then it iterates through all the numbers in the list, and if the
current number is greater than the number stored in largest , it replaces the number in that
variable. However, if the current number is less than the largest number seen so far, the variable
doesn’t change, and the code moves on to the next number in the list. After considering all the
numbers in the list, largest should hold the largest number, which in this case is 100.
To �nd the largest number in two di�erent lists of numbers, we can duplicate the code in Listing
10-1 and use the same logic at two di�erent places in the program, as shown in Listing 10-2.
Filename: src/main.rs
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
Listing 10-2: Code to �nd the largest number in two lists of numbers
Although this code works, duplicating code is tedious and error prone. We also have to update the
code in multiple places when we want to change it.
To eliminate this duplication, we can create an abstraction by de�ning a function that operates on
any list of integers given to it in a parameter. This solution makes our code clearer and lets us
express the concept of �nding the largest number in a list abstractly.
In Listing 10-3, we extracted the code that �nds the largest number into a function named
largest . Unlike the code in Listing 10-1, which can �nd the largest number in only one particular
list, this program can �nd the largest number in two di�erent lists.
Filename: src/main.rs
largest
}
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
Listing 10-3: Abstracted code to �nd the largest number in two lists
The largest function has a parameter called list , which represents any concrete slice of i32
values that we might pass into the function. As a result, when we call the function, the code runs
on the speci�c values that we pass in.
In sum, here are the steps we took to change the code from Listing 10-2 to Listing 10-3:
Next, we’ll use these same steps with generics to reduce code duplication in di�erent ways. In the
same way that the function body can operate on an abstract list instead of speci�c values,
generics allow code to operate on abstract types.
For example, say we had two functions: one that �nds the largest item in a slice of i32 values and
one that �nds the largest item in a slice of char values. How would we eliminate that duplication?
Let’s �nd out!
In Function De�nitions
When de�ning a function that uses generics, we place the generics in the signature of the function
where we would usually specify the data types of the parameters and return value. Doing so
makes our code more �exible and provides more functionality to callers of our function while
preventing code duplication.
Continuing with our largest function, Listing 10-4 shows two functions that both �nd the largest
value in a slice.
Filename: src/main.rs
largest
}
largest
}
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
Listing 10-4: Two functions that di�er only in their names and the types in their signatures
The largest_i32 function is the one we extracted in Listing 10-3 that �nds the largest i32 in a
slice. The largest_char function �nds the largest char in a slice. The function bodies have the
same code, so let’s eliminate the duplication by introducing a generic type parameter in a single
function.
To parameterize the types in the new function we’ll de�ne, we need to name the type parameter,
just as we do for the value parameters to a function. You can use any identi�er as a type
parameter name. But we’ll use T because, by convention, parameter names in Rust are short,
often just a letter, and Rust’s type-naming convention is CamelCase. Short for “type,” T is the
default choice of most Rust programmers.
When we use a parameter in the body of the function, we have to declare the parameter name in
the signature so the compiler knows what that name means. Similarly, when we use a type
parameter name in a function signature, we have to declare the type parameter name before we
use it. To de�ne the generic largest function, place type name declarations inside angle
brackets, <> , between the name of the function and the parameter list, like this:
We read this de�nition as: the function largest is generic over some type T . This function has
one parameter named list , which is a slice of values of type T . The largest function will return
a value of the same type T .
Listing 10-5 shows the combined largest function de�nition using the generic data type in its
signature. The listing also shows how we can call the function with either a slice of i32 values or
char values. Note that this code won’t compile yet, but we’ll �x it later in this chapter.
Filename: src/main.rs
largest
}
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
Listing 10-5: A de�nition of the largest function that uses generic type parameters but doesn’t compile yet
The note mentions std::cmp::PartialOrd , which is a trait. We’ll talk about traits in the next
section. For now, this error states that the body of largest won’t work for all possible types that
T could be. Because we want to compare values of type T in the body, we can only use types
whose values can be ordered. To enable comparisons, the standard library has the
std::cmp::PartialOrd trait that you can implement on types (see Appendix C for more on this
trait). You’ll learn how to specify that a generic type has a particular trait in the “Trait Bounds”
section, but let’s �rst explore other ways of using generic type parameters.
In Struct De�nitions
We can also de�ne structs to use a generic type parameter in one or more �elds using the <>
syntax. Listing 10-6 shows how to de�ne a Point<T> struct to hold x and y coordinate values of
any type.
Filename: src/main.rs
struct Point<T> {
x: T,
y: T,
}
fn main() {
let integer = Point { x: 5, y: 10 };
let float = Point { x: 1.0, y: 4.0 };
}
The syntax for using generics in struct de�nitions is similar to that used in function de�nitions.
First, we declare the name of the type parameter inside angle brackets just after the name of the
struct. Then we can use the generic type in the struct de�nition where we would otherwise specify
concrete data types.
Note that because we’ve used only one generic type to de�ne Point<T> , this de�nition says that
the Point<T> struct is generic over some type T , and the �elds x and y are both that same
type, whatever that type may be. If we create an instance of a Point<T> that has values of
Filename: src/main.rs
struct Point<T> {
x: T,
y: T,
}
fn main() {
let wont_work = Point { x: 5, y: 4.0 };
}
Listing 10-7: The �elds x and y must be the same type because both have the same generic data type T .
In this example, when we assign the integer value 5 to x , we let the compiler know that the
generic type T will be an integer for this instance of Point<T> . Then when we specify 4.0 for y ,
which we’ve de�ned to have the same type as x , we’ll get a type mismatch error like this:
To de�ne a Point struct where x and y are both generics but could have di�erent types, we can
use multiple generic type parameters. For example, in Listing 10-8, we can change the de�nition of
Point to be generic over types T and U where x is of type T and y is of type U .
Filename: src/main.rs
fn main() {
let both_integer = Point { x: 5, y: 10 };
let both_float = Point { x: 1.0, y: 4.0 };
let integer_and_float = Point { x: 5, y: 4.0 };
}
Listing 10-8: A Point<T, U> generic over two types so that x and y can be values of di�erent types
Now all the instances of Point shown are allowed! You can use as many generic type parameters
in a de�nition as you want, but using more than a few makes your code hard to read. When you
need lots of generic types in your code, it could indicate that your code needs restructuring into
smaller pieces.
In Enum De�nitions
As we did with structs, we can de�ne enums to hold generic data types in their variants. Let’s take
another look at the Option<T> enum that the standard library provides, which we used in Chapter
6:
enum Option<T> {
Some(T),
None,
}
This de�nition should now make more sense to you. As you can see, Option<T> is an enum that is
generic over type T and has two variants: Some , which holds one value of type T , and a None
variant that doesn’t hold any value. By using the Option<T> enum, we can express the abstract
concept of having an optional value, and because Option<T> is generic, we can use this
abstraction no matter what the type of the optional value is.
Enums can use multiple generic types as well. The de�nition of the Result enum that we used in
Chapter 9 is one example:
enum Result<T, E> {
Ok(T),
Err(E),
}
The Result enum is generic over two types, T and E , and has two variants: Ok , which holds a
value of type T , and Err , which holds a value of type E . This de�nition makes it convenient to
use the Result enum anywhere we have an operation that might succeed (return a value of some
type T ) or fail (return an error of some type E ). In fact, this is what we used to open a �le in
Listing 9-3, where T was �lled in with the type std::fs::File when the �le was opened
successfully and E was �lled in with the type std::io::Error when there were problems
opening the �le.
When you recognize situations in your code with multiple struct or enum de�nitions that di�er
only in the types of the values they hold, you can avoid duplication by using generic types instead.
In Method De�nitions
We can implement methods on structs and enums (as we did in Chapter 5) and use generic types
in their de�nitions, too. Listing 10-9 shows the Point<T> struct we de�ned in Listing 10-6 with a
method named x implemented on it.
Filename: src/main.rs
struct Point<T> {
x: T,
y: T,
}
impl<T> Point<T> {
fn x(&self) -> &T {
&self.x
}
}
fn main() {
let p = Point { x: 5, y: 10 };
Listing 10-9: Implementing a method named x on the Point<T> struct that will return a reference to the x
�eld of type T
Here, we’ve de�ned a method named x on Point<T> that returns a reference to the data in the
�eld x .
Note that we have to declare T just after impl so we can use it to specify that we’re
implementing methods on the type Point<T> . By declaring T as a generic type after impl , Rust
can identify that the type in the angle brackets in Point is a generic type rather than a concrete
type.
We could, for example, implement methods only on Point<f32> instances rather than on
Point<T> instances with any generic type. In Listing 10-10 we use the concrete type f32 ,
meaning we don’t declare any types after impl .
impl Point<f32> {
fn distance_from_origin(&self) -> f32 {
(self.x.powi(2) + self.y.powi(2)).sqrt()
}
}
Listing 10-10: An impl block that only applies to a struct with a particular concrete type for the generic type
parameter T
This code means the type Point<f32> will have a method named distance_from_origin and
other instances of Point<T> where T is not of type f32 will not have this method de�ned. The
method measures how far our point is from the point at coordinates (0.0, 0.0) and uses
mathematical operations that are available only for �oating point types.
Generic type parameters in a struct de�nition aren’t always the same as those you use in that
struct’s method signatures. For example, Listing 10-11 de�nes the method mixup on the
Point<T, U> struct from Listing 10-8. The method takes another Point as a parameter, which
might have di�erent types than the self Point we’re calling mixup on. The method creates a
new Point instance with the x value from the self Point (of type T ) and the y value from
the passed-in Point (of type W ).
Filename: src/main.rs
fn main() {
let p1 = Point { x: 5, y: 10.4 };
let p2 = Point { x: "Hello", y: 'c'};
let p3 = p1.mixup(p2);
Listing 10-11: A method that uses di�erent generic types than its struct’s de�nition
In main , we’ve de�ned a Point that has an i32 for x (with value 5 ) and an f64 for y (with
value 10.4 ). The p2 variable is a Point struct that has a string slice for x (with value "Hello" )
and a char for y (with value c ). Calling mixup on p1 with the argument p2 gives us p3 , which
will have an i32 for x , because x came from p1 . The p3 variable will have a char for y ,
because y came from p2 . The println! macro call will print p3.x = 5, p3.y = c .
The purpose of this example is to demonstrate a situation in which some generic parameters are
declared with impl and some are declared with the method de�nition. Here, the generic
parameters T and U are declared after impl , because they go with the struct de�nition. The
generic parameters V and W are declared after fn mixup , because they’re only relevant to the
method.
You might be wondering whether there is a runtime cost when you’re using generic type
parameters. The good news is that Rust implements generics in such a way that your code doesn’t
run any slower using generic types than it would with concrete types.
Rust accomplishes this by performing monomorphization of the code that is using generics at
compile time. Monomorphization is the process of turning generic code into speci�c code by �lling
in the concrete types that are used when compiled.
In this process, the compiler does the opposite of the steps we used to create the generic function
in Listing 10-5: the compiler looks at all the places where generic code is called and generates code
for the concrete types the generic code is called with.
Let’s look at how this works with an example that uses the standard library’s Option<T> enum:
let integer = Some(5);
let float = Some(5.0);
When Rust compiles this code, it performs monomorphization. During that process, the compiler
reads the values that have been used in Option<T> instances and identi�es two kinds of
Option<T> : one is i32 and the other is f64 . As such, it expands the generic de�nition of
Option<T> into Option_i32 and Option_f64 , thereby replacing the generic de�nition with the
speci�c ones.
The monomorphized version of the code looks like the following. The generic Option<T> is
replaced with the speci�c de�nitions created by the compiler:
Filename: src/main.rs
enum Option_i32 {
Some(i32),
None,
}
enum Option_f64 {
Some(f64),
None,
}
fn main() {
let integer = Option_i32::Some(5);
let float = Option_f64::Some(5.0);
}
Because Rust compiles generic code into code that speci�es the type in each instance, we pay no
runtime cost for using generics. When the code runs, it performs just as it would if we had
duplicated each de�nition by hand. The process of monomorphization makes Rust’s generics
extremely e�cient at runtime.
Note: Traits are similar to a feature often called interfaces in other languages, although with
some di�erences.
De�ning a Trait
A type’s behavior consists of the methods we can call on that type. Di�erent types share the same
behavior if we can call the same methods on all of those types. Trait de�nitions are a way to group
method signatures together to de�ne a set of behaviors necessary to accomplish some purpose.
For example, let’s say we have multiple structs that hold various kinds and amounts of text: a
NewsArticle struct that holds a news story �led in a particular location and a Tweet that can
have at most 280 characters along with metadata that indicates whether it was a new tweet, a
retweet, or a reply to another tweet.
We want to make a media aggregator library that can display summaries of data that might be
stored in a NewsArticle or Tweet instance. To do this, we need a summary from each type, and
we need to request that summary by calling a summarize method on an instance. Listing 10-12
shows the de�nition of a Summary trait that expresses this behavior.
Filename: src/lib.rs
pub trait Summary {
fn summarize(&self) -> String;
}
Listing 10-12: A Summary trait that consists of the behavior provided by a summarize method
Here, we declare a trait using the trait keyword and then the trait’s name, which is Summary in
this case. Inside the curly brackets, we declare the method signatures that describe the behaviors
of the types that implement this trait, which in this case is fn summarize(&self) -> String .
After the method signature, instead of providing an implementation within curly brackets, we use
a semicolon. Each type implementing this trait must provide its own custom behavior for the body
of the method. The compiler will enforce that any type that has the Summary trait will have the
method summarize de�ned with this signature exactly.
A trait can have multiple methods in its body: the method signatures are listed one per line and
each line ends in a semicolon.
Now that we’ve de�ned the desired behavior using the Summary trait, we can implement it on the
types in our media aggregator. Listing 10-13 shows an implementation of the Summary trait on the
NewsArticle struct that uses the headline, the author, and the location to create the return value
of summarize . For the Tweet struct, we de�ne summarize as the username followed by the entire
text of the tweet, assuming that tweet content is already limited to 280 characters.
Filename: src/lib.rs
pub struct NewsArticle {
pub headline: String,
pub location: String,
pub author: String,
pub content: String,
}
Listing 10-13: Implementing the Summary trait on the NewsArticle and Tweet types
Implementing a trait on a type is similar to implementing regular methods. The di�erence is that
after impl , we put the trait name that we want to implement, then use the for keyword, and
then specify the name of the type we want to implement the trait for. Within the impl block, we
put the method signatures that the trait de�nition has de�ned. Instead of adding a semicolon
after each signature, we use curly brackets and �ll in the method body with the speci�c behavior
that we want the methods of the trait to have for the particular type.
After implementing the trait, we can call the methods on instances of NewsArticle and Tweet in
the same way we call regular methods, like this:
Note that because we de�ned the Summary trait and the NewsArticle and Tweet types in the
same lib.rs in Listing 10-13, they’re all in the same scope. Let’s say this lib.rs is for a crate we’ve
called aggregator and someone else wants to use our crate’s functionality to implement the
Summary trait on a struct de�ned within their library’s scope. They would need to bring the trait
into their scope �rst. They would do so by specifying use aggregator::Summary; , which then
would enable them to implement Summary for their type. The Summary trait would also need to be
a public trait for another crate to implement it, which it is because we put the pub keyword before
trait in Listing 10-12.
One restriction to note with trait implementations is that we can implement a trait on a type only if
either the trait or the type is local to our crate. For example, we can implement standard library
traits like Display on a custom type like Tweet as part of our aggregator crate functionality,
because the type Tweet is local to our aggregator crate. We can also implement Summary on
Vec<T> in our aggregator crate, because the trait Summary is local to our aggregator crate.
But we can’t implement external traits on external types. For example, we can’t implement the
Display trait on Vec<T> within our aggregator crate, because Display and Vec<T> are de�ned
in the standard library and aren’t local to our aggregator crate. This restriction is part of a
property of programs called coherence, and more speci�cally the orphan rule, so named because
the parent type is not present. This rule ensures that other people’s code can’t break your code
and vice versa. Without the rule, two crates could implement the same trait for the same type, and
Rust wouldn’t know which implementation to use.
Default Implementations
Sometimes it’s useful to have default behavior for some or all of the methods in a trait instead of
requiring implementations for all methods on every type. Then, as we implement the trait on a
particular type, we can keep or override each method’s default behavior.
Listing 10-14 shows how to specify a default string for the summarize method of the Summary trait
instead of only de�ning the method signature, as we did in Listing 10-12.
Filename: src/lib.rs
pub trait Summary {
fn summarize(&self) -> String {
String::from("(Read more...)")
}
}
Listing 10-14: De�nition of a Summary trait with a default implementation of the summarize method
Even though we’re no longer de�ning the summarize method on NewsArticle directly, we’ve
provided a default implementation and speci�ed that NewsArticle implements the Summary
trait. As a result, we can still call the summarize method on an instance of NewsArticle , like this:
Creating a default implementation for summarize doesn’t require us to change anything about the
implementation of Summary on Tweet in Listing 10-13. The reason is that the syntax for overriding
a default implementation is the same as the syntax for implementing a trait method that doesn’t
have a default implementation.
Default implementations can call other methods in the same trait, even if those other methods
don’t have a default implementation. In this way, a trait can provide a lot of useful functionality
and only require implementors to specify a small part of it. For example, we could de�ne the
Summary trait to have a summarize_author method whose implementation is required, and then
de�ne a summarize method that has a default implementation that calls the summarize_author
method:
pub trait Summary {
fn summarize_author(&self) -> String;
To use this version of Summary , we only need to de�ne summarize_author when we implement
the trait on a type:
After we de�ne summarize_author , we can call summarize on instances of the Tweet struct, and
the default implementation of summarize will call the de�nition of summarize_author that we’ve
provided. Because we’ve implemented summarize_author , the Summary trait has given us the
behavior of the summarize method without requiring us to write any more code.
Note that it isn’t possible to call the default implementation from an overriding implementation of
that same method.
Traits as arguments
Now that you know how to de�ne traits and implement those traits on types, we can explore how
to use traits to accept arguments of many di�erent types.
For example, in Listing 10-13, we implemented the Summary trait on the types NewsArticle and
Tweet . We can de�ne a function notify that calls the summarize method on its parameter item
, which is of some type that implements the Summary trait. To do this, we can use the ‘ impl Trait
’ syntax, like this:
In the body of notify , we can call any methods on item that come from the Summary trait, like
summarize .
Trait Bounds
The impl Trait syntax works for short examples, but is syntax sugar for a longer form. This is
called a trait bound, and it looks like this:
This is equivalent to the example above, but is a bit more verbose. We place trait bounds with the
declaration of the generic type parameter, after a colon and inside angle brackets. Because of the
trait bound on T , we can call notify and pass in any instance of NewsArticle or Tweet . Code
that calls the function with any other type, like a String or an i32 , won’t compile, because those
types don’t implement Summary .
When should you use this form over impl Trait ? While impl Trait is nice for shorter examples,
trait bounds are nice for more complex ones. For example, say we wanted to take two things that
implement Summary :
This would work well if item1 and item2 were allowed to have diferent types (as long as both
implement Summary ). But what if you wanted to force both to have the exact same type? That is
only possible if you use a trait bound:
If notify needed to display formatting on item , as well as use the summarize method, then
item would need to implement two di�erent traits at the same time: Display and Summary . This
can be done using the + syntax:
However, there are downsides to using too many trait bounds. Each generic has its own trait
bounds, so functions with multiple generic type parameters can have lots of trait bound
information between a function’s name and its parameter list, making the function signature hard
to read. For this reason, Rust has alternate syntax for specifying trait bounds inside a where
clause after the function signature. So instead of writing this:
This function’s signature is less cluttered in that the function name, parameter list, and return type
are close together, similar to a function without lots of trait bounds.
Returning Traits
We can use the impl Trait syntax in return position as well, to return something that
implements a trait:
This signature says, “I’m going to return something that implements the Summary trait, but I’m not
going to tell you the exact type.” In our case, we’re returning a Tweet , but the caller doesn’t know
that.
Why is this useful? In chapter 13, we’re going to learn about two features that rely heavily on traits:
closures, and iterators. These features create types that only the compiler knows, or types that are
very, very long. impl Trait lets you simply say “this returns an Iterator ” without needing to
write out a really long type.
This only works if you have a single type that you’re returning, however. For example, this would
not work:
Here, we try to return either a NewsArticle or a Tweet . This cannot work, due to restrictions
around how impl Trait works. To write this code, you’ll have to wait until the “Using Trait Objects
that Allow for Values of Di�erent Types” section of Chapter 17.
Now that you know how to specify the behavior you want to use using the generic type
parameter’s bounds, let’s return to Listing 10-5 to �x the de�nition of the largest function that
uses a generic type parameter! Last time we tried to run that code, we received this error:
In the body of largest we wanted to compare two values of type T using the greater than ( > )
operator. Because that operator is de�ned as a default method on the standard library trait
std::cmp::PartialOrd , we need to specify PartialOrd in the trait bounds for T so the largest
function can work on slices of any type that we can compare. We don’t need to bring PartialOrd
into scope because it’s in the prelude. Change the signature of largest to look like this:
This time when we compile the code, we get a di�erent set of errors:
The key line in this error is cannot move out of type [T], a non-copy slice . With our non-
generic versions of the largest function, we were only trying to �nd the largest i32 or char . As
discussed in the “Stack-Only Data: Copy” section in Chapter 4, types like i32 and char that have a
known size can be stored on the stack, so they implement the Copy trait. But when we made the
largest function generic, it became possible for the list parameter to have types in it that
don’t implement the Copy trait. Consequently, we wouldn’t be able to move the value out of
list[0] and into the largest variable, resulting in this error.
To call this code with only those types that implement the Copy trait, we can add Copy to the trait
bounds of T ! Listing 10-15 shows the complete code of a generic largest function that will
compile as long as the types of the values in the slice that we pass into the function implement the
PartialOrd and Copy traits, like i32 and char do.
Filename: src/main.rs
largest
}
fn main() {
let number_list = vec![34, 50, 25, 100, 65];
Listing 10-15: A working de�nition of the largest function that works on any generic type that implements
the PartialOrd and Copy traits
If we don’t want to restrict the largest function to the types that implement the Copy trait, we
could specify that T has the trait bound Clone instead of Copy . Then we could clone each value
in the slice when we want the largest function to have ownership. Using the clone function
means we’re potentially making more heap allocations in the case of types that own heap data like
String , and heap allocations can be slow if we’re working with large amounts of data.
Another way we could implement largest is for the function to return a reference to a T value in
the slice. If we change the return type to &T instead of T , thereby changing the body of the
function to return a reference, we wouldn’t need the Clone or Copy trait bounds and we could
avoid heap allocations. Try implementing these alternate solutions on your own!
By using a trait bound with an impl block that uses generic type parameters, we can implement
methods conditionally for types that implement the speci�ed traits. For example, the type
Pair<T> in Listing 10-16 always implements the new function. But Pair<T> only implements the
cmp_display method if its inner type T implements the PartialOrd trait that enables
comparison and the Display trait that enables printing.
use std::fmt::Display;
struct Pair<T> {
x: T,
y: T,
}
impl<T> Pair<T> {
fn new(x: T, y: T) -> Self {
Self {
x,
y,
}
}
}
Listing 10-16: Conditionally implement methods on a generic type depending on trait bounds
We can also conditionally implement a trait for any type that implements another trait.
Implementations of a trait on any type that satis�es the trait bounds are called blanket
implementations and are extensively used in the Rust standard library. For example, the standard
library implements the ToString trait on any type that implements the Display trait. The impl
block in the standard library looks similar to this code:
Because the standard library has this blanket implementation, we can call the to_string method
de�ned by the ToString trait on any type that implements the Display trait. For example, we
can turn integers into their corresponding String values like this because integers implement
Display :
let s = 3.to_string();
Blanket implementations appear in the documentation for the trait in the “Implementors” section.
Traits and trait bounds let us write code that uses generic type parameters to reduce duplication
but also specify to the compiler that we want the generic type to have particular behavior. The
compiler can then use the trait bound information to check that all the concrete types used with
our code provide the correct behavior. In dynamically typed languages, we would get an error at
runtime if we called a method on a type that the type didn’t implement. But Rust moves these
errors to compile time so we’re forced to �x the problems before our code is even able to run.
Additionally, we don’t have to write code that checks for behavior at runtime because we’ve
already checked at compile time. Doing so improves performance without having to give up the
�exibility of generics.
Another kind of generic that we’ve already been using is called lifetimes. Rather than ensuring that
a type has the behavior we want, lifetimes ensure that references are valid as long as we need
them to be. Let’s look at how lifetimes do that.
The concept of lifetimes is somewhat di�erent from tools in other programming languages,
arguably making lifetimes Rust’s most distinctive feature. Although we won’t cover lifetimes in
their entirety in this chapter, we’ll discuss common ways you might encounter lifetime syntax so
you can become familiar with the concepts. See the “Advanced Lifetimes” section in Chapter 19 for
more detailed information.
The main aim of lifetimes is to prevent dangling references, which cause a program to reference
data other than the data it’s intended to reference. Consider the program in Listing 10-17, which
has an outer scope and an inner scope.
{
let r;
{
let x = 5;
r = &x;
}
Listing 10-17: An attempt to use a reference whose value has gone out of scope
Note: The examples in Listings 10-17, 10-18, and 10-24 declare variables without giving them
an initial value, so the variable name exists in the outer scope. At �rst glance, this might
appear to be in con�ict with Rust’s having no null values. However, if we try to use a variable
before giving it a value, we’ll get a compile-time error, which shows that Rust indeed does
not allow null values.
The outer scope declares a variable named r with no initial value, and the inner scope declares a
variable named x with the initial value of 5. Inside the inner scope, we attempt to set the value of
r as a reference to x . Then the inner scope ends, and we attempt to print the value in r . This
code won’t compile because the value r is referring to has gone out of scope before we try to use
it. Here is the error message:
The variable x doesn’t “live long enough.” The reason is that x will be out of scope when the
inner scope ends on line 7. But r is still valid for the outer scope; because its scope is larger, we
say that it “lives longer.” If Rust allowed this code to work, r would be referencing memory that
was deallocated when x went out of scope, and anything we tried to do with r wouldn’t work
correctly. So how does Rust determine that this code is invalid? It uses a borrow checker.
The Rust compiler has a borrow checker that compares scopes to determine whether all borrows
are valid. Listing 10-18 shows the same code as Listing 10-17 but with annotations showing the
lifetimes of the variables.
{
let r; // ---------+-- 'a
// |
{ // |
let x = 5; // -+-- 'b |
r = &x; // | |
} // -+ |
// |
println!("r: {}", r); // |
} // ---------+
Listing 10-18: Annotations of the lifetimes of r and x , named 'a and 'b , respectively
Here, we’ve annotated the lifetime of r with 'a and the lifetime of x with 'b . As you can see,
the inner 'b block is much smaller than the outer 'a lifetime block. At compile time, Rust
compares the size of the two lifetimes and sees that r has a lifetime of 'a but that it refers to
memory with a lifetime of 'b . The program is rejected because 'b is shorter than 'a : the
subject of the reference doesn’t live as long as the reference.
Listing 10-19 �xes the code so it doesn’t have a dangling reference and compiles without any
errors.
{
let x = 5; // ----------+-- 'b
// |
let r = &x; // --+-- 'a |
// | |
println!("r: {}", r); // | |
// --+ |
} // ----------+
Listing 10-19: A valid reference because the data has a longer lifetime than the reference
Here, x has the lifetime 'b , which in this case is larger than 'a . This means r can reference x
because Rust knows that the reference in r will always be valid while x is valid.
Now that you know where the lifetimes of references are and how Rust analyzes lifetimes to
ensure references will always be valid, let’s explore generic lifetimes of parameters and return
values in the context of functions.
Let’s write a function that returns the longer of two string slices. This function will take two string
slices and return a string slice. After we’ve implemented the longest function, the code in Listing
10-20 should print The longest string is abcd .
Filename: src/main.rs
fn main() {
let string1 = String::from("abcd");
let string2 = "xyz";
Listing 10-20: A main function that calls the longest function to �nd the longer of two string slices
Note that we want the function to take string slices, which are references, because we don’t want
the longest function to take ownership of its parameters. We want to allow the function to
accept slices of a String (the type stored in the variable string1 ) as well as string literals (which
is what variable string2 contains).
Refer to the “String Slices as Parameters” section in Chapter 4 for more discussion about why the
parameters we use in Listing 10-20 are the ones we want.
If we try to implement the longest function as shown in Listing 10-21, it won’t compile.
Filename: src/main.rs
Listing 10-21: An implementation of the longest function that returns the longer of two string slices but does
not yet compile
The help text reveals that the return type needs a generic lifetime parameter on it because Rust
can’t tell whether the reference being returned refers to x or y . Actually, we don’t know either,
because the if block in the body of this function returns a reference to x and the else block
returns a reference to y !
When we’re de�ning this function, we don’t know the concrete values that will be passed into this
function, so we don’t know whether the if case or the else case will execute. We also don’t
know the concrete lifetimes of the references that will be passed in, so we can’t look at the scopes
as we did in Listings 10-18 and 10-19 to determine whether the reference we return will always be
valid. The borrow checker can’t determine this either, because it doesn’t know how the lifetimes of
x and y relate to the lifetime of the return value. To �x this error, we’ll add generic lifetime
parameters that de�ne the relationship between the references so the borrow checker can
perform its analysis.
Lifetime annotations don’t change how long any of the references live. Just as functions can accept
any type when the signature speci�es a generic type parameter, functions can accept references
with any lifetime by specifying a generic lifetime parameter. Lifetime annotations describe the
relationships of the lifetimes of multiple references to each other without a�ecting the lifetimes.
Lifetime annotations have a slightly unusual syntax: the names of lifetime parameters must start
with an apostrophe ( ' ) and are usually all lowercase and very short, like generic types. Most
people use the name 'a . We place lifetime parameter annotations after the & of a reference,
using a space to separate the annotation from the reference’s type.
Here are some examples: a reference to an i32 without a lifetime parameter, a reference to an
i32 that has a lifetime parameter named 'a , and a mutable reference to an i32 that also has
the lifetime 'a .
&i32 // a reference
&'a i32 // a reference with an explicit lifetime
&'a mut i32 // a mutable reference with an explicit lifetime
One lifetime annotation by itself doesn’t have much meaning, because the annotations are meant
to tell Rust how generic lifetime parameters of multiple references relate to each other. For
example, let’s say we have a function with the parameter first that is a reference to an i32 with
lifetime 'a . The function also has another parameter named second that is another reference to
an i32 that also has the lifetime 'a . The lifetime annotations indicate that the references first
and second must both live as long as that generic lifetime.
Now let’s examine lifetime annotations in the context of the longest function. As with generic
type parameters, we need to declare generic lifetime parameters inside angle brackets between
the function name and the parameter list. The constraint we want to express in this signature is
that all the references in the parameters and the return value must have the same lifetime. We’ll
name the lifetime 'a and then add it to each reference, as shown in Listing 10-22.
Filename: src/main.rs
fn longest<'a>(x: &'a str, y: &'a str) -> &'a str {
if x.len() > y.len() {
x
} else {
y
}
}
Listing 10-22: The longest function de�nition specifying that all the references in the signature must have
the same lifetime 'a
This code should compile and produce the result we want when we use it with the main function
in Listing 10-20.
The function signature now tells Rust that for some lifetime 'a , the function takes two
parameters, both of which are string slices that live at least as long as lifetime 'a . The function
signature also tells Rust that the string slice returned from the function will live at least as long as
lifetime 'a . These constraints are what we want Rust to enforce. Remember, when we specify the
lifetime parameters in this function signature, we’re not changing the lifetimes of any values
passed in or returned. Rather, we’re specifying that the borrow checker should reject any values
that don’t adhere to these constraints. Note that the longest function doesn’t need to know
exactly how long x and y will live, only that some scope can be substituted for 'a that will
satisfy this signature.
When annotating lifetimes in functions, the annotations go in the function signature, not in the
function body. Rust can analyze the code within the function without any help. However, when a
function has references to or from code outside that function, it becomes almost impossible for
Rust to �gure out the lifetimes of the parameters or return values on its own. The lifetimes might
be di�erent each time the function is called. This is why we need to annotate the lifetimes
manually.
When we pass concrete references to longest , the concrete lifetime that is substituted for 'a is
the part of the scope of x that overlaps with the scope of y . In other words, the generic lifetime
'a will get the concrete lifetime that is equal to the smaller of the lifetimes of x and y . Because
we’ve annotated the returned reference with the same lifetime parameter 'a , the returned
reference will also be valid for the length of the smaller of the lifetimes of x and y .
Let’s look at how the lifetime annotations restrict the longest function by passing in references
that have di�erent concrete lifetimes. Listing 10-23 is a straightforward example.
Filename: src/main.rs
fn main() {
let string1 = String::from("long string is long");
{
let string2 = String::from("xyz");
let result = longest(string1.as_str(), string2.as_str());
println!("The longest string is {}", result);
}
}
Listing 10-23: Using the longest function with references to String values that have di�erent concrete
lifetimes
In this example, string1 is valid until the end of the outer scope, string2 is valid until the end of
the inner scope, and result references something that is valid until the end of the inner scope.
Run this code, and you’ll see that the borrow checker approves of this code; it will compile and
print The longest string is long string is long .
Next, let’s try an example that shows that the lifetime of the reference in result must be the
smaller lifetime of the two arguments. We’ll move the declaration of the result variable outside
the inner scope but leave the assignment of the value to the result variable inside the scope
with string2 . Then we’ll move the println! that uses result outside the inner scope, after the
inner scope has ended. The code in Listing 10-24 will not compile.
Filename: src/main.rs
fn main() {
let string1 = String::from("long string is long");
let result;
{
let string2 = String::from("xyz");
result = longest(string1.as_str(), string2.as_str());
}
println!("The longest string is {}", result);
}
Listing 10-24: Attempting to use result after string2 has gone out of scope
The error shows that for result to be valid for the println! statement, string2 would need to
be valid until the end of the outer scope. Rust knows this because we annotated the lifetimes of
the function parameters and return values using the same lifetime parameter 'a .
As humans, we can look at this code and see that string1 is longer than string2 and therefore
result will contain a reference to string1 . Because string1 has not gone out of scope yet, a
reference to string1 will still be valid for the println! statement. However, the compiler can’t
see that the reference is valid in this case. We’ve told Rust that the lifetime of the reference
returned by the longest function is the same as the smaller of the lifetimes of the references
passed in. Therefore, the borrow checker disallows the code in Listing 10-24 as possibly having an
invalid reference.
Try designing more experiments that vary the values and lifetimes of the references passed in to
the longest function and how the returned reference is used. Make hypotheses about whether
or not your experiments will pass the borrow checker before you compile; then check to see if
you’re right!
The way in which you need to specify lifetime parameters depends on what your function is doing.
For example, if we changed the implementation of the longest function to always return the �rst
parameter rather than the longest string slice, we wouldn’t need to specify a lifetime on the y
parameter. The following code will compile:
Filename: src/main.rs
fn longest<'a>(x: &'a str, y: &str) -> &'a str {
x
}
In this example, we’ve speci�ed a lifetime parameter 'a for the parameter x and the return type,
but not for the parameter y , because the lifetime of y does not have any relationship with the
lifetime of x or the return value.
When returning a reference from a function, the lifetime parameter for the return type needs to
match the lifetime parameter for one of the parameters. If the reference returned does not refer
to one of the parameters, it must refer to a value created within this function, which would be a
dangling reference because the value will go out of scope at the end of the function. Consider this
attempted implementation of the longest function that won’t compile:
Filename: src/main.rs
Here, even though we’ve speci�ed a lifetime parameter 'a for the return type, this
implementation will fail to compile because the return value lifetime is not related to the lifetime
of the parameters at all. Here is the error message we get:
The problem is that result goes out of scope and gets cleaned up at the end of the longest
function. We’re also trying to return a reference to result from the function. There is no way we
can specify lifetime parameters that would change the dangling reference, and Rust won’t let us
create a dangling reference. In this case, the best �x would be to return an owned data type rather
than a reference so the calling function is then responsible for cleaning up the value.
Ultimately, lifetime syntax is about connecting the lifetimes of various parameters and return
values of functions. Once they’re connected, Rust has enough information to allow memory-safe
operations and disallow operations that would create dangling pointers or otherwise violate
memory safety.
So far, we’ve only de�ned structs to hold owned types. It’s possible for structs to hold references,
but in that case we would need to add a lifetime annotation on every reference in the struct’s
de�nition. Listing 10-25 has a struct named ImportantExcerpt that holds a string slice.
Filename: src/main.rs
struct ImportantExcerpt<'a> {
part: &'a str,
}
fn main() {
let novel = String::from("Call me Ishmael. Some years ago...");
let first_sentence = novel.split('.')
.next()
.expect("Could not find a '.'");
let i = ImportantExcerpt { part: first_sentence };
}
Listing 10-25: A struct that holds a reference, so its de�nition needs a lifetime annotation
This struct has one �eld, part , that holds a string slice, which is a reference. As with generic data
types, we declare the name of the generic lifetime parameter inside angle brackets after the name
of the struct so we can use the lifetime parameter in the body of the struct de�nition. This
annotation means an instance of ImportantExcerpt can’t outlive the reference it holds in its
part �eld.
The main function here creates an instance of the ImportantExcerpt struct that holds a
reference to the �rst sentence of the String owned by the variable novel . The data in novel
exists before the ImportantExcerpt instance is created. In addition, novel doesn’t go out of
scope until after the ImportantExcerpt goes out of scope, so the reference in the
ImportantExcerpt instance is valid.
Lifetime Elision
You’ve learned that every reference has a lifetime and that you need to specify lifetime
parameters for functions or structs that use references. However, in Chapter 4 we had a function
in Listing 4-9, which is shown again in Listing 10-26, that compiled without lifetime annotations.
Filename: src/lib.rs
fn first_word(s: &str) -> &str {
let bytes = s.as_bytes();
&s[..]
}
Listing 10-26: A function we de�ned in Listing 4-9 that compiled without lifetime annotations, even though the
parameter and return type are references
The reason this function compiles without lifetime annotations is historical: in early versions
(pre-1.0) of Rust, this code wouldn’t have compiled because every reference needed an explicit
lifetime. At that time, the function signature would have been written like this:
After writing a lot of Rust code, the Rust team found that Rust programmers were entering the
same lifetime annotations over and over in particular situations. These situations were predictable
and followed a few deterministic patterns. The developers programmed these patterns into the
compiler’s code so the borrow checker could infer the lifetimes in these situations and wouldn’t
need explicit annotations.
This piece of Rust history is relevant because it’s possible that more deterministic patterns will
emerge and be added to the compiler. In the future, even fewer lifetime annotations might be
required.
The patterns programmed into Rust’s analysis of references are called the lifetime elision rules.
These aren’t rules for programmers to follow; they’re a set of particular cases that the compiler
will consider, and if your code �ts these cases, you don’t need to write the lifetimes explicitly.
The elision rules don’t provide full inference. If Rust deterministically applies the rules but there is
still ambiguity as to what lifetimes the references have, the compiler won’t guess what the lifetime
of the remaining references should be. In this case, instead of guessing, the compiler will give you
an error that you can resolve by adding the lifetime annotations that specify how the references
relate to each other.
Lifetimes on function or method parameters are called input lifetimes, and lifetimes on return
values are called output lifetimes.
The compiler uses three rules to �gure out what lifetimes references have when there aren’t
explicit annotations. The �rst rule applies to input lifetimes, and the second and third rules apply
to output lifetimes. If the compiler gets to the end of the three rules and there are still references
for which it can’t �gure out lifetimes, the compiler will stop with an error.
The �rst rule is that each parameter that is a reference gets its own lifetime parameter. In other
words, a function with one parameter gets one lifetime parameter: fn foo<'a>(x: &'a i32) ; a
function with two parameters gets two separate lifetime parameters:
fn foo<'a, 'b>(x: &'a i32, y: &'b i32) ; and so on.
The second rule is if there is exactly one input lifetime parameter, that lifetime is assigned to all
output lifetime parameters: fn foo<'a>(x: &'a i32) -> &'a i32 .
The third rule is if there are multiple input lifetime parameters, but one of them is &self or
&mut self because this is a method, the lifetime of self is assigned to all output lifetime
parameters. This third rule makes methods much nicer to read and write because fewer symbols
are necessary.
Let’s pretend we’re the compiler. We’ll apply these rules to �gure out what the lifetimes of the
references in the signature of the first_word function in Listing 10-26 are. The signature starts
without any lifetimes associated with the references:
Then the compiler applies the �rst rule, which speci�es that each parameter gets its own lifetime.
We’ll call it 'a as usual, so now the signature is this:
The second rule applies because there is exactly one input lifetime. The second rule speci�es that
the lifetime of the one input parameter gets assigned to the output lifetime, so the signature is
now this:
Now all the references in this function signature have lifetimes, and the compiler can continue its
analysis without needing the programmer to annotate the lifetimes in this function signature.
Let’s look at another example, this time using the longest function that had no lifetime
parameters when we started working with it in Listing 10-21:
Let’s apply the �rst rule: each parameter gets its own lifetime. This time we have two parameters
instead of one, so we have two lifetimes:
You can see that the second rule doesn’t apply because there is more than one input lifetime. The
third rule doesn’t apply either, because longest is a function rather than a method, so none of
the parameters are self . After working through all three rules, we still haven’t �gured out what
the return type’s lifetime is. This is why we got an error trying to compile the code in Listing 10-21:
the compiler worked through the lifetime elision rules but still couldn’t �gure out all the lifetimes
of the references in the signature.
Because the third rule really only applies in method signatures, we’ll look at lifetimes in that
context next to see why the third rule means we don’t have to annotate lifetimes in method
signatures very often.
When we implement methods on a struct with lifetimes, we use the same syntax as that of generic
type parameters shown in Listing 10-11. Where we declare and use the lifetime parameters
depends on whether they’re related to the struct �elds or the method parameters and return
values.
Lifetime names for struct �elds always need to be declared after the impl keyword and then used
after the struct’s name, because those lifetimes are part of the struct’s type.
In method signatures inside the impl block, references might be tied to the lifetime of references
in the struct’s �elds, or they might be independent. In addition, the lifetime elision rules often
make it so that lifetime annotations aren’t necessary in method signatures. Let’s look at some
examples using the struct named ImportantExcerpt that we de�ned in Listing 10-25.
First, we’ll use a method named level whose only parameter is a reference to self and whose
return value is an i32 , which is not a reference to anything:
impl<'a> ImportantExcerpt<'a> {
fn level(&self) -> i32 {
3
}
}
The lifetime parameter declaration after impl and use after the type name is required, but we’re
not required to annotate the lifetime of the reference to self because of the �rst elision rule.
impl<'a> ImportantExcerpt<'a> {
fn announce_and_return_part(&self, announcement: &str) -> &str {
println!("Attention please: {}", announcement);
self.part
}
}
There are two input lifetimes, so Rust applies the �rst lifetime elision rule and gives both &self
and announcement their own lifetimes. Then, because one of the parameters is &self , the return
type gets the lifetime of &self , and all lifetimes have been accounted for.
One special lifetime we need to discuss is 'static , which denotes the entire duration of the
program. All string literals have the 'static lifetime, which we can annotate as follows:
let s: &'static str = "I have a static lifetime.";
The text of this string is stored directly in the binary of your program, which is always available.
Therefore, the lifetime of all string literals is 'static .
You might see suggestions to use the 'static lifetime in error messages. But before specifying
'static as the lifetime for a reference, think about whether the reference you have actually lives
the entire lifetime of your program or not. You might consider whether you want it to live that
long, even if it could. Most of the time, the problem results from attempting to create a dangling
reference or a mismatch of the available lifetimes. In such cases, the solution is �xing those
problems, not specifying the 'static lifetime.
use std::fmt::Display;
fn longest_with_an_announcement<'a, T>(x: &'a str, y: &'a str, ann: T) -> &'a str
where T: Display
{
println!("Announcement! {}", ann);
if x.len() > y.len() {
x
} else {
y
}
}
This is the longest function from Listing 10-22 that returns the longer of two string slices. But
now it has an extra parameter named ann of the generic type T , which can be �lled in by any
type that implements the Display trait as speci�ed by the where clause. This extra parameter
will be printed before the function compares the lengths of the string slices, which is why the
Display trait bound is necessary. Because lifetimes are a type of generic, the declarations of the
lifetime parameter 'a and the generic type parameter T go in the same list inside the angle
brackets after the function name.
Summary
We covered a lot in this chapter! Now that you know about generic type parameters, traits and
trait bounds, and generic lifetime parameters, you’re ready to write code without repetition that
works in many di�erent situations. Generic type parameters let you apply the code to di�erent
types. Traits and trait bounds ensure that even though the types are generic, they’ll have the
behavior the code needs. You learned how to use lifetime annotations to ensure that this �exible
code won’t have any dangling references. And all of this analysis happens at compile time, which
doesn’t a�ect runtime performance!
Believe it or not, there is much more to learn on the topics we discussed in this chapter: Chapter
17 discusses trait objects, which are another way to use traits. Chapter 19 covers more complex
scenarios involving lifetime annotations as well as some advanced type system features. But next,
you’ll learn how to write tests in Rust so you can make sure your code is working the way it should.
their absence.” That doesn’t mean we shouldn’t try to test as much as we can!
Correctness in our programs is the extent to which our code does what we intend it to do. Rust is
designed with a high degree of concern about the correctness of programs, but correctness is
complex and not easy to prove. Rust’s type system shoulders a huge part of this burden, but the
type system cannot catch every kind of incorrectness. As such, Rust includes support for writing
automated software tests within the language.
As an example, say we write a function called add_two that adds 2 to whatever number is passed
to it. This function’s signature accepts an integer as a parameter and returns an integer as a result.
When we implement and compile that function, Rust does all the type checking and borrow
checking that you’ve learned so far to ensure that, for instance, we aren’t passing a String value
or an invalid reference to this function. But Rust can’t check that this function will do precisely
what we intend, which is return the parameter plus 2 rather than, say, the parameter plus 10 or
the parameter minus 50! That’s where tests come in.
We can write tests that assert, for example, that when we pass 3 to the add_two function, the
returned value is 5 . We can run these tests whenever we make changes to our code to make sure
any existing correct behavior has not changed.
Testing is a complex skill: although we can’t cover every detail about how to write good tests in
one chapter, we’ll discuss the mechanics of Rust’s testing facilities. We’ll talk about the annotations
and macros available to you when writing your tests, the default behavior and options provided
for running your tests, and how to organize tests into unit tests and integration tests.
Let’s look at the features Rust provides speci�cally for writing tests that take these actions, which
include the test attribute, a few macros, and the should_panic attribute.
At its simplest, a test in Rust is a function that’s annotated with the test attribute. Attributes are
metadata about pieces of Rust code; one example is the derive attribute we used with structs in
Chapter 5. To change a function into a test function, add #[test] on the line before fn . When
you run your tests with the cargo test command, Rust builds a test runner binary that runs the
functions annotated with the test attribute and reports on whether each test function passes or
fails.
When we make a new library project with Cargo, a test module with a test function in it is
automatically generated for us. This module helps you start writing your tests so you don’t have to
look up the exact structure and syntax of test functions every time you start a new project. You
can add as many additional test functions and as many test modules as you want!
We’ll explore some aspects of how tests work by experimenting with the template test generated
for us without actually testing any code. Then we’ll write some real-world tests that call some code
that we’ve w