2022-08-22 00:04:52 +00:00
|
|
|
|
<!-- DO NOT EDIT THIS FILE.
|
|
|
|
|
|
|
|
|
|
This file is periodically generated from the content in the `/src/`
|
|
|
|
|
directory, so all fixes need to be made in `/src/`.
|
|
|
|
|
-->
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
[TOC]
|
|
|
|
|
|
|
|
|
|
# An I/O Project: Building a Command Line Program
|
|
|
|
|
|
|
|
|
|
This chapter is a recap of the many skills you’ve learned so far and an
|
|
|
|
|
exploration of a few more standard library features. We’ll build a command line
|
|
|
|
|
tool that interacts with file and command line input/output to practice some of
|
|
|
|
|
the Rust concepts you now have under your belt.
|
|
|
|
|
|
|
|
|
|
Rust’s speed, safety, single binary output, and cross-platform support make it
|
|
|
|
|
an ideal language for creating command line tools, so for our project, we’ll
|
2022-04-25 13:14:26 +00:00
|
|
|
|
make our own version of the classic command line search tool `grep`
|
2022-08-22 00:07:47 +00:00
|
|
|
|
(**g**lobally search a **r**egular **e**xpression and **p**rint). In the
|
2022-04-25 13:14:26 +00:00
|
|
|
|
simplest use case, `grep` searches a specified file for a specified string. To
|
2022-06-03 20:16:02 +00:00
|
|
|
|
do so, `grep` takes as its arguments a file path and a string. Then it reads
|
|
|
|
|
the file, finds lines in that file that contain the string argument, and prints
|
2022-04-25 13:14:26 +00:00
|
|
|
|
those lines.
|
|
|
|
|
|
|
|
|
|
Along the way, we’ll show how to make our command line tool use the terminal
|
|
|
|
|
features that many other command line tools use. We’ll read the value of an
|
2021-12-28 21:35:17 +00:00
|
|
|
|
environment variable to allow the user to configure the behavior of our tool.
|
|
|
|
|
We’ll also print error messages to the standard error console stream (`stderr`)
|
2022-08-22 00:07:47 +00:00
|
|
|
|
instead of standard output (`stdout`) so that, for example, the user can
|
|
|
|
|
redirect successful output to a file while still seeing error messages onscreen.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
One Rust community member, Andrew Gallant, has already created a fully
|
|
|
|
|
featured, very fast version of `grep`, called `ripgrep`. By comparison, our
|
2022-04-25 13:14:26 +00:00
|
|
|
|
version will be fairly simple, but this chapter will give you some of the
|
|
|
|
|
background knowledge you need to understand a real-world project such as
|
2021-12-28 21:35:17 +00:00
|
|
|
|
`ripgrep`.
|
|
|
|
|
|
|
|
|
|
Our `grep` project will combine a number of concepts you’ve learned so far:
|
|
|
|
|
|
2022-08-24 13:12:29 +00:00
|
|
|
|
* Organizing code (Chapter 7)
|
2022-08-22 01:26:53 +00:00
|
|
|
|
* Using vectors and strings (Chapter 8)
|
2021-12-28 21:35:17 +00:00
|
|
|
|
* Handling errors (Chapter 9)
|
|
|
|
|
* Using traits and lifetimes where appropriate (Chapter 10)
|
|
|
|
|
* Writing tests (Chapter 11)
|
2022-09-13 16:08:50 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
We’ll also briefly introduce closures, iterators, and trait objects, which
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Chapter 13 and Chapter 17 will cover in detail.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
## Accepting Command Line Arguments
|
|
|
|
|
|
|
|
|
|
Let’s create a new project with, as always, `cargo new`. We’ll call our project
|
|
|
|
|
`minigrep` to distinguish it from the `grep` tool that you might already have
|
|
|
|
|
on your system.
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo new minigrep
|
|
|
|
|
Created binary (application) `minigrep` project
|
|
|
|
|
$ cd minigrep
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
The first task is to make `minigrep` accept its two command line arguments: the
|
2022-06-03 20:16:02 +00:00
|
|
|
|
file path and a string to search for. That is, we want to be able to run our
|
2022-06-03 18:28:21 +00:00
|
|
|
|
program with `cargo run`, two hyphens to indicate the following arguments are
|
|
|
|
|
for our program rather than for `cargo`, a string to search for, and a path to
|
|
|
|
|
a file to search in, like so:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- searchstring example-filename.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Right now, the program generated by `cargo new` cannot process arguments we
|
2022-08-22 00:07:47 +00:00
|
|
|
|
give it. Some existing libraries on *https://crates.io* can help with writing a
|
|
|
|
|
program that accepts command line arguments, but because you’re just learning
|
2021-12-28 21:35:17 +00:00
|
|
|
|
this concept, let’s implement this capability ourselves.
|
|
|
|
|
|
|
|
|
|
### Reading the Argument Values
|
|
|
|
|
|
|
|
|
|
To enable `minigrep` to read the values of command line arguments we pass to
|
2022-04-25 13:14:26 +00:00
|
|
|
|
it, we’ll need the `std::env::args` function provided in Rust’s standard
|
|
|
|
|
library. This function returns an iterator of the command line arguments passed
|
|
|
|
|
to `minigrep`. We’ll cover iterators fully in Chapter 13. For now, you only
|
|
|
|
|
need to know two details about iterators: iterators produce a series of values,
|
|
|
|
|
and we can call the `collect` method on an iterator to turn it into a
|
|
|
|
|
collection, such as a vector, that contains all the elements the iterator
|
|
|
|
|
produces.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
The code in Listing 12-1 allows your `minigrep` program to read any command
|
2022-08-24 13:12:29 +00:00
|
|
|
|
line arguments passed to it, and then collect the values into a vector.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
use std::env;
|
|
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
|
let args: Vec<String> = env::args().collect();
|
2022-06-03 20:16:02 +00:00
|
|
|
|
dbg!(args);
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-1: Collecting the command line arguments into a vector and printing
|
|
|
|
|
them
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
First we bring the `std::env` module into scope with a `use` statement so we
|
2021-12-28 21:35:17 +00:00
|
|
|
|
can use its `args` function. Notice that the `std::env::args` function is
|
|
|
|
|
nested in two levels of modules. As we discussed in Chapter 7, in cases where
|
2022-06-03 18:28:21 +00:00
|
|
|
|
the desired function is nested in more than one module, we’ve chosen to bring
|
|
|
|
|
the parent module into scope rather than the function. By doing so, we can
|
|
|
|
|
easily use other functions from `std::env`. It’s also less ambiguous than
|
2022-08-22 00:07:47 +00:00
|
|
|
|
adding `use std::env::args` and then calling the function with just `args`,
|
|
|
|
|
because `args` might easily be mistaken for a function that’s defined in the
|
|
|
|
|
current module.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
> ### The args Function and Invalid Unicode
|
2022-09-13 16:08:50 +00:00
|
|
|
|
>
|
2021-12-28 21:35:17 +00:00
|
|
|
|
> Note that `std::env::args` will panic if any argument contains invalid
|
2022-07-15 00:12:02 +00:00
|
|
|
|
Unicode. If your program needs to accept arguments containing invalid Unicode,
|
|
|
|
|
use `std::env::args_os` instead. That function returns an iterator that
|
|
|
|
|
produces `OsString` values instead of `String` values. We’ve chosen to use
|
2022-08-22 00:07:47 +00:00
|
|
|
|
`std::env::args` here for simplicity because `OsString` values differ per
|
2022-07-15 00:12:02 +00:00
|
|
|
|
platform and are more complex to work with than `String` values.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
On the first line of `main`, we call `env::args`, and we immediately use
|
|
|
|
|
`collect` to turn the iterator into a vector containing all the values produced
|
|
|
|
|
by the iterator. We can use the `collect` function to create many kinds of
|
|
|
|
|
collections, so we explicitly annotate the type of `args` to specify that we
|
2022-08-22 00:07:47 +00:00
|
|
|
|
want a vector of strings. Although you very rarely need to annotate types in
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Rust, `collect` is one function you do often need to annotate because Rust
|
|
|
|
|
isn’t able to infer the kind of collection you want.
|
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
Finally, we print the vector using the debug macro. Let’s try running the code
|
|
|
|
|
first with no arguments and then with two arguments:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo run
|
|
|
|
|
--snip--
|
2022-07-15 00:49:32 +00:00
|
|
|
|
[src/main.rs:5] args = [
|
|
|
|
|
"target/debug/minigrep",
|
|
|
|
|
]
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- needle haystack
|
2021-12-28 21:35:17 +00:00
|
|
|
|
--snip--
|
2022-07-15 00:49:32 +00:00
|
|
|
|
[src/main.rs:5] args = [
|
|
|
|
|
"target/debug/minigrep",
|
|
|
|
|
"needle",
|
|
|
|
|
"haystack",
|
|
|
|
|
]
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Notice that the first value in the vector is `"target/debug/minigrep"`, which
|
|
|
|
|
is the name of our binary. This matches the behavior of the arguments list in
|
|
|
|
|
C, letting programs use the name by which they were invoked in their execution.
|
|
|
|
|
It’s often convenient to have access to the program name in case you want to
|
|
|
|
|
print it in messages or change the behavior of the program based on what
|
2022-07-15 00:12:02 +00:00
|
|
|
|
command line alias was used to invoke the program. But for the purposes of this
|
2021-12-28 21:35:17 +00:00
|
|
|
|
chapter, we’ll ignore it and save only the two arguments we need.
|
|
|
|
|
|
|
|
|
|
### Saving the Argument Values in Variables
|
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
The program is currently able to access the values specified as command line
|
|
|
|
|
arguments. Now we need to save the values of the two arguments in variables so
|
|
|
|
|
we can use the values throughout the rest of the program. We do that in Listing
|
|
|
|
|
12-2.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
use std::env;
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
fn main() {
|
|
|
|
|
let args: Vec<String> = env::args().collect();
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
let query = &args[1];
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let file_path = &args[2];
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
println!("Searching for {}", query);
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("In file {}", file_path);
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-2: Creating variables to hold the query argument and file path
|
|
|
|
|
argument
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
As we saw when we printed the vector, the program’s name takes up the first
|
2022-08-24 13:12:29 +00:00
|
|
|
|
value in the vector at `args[0]`, so we’re starting arguments at index 1. The
|
2022-08-22 00:07:47 +00:00
|
|
|
|
first argument `minigrep` takes is the string we’re searching for, so we put a
|
|
|
|
|
reference to the first argument in the variable `query`. The second argument
|
|
|
|
|
will be the file path, so we put a reference to the second argument in the
|
|
|
|
|
variable `file_path`.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We temporarily print the values of these variables to prove that the code is
|
|
|
|
|
working as we intend. Let’s run this program again with the arguments `test`
|
|
|
|
|
and `sample.txt`:
|
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- test sample.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
|
|
|
|
|
Running `target/debug/minigrep test sample.txt`
|
|
|
|
|
Searching for test
|
|
|
|
|
In file sample.txt
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Great, the program is working! The values of the arguments we need are being
|
|
|
|
|
saved into the right variables. Later we’ll add some error handling to deal
|
|
|
|
|
with certain potential erroneous situations, such as when the user provides no
|
|
|
|
|
arguments; for now, we’ll ignore that situation and work on adding file-reading
|
|
|
|
|
capabilities instead.
|
|
|
|
|
|
|
|
|
|
## Reading a File
|
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
Now we’ll add functionality to read the file specified in the `file_path`
|
2022-08-22 00:07:47 +00:00
|
|
|
|
argument. First we need a sample file to test it with: we’ll use a file with a
|
2021-12-28 21:35:17 +00:00
|
|
|
|
small amount of text over multiple lines with some repeated words. Listing 12-3
|
|
|
|
|
has an Emily Dickinson poem that will work well! Create a file called
|
|
|
|
|
*poem.txt* at the root level of your project, and enter the poem “I’m Nobody!
|
|
|
|
|
Who are you?”
|
|
|
|
|
|
|
|
|
|
Filename: poem.txt
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
I'm nobody! Who are you?
|
|
|
|
|
Are you nobody, too?
|
|
|
|
|
Then there's a pair of us - don't tell!
|
|
|
|
|
They'd banish us, you know.
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
How dreary to be somebody!
|
|
|
|
|
How public, like a frog
|
|
|
|
|
To tell your name the livelong day
|
|
|
|
|
To an admiring bog!
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-3: A poem by Emily Dickinson makes a good test case.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
With the text in place, edit *src/main.rs* and add code to read the file, as
|
2021-12-28 21:35:17 +00:00
|
|
|
|
shown in Listing 12-4.
|
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
use std::env;
|
2022-07-15 00:12:02 +00:00
|
|
|
|
1 use std::fs;
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
fn main() {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("In file {}", file_path);
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
2 let contents = fs::read_to_string(file_path)
|
2022-06-03 20:16:02 +00:00
|
|
|
|
.expect("Should have been able to read the file");
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
3 println!("With text:\n{contents}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-4: Reading the contents of the file specified by the second argument
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
First we bring in a relevant part of the standard library with a `use`
|
2022-04-25 13:14:26 +00:00
|
|
|
|
statement: we need `std::fs` to handle files [1].
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
In `main`, the new statement `fs::read_to_string` takes the `file_path`, opens
|
|
|
|
|
that file, and returns an `std::io::Result<String>` of the file’s contents [2].
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
After that, we again add a temporary `println!` statement that prints the value
|
|
|
|
|
of `contents` after the file is read, so we can check that the program is
|
2022-04-25 13:14:26 +00:00
|
|
|
|
working so far [3].
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Let’s run this code with any string as the first command line argument (because
|
|
|
|
|
we haven’t implemented the searching part yet) and the *poem.txt* file as the
|
|
|
|
|
second argument:
|
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- the poem.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
|
|
|
|
|
Running `target/debug/minigrep the poem.txt`
|
|
|
|
|
Searching for the
|
|
|
|
|
In file poem.txt
|
|
|
|
|
With text:
|
|
|
|
|
I'm nobody! Who are you?
|
|
|
|
|
Are you nobody, too?
|
|
|
|
|
Then there's a pair of us - don't tell!
|
|
|
|
|
They'd banish us, you know.
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
How dreary to be somebody!
|
|
|
|
|
How public, like a frog
|
|
|
|
|
To tell your name the livelong day
|
|
|
|
|
To an admiring bog!
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Great! The code read and then printed the contents of the file. But the code
|
2022-04-25 13:14:26 +00:00
|
|
|
|
has a few flaws. At the moment, the `main` function has multiple
|
|
|
|
|
responsibilities: generally, functions are clearer and easier to maintain if
|
|
|
|
|
each function is responsible for only one idea. The other problem is that we’re
|
|
|
|
|
not handling errors as well as we could. The program is still small, so these
|
|
|
|
|
flaws aren’t a big problem, but as the program grows, it will be harder to fix
|
2022-08-22 00:07:47 +00:00
|
|
|
|
them cleanly. It’s a good practice to begin refactoring early on when
|
|
|
|
|
developing a program because it’s much easier to refactor smaller amounts of
|
|
|
|
|
code. We’ll do that next.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
## Refactoring to Improve Modularity and Error Handling
|
|
|
|
|
|
|
|
|
|
To improve our program, we’ll fix four problems that have to do with the
|
2022-04-25 13:14:26 +00:00
|
|
|
|
program’s structure and how it’s handling potential errors. First, our `main`
|
|
|
|
|
function now performs two tasks: it parses arguments and reads files. As our
|
|
|
|
|
program grows, the number of separate tasks the `main` function handles will
|
|
|
|
|
increase. As a function gains responsibilities, it becomes more difficult to
|
|
|
|
|
reason about, harder to test, and harder to change without breaking one of its
|
|
|
|
|
parts. It’s best to separate functionality so each function is responsible for
|
|
|
|
|
one task.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
This issue also ties into the second problem: although `query` and `file_path`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
are configuration variables to our program, variables like `contents` are used
|
|
|
|
|
to perform the program’s logic. The longer `main` becomes, the more variables
|
|
|
|
|
we’ll need to bring into scope; the more variables we have in scope, the harder
|
|
|
|
|
it will be to keep track of the purpose of each. It’s best to group the
|
|
|
|
|
configuration variables into one structure to make their purpose clear.
|
|
|
|
|
|
|
|
|
|
The third problem is that we’ve used `expect` to print an error message when
|
2022-06-03 20:16:02 +00:00
|
|
|
|
reading the file fails, but the error message just prints `Should have been
|
|
|
|
|
able to read the file`. Reading a file can fail in a number of ways: for
|
|
|
|
|
example, the file could be missing, or we might not have permission to open it.
|
|
|
|
|
Right now, regardless of the situation, we’d print the same error message for
|
2022-04-25 13:14:26 +00:00
|
|
|
|
everything, which wouldn’t give the user any information!
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Fourth, we use `expect` repeatedly to handle different errors, and if the user
|
|
|
|
|
runs our program without specifying enough arguments, they’ll get an `index out
|
|
|
|
|
of bounds` error from Rust that doesn’t clearly explain the problem. It would
|
|
|
|
|
be best if all the error-handling code were in one place so future maintainers
|
2022-04-25 13:14:26 +00:00
|
|
|
|
had only one place to consult the code if the error-handling logic needed to
|
2021-12-28 21:35:17 +00:00
|
|
|
|
change. Having all the error-handling code in one place will also ensure that
|
|
|
|
|
we’re printing messages that will be meaningful to our end users.
|
|
|
|
|
|
|
|
|
|
Let’s address these four problems by refactoring our project.
|
|
|
|
|
|
|
|
|
|
### Separation of Concerns for Binary Projects
|
|
|
|
|
|
|
|
|
|
The organizational problem of allocating responsibility for multiple tasks to
|
|
|
|
|
the `main` function is common to many binary projects. As a result, the Rust
|
2022-04-25 13:14:26 +00:00
|
|
|
|
community has developed guidelines for splitting the separate concerns of a
|
|
|
|
|
binary program when `main` starts getting large. This process has the following
|
|
|
|
|
steps:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
* Split your program into a *main.rs* file and a *lib.rs* file and move your
|
|
|
|
|
program’s logic to *lib.rs*.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
* As long as your command line parsing logic is small, it can remain in
|
2022-07-15 00:12:02 +00:00
|
|
|
|
*main.rs*.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
* When the command line parsing logic starts getting complicated, extract it
|
2022-07-15 00:12:02 +00:00
|
|
|
|
from *main.rs* and move it to *lib.rs*.
|
2022-09-13 16:08:50 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
The responsibilities that remain in the `main` function after this process
|
|
|
|
|
should be limited to the following:
|
|
|
|
|
|
|
|
|
|
* Calling the command line parsing logic with the argument values
|
|
|
|
|
* Setting up any other configuration
|
|
|
|
|
* Calling a `run` function in *lib.rs*
|
|
|
|
|
* Handling the error if `run` returns an error
|
2022-09-13 16:08:50 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
This pattern is about separating concerns: *main.rs* handles running the
|
2022-08-22 00:07:47 +00:00
|
|
|
|
program and *lib.rs* handles all the logic of the task at hand. Because you
|
2021-12-28 21:35:17 +00:00
|
|
|
|
can’t test the `main` function directly, this structure lets you test all of
|
2022-04-25 13:14:26 +00:00
|
|
|
|
your program’s logic by moving it into functions in *lib.rs*. The code that
|
|
|
|
|
remains in *main.rs* will be small enough to verify its correctness by reading
|
|
|
|
|
it. Let’s rework our program by following this process.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
#### Extracting the Argument Parser
|
|
|
|
|
|
|
|
|
|
We’ll extract the functionality for parsing arguments into a function that
|
|
|
|
|
`main` will call to prepare for moving the command line parsing logic to
|
2022-09-13 16:08:50 +00:00
|
|
|
|
src/lib.rs*. Listing 12-5 shows the new start of `main` that calls a new
|
|
|
|
|
function `parse_config`, which we’ll define in *src/main.rs* for the moment.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
fn main() {
|
|
|
|
|
let args: Vec<String> = env::args().collect();
|
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let (query, file_path) = parse_config(&args);
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
fn parse_config(args: &[String]) -> (&str, &str) {
|
|
|
|
|
let query = &args[1];
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let file_path = &args[2];
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
(query, file_path)
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-5: Extracting a `parse_config` function from `main`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We’re still collecting the command line arguments into a vector, but instead of
|
|
|
|
|
assigning the argument value at index 1 to the variable `query` and the
|
2022-06-03 20:16:02 +00:00
|
|
|
|
argument value at index 2 to the variable `file_path` within the `main`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
function, we pass the whole vector to the `parse_config` function. The
|
|
|
|
|
`parse_config` function then holds the logic that determines which argument
|
|
|
|
|
goes in which variable and passes the values back to `main`. We still create
|
2022-06-03 20:16:02 +00:00
|
|
|
|
the `query` and `file_path` variables in `main`, but `main` no longer has the
|
2021-12-28 21:35:17 +00:00
|
|
|
|
responsibility of determining how the command line arguments and variables
|
|
|
|
|
correspond.
|
|
|
|
|
|
|
|
|
|
This rework may seem like overkill for our small program, but we’re refactoring
|
|
|
|
|
in small, incremental steps. After making this change, run the program again to
|
|
|
|
|
verify that the argument parsing still works. It’s good to check your progress
|
|
|
|
|
often, to help identify the cause of problems when they occur.
|
|
|
|
|
|
|
|
|
|
#### Grouping Configuration Values
|
|
|
|
|
|
|
|
|
|
We can take another small step to improve the `parse_config` function further.
|
|
|
|
|
At the moment, we’re returning a tuple, but then we immediately break that
|
|
|
|
|
tuple into individual parts again. This is a sign that perhaps we don’t have
|
|
|
|
|
the right abstraction yet.
|
|
|
|
|
|
|
|
|
|
Another indicator that shows there’s room for improvement is the `config` part
|
|
|
|
|
of `parse_config`, which implies that the two values we return are related and
|
|
|
|
|
are both part of one configuration value. We’re not currently conveying this
|
|
|
|
|
meaning in the structure of the data other than by grouping the two values into
|
2022-04-25 13:14:26 +00:00
|
|
|
|
a tuple; we’ll instead put the two values into one struct and give each of the
|
2021-12-28 21:35:17 +00:00
|
|
|
|
struct fields a meaningful name. Doing so will make it easier for future
|
|
|
|
|
maintainers of this code to understand how the different values relate to each
|
|
|
|
|
other and what their purpose is.
|
|
|
|
|
|
|
|
|
|
Listing 12-6 shows the improvements to the `parse_config` function.
|
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
fn main() {
|
2022-08-22 00:07:47 +00:00
|
|
|
|
let args: Vec<String> = env::args().collect();
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
1 let config = parse_config(&args);
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
println!("Searching for {}", 2 config.query);
|
|
|
|
|
println!("In file {}", 3 config.file_path);
|
|
|
|
|
|
|
|
|
|
let contents = fs::read_to_string(4 config.file_path)
|
|
|
|
|
.expect("Should have been able to read the file");
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
5 struct Config {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
query: String,
|
2022-06-03 20:16:02 +00:00
|
|
|
|
file_path: String,
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
6 fn parse_config(args: &[String]) -> Config {
|
|
|
|
|
7 let query = args[1].clone();
|
|
|
|
|
8 let file_path = args[2].clone();
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
Config { query, file_path }
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-6: Refactoring `parse_config` to return an instance of a `Config`
|
|
|
|
|
struct
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We’ve added a struct named `Config` defined to have fields named `query` and
|
2022-07-15 00:12:02 +00:00
|
|
|
|
`file_path` [5]. The signature of `parse_config` now indicates that it returns
|
|
|
|
|
a `Config` value [6]. In the body of `parse_config`, where we used to return
|
2021-12-28 21:35:17 +00:00
|
|
|
|
string slices that reference `String` values in `args`, we now define `Config`
|
|
|
|
|
to contain owned `String` values. The `args` variable in `main` is the owner of
|
|
|
|
|
the argument values and is only letting the `parse_config` function borrow
|
|
|
|
|
them, which means we’d violate Rust’s borrowing rules if `Config` tried to take
|
|
|
|
|
ownership of the values in `args`.
|
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
There are a number of ways we could manage the `String` data; the easiest,
|
|
|
|
|
though somewhat inefficient, route is to call the `clone` method on the values
|
2022-08-22 01:26:53 +00:00
|
|
|
|
[7] [8]. This will make a full copy of the data for the `Config` instance to
|
|
|
|
|
own, which takes more time and memory than storing a reference to the string
|
2022-04-25 13:14:26 +00:00
|
|
|
|
data. However, cloning the data also makes our code very straightforward
|
|
|
|
|
because we don’t have to manage the lifetimes of the references; in this
|
|
|
|
|
circumstance, giving up a little performance to gain simplicity is a worthwhile
|
|
|
|
|
trade-off.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
> ### The Trade-Offs of Using clone
|
2022-09-13 16:08:50 +00:00
|
|
|
|
>
|
2021-12-28 21:35:17 +00:00
|
|
|
|
> There’s a tendency among many Rustaceans to avoid using `clone` to fix
|
2022-07-15 00:12:02 +00:00
|
|
|
|
ownership problems because of its runtime cost. In Chapter 13, you’ll learn how
|
|
|
|
|
to use more efficient methods in this type of situation. But for now, it’s okay
|
|
|
|
|
to copy a few strings to continue making progress because you’ll make these
|
|
|
|
|
copies only once and your file path and query string are very small. It’s
|
|
|
|
|
better to have a working program that’s a bit inefficient than to try to
|
|
|
|
|
hyperoptimize code on your first pass. As you become more experienced with
|
|
|
|
|
Rust, it’ll be easier to start with the most efficient solution, but for now,
|
|
|
|
|
it’s perfectly acceptable to call `clone`.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We’ve updated `main` so it places the instance of `Config` returned by
|
|
|
|
|
`parse_config` into a variable named `config` [1], and we updated the code that
|
2022-06-03 20:16:02 +00:00
|
|
|
|
previously used the separate `query` and `file_path` variables so it now uses
|
2022-08-22 01:26:53 +00:00
|
|
|
|
the fields on the `Config` struct instead [2] [3] [4].
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
Now our code more clearly conveys that `query` and `file_path` are related and
|
2021-12-28 21:35:17 +00:00
|
|
|
|
that their purpose is to configure how the program will work. Any code that
|
|
|
|
|
uses these values knows to find them in the `config` instance in the fields
|
|
|
|
|
named for their purpose.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
#### Creating a Constructor for Config
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
So far, we’ve extracted the logic responsible for parsing the command line
|
|
|
|
|
arguments from `main` and placed it in the `parse_config` function. Doing so
|
2022-08-22 00:07:47 +00:00
|
|
|
|
helped us see that the `query` and `file_path` values were related, and that
|
2021-12-28 21:35:17 +00:00
|
|
|
|
relationship should be conveyed in our code. We then added a `Config` struct to
|
2022-07-15 00:12:02 +00:00
|
|
|
|
name the related purpose of `query` and `file_path` and to be able to return
|
|
|
|
|
the values’ names as struct field names from the `parse_config` function.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
So now that the purpose of the `parse_config` function is to create a `Config`
|
|
|
|
|
instance, we can change `parse_config` from a plain function to a function
|
|
|
|
|
named `new` that is associated with the `Config` struct. Making this change
|
|
|
|
|
will make the code more idiomatic. We can create instances of types in the
|
|
|
|
|
standard library, such as `String`, by calling `String::new`. Similarly, by
|
|
|
|
|
changing `parse_config` into a `new` function associated with `Config`, we’ll
|
2022-08-22 00:07:47 +00:00
|
|
|
|
be able to create instances of `Config` by calling `Config::new`. Listing 12-7
|
|
|
|
|
shows the changes we need to make.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
fn main() {
|
|
|
|
|
let args: Vec<String> = env::args().collect();
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
1 let config = Config::new(&args);
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
2 impl Config {
|
|
|
|
|
3 fn new(args: &[String]) -> Config {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
let query = args[1].clone();
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let file_path = args[2].clone();
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
Config { query, file_path }
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-7: Changing `parse_config` into `Config::new`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We’ve updated `main` where we were calling `parse_config` to instead call
|
|
|
|
|
`Config::new` [1]. We’ve changed the name of `parse_config` to `new` [3] and
|
|
|
|
|
moved it within an `impl` block [2], which associates the `new` function with
|
|
|
|
|
`Config`. Try compiling this code again to make sure it works.
|
|
|
|
|
|
|
|
|
|
### Fixing the Error Handling
|
|
|
|
|
|
|
|
|
|
Now we’ll work on fixing our error handling. Recall that attempting to access
|
|
|
|
|
the values in the `args` vector at index 1 or index 2 will cause the program to
|
|
|
|
|
panic if the vector contains fewer than three items. Try running the program
|
|
|
|
|
without any arguments; it will look like this:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo run
|
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
|
|
|
|
|
Running `target/debug/minigrep`
|
2022-07-15 00:12:02 +00:00
|
|
|
|
thread 'main' panicked at 'index out of bounds: the len is 1 but
|
|
|
|
|
the index is 1', src/main.rs:27:21
|
|
|
|
|
note: run with `RUST_BACKTRACE=1` environment variable to display
|
|
|
|
|
a backtrace
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
The line `index out of bounds: the len is 1 but the index is 1` is an error
|
|
|
|
|
message intended for programmers. It won’t help our end users understand what
|
2022-04-25 13:14:26 +00:00
|
|
|
|
they should do instead. Let’s fix that now.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
#### Improving the Error Message
|
|
|
|
|
|
|
|
|
|
In Listing 12-8, we add a check in the `new` function that will verify that the
|
2022-08-22 00:07:47 +00:00
|
|
|
|
slice is long enough before accessing index 1 and index 2. If the slice isn’t
|
|
|
|
|
long enough, the program panics and displays a better error message.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
fn new(args: &[String]) -> Config {
|
|
|
|
|
if args.len() < 3 {
|
|
|
|
|
panic!("not enough arguments");
|
|
|
|
|
}
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-8: Adding a check for the number of arguments
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
This code is similar to the `Guess::new` function we wrote in Listing 9-13,
|
2021-12-28 21:35:17 +00:00
|
|
|
|
where we called `panic!` when the `value` argument was out of the range of
|
|
|
|
|
valid values. Instead of checking for a range of values here, we’re checking
|
2022-08-22 00:07:47 +00:00
|
|
|
|
that the length of `args` is at least `3` and the rest of the function can
|
2021-12-28 21:35:17 +00:00
|
|
|
|
operate under the assumption that this condition has been met. If `args` has
|
2022-08-24 13:12:29 +00:00
|
|
|
|
fewer than three items, this condition will be `true`, and we call the `panic!`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
macro to end the program immediately.
|
|
|
|
|
|
|
|
|
|
With these extra few lines of code in `new`, let’s run the program without any
|
|
|
|
|
arguments again to see what the error looks like now:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo run
|
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
|
|
|
|
|
Running `target/debug/minigrep`
|
2022-08-22 01:26:53 +00:00
|
|
|
|
thread 'main' panicked at 'not enough arguments',
|
|
|
|
|
src/main.rs:26:13
|
|
|
|
|
note: run with `RUST_BACKTRACE=1` environment variable to display
|
|
|
|
|
a backtrace
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
This output is better: we now have a reasonable error message. However, we also
|
2022-08-22 00:07:47 +00:00
|
|
|
|
have extraneous information we don’t want to give to our users. Perhaps the
|
|
|
|
|
technique we used in Listing 9-13 isn’t the best one to use here: a call to
|
2021-12-28 21:35:17 +00:00
|
|
|
|
`panic!` is more appropriate for a programming problem than a usage problem, as
|
2022-04-25 13:14:26 +00:00
|
|
|
|
discussed in Chapter 9. Instead, we’ll use the other technique you learned
|
2021-12-28 21:35:17 +00:00
|
|
|
|
about in Chapter 9—returning a `Result` that indicates either success or an
|
|
|
|
|
error.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
#### Returning a Result Instead of Calling panic!
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We can instead return a `Result` value that will contain a `Config` instance in
|
2022-06-03 18:28:21 +00:00
|
|
|
|
the successful case and will describe the problem in the error case. We’re also
|
|
|
|
|
going to change the function name from `new` to `build` because many
|
|
|
|
|
programmers expect `new` functions to never fail. When `Config::build` is
|
|
|
|
|
communicating to `main`, we can use the `Result` type to signal there was a
|
|
|
|
|
problem. Then we can change `main` to convert an `Err` variant into a more
|
|
|
|
|
practical error for our users without the surrounding text about `thread
|
|
|
|
|
'main'` and `RUST_BACKTRACE` that a call to `panic!` causes.
|
|
|
|
|
|
|
|
|
|
Listing 12-9 shows the changes we need to make to the return value of the
|
2022-08-22 00:07:47 +00:00
|
|
|
|
function we’re now calling `Config::build` and the body of the function needed
|
|
|
|
|
to return a `Result`. Note that this won’t compile until we update `main` as
|
|
|
|
|
well, which we’ll do in the next listing.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
impl Config {
|
2022-06-03 18:28:21 +00:00
|
|
|
|
fn build(args: &[String]) -> Result<Config, &'static str> {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
if args.len() < 3 {
|
|
|
|
|
return Err("not enough arguments");
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
let query = args[1].clone();
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let file_path = args[2].clone();
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
Ok(Config { query, file_path })
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-9: Returning a `Result` from `Config::build`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-06-03 18:28:21 +00:00
|
|
|
|
Our `build` function returns a `Result` with a `Config` instance in the success
|
2022-08-24 13:12:29 +00:00
|
|
|
|
case and an `&'static str` in the error case. Our error values will always be
|
2022-06-03 18:28:21 +00:00
|
|
|
|
string literals that have the `'static` lifetime.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-06-03 18:28:21 +00:00
|
|
|
|
We’ve made two changes in the body of the function: instead of calling `panic!`
|
|
|
|
|
when the user doesn’t pass enough arguments, we now return an `Err` value, and
|
|
|
|
|
we’ve wrapped the `Config` return value in an `Ok`. These changes make the
|
|
|
|
|
function conform to its new type signature.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-06-03 18:28:21 +00:00
|
|
|
|
Returning an `Err` value from `Config::build` allows the `main` function to
|
|
|
|
|
handle the `Result` value returned from the `build` function and exit the
|
|
|
|
|
process more cleanly in the error case.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
#### Calling Config::build and Handling Errors
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
To handle the error case and print a user-friendly message, we need to update
|
2022-06-03 18:28:21 +00:00
|
|
|
|
`main` to handle the `Result` being returned by `Config::build`, as shown in
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Listing 12-10. We’ll also take the responsibility of exiting the command line
|
2022-04-25 13:14:26 +00:00
|
|
|
|
tool with a nonzero error code away from `panic!` and instead implement it by
|
|
|
|
|
hand. A nonzero exit status is a convention to signal to the process that
|
|
|
|
|
called our program that the program exited with an error state.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
2022-07-15 00:12:02 +00:00
|
|
|
|
1 use std::process;
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
|
let args: Vec<String> = env::args().collect();
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
2 let config = Config::build(&args).3 unwrap_or_else(|4 err| {
|
|
|
|
|
5 println!("Problem parsing arguments: {err}");
|
|
|
|
|
6 process::exit(1);
|
2021-12-28 21:35:17 +00:00
|
|
|
|
});
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-10: Exiting with an error code if building a `Config` fails
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
In this listing, we’ve used a method we haven’t covered in detail yet:
|
|
|
|
|
`unwrap_or_else`, which is defined on `Result<T, E>` by the standard library
|
|
|
|
|
[2]. Using `unwrap_or_else` allows us to define some custom, non-`panic!` error
|
|
|
|
|
handling. If the `Result` is an `Ok` value, this method’s behavior is similar
|
2022-08-22 00:07:47 +00:00
|
|
|
|
to `unwrap`: it returns the inner value that `Ok` is wrapping. However, if the
|
|
|
|
|
value is an `Err` value, this method calls the code in the *closure*, which is
|
|
|
|
|
an anonymous function we define and pass as an argument to `unwrap_or_else`
|
|
|
|
|
[3]. We’ll cover closures in more detail in Chapter 13. For now, you just need
|
|
|
|
|
to know that `unwrap_or_else` will pass the inner value of the `Err`, which in
|
2021-12-28 21:35:17 +00:00
|
|
|
|
this case is the static string `"not enough arguments"` that we added in
|
|
|
|
|
Listing 12-9, to our closure in the argument `err` that appears between the
|
|
|
|
|
vertical pipes [4]. The code in the closure can then use the `err` value when
|
|
|
|
|
it runs.
|
|
|
|
|
|
|
|
|
|
We’ve added a new `use` line to bring `process` from the standard library into
|
|
|
|
|
scope [1]. The code in the closure that will be run in the error case is only
|
2022-08-22 00:07:47 +00:00
|
|
|
|
two lines: we print the `err` value [5] and then call `process::exit` [6]. The
|
|
|
|
|
`process::exit` function will stop the program immediately and return the
|
2021-12-28 21:35:17 +00:00
|
|
|
|
number that was passed as the exit status code. This is similar to the
|
|
|
|
|
`panic!`-based handling we used in Listing 12-8, but we no longer get all the
|
|
|
|
|
extra output. Let’s try it:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo run
|
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.48s
|
|
|
|
|
Running `target/debug/minigrep`
|
|
|
|
|
Problem parsing arguments: not enough arguments
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Great! This output is much friendlier for our users.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
### Extracting Logic from main
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Now that we’ve finished refactoring the configuration parsing, let’s turn to
|
|
|
|
|
the program’s logic. As we stated in “Separation of Concerns for Binary
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Projects” on page XX, we’ll extract a function named `run` that will hold all
|
|
|
|
|
the logic currently in the `main` function that isn’t involved with setting up
|
2021-12-28 21:35:17 +00:00
|
|
|
|
configuration or handling errors. When we’re done, `main` will be concise and
|
|
|
|
|
easy to verify by inspection, and we’ll be able to write tests for all the
|
|
|
|
|
other logic.
|
|
|
|
|
|
|
|
|
|
Listing 12-11 shows the extracted `run` function. For now, we’re just making
|
|
|
|
|
the small, incremental improvement of extracting the function. We’re still
|
2022-09-13 16:08:50 +00:00
|
|
|
|
defining the function in *src/main.rs*.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
fn main() {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
|
|
|
|
println!("Searching for {}", config.query);
|
|
|
|
|
println!("In file {}", config.file_path);
|
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
run(config);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
fn run(config: Config) {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let contents = fs::read_to_string(config.file_path)
|
|
|
|
|
.expect("Should have been able to read the file");
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("With text:\n{contents}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-11: Extracting a `run` function containing the rest of the program
|
|
|
|
|
logic
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
The `run` function now contains all the remaining logic from `main`, starting
|
|
|
|
|
from reading the file. The `run` function takes the `Config` instance as an
|
|
|
|
|
argument.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
#### Returning Errors from the run Function
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
With the remaining program logic separated into the `run` function, we can
|
2022-06-03 18:28:21 +00:00
|
|
|
|
improve the error handling, as we did with `Config::build` in Listing 12-9.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Instead of allowing the program to panic by calling `expect`, the `run`
|
|
|
|
|
function will return a `Result<T, E>` when something goes wrong. This will let
|
2022-04-25 13:14:26 +00:00
|
|
|
|
us further consolidate the logic around handling errors into `main` in a
|
2021-12-28 21:35:17 +00:00
|
|
|
|
user-friendly way. Listing 12-12 shows the changes we need to make to the
|
|
|
|
|
signature and body of `run`.
|
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
2022-07-15 00:12:02 +00:00
|
|
|
|
1 use std::error::Error;
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
2 fn run(config: Config) -> Result<(), Box<dyn Error>> {
|
|
|
|
|
let contents = fs::read_to_string(config.file_path)3 ?;
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("With text:\n{contents}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
4 Ok(())
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-12: Changing the `run` function to return `Result`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We’ve made three significant changes here. First, we changed the return type of
|
2022-07-15 00:12:02 +00:00
|
|
|
|
the `run` function to `Result<(), Box<dyn Error>>` [2]. This function
|
|
|
|
|
previously returned the unit type, `()`, and we keep that as the value returned
|
|
|
|
|
in the `Ok` case.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
For the error type, we used the *trait object* `Box<dyn Error>` (and we’ve
|
2022-08-22 00:07:47 +00:00
|
|
|
|
brought `std::error::Error` into scope with a `use` statement at the top [1]).
|
|
|
|
|
We’ll cover trait objects in Chapter 17. For now, just know that `Box<dyn
|
|
|
|
|
Error>` means the function will return a type that implements the `Error`
|
|
|
|
|
trait, but we don’t have to specify what particular type the return value will
|
|
|
|
|
be. This gives us flexibility to return error values that may be of different
|
|
|
|
|
types in different error cases. The `dyn` keyword is short for *dynamic*.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Second, we’ve removed the call to `expect` in favor of the `?` operator [3], as
|
|
|
|
|
we talked about in Chapter 9. Rather than `panic!` on an error, `?` will return
|
|
|
|
|
the error value from the current function for the caller to handle.
|
|
|
|
|
|
|
|
|
|
Third, the `run` function now returns an `Ok` value in the success case [4].
|
|
|
|
|
We’ve declared the `run` function’s success type as `()` in the signature,
|
|
|
|
|
which means we need to wrap the unit type value in the `Ok` value. This
|
2022-08-22 00:07:47 +00:00
|
|
|
|
`Ok(())` syntax might look a bit strange at first, but using `()` like this is
|
|
|
|
|
the idiomatic way to indicate that we’re calling `run` for its side effects
|
2021-12-28 21:35:17 +00:00
|
|
|
|
only; it doesn’t return a value we need.
|
|
|
|
|
|
|
|
|
|
When you run this code, it will compile but will display a warning:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
warning: unused `Result` that must be used
|
|
|
|
|
--> src/main.rs:19:5
|
|
|
|
|
|
|
|
|
|
|
19 | run(config);
|
|
|
|
|
| ^^^^^^^^^^^^
|
|
|
|
|
|
|
|
|
|
|
= note: `#[warn(unused_must_use)]` on by default
|
2022-08-22 01:26:53 +00:00
|
|
|
|
= note: this `Result` may be an `Err` variant, which should be
|
|
|
|
|
handled
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Rust tells us that our code ignored the `Result` value and the `Result` value
|
|
|
|
|
might indicate that an error occurred. But we’re not checking to see whether or
|
|
|
|
|
not there was an error, and the compiler reminds us that we probably meant to
|
|
|
|
|
have some error-handling code here! Let’s rectify that problem now.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
#### Handling Errors Returned from run in main
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We’ll check for errors and handle them using a technique similar to one we used
|
2022-06-03 18:28:21 +00:00
|
|
|
|
with `Config::build` in Listing 12-10, but with a slight difference:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
fn main() {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
println!("Searching for {}", config.query);
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("In file {}", config.file_path);
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
if let Err(e) = run(config) {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("Application error: {e}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
process::exit(1);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
We use `if let` rather than `unwrap_or_else` to check whether `run` returns an
|
2022-08-22 00:07:47 +00:00
|
|
|
|
`Err` value and to call `process::exit(1)` if it does. The `run` function
|
2022-07-15 00:12:02 +00:00
|
|
|
|
doesn’t return a value that we want to `unwrap` in the same way that
|
|
|
|
|
`Config::build` returns the `Config` instance. Because `run` returns `()` in
|
|
|
|
|
the success case, we only care about detecting an error, so we don’t need
|
|
|
|
|
`unwrap_or_else` to return the unwrapped value, which would only be `()`.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
The bodies of the `if let` and the `unwrap_or_else` functions are the same in
|
|
|
|
|
both cases: we print the error and exit.
|
|
|
|
|
|
|
|
|
|
### Splitting Code into a Library Crate
|
|
|
|
|
|
|
|
|
|
Our `minigrep` project is looking good so far! Now we’ll split the
|
2022-09-13 16:08:50 +00:00
|
|
|
|
*src/main.rs* file and put some code into the *src/lib.rs* file. That way, we
|
|
|
|
|
can test the code and have a *src/main.rs* file with fewer responsibilities.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
Let’s move all the code that isn’t in the `main` function from *src/main.rs* to
|
|
|
|
|
*src/lib.rs*:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
* The `run` function definition
|
|
|
|
|
* The relevant `use` statements
|
|
|
|
|
* The definition of `Config`
|
2022-06-03 18:28:21 +00:00
|
|
|
|
* The `Config::build` function definition
|
2022-09-13 16:08:50 +00:00
|
|
|
|
|
|
|
|
|
The contents of *src/lib.rs* should have the signatures shown in Listing 12-13
|
|
|
|
|
(we’ve omitted the bodies of the functions for brevity). Note that this won’t
|
|
|
|
|
compile until we modify *src/main.rs* in Listing 12-14.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
use std::error::Error;
|
|
|
|
|
use std::fs;
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
pub struct Config {
|
|
|
|
|
pub query: String,
|
2022-06-03 20:16:02 +00:00
|
|
|
|
pub file_path: String,
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
impl Config {
|
2022-08-24 13:12:29 +00:00
|
|
|
|
pub fn build(
|
|
|
|
|
args: &[String],
|
|
|
|
|
) -> Result<Config, &'static str> {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-13: Moving `Config` and `run` into *src/lib.rs*
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We’ve made liberal use of the `pub` keyword: on `Config`, on its fields and its
|
2022-06-03 18:28:21 +00:00
|
|
|
|
`build` method, and on the `run` function. We now have a library crate that has
|
|
|
|
|
a public API we can test!
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
Now we need to bring the code we moved to *src/lib.rs* into the scope of the
|
|
|
|
|
binary crate in *src/main.rs*, as shown in Listing 12-14.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
use std::env;
|
|
|
|
|
use std::process;
|
|
|
|
|
|
|
|
|
|
use minigrep::Config;
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
fn main() {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
if let Err(e) = minigrep::run(config) {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-14: Using the `minigrep` library crate in *src/main.rs*
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
We add a `use minigrep::Config` line to bring the `Config` type from the
|
|
|
|
|
library crate into the binary crate’s scope, and we prefix the `run` function
|
|
|
|
|
with our crate name. Now all the functionality should be connected and should
|
2022-07-15 00:12:02 +00:00
|
|
|
|
work. Run the program with `cargo run` and make sure everything works correctly.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Whew! That was a lot of work, but we’ve set ourselves up for success in the
|
|
|
|
|
future. Now it’s much easier to handle errors, and we’ve made the code more
|
2022-09-13 16:08:50 +00:00
|
|
|
|
modular. Almost all of our work will be done in *src/lib.rs* from here on out.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Let’s take advantage of this newfound modularity by doing something that would
|
|
|
|
|
have been difficult with the old code but is easy with the new code: we’ll
|
|
|
|
|
write some tests!
|
|
|
|
|
|
|
|
|
|
## Developing the Library’s Functionality with Test-Driven Development
|
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
Now that we’ve extracted the logic into *src/lib.rs* and left the argument
|
|
|
|
|
collecting and error handling in *src/main.rs*, it’s much easier to write tests
|
|
|
|
|
for the core functionality of our code. We can call functions directly with
|
|
|
|
|
various arguments and check return values without having to call our binary
|
|
|
|
|
from the command line.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
In this section, we’ll add the searching logic to the `minigrep` program using
|
|
|
|
|
the test-driven development (TDD) process with the following steps:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
1. Write a test that fails and run it to make sure it fails for the reason you
|
2022-07-15 00:12:02 +00:00
|
|
|
|
expect.
|
|
|
|
|
1. Write or modify just enough code to make the new test pass.
|
|
|
|
|
1. Refactor the code you just added or changed and make sure the tests continue
|
|
|
|
|
to pass.
|
|
|
|
|
1. Repeat from step 1!
|
2022-09-13 17:38:46 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
Though it’s just one of many ways to write software, TDD can help drive code
|
|
|
|
|
design. Writing the test before you write the code that makes the test pass
|
|
|
|
|
helps to maintain high test coverage throughout the process.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
We’ll test-drive the implementation of the functionality that will actually do
|
2021-12-28 21:35:17 +00:00
|
|
|
|
the searching for the query string in the file contents and produce a list of
|
|
|
|
|
lines that match the query. We’ll add this functionality in a function called
|
|
|
|
|
`search`.
|
|
|
|
|
|
|
|
|
|
### Writing a Failing Test
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Because we don’t need them anymore, let’s remove the `println!` statements from
|
2022-09-13 16:08:50 +00:00
|
|
|
|
*src/lib.rs* and *src/main.rs* that we used to check the program’s behavior.
|
|
|
|
|
Then, in *src/lib.rs*, we’ll add a `tests` module with a test function, as we
|
|
|
|
|
did in Chapter 11. The test function specifies the behavior we want the
|
|
|
|
|
`search` function to have: it will take a query and the text to search, and it
|
|
|
|
|
will return only the lines from the text that contain the query. Listing 12-15
|
|
|
|
|
shows this test, which won’t compile yet.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
#[cfg(test)]
|
|
|
|
|
mod tests {
|
|
|
|
|
use super::*;
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn one_result() {
|
|
|
|
|
let query = "duct";
|
|
|
|
|
let contents = "\
|
|
|
|
|
Rust:
|
|
|
|
|
safe, fast, productive.
|
|
|
|
|
Pick three.";
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
assert_eq!(
|
|
|
|
|
vec!["safe, fast, productive."],
|
|
|
|
|
search(query, contents)
|
|
|
|
|
);
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-15: Creating a failing test for the `search` function we wish we had
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
This test searches for the string `"duct"`. The text we’re searching is three
|
2022-08-22 00:07:47 +00:00
|
|
|
|
lines, only one of which contains `"duct"` (note that the backslash after the
|
2021-12-28 21:35:17 +00:00
|
|
|
|
opening double quote tells Rust not to put a newline character at the beginning
|
|
|
|
|
of the contents of this string literal). We assert that the value returned from
|
|
|
|
|
the `search` function contains only the line we expect.
|
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
We aren’t yet able to run this test and watch it fail because the test doesn’t
|
|
|
|
|
even compile: the `search` function doesn’t exist yet! In accordance with TDD
|
|
|
|
|
principles, we’ll add just enough code to get the test to compile and run by
|
|
|
|
|
adding a definition of the `search` function that always returns an empty
|
|
|
|
|
vector, as shown in Listing 12-16. Then the test should compile and fail
|
|
|
|
|
because an empty vector doesn’t match a vector containing the line `"safe,
|
2022-08-22 00:07:47 +00:00
|
|
|
|
fast, productive."`.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
2022-08-24 13:12:29 +00:00
|
|
|
|
pub fn search<'a>(
|
|
|
|
|
query: &str,
|
|
|
|
|
contents: &'a str,
|
|
|
|
|
) -> Vec<&'a str> {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
vec![]
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-16: Defining just enough of the `search` function so our test will
|
|
|
|
|
compile
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
Notice that we need to define an explicit lifetime `'a` in the signature of
|
|
|
|
|
`search` and use that lifetime with the `contents` argument and the return
|
|
|
|
|
value. Recall in Chapter 10 that the lifetime parameters specify which argument
|
|
|
|
|
lifetime is connected to the lifetime of the return value. In this case, we
|
|
|
|
|
indicate that the returned vector should contain string slices that reference
|
|
|
|
|
slices of the argument `contents` (rather than the argument `query`).
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
In other words, we tell Rust that the data returned by the `search` function
|
|
|
|
|
will live as long as the data passed into the `search` function in the
|
|
|
|
|
`contents` argument. This is important! The data referenced *by* a slice needs
|
|
|
|
|
to be valid for the reference to be valid; if the compiler assumes we’re making
|
|
|
|
|
string slices of `query` rather than `contents`, it will do its safety checking
|
|
|
|
|
incorrectly.
|
|
|
|
|
|
|
|
|
|
If we forget the lifetime annotations and try to compile this function, we’ll
|
|
|
|
|
get this error:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
error[E0106]: missing lifetime specifier
|
2022-08-24 13:12:29 +00:00
|
|
|
|
--> src/lib.rs:31:10
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
2022-08-24 13:12:29 +00:00
|
|
|
|
29 | query: &str,
|
|
|
|
|
| ----
|
|
|
|
|
30 | contents: &str,
|
|
|
|
|
| ----
|
|
|
|
|
31 | ) -> Vec<&str> {
|
|
|
|
|
| ^ expected named lifetime parameter
|
|
|
|
|
|
|
|
|
|
|
= help: this function's return type contains a borrowed value, but the
|
|
|
|
|
signature does not say whether it is borrowed from `query` or `contents`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
help: consider introducing a named lifetime parameter
|
|
|
|
|
|
|
2022-08-24 13:12:29 +00:00
|
|
|
|
28 ~ pub fn search<'a>(
|
|
|
|
|
29 ~ query: &'a str,
|
|
|
|
|
30 ~ contents: &'a str,
|
|
|
|
|
31 ~ ) -> Vec<&'a str> {
|
|
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Rust can’t possibly know which of the two arguments we need, so we need to tell
|
2022-04-25 13:14:26 +00:00
|
|
|
|
it explicitly. Because `contents` is the argument that contains all of our text
|
|
|
|
|
and we want to return the parts of that text that match, we know `contents` is
|
|
|
|
|
the argument that should be connected to the return value using the lifetime
|
|
|
|
|
syntax.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Other programming languages don’t require you to connect arguments to return
|
2022-04-25 13:14:26 +00:00
|
|
|
|
values in the signature, but this practice will get easier over time. You might
|
2022-08-22 01:26:53 +00:00
|
|
|
|
want to compare this example with the examples in “Validating References with
|
|
|
|
|
Lifetimes” on page XX.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Now let’s run the test:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo test
|
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished test [unoptimized + debuginfo] target(s) in 0.97s
|
2022-07-15 00:49:32 +00:00
|
|
|
|
Running unittests src/lib.rs (target/debug/deps/minigrep-9cd200e5fac0fc94)
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
running 1 test
|
|
|
|
|
test tests::one_result ... FAILED
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
failures:
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
---- tests::one_result stdout ----
|
2022-08-24 13:12:29 +00:00
|
|
|
|
thread 'tests::one_result' panicked at 'assertion failed: `(left == right)`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
left: `["safe, fast, productive."]`,
|
2022-08-24 13:12:29 +00:00
|
|
|
|
right: `[]`', src/lib.rs:47:9
|
2021-12-28 21:35:17 +00:00
|
|
|
|
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
failures:
|
|
|
|
|
tests::one_result
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out;
|
|
|
|
|
finished in 0.00s
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
error: test failed, to rerun pass '--lib'
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Great, the test fails, exactly as we expected. Let’s get the test to pass!
|
|
|
|
|
|
|
|
|
|
### Writing Code to Pass the Test
|
|
|
|
|
|
|
|
|
|
Currently, our test is failing because we always return an empty vector. To fix
|
|
|
|
|
that and implement `search`, our program needs to follow these steps:
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
1. Iterate through each line of the contents.
|
|
|
|
|
1. Check whether the line contains our query string.
|
|
|
|
|
1. If it does, add it to the list of values we’re returning.
|
|
|
|
|
1. If it doesn’t, do nothing.
|
|
|
|
|
1. Return the list of results that match.
|
2022-09-13 17:38:46 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
Let’s work through each step, starting with iterating through lines.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
#### Iterating Through Lines with the lines Method
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Rust has a helpful method to handle line-by-line iteration of strings,
|
2022-08-22 00:07:47 +00:00
|
|
|
|
conveniently named `lines`, that works as shown in Listing 12-17. Note that
|
|
|
|
|
this won’t compile yet.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
2022-08-24 13:12:29 +00:00
|
|
|
|
pub fn search<'a>(
|
|
|
|
|
query: &str,
|
|
|
|
|
contents: &'a str,
|
|
|
|
|
) -> Vec<&'a str> {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
for line in contents.lines() {
|
|
|
|
|
// do something with line
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-17: Iterating through each line in `contents`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
The `lines` method returns an iterator. We’ll talk about iterators in depth in
|
|
|
|
|
Chapter 13, but recall that you saw this way of using an iterator in Listing
|
|
|
|
|
3-5, where we used a `for` loop with an iterator to run some code on each item
|
|
|
|
|
in a collection.
|
|
|
|
|
|
|
|
|
|
#### Searching Each Line for the Query
|
|
|
|
|
|
|
|
|
|
Next, we’ll check whether the current line contains our query string.
|
|
|
|
|
Fortunately, strings have a helpful method named `contains` that does this for
|
|
|
|
|
us! Add a call to the `contains` method in the `search` function, as shown in
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Listing 12-18. Note that this still won’t compile yet.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
2022-08-24 13:12:29 +00:00
|
|
|
|
pub fn search<'a>(
|
|
|
|
|
query: &str,
|
|
|
|
|
contents: &'a str,
|
|
|
|
|
) -> Vec<&'a str> {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
for line in contents.lines() {
|
|
|
|
|
if line.contains(query) {
|
|
|
|
|
// do something with line
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-18: Adding functionality to see whether the line contains the string
|
|
|
|
|
in `query`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
At the moment, we’re building up functionality. To get the code to compile, we
|
|
|
|
|
need to return a value from the body as we indicated we would in the function
|
2022-04-25 13:14:26 +00:00
|
|
|
|
signature.
|
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
#### Storing Matching Lines
|
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
To finish this function, we need a way to store the matching lines that we want
|
|
|
|
|
to return. For that, we can make a mutable vector before the `for` loop and
|
|
|
|
|
call the `push` method to store a `line` in the vector. After the `for` loop,
|
|
|
|
|
we return the vector, as shown in Listing 12-19.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
2022-08-24 13:12:29 +00:00
|
|
|
|
pub fn search<'a>(
|
|
|
|
|
query: &str,
|
|
|
|
|
contents: &'a str,
|
|
|
|
|
) -> Vec<&'a str> {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
let mut results = Vec::new();
|
|
|
|
|
|
|
|
|
|
for line in contents.lines() {
|
|
|
|
|
if line.contains(query) {
|
|
|
|
|
results.push(line);
|
|
|
|
|
}
|
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
results
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-19: Storing the lines that match so we can return them
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Now the `search` function should return only the lines that contain `query`,
|
|
|
|
|
and our test should pass. Let’s run the test:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo test
|
|
|
|
|
--snip--
|
|
|
|
|
running 1 test
|
|
|
|
|
test tests::one_result ... ok
|
|
|
|
|
|
2022-08-22 01:26:53 +00:00
|
|
|
|
test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0
|
|
|
|
|
filtered out; finished in 0.00s
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Our test passed, so we know it works!
|
|
|
|
|
|
|
|
|
|
At this point, we could consider opportunities for refactoring the
|
|
|
|
|
implementation of the search function while keeping the tests passing to
|
|
|
|
|
maintain the same functionality. The code in the search function isn’t too bad,
|
|
|
|
|
but it doesn’t take advantage of some useful features of iterators. We’ll
|
|
|
|
|
return to this example in Chapter 13, where we’ll explore iterators in detail,
|
|
|
|
|
and look at how to improve it.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
#### Using the search Function in the run Function
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Now that the `search` function is working and tested, we need to call `search`
|
|
|
|
|
from our `run` function. We need to pass the `config.query` value and the
|
|
|
|
|
`contents` that `run` reads from the file to the `search` function. Then `run`
|
|
|
|
|
will print each line returned from `search`:
|
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let contents = fs::read_to_string(config.file_path)?;
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
for line in search(&config.query, &contents) {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("{line}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Ok(())
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
We’re still using a `for` loop to return each line from `search` and print it.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
Now the entire program should work! Let’s try it out, first with a word that
|
|
|
|
|
should return exactly one line from the Emily Dickinson poem: *frog*.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- frog poem.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.38s
|
|
|
|
|
Running `target/debug/minigrep frog poem.txt`
|
|
|
|
|
How public, like a frog
|
|
|
|
|
```
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Cool! Now let’s try a word that will match multiple lines, like *body*:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- body poem.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
|
|
|
|
|
Running `target/debug/minigrep body poem.txt`
|
|
|
|
|
I'm nobody! Who are you?
|
|
|
|
|
Are you nobody, too?
|
|
|
|
|
How dreary to be somebody!
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
And finally, let’s make sure that we don’t get any lines when we search for a
|
2022-08-22 00:07:47 +00:00
|
|
|
|
word that isn’t anywhere in the poem, such as *monomorphization*:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- monomorphization poem.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
|
|
|
|
|
Running `target/debug/minigrep monomorphization poem.txt`
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Excellent! We’ve built our own mini version of a classic tool and learned a lot
|
|
|
|
|
about how to structure applications. We’ve also learned a bit about file input
|
|
|
|
|
and output, lifetimes, testing, and command line parsing.
|
|
|
|
|
|
|
|
|
|
To round out this project, we’ll briefly demonstrate how to work with
|
|
|
|
|
environment variables and how to print to standard error, both of which are
|
|
|
|
|
useful when you’re writing command line programs.
|
|
|
|
|
|
|
|
|
|
## Working with Environment Variables
|
|
|
|
|
|
|
|
|
|
We’ll improve `minigrep` by adding an extra feature: an option for
|
|
|
|
|
case-insensitive searching that the user can turn on via an environment
|
|
|
|
|
variable. We could make this feature a command line option and require that
|
2022-04-25 13:14:26 +00:00
|
|
|
|
users enter it each time they want it to apply, but by instead making it an
|
|
|
|
|
environment variable, we allow our users to set the environment variable once
|
|
|
|
|
and have all their searches be case insensitive in that terminal session.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
### Writing a Failing Test for the Case-Insensitive search Function
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
We first add a new `search_case_insensitive` function that will be called when
|
2022-04-25 13:30:52 +00:00
|
|
|
|
the environment variable has a value. We’ll continue to follow the TDD process,
|
|
|
|
|
so the first step is again to write a failing test. We’ll add a new test for
|
|
|
|
|
the new `search_case_insensitive` function and rename our old test from
|
2021-12-28 21:35:17 +00:00
|
|
|
|
`one_result` to `case_sensitive` to clarify the differences between the two
|
|
|
|
|
tests, as shown in Listing 12-20.
|
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
#[cfg(test)]
|
|
|
|
|
mod tests {
|
|
|
|
|
use super::*;
|
|
|
|
|
|
|
|
|
|
#[test]
|
|
|
|
|
fn case_sensitive() {
|
|
|
|
|
let query = "duct";
|
|
|
|
|
let contents = "\
|
|
|
|
|
Rust:
|
|
|
|
|
safe, fast, productive.
|
|
|
|
|
Pick three.
|
|
|
|
|
Duct tape.";
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-08-22 01:26:53 +00:00
|
|
|
|
assert_eq!(
|
|
|
|
|
vec!["safe, fast, productive."],
|
|
|
|
|
search(query, contents)
|
|
|
|
|
);
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
#[test]
|
|
|
|
|
fn case_insensitive() {
|
|
|
|
|
let query = "rUsT";
|
|
|
|
|
let contents = "\
|
|
|
|
|
Rust:
|
|
|
|
|
safe, fast, productive.
|
|
|
|
|
Pick three.
|
2022-09-13 16:08:50 +00:00
|
|
|
|
Trust me.";
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
assert_eq!(
|
|
|
|
|
vec!["Rust:", "Trust me."],
|
2021-12-28 21:35:17 +00:00
|
|
|
|
search_case_insensitive(query, contents)
|
|
|
|
|
);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-20: Adding a new failing test for the case-insensitive function
|
|
|
|
|
we’re about to add
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Note that we’ve edited the old test’s `contents` too. We’ve added a new line
|
2022-08-22 00:07:47 +00:00
|
|
|
|
with the text `"Duct tape."` using a capital *D* that shouldn’t match the query
|
2021-12-28 21:35:17 +00:00
|
|
|
|
`"duct"` when we’re searching in a case-sensitive manner. Changing the old test
|
|
|
|
|
in this way helps ensure that we don’t accidentally break the case-sensitive
|
|
|
|
|
search functionality that we’ve already implemented. This test should pass now
|
|
|
|
|
and should continue to pass as we work on the case-insensitive search.
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
The new test for the case-*insensitive* search uses `"rUsT"` as its query. In
|
|
|
|
|
the `search_case_insensitive` function we’re about to add, the query `"rUsT"`
|
|
|
|
|
should match the line containing `"Rust:"` with a capital *R* and match the
|
|
|
|
|
line `"Trust me."` even though both have different casing from the query. This
|
|
|
|
|
is our failing test, and it will fail to compile because we haven’t yet defined
|
|
|
|
|
the `search_case_insensitive` function. Feel free to add a skeleton
|
2021-12-28 21:35:17 +00:00
|
|
|
|
implementation that always returns an empty vector, similar to the way we did
|
|
|
|
|
for the `search` function in Listing 12-16 to see the test compile and fail.
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
### Implementing the search_case_insensitive Function
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
The `search_case_insensitive` function, shown in Listing 12-21, will be almost
|
|
|
|
|
the same as the `search` function. The only difference is that we’ll lowercase
|
2022-08-22 00:07:47 +00:00
|
|
|
|
the `query` and each `line` so that whatever the case of the input arguments,
|
2021-12-28 21:35:17 +00:00
|
|
|
|
they’ll be the same case when we check whether the line contains the query.
|
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
pub fn search_case_insensitive<'a>(
|
|
|
|
|
query: &str,
|
|
|
|
|
contents: &'a str,
|
|
|
|
|
) -> Vec<&'a str> {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
1 let query = query.to_lowercase();
|
2021-12-28 21:35:17 +00:00
|
|
|
|
let mut results = Vec::new();
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
for line in contents.lines() {
|
2022-07-15 00:12:02 +00:00
|
|
|
|
if 2 line.to_lowercase().contains(3 &query) {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
results.push(line);
|
|
|
|
|
}
|
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
results
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-21: Defining the `search_case_insensitive` function to lowercase the
|
|
|
|
|
query and the line before comparing them
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
First we lowercase the `query` string and store it in a shadowed variable with
|
|
|
|
|
the same name [1]. Calling `to_lowercase` on the query is necessary so that no
|
|
|
|
|
matter whether the user’s query is `"rust"`, `"RUST"`, `"Rust"`, or `"rUsT"`,
|
|
|
|
|
we’ll treat the query as if it were `"rust"` and be insensitive to the case.
|
|
|
|
|
While `to_lowercase` will handle basic Unicode, it won’t be 100% accurate. If
|
|
|
|
|
we were writing a real application, we’d want to do a bit more work here, but
|
|
|
|
|
this section is about environment variables, not Unicode, so we’ll leave it at
|
|
|
|
|
that here.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Note that `query` is now a `String` rather than a string slice because calling
|
2021-12-28 21:35:17 +00:00
|
|
|
|
`to_lowercase` creates new data rather than referencing existing data. Say the
|
2022-08-22 00:07:47 +00:00
|
|
|
|
query is `"rUsT"`, as an example: that string slice doesn’t contain a lowercase
|
|
|
|
|
`u` or `t` for us to use, so we have to allocate a new `String` containing
|
|
|
|
|
`"rust"`. When we pass `query` as an argument to the `contains` method now, we
|
|
|
|
|
need to add an ampersand [3] because the signature of `contains` is defined to
|
|
|
|
|
take a string slice.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
Next, we add a call to `to_lowercase` on each `line` to lowercase all
|
|
|
|
|
characters [2]. Now that we’ve converted `line` and `query` to lowercase, we’ll
|
|
|
|
|
find matches no matter what the case of the query is.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Let’s see if this implementation passes the tests:
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
running 2 tests
|
|
|
|
|
test tests::case_insensitive ... ok
|
|
|
|
|
test tests::case_sensitive ... ok
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-08-22 01:26:53 +00:00
|
|
|
|
test result: ok. 2 passed; 0 failed; 0 ignored; 0 measured; 0
|
|
|
|
|
filtered out; finished in 0.00s
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Great! They passed. Now, let’s call the new `search_case_insensitive` function
|
2022-08-22 00:07:47 +00:00
|
|
|
|
from the `run` function. First we’ll add a configuration option to the `Config`
|
|
|
|
|
struct to switch between case-sensitive and case-insensitive search. Adding
|
|
|
|
|
this field will cause compiler errors because we aren’t initializing this field
|
|
|
|
|
anywhere yet:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
pub struct Config {
|
|
|
|
|
pub query: String,
|
2022-06-03 20:16:02 +00:00
|
|
|
|
pub file_path: String,
|
2022-04-25 13:30:52 +00:00
|
|
|
|
pub ignore_case: bool,
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-07-15 00:49:32 +00:00
|
|
|
|
We added the `ignore_case` field that holds a Boolean. Next, we need the `run`
|
|
|
|
|
function to check the `ignore_case` field’s value and use that to decide
|
|
|
|
|
whether to call the `search` function or the `search_case_insensitive`
|
2022-04-25 13:14:26 +00:00
|
|
|
|
function, as shown in Listing 12-22. This still won’t compile yet.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
pub fn run(config: Config) -> Result<(), Box<dyn Error>> {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let contents = fs::read_to_string(config.file_path)?;
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2022-04-25 13:30:52 +00:00
|
|
|
|
let results = if config.ignore_case {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
search_case_insensitive(&config.query, &contents)
|
2022-04-25 13:30:52 +00:00
|
|
|
|
} else {
|
|
|
|
|
search(&config.query, &contents)
|
2021-12-28 21:35:17 +00:00
|
|
|
|
};
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
for line in results {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
println!("{line}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
}
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Ok(())
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-22: Calling either `search` or `search_case_insensitive` based on
|
|
|
|
|
the value in `config.ignore_case`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Finally, we need to check for the environment variable. The functions for
|
|
|
|
|
working with environment variables are in the `env` module in the standard
|
2022-09-13 16:08:50 +00:00
|
|
|
|
library, so we bring that module into scope at the top of *src/lib.rs*. Then
|
2022-08-22 00:07:47 +00:00
|
|
|
|
we’ll use the `var` function from the `env` module to check to see if any value
|
|
|
|
|
has been set for an environment variable named `IGNORE_CASE`, as shown in
|
2022-04-25 13:30:52 +00:00
|
|
|
|
Listing 12-23.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/lib.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
use std::env;
|
2022-07-15 00:12:02 +00:00
|
|
|
|
--snip--
|
2022-08-22 00:07:47 +00:00
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
impl Config {
|
2022-08-24 13:12:29 +00:00
|
|
|
|
pub fn build(
|
|
|
|
|
args: &[String]
|
|
|
|
|
) -> Result<Config, &'static str> {
|
2021-12-28 21:35:17 +00:00
|
|
|
|
if args.len() < 3 {
|
|
|
|
|
return Err("not enough arguments");
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
let query = args[1].clone();
|
2022-06-03 20:16:02 +00:00
|
|
|
|
let file_path = args[2].clone();
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:30:52 +00:00
|
|
|
|
let ignore_case = env::var("IGNORE_CASE").is_ok();
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Ok(Config {
|
|
|
|
|
query,
|
2022-06-03 20:16:02 +00:00
|
|
|
|
file_path,
|
2022-04-25 13:30:52 +00:00
|
|
|
|
ignore_case,
|
2021-12-28 21:35:17 +00:00
|
|
|
|
})
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-23: Checking for any value in an environment variable named
|
|
|
|
|
`IGNORE_CASE`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Here, we create a new variable, `ignore_case`. To set its value, we call the
|
2022-04-25 13:30:52 +00:00
|
|
|
|
`env::var` function and pass it the name of the `IGNORE_CASE` environment
|
2021-12-28 21:35:17 +00:00
|
|
|
|
variable. The `env::var` function returns a `Result` that will be the
|
|
|
|
|
successful `Ok` variant that contains the value of the environment variable if
|
2022-04-25 13:30:52 +00:00
|
|
|
|
the environment variable is set to any value. It will return the `Err` variant
|
|
|
|
|
if the environment variable is not set.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:30:52 +00:00
|
|
|
|
We’re using the `is_ok` method on the `Result` to check whether the environment
|
|
|
|
|
variable is set, which means the program should do a case-insensitive search.
|
|
|
|
|
If the `IGNORE_CASE` environment variable isn’t set to anything, `is_ok` will
|
2022-08-22 00:07:47 +00:00
|
|
|
|
return `false` and the program will perform a case-sensitive search. We don’t
|
2021-12-28 21:35:17 +00:00
|
|
|
|
care about the *value* of the environment variable, just whether it’s set or
|
2022-04-25 13:30:52 +00:00
|
|
|
|
unset, so we’re checking `is_ok` rather than using `unwrap`, `expect`, or any
|
2021-12-28 21:35:17 +00:00
|
|
|
|
of the other methods we’ve seen on `Result`.
|
|
|
|
|
|
2022-04-25 13:30:52 +00:00
|
|
|
|
We pass the value in the `ignore_case` variable to the `Config` instance so the
|
|
|
|
|
`run` function can read that value and decide whether to call
|
|
|
|
|
`search_case_insensitive` or `search`, as we implemented in Listing 12-22.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Let’s give it a try! First we’ll run our program without the environment
|
2021-12-28 21:35:17 +00:00
|
|
|
|
variable set and with the query `to`, which should match any line that contains
|
2022-08-22 00:07:47 +00:00
|
|
|
|
the word *to* in all lowercase:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- to poem.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
Compiling minigrep v0.1.0 (file:///projects/minigrep)
|
|
|
|
|
Finished dev [unoptimized + debuginfo] target(s) in 0.0s
|
|
|
|
|
Running `target/debug/minigrep to poem.txt`
|
|
|
|
|
Are you nobody, too?
|
|
|
|
|
How dreary to be somebody!
|
|
|
|
|
```
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Looks like that still works! Now let’s run the program with `IGNORE_CASE` set
|
|
|
|
|
to `1` but with the same query `to`:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ IGNORE_CASE=1 cargo run -- to poem.txt
|
2022-04-25 13:14:26 +00:00
|
|
|
|
```
|
|
|
|
|
|
2021-12-28 21:35:17 +00:00
|
|
|
|
If you’re using PowerShell, you will need to set the environment variable and
|
|
|
|
|
run the program as separate commands:
|
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
PS> $Env:IGNORE_CASE=1; cargo run -- to poem.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
2022-07-15 00:12:02 +00:00
|
|
|
|
This will make `IGNORE_CASE` persist for the remainder of your shell session.
|
|
|
|
|
It can be unset with the `Remove-Item` cmdlet:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
2022-04-25 13:30:52 +00:00
|
|
|
|
PS> Remove-Item Env:IGNORE_CASE
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
We should get lines that contain *to* that might have uppercase letters:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
Are you nobody, too?
|
|
|
|
|
How dreary to be somebody!
|
|
|
|
|
To tell your name the livelong day
|
|
|
|
|
To an admiring bog!
|
|
|
|
|
```
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
Excellent, we also got lines containing *To*! Our `minigrep` program can now do
|
2021-12-28 21:35:17 +00:00
|
|
|
|
case-insensitive searching controlled by an environment variable. Now you know
|
|
|
|
|
how to manage options set using either command line arguments or environment
|
|
|
|
|
variables.
|
|
|
|
|
|
|
|
|
|
Some programs allow arguments *and* environment variables for the same
|
|
|
|
|
configuration. In those cases, the programs decide that one or the other takes
|
2022-04-25 13:30:52 +00:00
|
|
|
|
precedence. For another exercise on your own, try controlling case sensitivity
|
|
|
|
|
through either a command line argument or an environment variable. Decide
|
|
|
|
|
whether the command line argument or the environment variable should take
|
|
|
|
|
precedence if the program is run with one set to case sensitive and one set to
|
|
|
|
|
ignore case.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
The `std::env` module contains many more useful features for dealing with
|
|
|
|
|
environment variables: check out its documentation to see what is available.
|
|
|
|
|
|
|
|
|
|
## Writing Error Messages to Standard Error Instead of Standard Output
|
|
|
|
|
|
|
|
|
|
At the moment, we’re writing all of our output to the terminal using the
|
|
|
|
|
`println!` macro. In most terminals, there are two kinds of output: *standard
|
|
|
|
|
output* (`stdout`) for general information and *standard error* (`stderr`) for
|
|
|
|
|
error messages. This distinction enables users to choose to direct the
|
|
|
|
|
successful output of a program to a file but still print error messages to the
|
|
|
|
|
screen.
|
|
|
|
|
|
2022-08-22 00:07:47 +00:00
|
|
|
|
The `println!` macro is only capable of printing to standard output, so we have
|
|
|
|
|
to use something else to print to standard error.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
### Checking Where Errors Are Written
|
|
|
|
|
|
2022-08-24 13:12:29 +00:00
|
|
|
|
First let’s observe how the content printed by `minigrep` is currently being
|
2021-12-28 21:35:17 +00:00
|
|
|
|
written to standard output, including any error messages we want to write to
|
|
|
|
|
standard error instead. We’ll do that by redirecting the standard output stream
|
2022-04-25 13:14:26 +00:00
|
|
|
|
to a file while intentionally causing an error. We won’t redirect the standard
|
|
|
|
|
error stream, so any content sent to standard error will continue to display on
|
|
|
|
|
the screen.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Command line programs are expected to send error messages to the standard error
|
|
|
|
|
stream so we can still see error messages on the screen even if we redirect the
|
2022-08-22 00:07:47 +00:00
|
|
|
|
standard output stream to a file. Our program is not currently well behaved:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
we’re about to see that it saves the error message output to a file instead!
|
|
|
|
|
|
2022-07-15 00:49:32 +00:00
|
|
|
|
To demonstrate this behavior, we’ll run the program with `>` and the file path,
|
2022-09-13 16:08:50 +00:00
|
|
|
|
*output.txt*, that we want to redirect the standard output stream to. We won’t
|
|
|
|
|
pass any arguments, which should cause an error:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo run > output.txt
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
The `>` syntax tells the shell to write the contents of standard output to
|
2022-09-13 16:08:50 +00:00
|
|
|
|
*output.txt* instead of the screen. We didn’t see the error message we were
|
2021-12-28 21:35:17 +00:00
|
|
|
|
expecting printed to the screen, so that means it must have ended up in the
|
2022-09-13 16:08:50 +00:00
|
|
|
|
file. This is what *output.txt* contains:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
Problem parsing arguments: not enough arguments
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
Yup, our error message is being printed to standard output. It’s much more
|
|
|
|
|
useful for error messages like this to be printed to standard error so only
|
|
|
|
|
data from a successful run ends up in the file. We’ll change that.
|
|
|
|
|
|
|
|
|
|
### Printing Errors to Standard Error
|
|
|
|
|
|
|
|
|
|
We’ll use the code in Listing 12-24 to change how error messages are printed.
|
|
|
|
|
Because of the refactoring we did earlier in this chapter, all the code that
|
|
|
|
|
prints error messages is in one function, `main`. The standard library provides
|
2022-08-22 00:07:47 +00:00
|
|
|
|
the `eprintln!` macro that prints to the standard error stream, so let’s change
|
|
|
|
|
the two places we were calling `println!` to print errors to use `eprintln!`
|
|
|
|
|
instead.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Filename: src/main.rs
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
fn main() {
|
|
|
|
|
let args: Vec<String> = env::args().collect();
|
|
|
|
|
|
2022-06-03 18:28:21 +00:00
|
|
|
|
let config = Config::build(&args).unwrap_or_else(|err| {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
eprintln!("Problem parsing arguments: {err}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
process::exit(1);
|
|
|
|
|
});
|
|
|
|
|
|
|
|
|
|
if let Err(e) = minigrep::run(config) {
|
2022-06-03 20:16:02 +00:00
|
|
|
|
eprintln!("Application error: {e}");
|
2021-12-28 21:35:17 +00:00
|
|
|
|
process::exit(1);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:54:09 +00:00
|
|
|
|
Listing 12-24: Writing error messages to standard error instead of standard
|
|
|
|
|
output using `eprintln!`
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
2022-04-25 13:14:26 +00:00
|
|
|
|
Let’s now run the program again in the same way, without any arguments and
|
|
|
|
|
redirecting standard output with `>`:
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
$ cargo run > output.txt
|
|
|
|
|
Problem parsing arguments: not enough arguments
|
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
Now we see the error onscreen and *output.txt* contains nothing, which is the
|
|
|
|
|
behavior we expect of command line programs.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Let’s run the program again with arguments that don’t cause an error but still
|
|
|
|
|
redirect standard output to a file, like so:
|
|
|
|
|
|
|
|
|
|
```
|
2022-06-03 18:28:21 +00:00
|
|
|
|
$ cargo run -- to poem.txt > output.txt
|
2021-12-28 21:35:17 +00:00
|
|
|
|
```
|
|
|
|
|
|
2022-09-13 16:08:50 +00:00
|
|
|
|
We won’t see any output to the terminal, and *output.txt* will contain our
|
2021-12-28 21:35:17 +00:00
|
|
|
|
results:
|
|
|
|
|
|
|
|
|
|
Filename: output.txt
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
Are you nobody, too?
|
|
|
|
|
How dreary to be somebody!
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
This demonstrates that we’re now using standard output for successful output
|
|
|
|
|
and standard error for error output as appropriate.
|
|
|
|
|
|
|
|
|
|
## Summary
|
|
|
|
|
|
|
|
|
|
This chapter recapped some of the major concepts you’ve learned so far and
|
|
|
|
|
covered how to perform common I/O operations in Rust. By using command line
|
2022-08-22 00:07:47 +00:00
|
|
|
|
arguments, files, environment variables, and the `eprintln!` macro for printing
|
|
|
|
|
errors, you’re now prepared to write command line applications. Combined with
|
|
|
|
|
the concepts in previous chapters, your code will be well organized, store data
|
|
|
|
|
effectively in the appropriate data structures, handle errors nicely, and be
|
|
|
|
|
well tested.
|
2021-12-28 21:35:17 +00:00
|
|
|
|
|
|
|
|
|
Next, we’ll explore some Rust features that were influenced by functional
|
|
|
|
|
languages: closures and iterators.
|
2022-07-15 00:12:02 +00:00
|
|
|
|
|