Propagate edits to chapter 20 to src

This commit is contained in:
Carol (Nichols || Goulding) 2022-06-07 21:44:03 -04:00
parent 89f9c10353
commit 5df6909c57
No known key found for this signature in database
GPG Key ID: E907EE5A736F87D4
5 changed files with 125 additions and 113 deletions

View File

@ -502,6 +502,7 @@ TextField
That'd
there'd
ThreadPool
threadpool
timestamp
Tiếng
timeline

View File

@ -12,7 +12,7 @@ Figure 20-1 in a web browser.
<span class="caption">Figure 20-1: Our final shared project</span>
Here is the plan to build the web server:
Here is our plan for building the web server:
1. Learn a bit about TCP and HTTP.
2. Listen for TCP connections on a socket.
@ -20,15 +20,14 @@ Here is the plan to build the web server:
4. Create a proper HTTP response.
5. Improve the throughput of our server with a thread pool.
But before we get started, we should mention one detail: the method well use
wont be the best way to build a web server with Rust. A number of
production-ready crates are available on [crates.io](https://crates.io/) that
provide more complete web server and thread pool implementations than well
build.
However, our intention in this chapter is to help you learn, not to take the
easy route. Because Rust is a systems programming language, we can choose the
level of abstraction we want to work with and can go to a lower level than is
possible or practical in other languages. Well write the basic HTTP server and
thread pool manually so you can learn the general ideas and techniques behind
the crates you might use in the future.
Before we get started, we should mention one detail: the method well use wont
be the best way to build a web server with Rust. Community members have
published a number of production-ready crates available on
[crates.io](https://crates.io/) that provide more complete web server and
thread pool implementations than well build. However, our intention in this
chapter is to help you learn, not to take the easy route. Because Rust is a
systems programming language, we can choose the level of abstraction we want to
work with and can go to a lower level than is possible or practical in other
languages. Well therefore write the basic HTTP server and thread pool manually
so you can learn the general ideas and techniques behind the crates you might
use in the future.

View File

@ -5,12 +5,11 @@ lets look at a quick overview of the protocols involved in building web
servers. The details of these protocols are beyond the scope of this book, but
a brief overview will give you the information you need.
The two main protocols involved in web servers are the *Hypertext Transfer
Protocol* *(HTTP)* and the *Transmission Control Protocol* *(TCP)*. Both
protocols are *request-response* protocols, meaning a *client* initiates
requests and a *server* listens to the requests and provides a response to the
client. The contents of those requests and responses are defined by the
protocols.
The two main protocols involved in web servers are *Hypertext Transfer
Protocol* *(HTTP)* and *Transmission Control Protocol* *(TCP)*. Both protocols
are *request-response* protocols, meaning a *client* initiates requests and a
*server* listens to the requests and provides a response to the client. The
contents of those requests and responses are defined by the protocols.
TCP is the lower-level protocol that describes the details of how information
gets from one server to another but doesnt specify what that information is.
@ -32,8 +31,8 @@ $ cd hello
```
Now enter the code in Listing 20-1 in *src/main.rs* to start. This code will
listen at the address `127.0.0.1:7878` for incoming TCP streams. When it gets
an incoming stream, it will print `Connection established!`.
listen at the local address `127.0.0.1:7878` for incoming TCP streams. When it
gets an incoming stream, it will print `Connection established!`.
<span class="filename">Filename: src/main.rs</span>
@ -48,23 +47,24 @@ Using `TcpListener`, we can listen for TCP connections at the address
`127.0.0.1:7878`. In the address, the section before the colon is an IP address
representing your computer (this is the same on every computer and doesnt
represent the authors computer specifically), and `7878` is the port. Weve
chosen this port for two reasons: HTTP isnt normally accepted on this port, and
7878 is *rust* typed on a telephone.
chosen this port for two reasons: HTTP isnt normally accepted on this port so
our server is unlikely to conflict with any other web server you might have
running on your machine, and 7878 is *rust* typed on a telephone.
The `bind` function in this scenario works like the `new` function in that it
will return a new `TcpListener` instance. The reason the function is called
`bind` is that in networking, connecting to a port to listen to is known as
“binding to a port.”
will return a new `TcpListener` instance. The function is called `bind`
because, in networking, connecting to a port to listen to is known as “binding
to a port.”
The `bind` function returns a `Result<T, E>`, which indicates that binding
might fail. For example, connecting to port 80 requires administrator
privileges (nonadministrators can listen only on ports higher than 1023), so if
we tried to connect to port 80 without being an administrator, binding wouldnt
work. As another example, binding wouldnt work if we ran two instances of our
program and so had two programs listening to the same port. Because were
writing a basic server just for learning purposes, we wont worry about
handling these kinds of errors; instead, we use `unwrap` to stop the program if
errors happen.
The `bind` function returns a `Result<T, E>`, which indicates that its
possible for binding to fail. For example, connecting to port 80 requires
administrator privileges (nonadministrators can listen only on ports higher
than 1023), so if we tried to connect to port 80 without being an
administrator, binding wouldnt work. Binding also wouldnt work, for example,
if we ran two instances of our program and so had two programs listening to the
same port. Because were writing a basic server just for learning purposes, we
wont worry about handling these kinds of errors; instead, we use `unwrap` to
stop the program if errors happen.
The `incoming` method on `TcpListener` returns an iterator that gives us a
sequence of streams (more specifically, streams of type `TcpStream`). A single
@ -160,10 +160,10 @@ more gracefully, but were choosing to stop the program in the error case for
simplicity.
The browser signals the end of an HTTP request by sending two newline
characters in a row, so to get one request from the stream, we take lines while
theyre not the empty string. Once weve collected the lines into the vector,
were printing them out using pretty debug formatting so we can take a look at
the instructions the web browser is sending to our server.
characters in a row, so to get one request from the stream, we take lines until
we get a line that is the empty string. Once weve collected the lines into the
vector, were printing them out using pretty debug formatting so we can take a
look at the instructions the web browser is sending to our server.
Lets try this code! Start the program and make a request in a web browser
again. Note that well still get an error page in the browser, but our
@ -215,7 +215,8 @@ message-body
The first line is the *request line* that holds information about what the
client is requesting. The first part of the request line indicates the *method*
being used, such as `GET` or `POST`, which describes how the client is making
this request. Our client used a `GET` request.
this request. Our client used a `GET` request, which means it is asking for
information.
The next part of the request line is */*, which indicates the *Uniform Resource
Identifier* *(URI)* the client is requesting: a URI is almost, but not quite,
@ -297,7 +298,7 @@ request and sending a response!
### Returning Real HTML
Lets implement the functionality for returning more than a blank page. Create
a new file, *hello.html*, in the root of your project directory, not in the
the new file *hello.html* in the root of your project directory, not in the
*src* directory. You can input any HTML you want; Listing 20-4 shows one
possibility.
@ -340,9 +341,10 @@ should see your HTML rendered!
Currently, were ignoring the request data in `http_request` and just sending
back the contents of the HTML file unconditionally. That means if you try
requesting *127.0.0.1:7878/something-else* in your browser, youll still get
back this same HTML response. Our server is very limited and is not what most
web servers do. We want to customize our responses depending on the request and
only send back the HTML file for a well-formed request to */*.
back this same HTML response. At the moment, our server is very limited and
does not do what most web servers do. We want to customize our responses
depending on the request and only send back the HTML file for a well-formed
request to */*.
### Validating the Request and Selectively Responding
@ -360,8 +362,8 @@ received against what we know a request for */* looks like and adds `if` and
{{#rustdoc_include ../listings/ch20-web-server/listing-20-06/src/main.rs:here}}
```
<span class="caption">Listing 20-6: Looking at the request line and handling
requests to */* differently from other requests</span>
<span class="caption">Listing 20-6: Handling requests to */* differently from
other requests</span>
Were only going to be looking at the first line of the HTTP request, so rather
than reading the entire request into a vector, were calling `next` to get the

View File

@ -21,8 +21,8 @@ for 5 seconds before responding.
{{#rustdoc_include ../listings/ch20-web-server/listing-20-10/src/main.rs:here}}
```
<span class="caption">Listing 20-10: Simulating a slow request by recognizing
*/sleep* and sleeping for 5 seconds</span>
<span class="caption">Listing 20-10: Simulating a slow request by sleeping for
5 seconds</span>
We switched from `if` to `match` now that we have three cases. We need to
explicitly match on a slice of `request_line` to pattern match against the
@ -43,9 +43,8 @@ you enter the */* URI a few times, as before, youll see it respond quickly.
But if you enter */sleep* and then load */*, youll see that */* waits until
`sleep` has slept for its full 5 seconds before loading.
There are multiple ways we could change how our web server works to avoid
having more requests back up behind a slow request; the one well implement is
a thread pool.
There are multiple techniques we could use to avoid requests backing up behind
a slow request; the one well implement is a thread pool.
### Improving Throughput with a Thread Pool
@ -64,19 +63,19 @@ for each request as it came in, someone making 10 million requests to our
server could create havoc by using up all our servers resources and grinding
the processing of requests to a halt.
Rather than spawning unlimited threads, well have a fixed number of threads
waiting in the pool. As requests come in, theyll be sent to the pool for
Rather than spawning unlimited threads, then, well have a fixed number of
threads waiting in the pool. Requests that come in are sent to the pool for
processing. The pool will maintain a queue of incoming requests. Each of the
threads in the pool will pop off a request from this queue, handle the request,
and then ask the queue for another request. With this design, we can process
`N` requests concurrently, where `N` is the number of threads. If each thread
is responding to a long-running request, subsequent requests can still back up
in the queue, but weve increased the number of long-running requests we can
handle before reaching that point.
and then ask the queue for another request. With this design, we can process up
to `N` requests concurrently, where `N` is the number of threads. If each
thread is responding to a long-running request, subsequent requests can still
back up in the queue, but weve increased the number of long-running requests
we can handle before reaching that point.
This technique is just one of many ways to improve the throughput of a web
server. Other options you might explore are the fork/join model and the
single-threaded async I/O model. If youre interested in this topic, you can
server. Other options you might explore are the *fork/join model* and the
*single-threaded async I/O model*. If youre interested in this topic, you can
read more about other solutions and try to implement them; with a low-level
language like Rust, all of these options are possible.
@ -90,15 +89,21 @@ designing the public API.
Similar to how we used test-driven development in the project in Chapter 12,
well use compiler-driven development here. Well write the code that calls the
functions we want, and then well look at errors from the compiler to determine
what we should change next to get the code to work.
what we should change next to get the code to work. Before we do that, however,
well explore the technique were not going to use as a starting point.
#### Code Structure If We Could Spawn a Thread for Each Request
<!-- Old headings. Do not remove or links may break. -->
<a id="code-structure-if-we-could-spawn-a-thread-for-each-request"></a>
#### Spawning a Thread for Each Request
First, lets explore how our code might look if it did create a new thread for
every connection. As mentioned earlier, this isnt our final plan due to the
problems with potentially spawning an unlimited number of threads, but it is a
starting point. Listing 20-11 shows the changes to make to `main` to spawn a
new thread to handle each stream within the `for` loop.
starting point to get a working multithreaded server first. Then well add the
thread pool as an improvement, and contrasting the two solutions will be
easier. Listing 20-11 shows the changes to make to `main` to spawn a new thread
to handle each stream within the `for` loop.
<span class="filename">Filename: src/main.rs</span>
@ -112,11 +117,14 @@ stream</span>
As you learned in Chapter 16, `thread::spawn` will create a new thread and then
run the code in the closure in the new thread. If you run this code and load
*/sleep* in your browser, then */* in two more browser tabs, youll indeed see
that the requests to */* dont have to wait for */sleep* to finish. But as we
mentioned, this will eventually overwhelm the system because youd be making
that the requests to */* dont have to wait for */sleep* to finish. However, as
we mentioned, this will eventually overwhelm the system because youd be making
new threads without any limit.
#### Creating a Similar Interface for a Finite Number of Threads
<!-- Old headings. Do not remove or links may break. -->
<a id="creating-a-similar-interface-for-a-finite-number-of-threads"></a>
#### Creating a Finite Number of Threads
We want our thread pool to work in a similar, familiar way so switching from
threads to a thread pool doesnt require large changes to the code that uses
@ -138,7 +146,10 @@ run for each stream. We need to implement `pool.execute` so it takes the
closure and gives it to a thread in the pool to run. This code wont yet
compile, but well try so the compiler can guide us in how to fix it.
#### Building the `ThreadPool` Struct Using Compiler Driven Development
<!-- Old headings. Do not remove or links may break. -->
<a id="building-the-threadpool-struct-using-compiler-driven-development"></a>
#### Building `ThreadPool` Using Compiler Driven Development
Make the changes in Listing 20-12 to *src/main.rs*, and then lets use the
compiler errors from `cargo check` to drive our development. Here is the first
@ -206,12 +217,11 @@ Lets check the code again:
```
Now the error occurs because we dont have an `execute` method on `ThreadPool`.
Recall from the [“Creating a Similar Interface for a Finite Number of
Threads”](#creating-a-similar-interface-for-a-finite-number-of-threads)<!--
ignore --> section that we decided our thread pool should have an interface
similar to `thread::spawn`. In addition, well implement the `execute` function
so it takes the closure its given and gives it to an idle thread in the pool
to run.
Recall from the [“Creating a Finite Number of
Threads”](#creating-a-finite-number-of-threads)<!-- ignore --> section that we
decided our thread pool should have an interface similar to `thread::spawn`. In
addition, well implement the `execute` function so it takes the closure its
given and gives it to an idle thread in the pool to run.
Well define the `execute` method on `ThreadPool` to take a closure as a
parameter. Recall from the [“Moving Captured Values Out of the Closure and the
@ -315,8 +325,8 @@ pub fn new(size: usize) -> Result<ThreadPool, PoolCreationError> {
Now that we have a way to know we have a valid number of threads to store in
the pool, we can create those threads and store them in the `ThreadPool` struct
before returning it. But how do we “store” a thread? Lets take another look at
the `thread::spawn` signature:
before returning the struct. But how do we “store” a thread? Lets take another
look at the `thread::spawn` signature:
```rust,ignore
pub fn spawn<F, T>(f: F) -> JoinHandle<T>
@ -351,12 +361,11 @@ using `thread::JoinHandle` as the type of the items in the vector in
`ThreadPool`.
Once a valid size is received, our `ThreadPool` creates a new vector that can
hold `size` items. We havent used the `with_capacity` function in this book
yet, which performs the same task as `Vec::new` but with an important
difference: it preallocates space in the vector. Because we know we need to
store `size` elements in the vector, doing this allocation up front is slightly
more efficient than using `Vec::new`, which resizes itself as elements are
inserted.
hold `size` items. The `with_capacity` function performs the same task as
`Vec::new` but with an important difference: it preallocates space in the
vector. Because we know we need to store `size` elements in the vector, doing
this allocation up front is slightly more efficient than using `Vec::new`,
which resizes itself as elements are inserted.
When you run `cargo check` again, it should succeed.
@ -373,10 +382,11 @@ implement it manually.
Well implement this behavior by introducing a new data structure between the
`ThreadPool` and the threads that will manage this new behavior. Well call
this data structure `Worker`, which is a common term in pooling
implementations. Think of people working in the kitchen at a restaurant: the
workers wait until orders come in from customers, and then theyre responsible
for taking those orders and filling them.
this data structure *Worker*, which is a common term in pooling
implementations. The Worker picks up code that needs to be run and runs the
code in the Workers thread. Think of people working in the kitchen at a
restaurant: the workers wait until orders come in from customers, and then
theyre responsible for taking those orders and filling them.
Instead of storing a vector of `JoinHandle<()>` instances in the thread pool,
well store instances of the `Worker` struct. Each `Worker` will store a single
@ -385,9 +395,9 @@ take a closure of code to run and send it to the already running thread for
execution. Well also give each worker an `id` so we can distinguish between
the different workers in the pool when logging or debugging.
Lets make the following changes to what happens when we create a `ThreadPool`.
Well implement the code that sends the closure to the thread after we have
`Worker` set up in this way:
Here is the new process that will happen when we create a `ThreadPool`. Well
implement the code that sends the closure to the thread after we have `Worker`
set up in this way:
1. Define a `Worker` struct that holds an `id` and a `JoinHandle<()>`.
2. Change `ThreadPool` to hold a vector of `Worker` instances.
@ -428,19 +438,18 @@ the closure that we get in `execute`. Lets look at how to do that next.
#### Sending Requests to Threads via Channels
Now well tackle the problem that the closures given to `thread::spawn` do
The next problem well tackle is that the closures given to `thread::spawn` do
absolutely nothing. Currently, we get the closure we want to execute in the
`execute` method. But we need to give `thread::spawn` a closure to run when we
create each `Worker` during the creation of the `ThreadPool`.
We want the `Worker` structs that we just created to fetch code to run from a
queue held in the `ThreadPool` and send that code to its thread to run.
We want the `Worker` structs that we just created to fetch the code to run from
a queue held in the `ThreadPool` and send that code to its thread to run.
In Chapter 16, you learned about *channels*—a simple way to communicate between
two threads—that would be perfect for this use case. Well use a channel to
function as the queue of jobs, and `execute` will send a job from the
`ThreadPool` to the `Worker` instances, which will send the job to its thread.
Here is the plan:
The channels we learned about in Chapter 16—a simple way to communicate between
two threads—would be perfect for this use case. Well use a channel to function
as the queue of jobs, and `execute` will send a job from the `ThreadPool` to
the `Worker` instances, which will send the job to its thread. Here is the plan:
1. The `ThreadPool` will create a channel and hold on to the sender.
2. Each `Worker` will hold on to the receiver.
@ -528,8 +537,8 @@ Lets finally implement the `execute` method on `ThreadPool`. Well also cha
`Job` from a struct to a type alias for a trait object that holds the type of
closure that `execute` receives. As discussed in the [“Creating Type Synonyms
with Type Aliases”][creating-type-synonyms-with-type-aliases]<!-- ignore -->
section of Chapter 19, type aliases allow us to make long types shorter. Look
at Listing 20-19.
section of Chapter 19, type aliases allow us to make long types shorter for
ease of use. Look at Listing 20-19.
<span class="filename">Filename: src/lib.rs</span>
@ -659,8 +668,8 @@ processed. The reason is somewhat subtle: the `Mutex` struct has no public
the `MutexGuard<T>` within the `LockResult<MutexGuard<T>>` that the `lock`
method returns. At compile time, the borrow checker can then enforce the rule
that a resource guarded by a `Mutex` cannot be accessed unless we hold the
lock. But this implementation can also result in the lock being held longer
than intended if we dont think carefully about the lifetime of the
lock. However, this implementation can also result in the lock being held
longer than intended if we arent mindful of the lifetime of the
`MutexGuard<T>`.
The code in Listing 20-20 that uses `let job =

View File

@ -8,11 +8,12 @@ class="keystroke">ctrl-c</span> method to halt the main thread, all other
threads are stopped immediately as well, even if theyre in the middle of
serving a request.
Now well implement the `Drop` trait to call `join` on each of the threads in
the pool so they can finish the requests theyre working on before closing.
Then well implement a way to tell the threads they should stop accepting new
requests and shut down. To see this code in action, well modify our server to
accept only two requests before gracefully shutting down its thread pool.
Next, then, well implement the `Drop` trait to call `join` on each of the
threads in the pool so they can finish the requests theyre working on before
closing. Then well implement a way to tell the threads they should stop
accepting new requests and shut down. To see this code in action, well modify
our server to accept only two requests before gracefully shutting down its
thread pool.
### Implementing the `Drop` Trait on `ThreadPool`
@ -97,13 +98,13 @@ cleaned up, so nothing happens in that case.
### Signaling to the Threads to Stop Listening for Jobs
With all the changes weve made, our code compiles without any warnings. But
the bad news is this code doesnt function the way we want it to yet. The key
is the logic in the closures run by the threads of the `Worker` instances: at
the moment, we call `join`, but that wont shut down the threads because they
`loop` forever looking for jobs. If we try to drop our `ThreadPool` with our
current implementation of `drop`, the main thread will block forever waiting
for the first thread to finish.
With all the changes weve made, our code compiles without any warnings.
However, the bad news is this code doesnt function the way we want it to yet.
The key is the logic in the closures run by the threads of the `Worker`
instances: at the moment, we call `join`, but that wont shut down the threads
because they `loop` forever looking for jobs. If we try to drop our
`ThreadPool` with our current implementation of `drop`, the main thread will
block forever waiting for the first thread to finish.
To fix this problem, well need a change in the the `ThreadPool` `drop`
implementation and then a change in the `Worker` loop.