rustls/BENCHMARKING.md

83 lines
4.0 KiB
Markdown
Raw Permalink Normal View History

# Benchmarking
This repository includes benchmarks for multiple use cases. They are described below, along with
information on how to run them.
## Throughput and memory usage benchmarks
These benchmarks measure the throughput and memory footprint you get from rustls. They have been
used in the past to compare performance against OpenSSL (see the results of [December
2023](https://github.com/aochagavia/rustls-bench-results) and [July
2019](https://jbp.io/2019/07/01/rustls-vs-openssl-performance.html)). You can also use them to
evaluate rustls' performance on different hardware (e.g. a bare-metal server with support for
AVX-512 instructions vs. a cloud VM with a consumer-grade CPU).
The measured aspects are:
1. Bulk data transfer throughput in MiB/s;
2. Handshake throughput (full, session id, tickets) in handshakes per second;
3. Memory usage per connection.
If you are interested in comparing against OpenSSL, check out the [twin OpenSSL
benchmarks](https://github.com/ctz/openssl-bench), which produce similar measurements.
#### Building
The benchmarks are implemented in the form of "example code" in `rustls/examples/internal/bench.rs`.
Use `cargo build --release -p rustls --example bench` to obtain the corresponding binary (you can
toggle conditionally compiled code with the `--no-default-features` and `--features` flags).
Note: while `cargo build --release --example bench` also works, it results in surprises when used
together with `--no-default-features` because of how Cargo's feature unification works (some
features get enabled automatically by other subcrates).
#### Running
There is a makefile in [admin/bench-measure.mk](admin/bench-measure.mk) providing useful commands to
facilitate benchmarking:
- `make measure`: runs bulk transfer and handshake throughput benchmarks using a predefined list of
cipher suites.
- `make memory`: measures memory usage for different amounts of connections.
You can inspect the makefile to get an idea of the command line arguments accepted by `bench`. With
the right arguments, you can run benchmarks for other cipher suites (through `cargo run --release`
or by directly launching the compiled binary).
#### Reducing noise
We usually extend the duration of the benchmarks in an attempt to neutralize the effect of cold CPU
and page caches, giving us more accurate results. This is done through the `BENCH_MULTIPLIER`
environment variable, which tells the benchmark runner to multiply the amount of work done. For
instance, `BENCH_MULTIPLIER=8` will ensure we do 8 times the work.
Additional ways to reduce noise are:
- Disabling ASLR (through `setarch -R`).
- Disabling CPU dynamic frequency scaling (usually on the BIOS/UEFI level).
- Disabling CPU hyper-threading (usually on the BIOS/UEFI level).
- Setting the Linux CPU governor to performance for all cores.
- Running the benchmarks multiple times (e.g. 30) and taking the median for each scenario (the
[December 2023 results](https://github.com/aochagavia/rustls-bench-results) include Python code
doing this).
## CI benchmarks
These benchmarks are meant to provide _automated_ and _accurate_ feedback on a PR's performance
impact compared to the main branch. By automating them we ensure they are regularly used, by keeping
them accurate we ensure they are actionable (i.e. too much noise would train reviewers to ignore the
information).
The benchmarks themselves are located under [ci-bench](ci-bench), together with a detailed readme
(including instructions on how to run them locally). The automated runner lives in its own
[repository](https://github.com/rustls/rustls-bench-app) and is deployed to a bare-metal machine to
ensure low-noise results.
## Nightly benchmarks
There are some `#[bench]` benchmarks spread throughout the codebase. We do not use them
systematically, but they help understand the performance of smaller pieces of code (one or two
functions), which would be difficult to see when the unit-of-benchmark is an entire handshake.
These benchmarks require a nightly compiler. If you are using `rustup`, you can run them with
`RUSTFLAGS=--cfg=bench cargo +nightly bench`