cargo/benches/README.md

# Cargo Benchmarking

This directory contains some benchmarks for cargo itself. This uses
[Criterion] for running benchmarks. It is recommended to read the Criterion
book to get familiar with how to use it. A basic usage would be:

```sh
cd benches/benchsuite
cargo bench
```

However, running all benchmarks would take many minutes, so in most cases it
is recommended to just run the benchmarks relevant to whatever section of code
you are working on.

## Benchmarks

There are several different kinds of benchmarks in the `benchsuite/benches` directory:

* `global_cache_tracker` — Benchmarks saving data to the global cache tracker
  database using samples of real-world data.
* `resolve` — Benchmarks the resolver against simulations of real-world workspaces.
* `workspace_initialization` — Benchmarks initialization of a workspace
  against simulations of real-world workspaces.

### Resolve benchmarks

The resolve benchmarks involve downloading the index and benchmarking against some
real-world and artificial workspaces located in the [`workspaces`](workspaces)
directory.

**Beware** that the initial download can take a fairly long amount of time (10
minutes minimum on an extremely fast network) and require significant disk
space (around 4.5GB). The benchsuite will cache the index and downloaded
crates in the `target/tmp/bench` directory, so subsequent runs should be
faster. You can (and probably should) specify individual benchmarks to run to
narrow it down to a more reasonable set, for example:

```sh
cargo bench -p benchsuite --bench resolve -- resolve_ws/rust
```

This will only download what's necessary for the rust-lang/rust workspace
(which is about 330MB) and run the benchmarks against it (which should take
about a minute). To get a list of all the benchmarks, run:

```sh
cargo bench -p benchsuite --bench resolve -- --list
```

### Global cache tracker

The `global_cache_tracker` benchmark tests saving data to the global cache
tracker database using samples of real-world data. This benchmark should run
relatively quickly.

The real-world data is based on a capture of my personal development
environment which has accumulated a large cache. So it is somewhat arbitrary,
but hopefully representative of a challenging environment. Capturing of the
data is done with the `capture-last-use` binary, which you can run if you need
to rebuild the database. Just try to run on a system with a relatively full
cache in your cargo home directory.

```sh
cargo bench -p benchsuite --bench global_cache_tracker
```

## Viewing reports

The benchmarks display some basic information on the command-line while they
run. A more complete HTML report can be found at
`target/criterion/report/index.html` which contains links to all the
benchmarks and summaries. Check out the Criterion book for more information on
the extensive reporting capabilities.

## Comparing implementations

Knowing the raw numbers can be useful, but what you're probably most
interested in is checking if your changes help or hurt performance. To do
that, you need to run the benchmarks multiple times.

First, run the benchmarks from the master branch of cargo without any changes.
To make it easier to compare, Criterion supports naming the baseline so that
you can iterate on your code and compare against it multiple times.

```sh
cargo bench -- --save-baseline master
```

Now you can switch to your branch with your changes. Re-run the benchmarks
compared against the baseline:

```sh
cargo bench -- --baseline master
```

You can repeat the last command as you make changes to re-compare against the
master baseline.

Without the baseline arguments, it will compare against the last run, which
can be helpful for comparing incremental changes.

## Capturing workspaces

The [`workspaces`](workspaces) directory contains several workspaces that
provide a variety of different workspaces intended to provide good exercises
for benchmarks. Some of these are shadow copies of real-world workspaces. This
is done with the tool in the [`capture`](capture) directory. The tool will
copy `Cargo.lock` and all of the `Cargo.toml` files of the workspace members.
It also adds an empty `lib.rs` so Cargo won't error, and sanitizes the
`Cargo.toml` to some degree, removing unwanted elements. Finally, it
compresses everything into a `tgz`.

To run it, do:

```sh
cd benches/capture
cargo run -- /path/to/workspace/foo
```

The resolver benchmarks also support the `CARGO_BENCH_WORKSPACES` environment
variable, which you can point to a Cargo workspace if you want to try
different workspaces. For example:

```sh
CARGO_BENCH_WORKSPACES=/path/to/some/workspace cargo bench
```

## TODO

This is just a start for establishing a benchmarking suite for Cargo. There's
a lot that can be added. Some ideas:

* Fix the benchmarks so that the resolver setup doesn't run every iteration.
* Benchmark [this section of
  code](https://github.com/rust-lang/cargo/blob/a821e2cb24d7b6013433f069ab3bad53d160e100/src/cargo/ops/cargo_compile.rs#L470-L549)
  which builds the unit graph. The performance there isn't great, and it would
  be good to keep an eye on it. Unfortunately that would mean doing a bit of
  work to make `generate_targets` publicly visible, and there is a bunch of
  setup code that may need to be duplicated.
* Benchmark the fingerprinting code.
* Benchmark running the `cargo` executable. Running something like `cargo
  build` or `cargo check` with everything "Fresh" would be a good end-to-end
  exercise to measure the overall overhead of Cargo.
* Benchmark pathological resolver scenarios. There might be some cases where
  the resolver can spend a significant amount of time. It would be good to
  identify if these exist, and create benchmarks for them. This may require
  creating an artificial index, similar to the `resolver-tests`. This should
  also consider scenarios where the resolver ultimately fails.
* Benchmark without `Cargo.lock`. I'm not sure if this is particularly
  valuable, since we are mostly concerned with incremental builds which will
  always have a lock file.
* Benchmark just
  [`resolve::resolve`](https://github.com/rust-lang/cargo/blob/a821e2cb24d7b6013433f069ab3bad53d160e100/src/cargo/core/resolver/mod.rs#L122)
  without anything else. This can help focus on just the resolver.

[Criterion]: https://bheisler.github.io/criterion.rs/book/
Add the start of a basic benchmarking suite. 2021-10-02 23:39:41 +00:00			`# Cargo Benchmarking`

			`This directory contains some benchmarks for cargo itself. This uses`
			`[Criterion] for running benchmarks. It is recommended to read the Criterion`
			`book to get familiar with how to use it. A basic usage would be:`

			```sh
			`cd benches/benchsuite`
			`cargo bench`
			```

Add benchmarks for the global cache tracker. 2023-09-07 04:36:42 +00:00			`However, running all benchmarks would take many minutes, so in most cases it`
			`is recommended to just run the benchmarks relevant to whatever section of code`
			`you are working on.`

			`## Benchmarks`

			There are several different kinds of benchmarks in the `benchsuite/benches` directory:

			* `global_cache_tracker` — Benchmarks saving data to the global cache tracker
			`database using samples of real-world data.`
			* `resolve` — Benchmarks the resolver against simulations of real-world workspaces.
			* `workspace_initialization` — Benchmarks initialization of a workspace
			`against simulations of real-world workspaces.`

			`### Resolve benchmarks`

			`The resolve benchmarks involve downloading the index and benchmarking against some`
Add the start of a basic benchmarking suite. 2021-10-02 23:39:41 +00:00			real-world and artificial workspaces located in the [`workspaces`](workspaces)
			`directory.`

			`Beware that the initial download can take a fairly long amount of time (10`
			`minutes minimum on an extremely fast network) and require significant disk`
			`space (around 4.5GB). The benchsuite will cache the index and downloaded`
			crates in the `target/tmp/bench` directory, so subsequent runs should be
			`faster. You can (and probably should) specify individual benchmarks to run to`
			`narrow it down to a more reasonable set, for example:`

			```sh
Add benchmarks for the global cache tracker. 2023-09-07 04:36:42 +00:00			`cargo bench -p benchsuite --bench resolve -- resolve_ws/rust`
Add the start of a basic benchmarking suite. 2021-10-02 23:39:41 +00:00			```

			`This will only download what's necessary for the rust-lang/rust workspace`
			`(which is about 330MB) and run the benchmarks against it (which should take`
			`about a minute). To get a list of all the benchmarks, run:`

			```sh
Add benchmarks for the global cache tracker. 2023-09-07 04:36:42 +00:00			`cargo bench -p benchsuite --bench resolve -- --list`
			```

			`### Global cache tracker`

			The `global_cache_tracker` benchmark tests saving data to the global cache
			`tracker database using samples of real-world data. This benchmark should run`
			`relatively quickly.`

			`The real-world data is based on a capture of my personal development`
			`environment which has accumulated a large cache. So it is somewhat arbitrary,`
			`but hopefully representative of a challenging environment. Capturing of the`
			data is done with the `capture-last-use` binary, which you can run if you need
			`to rebuild the database. Just try to run on a system with a relatively full`
			`cache in your cargo home directory.`

			```sh
			`cargo bench -p benchsuite --bench global_cache_tracker`
Add the start of a basic benchmarking suite. 2021-10-02 23:39:41 +00:00			```

			`## Viewing reports`

			`The benchmarks display some basic information on the command-line while they`
			`run. A more complete HTML report can be found at`
			`target/criterion/report/index.html` which contains links to all the
			`benchmarks and summaries. Check out the Criterion book for more information on`
			`the extensive reporting capabilities.`

			`## Comparing implementations`

			`Knowing the raw numbers can be useful, but what you're probably most`
			`interested in is checking if your changes help or hurt performance. To do`
			`that, you need to run the benchmarks multiple times.`

			`First, run the benchmarks from the master branch of cargo without any changes.`
			`To make it easier to compare, Criterion supports naming the baseline so that`
			`you can iterate on your code and compare against it multiple times.`

			```sh
			`cargo bench -- --save-baseline master`
			```

			`Now you can switch to your branch with your changes. Re-run the benchmarks`
			`compared against the baseline:`

			```sh
			`cargo bench -- --baseline master`
			```

			`You can repeat the last command as you make changes to re-compare against the`
			`master baseline.`

			`Without the baseline arguments, it will compare against the last run, which`
			`can be helpful for comparing incremental changes.`

			`## Capturing workspaces`

			The [`workspaces`](workspaces) directory contains several workspaces that
			`provide a variety of different workspaces intended to provide good exercises`
			`for benchmarks. Some of these are shadow copies of real-world workspaces. This`
			is done with the tool in the [`capture`](capture) directory. The tool will
			copy `Cargo.lock` and all of the `Cargo.toml` files of the workspace members.
			It also adds an empty `lib.rs` so Cargo won't error, and sanitizes the
			`Cargo.toml` to some degree, removing unwanted elements. Finally, it
			compresses everything into a `tgz`.

			`To run it, do:`

			```sh
			`cd benches/capture`
			`cargo run -- /path/to/workspace/foo`
			```

			The resolver benchmarks also support the `CARGO_BENCH_WORKSPACES` environment
			`variable, which you can point to a Cargo workspace if you want to try`
			`different workspaces. For example:`

			```sh
			`CARGO_BENCH_WORKSPACES=/path/to/some/workspace cargo bench`
			```

			`## TODO`

			`This is just a start for establishing a benchmarking suite for Cargo. There's`
			`a lot that can be added. Some ideas:`

			`* Fix the benchmarks so that the resolver setup doesn't run every iteration.`
			`* Benchmark [this section of`
			`code](https://github.com/rust-lang/cargo/blob/a821e2cb24d7b6013433f069ab3bad53d160e100/src/cargo/ops/cargo_compile.rs#L470-L549)`
			`which builds the unit graph. The performance there isn't great, and it would`
			`be good to keep an eye on it. Unfortunately that would mean doing a bit of`
			work to make `generate_targets` publicly visible, and there is a bunch of
			`setup code that may need to be duplicated.`
			`* Benchmark the fingerprinting code.`
			* Benchmark running the `cargo` executable. Running something like `cargo
			build` or `cargo check` with everything "Fresh" would be a good end-to-end
			`exercise to measure the overall overhead of Cargo.`
			`* Benchmark pathological resolver scenarios. There might be some cases where`
			`the resolver can spend a significant amount of time. It would be good to`
			`identify if these exist, and create benchmarks for them. This may require`
			creating an artificial index, similar to the `resolver-tests`. This should
			`also consider scenarios where the resolver ultimately fails.`
			* Benchmark without `Cargo.lock`. I'm not sure if this is particularly
			`valuable, since we are mostly concerned with incremental builds which will`
			`always have a lock file.`
			`* Benchmark just`
			[`resolve::resolve`](https://github.com/rust-lang/cargo/blob/a821e2cb24d7b6013433f069ab3bad53d160e100/src/cargo/core/resolver/mod.rs#L122)
			`without anything else. This can help focus on just the resolver.`

			`[Criterion]: https://bheisler.github.io/criterion.rs/book/`