mirror of https://github.com/rust-lang/cargo
2054 lines
84 KiB
Rust
2054 lines
84 KiB
Rust
//! # Fingerprints
|
||
//!
|
||
//! This module implements change-tracking so that Cargo can know whether or
|
||
//! not something needs to be recompiled. A Cargo `Unit` can be either "dirty"
|
||
//! (needs to be recompiled) or "fresh" (it does not need to be recompiled).
|
||
//! There are several mechanisms that influence a Unit's freshness:
|
||
//!
|
||
//! - The `Fingerprint` is a hash, saved to the filesystem in the
|
||
//! `.fingerprint` directory, that tracks information about the Unit. If the
|
||
//! fingerprint is missing (such as the first time the unit is being
|
||
//! compiled), then the unit is dirty. If any of the fingerprint fields
|
||
//! change (like the name of the source file), then the Unit is considered
|
||
//! dirty.
|
||
//!
|
||
//! The `Fingerprint` also tracks the fingerprints of all its dependencies,
|
||
//! so a change in a dependency will propagate the "dirty" status up.
|
||
//!
|
||
//! - Filesystem mtime tracking is also used to check if a unit is dirty.
|
||
//! See the section below on "Mtime comparison" for more details. There
|
||
//! are essentially two parts to mtime tracking:
|
||
//!
|
||
//! 1. The mtime of a Unit's output files is compared to the mtime of all
|
||
//! its dependencies' output file mtimes (see `check_filesystem`). If any
|
||
//! output is missing, or is older than a dependency's output, then the
|
||
//! unit is dirty.
|
||
//! 2. The mtime of a Unit's source files is compared to the mtime of its
|
||
//! dep-info file in the fingerprint directory (see `find_stale_file`).
|
||
//! The dep-info file is used as an anchor to know when the last build of
|
||
//! the unit was done. See the "dep-info files" section below for more
|
||
//! details. If any input files are missing, or are newer than the
|
||
//! dep-info, then the unit is dirty.
|
||
//!
|
||
//! Note: Fingerprinting is not a perfect solution. Filesystem mtime tracking
|
||
//! is notoriously imprecise and problematic. Only a small part of the
|
||
//! environment is captured. This is a balance of performance, simplicity, and
|
||
//! completeness. Sandboxing, hashing file contents, tracking every file
|
||
//! access, environment variable, and network operation would ensure more
|
||
//! reliable and reproducible builds at the cost of being complex, slow, and
|
||
//! platform-dependent.
|
||
//!
|
||
//! ## Fingerprints and Metadata
|
||
//!
|
||
//! The `Metadata` hash is a hash added to the output filenames to isolate
|
||
//! each unit. See the documentation in the `compilation_files` module for
|
||
//! more details. NOTE: Not all output files are isolated via filename hashes
|
||
//! (like dylibs). The fingerprint directory uses a hash, but sometimes units
|
||
//! share the same fingerprint directory (when they don't have Metadata) so
|
||
//! care should be taken to handle this!
|
||
//!
|
||
//! Fingerprints and Metadata are similar, and track some of the same things.
|
||
//! The Metadata contains information that is required to keep Units separate.
|
||
//! The Fingerprint includes additional information that should cause a
|
||
//! recompile, but it is desired to reuse the same filenames. A comparison
|
||
//! of what is tracked:
|
||
//!
|
||
//! Value | Fingerprint | Metadata
|
||
//! -------------------------------------------|-------------|----------
|
||
//! rustc | ✓ | ✓
|
||
//! Profile | ✓ | ✓
|
||
//! `cargo rustc` extra args | ✓ | ✓
|
||
//! CompileMode | ✓ | ✓
|
||
//! Target Name | ✓ | ✓
|
||
//! TargetKind (bin/lib/etc.) | ✓ | ✓
|
||
//! Enabled Features | ✓ | ✓
|
||
//! Immediate dependency’s hashes | ✓[^1] | ✓
|
||
//! CompileKind (host/target) | ✓ | ✓
|
||
//! __CARGO_DEFAULT_LIB_METADATA[^4] | | ✓
|
||
//! package_id | | ✓
|
||
//! authors, description, homepage, repo | ✓ |
|
||
//! Target src path relative to ws | ✓ |
|
||
//! Target flags (test/bench/for_host/edition) | ✓ |
|
||
//! -C incremental=… flag | ✓ |
|
||
//! mtime of sources | ✓[^3] |
|
||
//! RUSTFLAGS/RUSTDOCFLAGS | ✓ |
|
||
//! LTO flags | ✓ | ✓
|
||
//! config settings[^5] | ✓ |
|
||
//! is_std | | ✓
|
||
//!
|
||
//! [^1]: Build script and bin dependencies are not included.
|
||
//!
|
||
//! [^3]: See below for details on mtime tracking.
|
||
//!
|
||
//! [^4]: `__CARGO_DEFAULT_LIB_METADATA` is set by rustbuild to embed the
|
||
//! release channel (bootstrap/stable/beta/nightly) in libstd.
|
||
//!
|
||
//! [^5]: Config settings that are not otherwise captured anywhere else.
|
||
//! Currently, this is only `doc.extern-map`.
|
||
//!
|
||
//! When deciding what should go in the Metadata vs the Fingerprint, consider
|
||
//! that some files (like dylibs) do not have a hash in their filename. Thus,
|
||
//! if a value changes, only the fingerprint will detect the change (consider,
|
||
//! for example, swapping between different features). Fields that are only in
|
||
//! Metadata generally aren't relevant to the fingerprint because they
|
||
//! fundamentally change the output (like target vs host changes the directory
|
||
//! where it is emitted).
|
||
//!
|
||
//! ## Fingerprint files
|
||
//!
|
||
//! Fingerprint information is stored in the
|
||
//! `target/{debug,release}/.fingerprint/` directory. Each Unit is stored in a
|
||
//! separate directory. Each Unit directory contains:
|
||
//!
|
||
//! - A file with a 16 hex-digit hash. This is the Fingerprint hash, used for
|
||
//! quick loading and comparison.
|
||
//! - A `.json` file that contains details about the Fingerprint. This is only
|
||
//! used to log details about *why* a fingerprint is considered dirty.
|
||
//! `CARGO_LOG=cargo::core::compiler::fingerprint=trace cargo build` can be
|
||
//! used to display this log information.
|
||
//! - A "dep-info" file which is a translation of rustc's `*.d` dep-info files
|
||
//! to a Cargo-specific format that tweaks file names and is optimized for
|
||
//! reading quickly.
|
||
//! - An `invoked.timestamp` file whose filesystem mtime is updated every time
|
||
//! the Unit is built. This is used for capturing the time when the build
|
||
//! starts, to detect if files are changed in the middle of the build. See
|
||
//! below for more details.
|
||
//!
|
||
//! Note that some units are a little different. A Unit for *running* a build
|
||
//! script or for `rustdoc` does not have a dep-info file (it's not
|
||
//! applicable). Build script `invoked.timestamp` files are in the build
|
||
//! output directory.
|
||
//!
|
||
//! ## Fingerprint calculation
|
||
//!
|
||
//! After the list of Units has been calculated, the Units are added to the
|
||
//! `JobQueue`. As each one is added, the fingerprint is calculated, and the
|
||
//! dirty/fresh status is recorded. A closure is used to update the fingerprint
|
||
//! on-disk when the Unit successfully finishes. The closure will recompute the
|
||
//! Fingerprint based on the updated information. If the Unit fails to compile,
|
||
//! the fingerprint is not updated.
|
||
//!
|
||
//! Fingerprints are cached in the `Context`. This makes computing
|
||
//! Fingerprints faster, but also is necessary for properly updating
|
||
//! dependency information. Since a Fingerprint includes the Fingerprints of
|
||
//! all dependencies, when it is updated, by using `Arc` clones, it
|
||
//! automatically picks up the updates to its dependencies.
|
||
//!
|
||
//! ### dep-info files
|
||
//!
|
||
//! Cargo passes the `--emit=dep-info` flag to `rustc` so that `rustc` will
|
||
//! generate a "dep info" file (with the `.d` extension). This is a
|
||
//! Makefile-like syntax that includes all of the source files used to build
|
||
//! the crate. This file is used by Cargo to know which files to check to see
|
||
//! if the crate will need to be rebuilt.
|
||
//!
|
||
//! After `rustc` exits successfully, Cargo will read the dep info file and
|
||
//! translate it into a binary format that is stored in the fingerprint
|
||
//! directory (`translate_dep_info`). The mtime of the fingerprint dep-info
|
||
//! file itself is used as the reference for comparing the source files to
|
||
//! determine if any of the source files have been modified (see below for
|
||
//! more detail). Note that Cargo parses the special `# env-var:...` comments in
|
||
//! dep-info files to learn about environment variables that the rustc compile
|
||
//! depends on. Cargo then later uses this to trigger a recompile if a
|
||
//! referenced env var changes (even if the source didn't change).
|
||
//!
|
||
//! There is also a third dep-info file. Cargo will extend the file created by
|
||
//! rustc with some additional information and saves this into the output
|
||
//! directory. This is intended for build system integration. See the
|
||
//! `output_depinfo` module for more detail.
|
||
//!
|
||
//! #### -Zbinary-dep-depinfo
|
||
//!
|
||
//! `rustc` has an experimental flag `-Zbinary-dep-depinfo`. This causes
|
||
//! `rustc` to include binary files (like rlibs) in the dep-info file. This is
|
||
//! primarily to support rustc development, so that Cargo can check the
|
||
//! implicit dependency to the standard library (which lives in the sysroot).
|
||
//! We want Cargo to recompile whenever the standard library rlib/dylibs
|
||
//! change, and this is a generic mechanism to make that work.
|
||
//!
|
||
//! ### Mtime comparison
|
||
//!
|
||
//! The use of modification timestamps is the most common way a unit will be
|
||
//! determined to be dirty or fresh between builds. There are many subtle
|
||
//! issues and edge cases with mtime comparisons. This gives a high-level
|
||
//! overview, but you'll need to read the code for the gritty details. Mtime
|
||
//! handling is different for different unit kinds. The different styles are
|
||
//! driven by the `Fingerprint.local` field, which is set based on the unit
|
||
//! kind.
|
||
//!
|
||
//! The status of whether or not the mtime is "stale" or "up-to-date" is
|
||
//! stored in `Fingerprint.fs_status`.
|
||
//!
|
||
//! All units will compare the mtime of its newest output file with the mtimes
|
||
//! of the outputs of all its dependencies. If any output file is missing,
|
||
//! then the unit is stale. If any dependency is newer, the unit is stale.
|
||
//!
|
||
//! #### Normal package mtime handling
|
||
//!
|
||
//! `LocalFingerprint::CheckDepinfo` is used for checking the mtime of
|
||
//! packages. It compares the mtime of the input files (the source files) to
|
||
//! the mtime of the dep-info file (which is written last after a build is
|
||
//! finished). If the dep-info is missing, the unit is stale (it has never
|
||
//! been built). The list of input files comes from the dep-info file. See the
|
||
//! section above for details on dep-info files.
|
||
//!
|
||
//! Also note that although registry and git packages use `CheckDepInfo`, none
|
||
//! of their source files are included in the dep-info (see
|
||
//! `translate_dep_info`), so for those kinds no mtime checking is done
|
||
//! (unless `-Zbinary-dep-depinfo` is used). Repository and git packages are
|
||
//! static, so there is no need to check anything.
|
||
//!
|
||
//! When a build is complete, the mtime of the dep-info file in the
|
||
//! fingerprint directory is modified to rewind it to the time when the build
|
||
//! started. This is done by creating an `invoked.timestamp` file when the
|
||
//! build starts to capture the start time. The mtime is rewound to the start
|
||
//! to handle the case where the user modifies a source file while a build is
|
||
//! running. Cargo can't know whether or not the file was included in the
|
||
//! build, so it takes a conservative approach of assuming the file was *not*
|
||
//! included, and it should be rebuilt during the next build.
|
||
//!
|
||
//! #### Rustdoc mtime handling
|
||
//!
|
||
//! Rustdoc does not emit a dep-info file, so Cargo currently has a relatively
|
||
//! simple system for detecting rebuilds. `LocalFingerprint::Precalculated` is
|
||
//! used for rustdoc units. For registry packages, this is the package
|
||
//! version. For git packages, it is the git hash. For path packages, it is
|
||
//! the a string of the mtime of the newest file in the package.
|
||
//!
|
||
//! There are some known bugs with how this works, so it should be improved at
|
||
//! some point.
|
||
//!
|
||
//! #### Build script mtime handling
|
||
//!
|
||
//! Build script mtime handling runs in different modes. There is the "old
|
||
//! style" where the build script does not emit any `rerun-if` directives. In
|
||
//! this mode, Cargo will use `LocalFingerprint::Precalculated`. See the
|
||
//! "rustdoc" section above how it works.
|
||
//!
|
||
//! In the new-style, each `rerun-if` directive is translated to the
|
||
//! corresponding `LocalFingerprint` variant. The `RerunIfChanged` variant
|
||
//! compares the mtime of the given filenames against the mtime of the
|
||
//! "output" file.
|
||
//!
|
||
//! Similar to normal units, the build script "output" file mtime is rewound
|
||
//! to the time just before the build script is executed to handle mid-build
|
||
//! modifications.
|
||
//!
|
||
//! ## Considerations for inclusion in a fingerprint
|
||
//!
|
||
//! Over time we've realized a few items which historically were included in
|
||
//! fingerprint hashings should not actually be included. Examples are:
|
||
//!
|
||
//! * Modification time values. We strive to never include a modification time
|
||
//! inside a `Fingerprint` to get hashed into an actual value. While
|
||
//! theoretically fine to do, in practice this causes issues with common
|
||
//! applications like Docker. Docker, after a layer is built, will zero out
|
||
//! the nanosecond part of all filesystem modification times. This means that
|
||
//! the actual modification time is different for all build artifacts, which
|
||
//! if we tracked the actual values of modification times would cause
|
||
//! unnecessary recompiles. To fix this we instead only track paths which are
|
||
//! relevant. These paths are checked dynamically to see if they're up to
|
||
//! date, and the modification time doesn't make its way into the fingerprint
|
||
//! hash.
|
||
//!
|
||
//! * Absolute path names. We strive to maintain a property where if you rename
|
||
//! a project directory Cargo will continue to preserve all build artifacts
|
||
//! and reuse the cache. This means that we can't ever hash an absolute path
|
||
//! name. Instead we always hash relative path names and the "root" is passed
|
||
//! in at runtime dynamically. Some of this is best effort, but the general
|
||
//! idea is that we assume all accesses within a crate stay within that
|
||
//! crate.
|
||
//!
|
||
//! These are pretty tricky to test for unfortunately, but we should have a good
|
||
//! test suite nowadays and lord knows Cargo gets enough testing in the wild!
|
||
//!
|
||
//! ## Build scripts
|
||
//!
|
||
//! The *running* of a build script (`CompileMode::RunCustomBuild`) is treated
|
||
//! significantly different than all other Unit kinds. It has its own function
|
||
//! for calculating the Fingerprint (`calculate_run_custom_build`) and has some
|
||
//! unique considerations. It does not track the same information as a normal
|
||
//! Unit. The information tracked depends on the `rerun-if-changed` and
|
||
//! `rerun-if-env-changed` statements produced by the build script. If the
|
||
//! script does not emit either of these statements, the Fingerprint runs in
|
||
//! "old style" mode where an mtime change of *any* file in the package will
|
||
//! cause the build script to be re-run. Otherwise, the fingerprint *only*
|
||
//! tracks the individual "rerun-if" items listed by the build script.
|
||
//!
|
||
//! The "rerun-if" statements from a *previous* build are stored in the build
|
||
//! output directory in a file called `output`. Cargo parses this file when
|
||
//! the Unit for that build script is prepared for the `JobQueue`. The
|
||
//! Fingerprint code can then use that information to compute the Fingerprint
|
||
//! and compare against the old fingerprint hash.
|
||
//!
|
||
//! Care must be taken with build script Fingerprints because the
|
||
//! `Fingerprint::local` value may be changed after the build script runs
|
||
//! (such as if the build script adds or removes "rerun-if" items).
|
||
//!
|
||
//! Another complication is if a build script is overridden. In that case, the
|
||
//! fingerprint is the hash of the output of the override.
|
||
//!
|
||
//! ## Special considerations
|
||
//!
|
||
//! Registry dependencies do not track the mtime of files. This is because
|
||
//! registry dependencies are not expected to change (if a new version is
|
||
//! used, the Package ID will change, causing a rebuild). Cargo currently
|
||
//! partially works with Docker caching. When a Docker image is built, it has
|
||
//! normal mtime information. However, when a step is cached, the nanosecond
|
||
//! portions of all files is zeroed out. Currently this works, but care must
|
||
//! be taken for situations like these.
|
||
//!
|
||
//! HFS on macOS only supports 1 second timestamps. This causes a significant
|
||
//! number of problems, particularly with Cargo's testsuite which does rapid
|
||
//! builds in succession. Other filesystems have various degrees of
|
||
//! resolution.
|
||
//!
|
||
//! Various weird filesystems (such as network filesystems) also can cause
|
||
//! complications. Network filesystems may track the time on the server
|
||
//! (except when the time is set manually such as with
|
||
//! `filetime::set_file_times`). Not all filesystems support modifying the
|
||
//! mtime.
|
||
//!
|
||
//! See the `A-rebuild-detection` flag on the issue tracker for more:
|
||
//! <https://github.com/rust-lang/cargo/issues?q=is%3Aissue+is%3Aopen+label%3AA-rebuild-detection>
|
||
|
||
use std::collections::hash_map::{Entry, HashMap};
|
||
use std::convert::TryInto;
|
||
use std::env;
|
||
use std::hash::{self, Hasher};
|
||
use std::path::{Path, PathBuf};
|
||
use std::str;
|
||
use std::sync::{Arc, Mutex};
|
||
use std::time::SystemTime;
|
||
|
||
use anyhow::{bail, format_err};
|
||
use cargo_util::ProcessBuilder;
|
||
use filetime::FileTime;
|
||
use log::{debug, info};
|
||
use serde::de;
|
||
use serde::ser;
|
||
use serde::{Deserialize, Serialize};
|
||
|
||
use crate::core::compiler::unit_graph::UnitDep;
|
||
use crate::core::Package;
|
||
use crate::util;
|
||
use crate::util::errors::{CargoResult, CargoResultExt};
|
||
use crate::util::interning::InternedString;
|
||
use crate::util::paths;
|
||
use crate::util::{internal, path_args, profile};
|
||
|
||
use super::custom_build::BuildDeps;
|
||
use super::job::{Job, Work};
|
||
use super::{BuildContext, Context, FileFlavor, Unit};
|
||
|
||
/// Determines if a `unit` is up-to-date, and if not prepares necessary work to
|
||
/// update the persisted fingerprint.
|
||
///
|
||
/// This function will inspect `unit`, calculate a fingerprint for it, and then
|
||
/// return an appropriate `Job` to run. The returned `Job` will be a noop if
|
||
/// `unit` is considered "fresh", or if it was previously built and cached.
|
||
/// Otherwise the `Job` returned will write out the true fingerprint to the
|
||
/// filesystem, to be executed after the unit's work has completed.
|
||
///
|
||
/// The `force` flag is a way to force the `Job` to be "dirty", or always
|
||
/// update the fingerprint. **Beware using this flag** because it does not
|
||
/// transitively propagate throughout the dependency graph, it only forces this
|
||
/// one unit which is very unlikely to be what you want unless you're
|
||
/// exclusively talking about top-level units.
|
||
pub fn prepare_target(cx: &mut Context<'_, '_>, unit: &Unit, force: bool) -> CargoResult<Job> {
|
||
let _p = profile::start(format!(
|
||
"fingerprint: {} / {}",
|
||
unit.pkg.package_id(),
|
||
unit.target.name()
|
||
));
|
||
let bcx = cx.bcx;
|
||
let loc = cx.files().fingerprint_file_path(unit, "");
|
||
|
||
debug!("fingerprint at: {}", loc.display());
|
||
|
||
// Figure out if this unit is up to date. After calculating the fingerprint
|
||
// compare it to an old version, if any, and attempt to print diagnostic
|
||
// information about failed comparisons to aid in debugging.
|
||
let fingerprint = calculate(cx, unit)?;
|
||
let mtime_on_use = cx.bcx.config.cli_unstable().mtime_on_use;
|
||
let compare = compare_old_fingerprint(&loc, &*fingerprint, mtime_on_use);
|
||
log_compare(unit, &compare);
|
||
|
||
// If our comparison failed (e.g., we're going to trigger a rebuild of this
|
||
// crate), then we also ensure the source of the crate passes all
|
||
// verification checks before we build it.
|
||
//
|
||
// The `Source::verify` method is intended to allow sources to execute
|
||
// pre-build checks to ensure that the relevant source code is all
|
||
// up-to-date and as expected. This is currently used primarily for
|
||
// directory sources which will use this hook to perform an integrity check
|
||
// on all files in the source to ensure they haven't changed. If they have
|
||
// changed then an error is issued.
|
||
if compare.is_err() {
|
||
let source_id = unit.pkg.package_id().source_id();
|
||
let sources = bcx.packages.sources();
|
||
let source = sources
|
||
.get(source_id)
|
||
.ok_or_else(|| internal("missing package source"))?;
|
||
source.verify(unit.pkg.package_id())?;
|
||
}
|
||
|
||
if compare.is_ok() && !force {
|
||
return Ok(Job::new_fresh());
|
||
}
|
||
|
||
// Clear out the old fingerprint file if it exists. This protects when
|
||
// compilation is interrupted leaving a corrupt file. For example, a
|
||
// project with a lib.rs and integration test (two units):
|
||
//
|
||
// 1. Build the library and integration test.
|
||
// 2. Make a change to lib.rs (NOT the integration test).
|
||
// 3. Build the integration test, hit Ctrl-C while linking. With gcc, this
|
||
// will leave behind an incomplete executable (zero size, or partially
|
||
// written). NOTE: The library builds successfully, it is the linking
|
||
// of the integration test that we are interrupting.
|
||
// 4. Build the integration test again.
|
||
//
|
||
// Without the following line, then step 3 will leave a valid fingerprint
|
||
// on the disk. Then step 4 will think the integration test is "fresh"
|
||
// because:
|
||
//
|
||
// - There is a valid fingerprint hash on disk (written in step 1).
|
||
// - The mtime of the output file (the corrupt integration executable
|
||
// written in step 3) is newer than all of its dependencies.
|
||
// - The mtime of the integration test fingerprint dep-info file (written
|
||
// in step 1) is newer than the integration test's source files, because
|
||
// we haven't modified any of its source files.
|
||
//
|
||
// But the executable is corrupt and needs to be rebuilt. Clearing the
|
||
// fingerprint at step 3 ensures that Cargo never mistakes a partially
|
||
// written output as up-to-date.
|
||
if loc.exists() {
|
||
// Truncate instead of delete so that compare_old_fingerprint will
|
||
// still log the reason for the fingerprint failure instead of just
|
||
// reporting "failed to read fingerprint" during the next build if
|
||
// this build fails.
|
||
paths::write(&loc, b"")?;
|
||
}
|
||
|
||
let write_fingerprint = if unit.mode.is_run_custom_build() {
|
||
// For build scripts the `local` field of the fingerprint may change
|
||
// while we're executing it. For example it could be in the legacy
|
||
// "consider everything a dependency mode" and then we switch to "deps
|
||
// are explicitly specified" mode.
|
||
//
|
||
// To handle this movement we need to regenerate the `local` field of a
|
||
// build script's fingerprint after it's executed. We do this by
|
||
// using the `build_script_local_fingerprints` function which returns a
|
||
// thunk we can invoke on a foreign thread to calculate this.
|
||
let build_script_outputs = Arc::clone(&cx.build_script_outputs);
|
||
let metadata = cx.get_run_build_script_metadata(unit);
|
||
let (gen_local, _overridden) = build_script_local_fingerprints(cx, unit);
|
||
let output_path = cx.build_explicit_deps[unit].build_script_output.clone();
|
||
Work::new(move |_| {
|
||
let outputs = build_script_outputs.lock().unwrap();
|
||
let output = outputs
|
||
.get(metadata)
|
||
.expect("output must exist after running");
|
||
let deps = BuildDeps::new(&output_path, Some(output));
|
||
|
||
// FIXME: it's basically buggy that we pass `None` to `call_box`
|
||
// here. See documentation on `build_script_local_fingerprints`
|
||
// below for more information. Despite this just try to proceed and
|
||
// hobble along if it happens to return `Some`.
|
||
if let Some(new_local) = (gen_local)(&deps, None)? {
|
||
*fingerprint.local.lock().unwrap() = new_local;
|
||
}
|
||
|
||
write_fingerprint(&loc, &fingerprint)
|
||
})
|
||
} else {
|
||
Work::new(move |_| write_fingerprint(&loc, &fingerprint))
|
||
};
|
||
|
||
Ok(Job::new_dirty(write_fingerprint))
|
||
}
|
||
|
||
/// Dependency edge information for fingerprints. This is generated for each
|
||
/// dependency and is stored in a `Fingerprint` below.
|
||
#[derive(Clone)]
|
||
struct DepFingerprint {
|
||
/// The hash of the package id that this dependency points to
|
||
pkg_id: u64,
|
||
/// The crate name we're using for this dependency, which if we change we'll
|
||
/// need to recompile!
|
||
name: InternedString,
|
||
/// Whether or not this dependency is flagged as a public dependency or not.
|
||
public: bool,
|
||
/// Whether or not this dependency is an rmeta dependency or a "full"
|
||
/// dependency. In the case of an rmeta dependency our dependency edge only
|
||
/// actually requires the rmeta from what we depend on, so when checking
|
||
/// mtime information all files other than the rmeta can be ignored.
|
||
only_requires_rmeta: bool,
|
||
/// The dependency's fingerprint we recursively point to, containing all the
|
||
/// other hash information we'd otherwise need.
|
||
fingerprint: Arc<Fingerprint>,
|
||
}
|
||
|
||
/// A fingerprint can be considered to be a "short string" representing the
|
||
/// state of a world for a package.
|
||
///
|
||
/// If a fingerprint ever changes, then the package itself needs to be
|
||
/// recompiled. Inputs to the fingerprint include source code modifications,
|
||
/// compiler flags, compiler version, etc. This structure is not simply a
|
||
/// `String` due to the fact that some fingerprints cannot be calculated lazily.
|
||
///
|
||
/// Path sources, for example, use the mtime of the corresponding dep-info file
|
||
/// as a fingerprint (all source files must be modified *before* this mtime).
|
||
/// This dep-info file is not generated, however, until after the crate is
|
||
/// compiled. As a result, this structure can be thought of as a fingerprint
|
||
/// to-be. The actual value can be calculated via `hash()`, but the operation
|
||
/// may fail as some files may not have been generated.
|
||
///
|
||
/// Note that dependencies are taken into account for fingerprints because rustc
|
||
/// requires that whenever an upstream crate is recompiled that all downstream
|
||
/// dependents are also recompiled. This is typically tracked through
|
||
/// `DependencyQueue`, but it also needs to be retained here because Cargo can
|
||
/// be interrupted while executing, losing the state of the `DependencyQueue`
|
||
/// graph.
|
||
#[derive(Serialize, Deserialize)]
|
||
pub struct Fingerprint {
|
||
/// Hash of the version of `rustc` used.
|
||
rustc: u64,
|
||
/// Sorted list of cfg features enabled.
|
||
features: String,
|
||
/// Hash of the `Target` struct, including the target name,
|
||
/// package-relative source path, edition, etc.
|
||
target: u64,
|
||
/// Hash of the `Profile`, `CompileMode`, and any extra flags passed via
|
||
/// `cargo rustc` or `cargo rustdoc`.
|
||
profile: u64,
|
||
/// Hash of the path to the base source file. This is relative to the
|
||
/// workspace root for path members, or absolute for other sources.
|
||
path: u64,
|
||
/// Fingerprints of dependencies.
|
||
deps: Vec<DepFingerprint>,
|
||
/// Information about the inputs that affect this Unit (such as source
|
||
/// file mtimes or build script environment variables).
|
||
local: Mutex<Vec<LocalFingerprint>>,
|
||
/// Cached hash of the `Fingerprint` struct. Used to improve performance
|
||
/// for hashing.
|
||
#[serde(skip)]
|
||
memoized_hash: Mutex<Option<u64>>,
|
||
/// RUSTFLAGS/RUSTDOCFLAGS environment variable value (or config value).
|
||
rustflags: Vec<String>,
|
||
/// Hash of some metadata from the manifest, such as "authors", or
|
||
/// "description", which are exposed as environment variables during
|
||
/// compilation.
|
||
metadata: u64,
|
||
/// Hash of various config settings that change how things are compiled.
|
||
config: u64,
|
||
/// The rustc target. This is only relevant for `.json` files, otherwise
|
||
/// the metadata hash segregates the units.
|
||
compile_kind: u64,
|
||
/// Description of whether the filesystem status for this unit is up to date
|
||
/// or should be considered stale.
|
||
#[serde(skip)]
|
||
fs_status: FsStatus,
|
||
/// Files, relative to `target_root`, that are produced by the step that
|
||
/// this `Fingerprint` represents. This is used to detect when the whole
|
||
/// fingerprint is out of date if this is missing, or if previous
|
||
/// fingerprints output files are regenerated and look newer than this one.
|
||
#[serde(skip)]
|
||
outputs: Vec<PathBuf>,
|
||
}
|
||
|
||
/// Indication of the status on the filesystem for a particular unit.
|
||
enum FsStatus {
|
||
/// This unit is to be considered stale, even if hash information all
|
||
/// matches. The filesystem inputs have changed (or are missing) and the
|
||
/// unit needs to subsequently be recompiled.
|
||
Stale,
|
||
|
||
/// This unit is up-to-date. All outputs and their corresponding mtime are
|
||
/// listed in the payload here for other dependencies to compare against.
|
||
UpToDate { mtimes: HashMap<PathBuf, FileTime> },
|
||
}
|
||
|
||
impl FsStatus {
|
||
fn up_to_date(&self) -> bool {
|
||
match self {
|
||
FsStatus::UpToDate { .. } => true,
|
||
FsStatus::Stale => false,
|
||
}
|
||
}
|
||
}
|
||
|
||
impl Default for FsStatus {
|
||
fn default() -> FsStatus {
|
||
FsStatus::Stale
|
||
}
|
||
}
|
||
|
||
impl Serialize for DepFingerprint {
|
||
fn serialize<S>(&self, ser: S) -> Result<S::Ok, S::Error>
|
||
where
|
||
S: ser::Serializer,
|
||
{
|
||
(
|
||
&self.pkg_id,
|
||
&self.name,
|
||
&self.public,
|
||
&self.fingerprint.hash(),
|
||
)
|
||
.serialize(ser)
|
||
}
|
||
}
|
||
|
||
impl<'de> Deserialize<'de> for DepFingerprint {
|
||
fn deserialize<D>(d: D) -> Result<DepFingerprint, D::Error>
|
||
where
|
||
D: de::Deserializer<'de>,
|
||
{
|
||
let (pkg_id, name, public, hash) = <(u64, String, bool, u64)>::deserialize(d)?;
|
||
Ok(DepFingerprint {
|
||
pkg_id,
|
||
name: InternedString::new(&name),
|
||
public,
|
||
fingerprint: Arc::new(Fingerprint {
|
||
memoized_hash: Mutex::new(Some(hash)),
|
||
..Fingerprint::new()
|
||
}),
|
||
// This field is never read since it's only used in
|
||
// `check_filesystem` which isn't used by fingerprints loaded from
|
||
// disk.
|
||
only_requires_rmeta: false,
|
||
})
|
||
}
|
||
}
|
||
|
||
/// A `LocalFingerprint` represents something that we use to detect direct
|
||
/// changes to a `Fingerprint`.
|
||
///
|
||
/// This is where we track file information, env vars, etc. This
|
||
/// `LocalFingerprint` struct is hashed and if the hash changes will force a
|
||
/// recompile of any fingerprint it's included into. Note that the "local"
|
||
/// terminology comes from the fact that it only has to do with one crate, and
|
||
/// `Fingerprint` tracks the transitive propagation of fingerprint changes.
|
||
///
|
||
/// Note that because this is hashed its contents are carefully managed. Like
|
||
/// mentioned in the above module docs, we don't want to hash absolute paths or
|
||
/// mtime information.
|
||
///
|
||
/// Also note that a `LocalFingerprint` is used in `check_filesystem` to detect
|
||
/// when the filesystem contains stale information (based on mtime currently).
|
||
/// The paths here don't change much between compilations but they're used as
|
||
/// inputs when we probe the filesystem looking at information.
|
||
#[derive(Debug, Serialize, Deserialize, Hash)]
|
||
enum LocalFingerprint {
|
||
/// This is a precalculated fingerprint which has an opaque string we just
|
||
/// hash as usual. This variant is primarily used for rustdoc where we
|
||
/// don't have a dep-info file to compare against.
|
||
///
|
||
/// This is also used for build scripts with no `rerun-if-*` statements, but
|
||
/// that's overall a mistake and causes bugs in Cargo. We shouldn't use this
|
||
/// for build scripts.
|
||
Precalculated(String),
|
||
|
||
/// This is used for crate compilations. The `dep_info` file is a relative
|
||
/// path anchored at `target_root(...)` to the dep-info file that Cargo
|
||
/// generates (which is a custom serialization after parsing rustc's own
|
||
/// `dep-info` output).
|
||
///
|
||
/// The `dep_info` file, when present, also lists a number of other files
|
||
/// for us to look at. If any of those files are newer than this file then
|
||
/// we need to recompile.
|
||
CheckDepInfo { dep_info: PathBuf },
|
||
|
||
/// This represents a nonempty set of `rerun-if-changed` annotations printed
|
||
/// out by a build script. The `output` file is a relative file anchored at
|
||
/// `target_root(...)` which is the actual output of the build script. That
|
||
/// output has already been parsed and the paths printed out via
|
||
/// `rerun-if-changed` are listed in `paths`. The `paths` field is relative
|
||
/// to `pkg.root()`
|
||
///
|
||
/// This is considered up-to-date if all of the `paths` are older than
|
||
/// `output`, otherwise we need to recompile.
|
||
RerunIfChanged {
|
||
output: PathBuf,
|
||
paths: Vec<PathBuf>,
|
||
},
|
||
|
||
/// This represents a single `rerun-if-env-changed` annotation printed by a
|
||
/// build script. The exact env var and value are hashed here. There's no
|
||
/// filesystem dependence here, and if the values are changed the hash will
|
||
/// change forcing a recompile.
|
||
RerunIfEnvChanged { var: String, val: Option<String> },
|
||
}
|
||
|
||
enum StaleItem {
|
||
MissingFile(PathBuf),
|
||
ChangedFile {
|
||
reference: PathBuf,
|
||
reference_mtime: FileTime,
|
||
stale: PathBuf,
|
||
stale_mtime: FileTime,
|
||
},
|
||
ChangedEnv {
|
||
var: String,
|
||
previous: Option<String>,
|
||
current: Option<String>,
|
||
},
|
||
}
|
||
|
||
impl LocalFingerprint {
|
||
/// Checks dynamically at runtime if this `LocalFingerprint` has a stale
|
||
/// item inside of it.
|
||
///
|
||
/// The main purpose of this function is to handle two different ways
|
||
/// fingerprints can be invalidated:
|
||
///
|
||
/// * One is a dependency listed in rustc's dep-info files is invalid. Note
|
||
/// that these could either be env vars or files. We check both here.
|
||
///
|
||
/// * Another is the `rerun-if-changed` directive from build scripts. This
|
||
/// is where we'll find whether files have actually changed
|
||
fn find_stale_item(
|
||
&self,
|
||
mtime_cache: &mut HashMap<PathBuf, FileTime>,
|
||
pkg_root: &Path,
|
||
target_root: &Path,
|
||
) -> CargoResult<Option<StaleItem>> {
|
||
match self {
|
||
// We need to parse `dep_info`, learn about the crate's dependencies.
|
||
//
|
||
// For each env var we see if our current process's env var still
|
||
// matches, and for each file we see if any of them are newer than
|
||
// the `dep_info` file itself whose mtime represents the start of
|
||
// rustc.
|
||
LocalFingerprint::CheckDepInfo { dep_info } => {
|
||
let dep_info = target_root.join(dep_info);
|
||
let info = match parse_dep_info(pkg_root, target_root, &dep_info)? {
|
||
Some(info) => info,
|
||
None => return Ok(Some(StaleItem::MissingFile(dep_info))),
|
||
};
|
||
for (key, previous) in info.env.iter() {
|
||
let current = env::var(key).ok();
|
||
if current == *previous {
|
||
continue;
|
||
}
|
||
return Ok(Some(StaleItem::ChangedEnv {
|
||
var: key.clone(),
|
||
previous: previous.clone(),
|
||
current,
|
||
}));
|
||
}
|
||
Ok(find_stale_file(mtime_cache, &dep_info, info.files.iter()))
|
||
}
|
||
|
||
// We need to verify that no paths listed in `paths` are newer than
|
||
// the `output` path itself, or the last time the build script ran.
|
||
LocalFingerprint::RerunIfChanged { output, paths } => Ok(find_stale_file(
|
||
mtime_cache,
|
||
&target_root.join(output),
|
||
paths.iter().map(|p| pkg_root.join(p)),
|
||
)),
|
||
|
||
// These have no dependencies on the filesystem, and their values
|
||
// are included natively in the `Fingerprint` hash so nothing
|
||
// tocheck for here.
|
||
LocalFingerprint::RerunIfEnvChanged { .. } => Ok(None),
|
||
LocalFingerprint::Precalculated(..) => Ok(None),
|
||
}
|
||
}
|
||
|
||
fn kind(&self) -> &'static str {
|
||
match self {
|
||
LocalFingerprint::Precalculated(..) => "precalculated",
|
||
LocalFingerprint::CheckDepInfo { .. } => "dep-info",
|
||
LocalFingerprint::RerunIfChanged { .. } => "rerun-if-changed",
|
||
LocalFingerprint::RerunIfEnvChanged { .. } => "rerun-if-env-changed",
|
||
}
|
||
}
|
||
}
|
||
|
||
#[derive(Debug)]
|
||
struct MtimeSlot(Mutex<Option<FileTime>>);
|
||
|
||
impl Fingerprint {
|
||
fn new() -> Fingerprint {
|
||
Fingerprint {
|
||
rustc: 0,
|
||
target: 0,
|
||
profile: 0,
|
||
path: 0,
|
||
features: String::new(),
|
||
deps: Vec::new(),
|
||
local: Mutex::new(Vec::new()),
|
||
memoized_hash: Mutex::new(None),
|
||
rustflags: Vec::new(),
|
||
metadata: 0,
|
||
config: 0,
|
||
compile_kind: 0,
|
||
fs_status: FsStatus::Stale,
|
||
outputs: Vec::new(),
|
||
}
|
||
}
|
||
|
||
/// For performance reasons fingerprints will memoize their own hash, but
|
||
/// there's also internal mutability with its `local` field which can
|
||
/// change, for example with build scripts, during a build.
|
||
///
|
||
/// This method can be used to bust all memoized hashes just before a build
|
||
/// to ensure that after a build completes everything is up-to-date.
|
||
pub fn clear_memoized(&self) {
|
||
*self.memoized_hash.lock().unwrap() = None;
|
||
}
|
||
|
||
fn hash(&self) -> u64 {
|
||
if let Some(s) = *self.memoized_hash.lock().unwrap() {
|
||
return s;
|
||
}
|
||
let ret = util::hash_u64(self);
|
||
*self.memoized_hash.lock().unwrap() = Some(ret);
|
||
ret
|
||
}
|
||
|
||
/// Compares this fingerprint with an old version which was previously
|
||
/// serialized to filesystem.
|
||
///
|
||
/// The purpose of this is exclusively to produce a diagnostic message
|
||
/// indicating why we're recompiling something. This function always returns
|
||
/// an error, it will never return success.
|
||
fn compare(&self, old: &Fingerprint) -> CargoResult<()> {
|
||
if self.rustc != old.rustc {
|
||
bail!("rust compiler has changed")
|
||
}
|
||
if self.features != old.features {
|
||
bail!(
|
||
"features have changed: previously {}, now {}",
|
||
old.features,
|
||
self.features
|
||
)
|
||
}
|
||
if self.target != old.target {
|
||
bail!("target configuration has changed")
|
||
}
|
||
if self.path != old.path {
|
||
bail!("path to the source has changed")
|
||
}
|
||
if self.profile != old.profile {
|
||
bail!("profile configuration has changed")
|
||
}
|
||
if self.rustflags != old.rustflags {
|
||
bail!(
|
||
"RUSTFLAGS has changed: previously {:?}, now {:?}",
|
||
old.rustflags,
|
||
self.rustflags
|
||
)
|
||
}
|
||
if self.metadata != old.metadata {
|
||
bail!("metadata changed")
|
||
}
|
||
if self.config != old.config {
|
||
bail!("configuration settings have changed")
|
||
}
|
||
if self.compile_kind != old.compile_kind {
|
||
bail!("compile kind (rustc target) changed")
|
||
}
|
||
let my_local = self.local.lock().unwrap();
|
||
let old_local = old.local.lock().unwrap();
|
||
if my_local.len() != old_local.len() {
|
||
bail!("local lens changed");
|
||
}
|
||
for (new, old) in my_local.iter().zip(old_local.iter()) {
|
||
match (new, old) {
|
||
(LocalFingerprint::Precalculated(a), LocalFingerprint::Precalculated(b)) => {
|
||
if a != b {
|
||
bail!(
|
||
"precalculated components have changed: previously {}, now {}",
|
||
b,
|
||
a
|
||
)
|
||
}
|
||
}
|
||
(
|
||
LocalFingerprint::CheckDepInfo { dep_info: adep },
|
||
LocalFingerprint::CheckDepInfo { dep_info: bdep },
|
||
) => {
|
||
if adep != bdep {
|
||
bail!(
|
||
"dep info output changed: previously {:?}, now {:?}",
|
||
bdep,
|
||
adep
|
||
)
|
||
}
|
||
}
|
||
(
|
||
LocalFingerprint::RerunIfChanged {
|
||
output: aout,
|
||
paths: apaths,
|
||
},
|
||
LocalFingerprint::RerunIfChanged {
|
||
output: bout,
|
||
paths: bpaths,
|
||
},
|
||
) => {
|
||
if aout != bout {
|
||
bail!(
|
||
"rerun-if-changed output changed: previously {:?}, now {:?}",
|
||
bout,
|
||
aout
|
||
)
|
||
}
|
||
if apaths != bpaths {
|
||
bail!(
|
||
"rerun-if-changed output changed: previously {:?}, now {:?}",
|
||
bpaths,
|
||
apaths,
|
||
)
|
||
}
|
||
}
|
||
(
|
||
LocalFingerprint::RerunIfEnvChanged {
|
||
var: akey,
|
||
val: avalue,
|
||
},
|
||
LocalFingerprint::RerunIfEnvChanged {
|
||
var: bkey,
|
||
val: bvalue,
|
||
},
|
||
) => {
|
||
if *akey != *bkey {
|
||
bail!("env vars changed: previously {}, now {}", bkey, akey);
|
||
}
|
||
if *avalue != *bvalue {
|
||
bail!(
|
||
"env var `{}` changed: previously {:?}, now {:?}",
|
||
akey,
|
||
bvalue,
|
||
avalue
|
||
)
|
||
}
|
||
}
|
||
(a, b) => bail!(
|
||
"local fingerprint type has changed ({} => {})",
|
||
b.kind(),
|
||
a.kind()
|
||
),
|
||
}
|
||
}
|
||
|
||
if self.deps.len() != old.deps.len() {
|
||
bail!("number of dependencies has changed")
|
||
}
|
||
for (a, b) in self.deps.iter().zip(old.deps.iter()) {
|
||
if a.name != b.name {
|
||
let e = format_err!("`{}` != `{}`", a.name, b.name)
|
||
.context("unit dependency name changed");
|
||
return Err(e);
|
||
}
|
||
|
||
if a.fingerprint.hash() != b.fingerprint.hash() {
|
||
let e = format_err!(
|
||
"new ({}/{:x}) != old ({}/{:x})",
|
||
a.name,
|
||
a.fingerprint.hash(),
|
||
b.name,
|
||
b.fingerprint.hash()
|
||
)
|
||
.context("unit dependency information changed");
|
||
return Err(e);
|
||
}
|
||
}
|
||
|
||
if !self.fs_status.up_to_date() {
|
||
bail!("current filesystem status shows we're outdated");
|
||
}
|
||
|
||
// This typically means some filesystem modifications happened or
|
||
// something transitive was odd. In general we should strive to provide
|
||
// a better error message than this, so if you see this message a lot it
|
||
// likely means this method needs to be updated!
|
||
bail!("two fingerprint comparison turned up nothing obvious");
|
||
}
|
||
|
||
/// Dynamically inspect the local filesystem to update the `fs_status` field
|
||
/// of this `Fingerprint`.
|
||
///
|
||
/// This function is used just after a `Fingerprint` is constructed to check
|
||
/// the local state of the filesystem and propagate any dirtiness from
|
||
/// dependencies up to this unit as well. This function assumes that the
|
||
/// unit starts out as `FsStatus::Stale` and then it will optionally switch
|
||
/// it to `UpToDate` if it can.
|
||
fn check_filesystem(
|
||
&mut self,
|
||
mtime_cache: &mut HashMap<PathBuf, FileTime>,
|
||
pkg_root: &Path,
|
||
target_root: &Path,
|
||
) -> CargoResult<()> {
|
||
assert!(!self.fs_status.up_to_date());
|
||
|
||
let mut mtimes = HashMap::new();
|
||
|
||
// Get the `mtime` of all outputs. Optionally update their mtime
|
||
// afterwards based on the `mtime_on_use` flag. Afterwards we want the
|
||
// minimum mtime as it's the one we'll be comparing to inputs and
|
||
// dependencies.
|
||
for output in self.outputs.iter() {
|
||
let mtime = match paths::mtime(output) {
|
||
Ok(mtime) => mtime,
|
||
|
||
// This path failed to report its `mtime`. It probably doesn't
|
||
// exists, so leave ourselves as stale and bail out.
|
||
Err(e) => {
|
||
debug!("failed to get mtime of {:?}: {}", output, e);
|
||
return Ok(());
|
||
}
|
||
};
|
||
assert!(mtimes.insert(output.clone(), mtime).is_none());
|
||
}
|
||
|
||
let opt_max = mtimes.iter().max_by_key(|kv| kv.1);
|
||
let (max_path, max_mtime) = match opt_max {
|
||
Some(mtime) => mtime,
|
||
|
||
// We had no output files. This means we're an overridden build
|
||
// script and we're just always up to date because we aren't
|
||
// watching the filesystem.
|
||
None => {
|
||
self.fs_status = FsStatus::UpToDate { mtimes };
|
||
return Ok(());
|
||
}
|
||
};
|
||
debug!(
|
||
"max output mtime for {:?} is {:?} {}",
|
||
pkg_root, max_path, max_mtime
|
||
);
|
||
|
||
for dep in self.deps.iter() {
|
||
let dep_mtimes = match &dep.fingerprint.fs_status {
|
||
FsStatus::UpToDate { mtimes } => mtimes,
|
||
// If our dependency is stale, so are we, so bail out.
|
||
FsStatus::Stale => return Ok(()),
|
||
};
|
||
|
||
// If our dependency edge only requires the rmeta file to be present
|
||
// then we only need to look at that one output file, otherwise we
|
||
// need to consider all output files to see if we're out of date.
|
||
let (dep_path, dep_mtime) = if dep.only_requires_rmeta {
|
||
dep_mtimes
|
||
.iter()
|
||
.find(|(path, _mtime)| {
|
||
path.extension().and_then(|s| s.to_str()) == Some("rmeta")
|
||
})
|
||
.expect("failed to find rmeta")
|
||
} else {
|
||
match dep_mtimes.iter().max_by_key(|kv| kv.1) {
|
||
Some(dep_mtime) => dep_mtime,
|
||
// If our dependencies is up to date and has no filesystem
|
||
// interactions, then we can move on to the next dependency.
|
||
None => continue,
|
||
}
|
||
};
|
||
debug!(
|
||
"max dep mtime for {:?} is {:?} {}",
|
||
pkg_root, dep_path, dep_mtime
|
||
);
|
||
|
||
// If the dependency is newer than our own output then it was
|
||
// recompiled previously. We transitively become stale ourselves in
|
||
// that case, so bail out.
|
||
//
|
||
// Note that this comparison should probably be `>=`, not `>`, but
|
||
// for a discussion of why it's `>` see the discussion about #5918
|
||
// below in `find_stale`.
|
||
if dep_mtime > max_mtime {
|
||
info!(
|
||
"dependency on `{}` is newer than we are {} > {} {:?}",
|
||
dep.name, dep_mtime, max_mtime, pkg_root
|
||
);
|
||
return Ok(());
|
||
}
|
||
}
|
||
|
||
// If we reached this far then all dependencies are up to date. Check
|
||
// all our `LocalFingerprint` information to see if we have any stale
|
||
// files for this package itself. If we do find something log a helpful
|
||
// message and bail out so we stay stale.
|
||
for local in self.local.get_mut().unwrap().iter() {
|
||
if let Some(item) = local.find_stale_item(mtime_cache, pkg_root, target_root)? {
|
||
item.log();
|
||
return Ok(());
|
||
}
|
||
}
|
||
|
||
// Everything was up to date! Record such.
|
||
self.fs_status = FsStatus::UpToDate { mtimes };
|
||
debug!("filesystem up-to-date {:?}", pkg_root);
|
||
|
||
Ok(())
|
||
}
|
||
}
|
||
|
||
impl hash::Hash for Fingerprint {
|
||
fn hash<H: Hasher>(&self, h: &mut H) {
|
||
let Fingerprint {
|
||
rustc,
|
||
ref features,
|
||
target,
|
||
path,
|
||
profile,
|
||
ref deps,
|
||
ref local,
|
||
metadata,
|
||
config,
|
||
compile_kind,
|
||
ref rustflags,
|
||
..
|
||
} = *self;
|
||
let local = local.lock().unwrap();
|
||
(
|
||
rustc,
|
||
features,
|
||
target,
|
||
path,
|
||
profile,
|
||
&*local,
|
||
metadata,
|
||
config,
|
||
compile_kind,
|
||
rustflags,
|
||
)
|
||
.hash(h);
|
||
|
||
h.write_usize(deps.len());
|
||
for DepFingerprint {
|
||
pkg_id,
|
||
name,
|
||
public,
|
||
fingerprint,
|
||
only_requires_rmeta: _, // static property, no need to hash
|
||
} in deps
|
||
{
|
||
pkg_id.hash(h);
|
||
name.hash(h);
|
||
public.hash(h);
|
||
// use memoized dep hashes to avoid exponential blowup
|
||
h.write_u64(Fingerprint::hash(fingerprint));
|
||
}
|
||
}
|
||
}
|
||
|
||
impl hash::Hash for MtimeSlot {
|
||
fn hash<H: Hasher>(&self, h: &mut H) {
|
||
self.0.lock().unwrap().hash(h)
|
||
}
|
||
}
|
||
|
||
impl ser::Serialize for MtimeSlot {
|
||
fn serialize<S>(&self, s: S) -> Result<S::Ok, S::Error>
|
||
where
|
||
S: ser::Serializer,
|
||
{
|
||
self.0
|
||
.lock()
|
||
.unwrap()
|
||
.map(|ft| (ft.unix_seconds(), ft.nanoseconds()))
|
||
.serialize(s)
|
||
}
|
||
}
|
||
|
||
impl<'de> de::Deserialize<'de> for MtimeSlot {
|
||
fn deserialize<D>(d: D) -> Result<MtimeSlot, D::Error>
|
||
where
|
||
D: de::Deserializer<'de>,
|
||
{
|
||
let kind: Option<(i64, u32)> = de::Deserialize::deserialize(d)?;
|
||
Ok(MtimeSlot(Mutex::new(
|
||
kind.map(|(s, n)| FileTime::from_unix_time(s, n)),
|
||
)))
|
||
}
|
||
}
|
||
|
||
impl DepFingerprint {
|
||
fn new(cx: &mut Context<'_, '_>, parent: &Unit, dep: &UnitDep) -> CargoResult<DepFingerprint> {
|
||
let fingerprint = calculate(cx, &dep.unit)?;
|
||
// We need to be careful about what we hash here. We have a goal of
|
||
// supporting renaming a project directory and not rebuilding
|
||
// everything. To do that, however, we need to make sure that the cwd
|
||
// doesn't make its way into any hashes, and one source of that is the
|
||
// `SourceId` for `path` packages.
|
||
//
|
||
// We already have a requirement that `path` packages all have unique
|
||
// names (sort of for this same reason), so if the package source is a
|
||
// `path` then we just hash the name, but otherwise we hash the full
|
||
// id as it won't change when the directory is renamed.
|
||
let pkg_id = if dep.unit.pkg.package_id().source_id().is_path() {
|
||
util::hash_u64(dep.unit.pkg.package_id().name())
|
||
} else {
|
||
util::hash_u64(dep.unit.pkg.package_id())
|
||
};
|
||
|
||
Ok(DepFingerprint {
|
||
pkg_id,
|
||
name: dep.extern_crate_name,
|
||
public: dep.public,
|
||
fingerprint,
|
||
only_requires_rmeta: cx.only_requires_rmeta(parent, &dep.unit),
|
||
})
|
||
}
|
||
}
|
||
|
||
impl StaleItem {
|
||
/// Use the `log` crate to log a hopefully helpful message in diagnosing
|
||
/// what file is considered stale and why. This is intended to be used in
|
||
/// conjunction with `CARGO_LOG` to determine why Cargo is recompiling
|
||
/// something. Currently there's no user-facing usage of this other than
|
||
/// that.
|
||
fn log(&self) {
|
||
match self {
|
||
StaleItem::MissingFile(path) => {
|
||
info!("stale: missing {:?}", path);
|
||
}
|
||
StaleItem::ChangedFile {
|
||
reference,
|
||
reference_mtime,
|
||
stale,
|
||
stale_mtime,
|
||
} => {
|
||
info!("stale: changed {:?}", stale);
|
||
info!(" (vs) {:?}", reference);
|
||
info!(" {:?} != {:?}", reference_mtime, stale_mtime);
|
||
}
|
||
StaleItem::ChangedEnv {
|
||
var,
|
||
previous,
|
||
current,
|
||
} => {
|
||
info!("stale: changed env {:?}", var);
|
||
info!(" {:?} != {:?}", previous, current);
|
||
}
|
||
}
|
||
}
|
||
}
|
||
|
||
/// Calculates the fingerprint for a `unit`.
|
||
///
|
||
/// This fingerprint is used by Cargo to learn about when information such as:
|
||
///
|
||
/// * A non-path package changes (changes version, changes revision, etc).
|
||
/// * Any dependency changes
|
||
/// * The compiler changes
|
||
/// * The set of features a package is built with changes
|
||
/// * The profile a target is compiled with changes (e.g., opt-level changes)
|
||
/// * Any other compiler flags change that will affect the result
|
||
///
|
||
/// Information like file modification time is only calculated for path
|
||
/// dependencies.
|
||
fn calculate(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Arc<Fingerprint>> {
|
||
// This function is slammed quite a lot, so the result is memoized.
|
||
if let Some(s) = cx.fingerprints.get(unit) {
|
||
return Ok(Arc::clone(s));
|
||
}
|
||
let mut fingerprint = if unit.mode.is_run_custom_build() {
|
||
calculate_run_custom_build(cx, unit)?
|
||
} else if unit.mode.is_doc_test() {
|
||
panic!("doc tests do not fingerprint");
|
||
} else {
|
||
calculate_normal(cx, unit)?
|
||
};
|
||
|
||
// After we built the initial `Fingerprint` be sure to update the
|
||
// `fs_status` field of it.
|
||
let target_root = target_root(cx);
|
||
fingerprint.check_filesystem(&mut cx.mtime_cache, unit.pkg.root(), &target_root)?;
|
||
|
||
let fingerprint = Arc::new(fingerprint);
|
||
cx.fingerprints
|
||
.insert(unit.clone(), Arc::clone(&fingerprint));
|
||
Ok(fingerprint)
|
||
}
|
||
|
||
/// Calculate a fingerprint for a "normal" unit, or anything that's not a build
|
||
/// script. This is an internal helper of `calculate`, don't call directly.
|
||
fn calculate_normal(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Fingerprint> {
|
||
// Recursively calculate the fingerprint for all of our dependencies.
|
||
//
|
||
// Skip fingerprints of binaries because they don't actually induce a
|
||
// recompile, they're just dependencies in the sense that they need to be
|
||
// built.
|
||
//
|
||
// Create Vec since mutable cx is needed in closure.
|
||
let deps = Vec::from(cx.unit_deps(unit));
|
||
let mut deps = deps
|
||
.into_iter()
|
||
.filter(|dep| !dep.unit.target.is_bin())
|
||
.map(|dep| DepFingerprint::new(cx, unit, &dep))
|
||
.collect::<CargoResult<Vec<_>>>()?;
|
||
deps.sort_by(|a, b| a.pkg_id.cmp(&b.pkg_id));
|
||
|
||
// Afterwards calculate our own fingerprint information.
|
||
let target_root = target_root(cx);
|
||
let local = if unit.mode.is_doc() {
|
||
// rustdoc does not have dep-info files.
|
||
let fingerprint = pkg_fingerprint(cx.bcx, &unit.pkg).chain_err(|| {
|
||
format!(
|
||
"failed to determine package fingerprint for documenting {}",
|
||
unit.pkg
|
||
)
|
||
})?;
|
||
vec![LocalFingerprint::Precalculated(fingerprint)]
|
||
} else {
|
||
let dep_info = dep_info_loc(cx, unit);
|
||
let dep_info = dep_info.strip_prefix(&target_root).unwrap().to_path_buf();
|
||
vec![LocalFingerprint::CheckDepInfo { dep_info }]
|
||
};
|
||
|
||
// Figure out what the outputs of our unit is, and we'll be storing them
|
||
// into the fingerprint as well.
|
||
let outputs = cx
|
||
.outputs(unit)?
|
||
.iter()
|
||
.filter(|output| !matches!(output.flavor, FileFlavor::DebugInfo | FileFlavor::Auxiliary))
|
||
.map(|output| output.path.clone())
|
||
.collect();
|
||
|
||
// Fill out a bunch more information that we'll be tracking typically
|
||
// hashed to take up less space on disk as we just need to know when things
|
||
// change.
|
||
let extra_flags = if unit.mode.is_doc() {
|
||
cx.bcx.rustdocflags_args(unit)
|
||
} else {
|
||
cx.bcx.rustflags_args(unit)
|
||
}
|
||
.to_vec();
|
||
|
||
let profile_hash = util::hash_u64((
|
||
&unit.profile,
|
||
unit.mode,
|
||
cx.bcx.extra_args_for(unit),
|
||
cx.lto[unit],
|
||
));
|
||
// Include metadata since it is exposed as environment variables.
|
||
let m = unit.pkg.manifest().metadata();
|
||
let metadata = util::hash_u64((&m.authors, &m.description, &m.homepage, &m.repository));
|
||
let config = if unit.mode.is_doc() && cx.bcx.config.cli_unstable().rustdoc_map {
|
||
cx.bcx
|
||
.config
|
||
.doc_extern_map()
|
||
.map_or(0, |map| util::hash_u64(map))
|
||
} else {
|
||
0
|
||
};
|
||
let compile_kind = unit.kind.fingerprint_hash();
|
||
Ok(Fingerprint {
|
||
rustc: util::hash_u64(&cx.bcx.rustc().verbose_version),
|
||
target: util::hash_u64(&unit.target),
|
||
profile: profile_hash,
|
||
// Note that .0 is hashed here, not .1 which is the cwd. That doesn't
|
||
// actually affect the output artifact so there's no need to hash it.
|
||
path: util::hash_u64(path_args(cx.bcx.ws, unit).0),
|
||
features: format!("{:?}", unit.features),
|
||
deps,
|
||
local: Mutex::new(local),
|
||
memoized_hash: Mutex::new(None),
|
||
metadata,
|
||
config,
|
||
compile_kind,
|
||
rustflags: extra_flags,
|
||
fs_status: FsStatus::Stale,
|
||
outputs,
|
||
})
|
||
}
|
||
|
||
/// Calculate a fingerprint for an "execute a build script" unit. This is an
|
||
/// internal helper of `calculate`, don't call directly.
|
||
fn calculate_run_custom_build(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<Fingerprint> {
|
||
assert!(unit.mode.is_run_custom_build());
|
||
// Using the `BuildDeps` information we'll have previously parsed and
|
||
// inserted into `build_explicit_deps` built an initial snapshot of the
|
||
// `LocalFingerprint` list for this build script. If we previously executed
|
||
// the build script this means we'll be watching files and env vars.
|
||
// Otherwise if we haven't previously executed it we'll just start watching
|
||
// the whole crate.
|
||
let (gen_local, overridden) = build_script_local_fingerprints(cx, unit);
|
||
let deps = &cx.build_explicit_deps[unit];
|
||
let local = (gen_local)(
|
||
deps,
|
||
Some(&|| {
|
||
pkg_fingerprint(cx.bcx, &unit.pkg).chain_err(|| {
|
||
format!(
|
||
"failed to determine package fingerprint for build script for {}",
|
||
unit.pkg
|
||
)
|
||
})
|
||
}),
|
||
)?
|
||
.unwrap();
|
||
let output = deps.build_script_output.clone();
|
||
|
||
// Include any dependencies of our execution, which is typically just the
|
||
// compilation of the build script itself. (if the build script changes we
|
||
// should be rerun!). Note though that if we're an overridden build script
|
||
// we have no dependencies so no need to recurse in that case.
|
||
let deps = if overridden {
|
||
// Overridden build scripts don't need to track deps.
|
||
vec![]
|
||
} else {
|
||
// Create Vec since mutable cx is needed in closure.
|
||
let deps = Vec::from(cx.unit_deps(unit));
|
||
deps.into_iter()
|
||
.map(|dep| DepFingerprint::new(cx, unit, &dep))
|
||
.collect::<CargoResult<Vec<_>>>()?
|
||
};
|
||
|
||
Ok(Fingerprint {
|
||
local: Mutex::new(local),
|
||
rustc: util::hash_u64(&cx.bcx.rustc().verbose_version),
|
||
deps,
|
||
outputs: if overridden { Vec::new() } else { vec![output] },
|
||
|
||
// Most of the other info is blank here as we don't really include it
|
||
// in the execution of the build script, but... this may be a latent
|
||
// bug in Cargo.
|
||
..Fingerprint::new()
|
||
})
|
||
}
|
||
|
||
/// Get ready to compute the `LocalFingerprint` values for a `RunCustomBuild`
|
||
/// unit.
|
||
///
|
||
/// This function has, what's on the surface, a seriously wonky interface.
|
||
/// You'll call this function and it'll return a closure and a boolean. The
|
||
/// boolean is pretty simple in that it indicates whether the `unit` has been
|
||
/// overridden via `.cargo/config`. The closure is much more complicated.
|
||
///
|
||
/// This closure is intended to capture any local state necessary to compute
|
||
/// the `LocalFingerprint` values for this unit. It is `Send` and `'static` to
|
||
/// be sent to other threads as well (such as when we're executing build
|
||
/// scripts). That deduplication is the rationale for the closure at least.
|
||
///
|
||
/// The arguments to the closure are a bit weirder, though, and I'll apologize
|
||
/// in advance for the weirdness too. The first argument to the closure is a
|
||
/// `&BuildDeps`. This is the parsed version of a build script, and when Cargo
|
||
/// starts up this is cached from previous runs of a build script. After a
|
||
/// build script executes the output file is reparsed and passed in here.
|
||
///
|
||
/// The second argument is the weirdest, it's *optionally* a closure to
|
||
/// call `pkg_fingerprint` below. The `pkg_fingerprint` below requires access
|
||
/// to "source map" located in `Context`. That's very non-`'static` and
|
||
/// non-`Send`, so it can't be used on other threads, such as when we invoke
|
||
/// this after a build script has finished. The `Option` allows us to for sure
|
||
/// calculate it on the main thread at the beginning, and then swallow the bug
|
||
/// for now where a worker thread after a build script has finished doesn't
|
||
/// have access. Ideally there would be no second argument or it would be more
|
||
/// "first class" and not an `Option` but something that can be sent between
|
||
/// threads. In any case, it's a bug for now.
|
||
///
|
||
/// This isn't the greatest of interfaces, and if there's suggestions to
|
||
/// improve please do so!
|
||
///
|
||
/// FIXME(#6779) - see all the words above
|
||
fn build_script_local_fingerprints(
|
||
cx: &mut Context<'_, '_>,
|
||
unit: &Unit,
|
||
) -> (
|
||
Box<
|
||
dyn FnOnce(
|
||
&BuildDeps,
|
||
Option<&dyn Fn() -> CargoResult<String>>,
|
||
) -> CargoResult<Option<Vec<LocalFingerprint>>>
|
||
+ Send,
|
||
>,
|
||
bool,
|
||
) {
|
||
assert!(unit.mode.is_run_custom_build());
|
||
// First up, if this build script is entirely overridden, then we just
|
||
// return the hash of what we overrode it with. This is the easy case!
|
||
if let Some(fingerprint) = build_script_override_fingerprint(cx, unit) {
|
||
debug!("override local fingerprints deps {}", unit.pkg);
|
||
return (
|
||
Box::new(
|
||
move |_: &BuildDeps, _: Option<&dyn Fn() -> CargoResult<String>>| {
|
||
Ok(Some(vec![fingerprint]))
|
||
},
|
||
),
|
||
true, // this is an overridden build script
|
||
);
|
||
}
|
||
|
||
// ... Otherwise this is a "real" build script and we need to return a real
|
||
// closure. Our returned closure classifies the build script based on
|
||
// whether it prints `rerun-if-*`. If it *doesn't* print this it's where the
|
||
// magical second argument comes into play, which fingerprints a whole
|
||
// package. Remember that the fact that this is an `Option` is a bug, but a
|
||
// longstanding bug, in Cargo. Recent refactorings just made it painfully
|
||
// obvious.
|
||
let pkg_root = unit.pkg.root().to_path_buf();
|
||
let target_dir = target_root(cx);
|
||
let calculate =
|
||
move |deps: &BuildDeps, pkg_fingerprint: Option<&dyn Fn() -> CargoResult<String>>| {
|
||
if deps.rerun_if_changed.is_empty() && deps.rerun_if_env_changed.is_empty() {
|
||
match pkg_fingerprint {
|
||
// FIXME: this is somewhat buggy with respect to docker and
|
||
// weird filesystems. The `Precalculated` variant
|
||
// constructed below will, for `path` dependencies, contain
|
||
// a stringified version of the mtime for the local crate.
|
||
// This violates one of the things we describe in this
|
||
// module's doc comment, never hashing mtimes. We should
|
||
// figure out a better scheme where a package fingerprint
|
||
// may be a string (like for a registry) or a list of files
|
||
// (like for a path dependency). Those list of files would
|
||
// be stored here rather than the the mtime of them.
|
||
Some(f) => {
|
||
let s = f()?;
|
||
debug!(
|
||
"old local fingerprints deps {:?} precalculated={:?}",
|
||
pkg_root, s
|
||
);
|
||
return Ok(Some(vec![LocalFingerprint::Precalculated(s)]));
|
||
}
|
||
None => return Ok(None),
|
||
}
|
||
}
|
||
|
||
// Ok so now we're in "new mode" where we can have files listed as
|
||
// dependencies as well as env vars listed as dependencies. Process
|
||
// them all here.
|
||
Ok(Some(local_fingerprints_deps(deps, &target_dir, &pkg_root)))
|
||
};
|
||
|
||
// Note that `false` == "not overridden"
|
||
(Box::new(calculate), false)
|
||
}
|
||
|
||
/// Create a `LocalFingerprint` for an overridden build script.
|
||
/// Returns None if it is not overridden.
|
||
fn build_script_override_fingerprint(
|
||
cx: &mut Context<'_, '_>,
|
||
unit: &Unit,
|
||
) -> Option<LocalFingerprint> {
|
||
// Build script output is only populated at this stage when it is
|
||
// overridden.
|
||
let build_script_outputs = cx.build_script_outputs.lock().unwrap();
|
||
let metadata = cx.get_run_build_script_metadata(unit);
|
||
// Returns None if it is not overridden.
|
||
let output = build_script_outputs.get(metadata)?;
|
||
let s = format!(
|
||
"overridden build state with hash: {}",
|
||
util::hash_u64(output)
|
||
);
|
||
Some(LocalFingerprint::Precalculated(s))
|
||
}
|
||
|
||
/// Compute the `LocalFingerprint` values for a `RunCustomBuild` unit for
|
||
/// non-overridden new-style build scripts only. This is only used when `deps`
|
||
/// is already known to have a nonempty `rerun-if-*` somewhere.
|
||
fn local_fingerprints_deps(
|
||
deps: &BuildDeps,
|
||
target_root: &Path,
|
||
pkg_root: &Path,
|
||
) -> Vec<LocalFingerprint> {
|
||
debug!("new local fingerprints deps {:?}", pkg_root);
|
||
let mut local = Vec::new();
|
||
|
||
if !deps.rerun_if_changed.is_empty() {
|
||
// Note that like the module comment above says we are careful to never
|
||
// store an absolute path in `LocalFingerprint`, so ensure that we strip
|
||
// absolute prefixes from them.
|
||
let output = deps
|
||
.build_script_output
|
||
.strip_prefix(target_root)
|
||
.unwrap()
|
||
.to_path_buf();
|
||
let paths = deps
|
||
.rerun_if_changed
|
||
.iter()
|
||
.map(|p| p.strip_prefix(pkg_root).unwrap_or(p).to_path_buf())
|
||
.collect();
|
||
local.push(LocalFingerprint::RerunIfChanged { output, paths });
|
||
}
|
||
|
||
for var in deps.rerun_if_env_changed.iter() {
|
||
let val = env::var(var).ok();
|
||
local.push(LocalFingerprint::RerunIfEnvChanged {
|
||
var: var.clone(),
|
||
val,
|
||
});
|
||
}
|
||
|
||
local
|
||
}
|
||
|
||
fn write_fingerprint(loc: &Path, fingerprint: &Fingerprint) -> CargoResult<()> {
|
||
debug_assert_ne!(fingerprint.rustc, 0);
|
||
// fingerprint::new().rustc == 0, make sure it doesn't make it to the file system.
|
||
// This is mostly so outside tools can reliably find out what rust version this file is for,
|
||
// as we can use the full hash.
|
||
let hash = fingerprint.hash();
|
||
debug!("write fingerprint ({:x}) : {}", hash, loc.display());
|
||
paths::write(loc, util::to_hex(hash).as_bytes())?;
|
||
|
||
let json = serde_json::to_string(fingerprint).unwrap();
|
||
if cfg!(debug_assertions) {
|
||
let f: Fingerprint = serde_json::from_str(&json).unwrap();
|
||
assert_eq!(f.hash(), hash);
|
||
}
|
||
paths::write(&loc.with_extension("json"), json.as_bytes())?;
|
||
Ok(())
|
||
}
|
||
|
||
/// Prepare for work when a package starts to build
|
||
pub fn prepare_init(cx: &mut Context<'_, '_>, unit: &Unit) -> CargoResult<()> {
|
||
let new1 = cx.files().fingerprint_dir(unit);
|
||
|
||
// Doc tests have no output, thus no fingerprint.
|
||
if !new1.exists() && !unit.mode.is_doc_test() {
|
||
paths::create_dir_all(&new1)?;
|
||
}
|
||
|
||
Ok(())
|
||
}
|
||
|
||
/// Returns the location that the dep-info file will show up at for the `unit`
|
||
/// specified.
|
||
pub fn dep_info_loc(cx: &mut Context<'_, '_>, unit: &Unit) -> PathBuf {
|
||
cx.files().fingerprint_file_path(unit, "dep-")
|
||
}
|
||
|
||
/// Returns an absolute path that target directory.
|
||
/// All paths are rewritten to be relative to this.
|
||
fn target_root(cx: &Context<'_, '_>) -> PathBuf {
|
||
cx.bcx.ws.target_dir().into_path_unlocked()
|
||
}
|
||
|
||
fn compare_old_fingerprint(
|
||
loc: &Path,
|
||
new_fingerprint: &Fingerprint,
|
||
mtime_on_use: bool,
|
||
) -> CargoResult<()> {
|
||
let old_fingerprint_short = paths::read(loc)?;
|
||
|
||
if mtime_on_use {
|
||
// update the mtime so other cleaners know we used it
|
||
let t = FileTime::from_system_time(SystemTime::now());
|
||
debug!("mtime-on-use forcing {:?} to {}", loc, t);
|
||
paths::set_file_time_no_err(loc, t);
|
||
}
|
||
|
||
let new_hash = new_fingerprint.hash();
|
||
|
||
if util::to_hex(new_hash) == old_fingerprint_short && new_fingerprint.fs_status.up_to_date() {
|
||
return Ok(());
|
||
}
|
||
|
||
let old_fingerprint_json = paths::read(&loc.with_extension("json"))?;
|
||
let old_fingerprint: Fingerprint = serde_json::from_str(&old_fingerprint_json)
|
||
.chain_err(|| internal("failed to deserialize json"))?;
|
||
// Fingerprint can be empty after a failed rebuild (see comment in prepare_target).
|
||
if !old_fingerprint_short.is_empty() {
|
||
debug_assert_eq!(util::to_hex(old_fingerprint.hash()), old_fingerprint_short);
|
||
}
|
||
let result = new_fingerprint.compare(&old_fingerprint);
|
||
assert!(result.is_err());
|
||
result
|
||
}
|
||
|
||
fn log_compare(unit: &Unit, compare: &CargoResult<()>) {
|
||
let ce = match compare {
|
||
Ok(..) => return,
|
||
Err(e) => e,
|
||
};
|
||
info!(
|
||
"fingerprint error for {}/{:?}/{:?}",
|
||
unit.pkg, unit.mode, unit.target,
|
||
);
|
||
info!(" err: {:?}", ce);
|
||
}
|
||
|
||
/// Parses Cargo's internal `EncodedDepInfo` structure that was previously
|
||
/// serialized to disk.
|
||
///
|
||
/// Note that this is not rustc's `*.d` files.
|
||
///
|
||
/// Also note that rustc's `*.d` files are translated to Cargo-specific
|
||
/// `EncodedDepInfo` files after compilations have finished in
|
||
/// `translate_dep_info`.
|
||
///
|
||
/// Returns `None` if the file is corrupt or couldn't be read from disk. This
|
||
/// indicates that the crate should likely be rebuilt.
|
||
pub fn parse_dep_info(
|
||
pkg_root: &Path,
|
||
target_root: &Path,
|
||
dep_info: &Path,
|
||
) -> CargoResult<Option<RustcDepInfo>> {
|
||
let data = match paths::read_bytes(dep_info) {
|
||
Ok(data) => data,
|
||
Err(_) => return Ok(None),
|
||
};
|
||
let info = match EncodedDepInfo::parse(&data) {
|
||
Some(info) => info,
|
||
None => {
|
||
log::warn!("failed to parse cargo's dep-info at {:?}", dep_info);
|
||
return Ok(None);
|
||
}
|
||
};
|
||
let mut ret = RustcDepInfo::default();
|
||
ret.env = info.env;
|
||
for (ty, path) in info.files {
|
||
let path = match ty {
|
||
DepInfoPathType::PackageRootRelative => pkg_root.join(path),
|
||
// N.B. path might be absolute here in which case the join will have no effect
|
||
DepInfoPathType::TargetRootRelative => target_root.join(path),
|
||
};
|
||
ret.files.push(path);
|
||
}
|
||
Ok(Some(ret))
|
||
}
|
||
|
||
fn pkg_fingerprint(bcx: &BuildContext<'_, '_>, pkg: &Package) -> CargoResult<String> {
|
||
let source_id = pkg.package_id().source_id();
|
||
let sources = bcx.packages.sources();
|
||
|
||
let source = sources
|
||
.get(source_id)
|
||
.ok_or_else(|| internal("missing package source"))?;
|
||
source.fingerprint(pkg)
|
||
}
|
||
|
||
fn find_stale_file<I>(
|
||
mtime_cache: &mut HashMap<PathBuf, FileTime>,
|
||
reference: &Path,
|
||
paths: I,
|
||
) -> Option<StaleItem>
|
||
where
|
||
I: IntoIterator,
|
||
I::Item: AsRef<Path>,
|
||
{
|
||
let reference_mtime = match paths::mtime(reference) {
|
||
Ok(mtime) => mtime,
|
||
Err(..) => return Some(StaleItem::MissingFile(reference.to_path_buf())),
|
||
};
|
||
|
||
for path in paths {
|
||
let path = path.as_ref();
|
||
let path_mtime = match mtime_cache.entry(path.to_path_buf()) {
|
||
Entry::Occupied(o) => *o.get(),
|
||
Entry::Vacant(v) => {
|
||
let mtime = match paths::mtime_recursive(path) {
|
||
Ok(mtime) => mtime,
|
||
Err(..) => return Some(StaleItem::MissingFile(path.to_path_buf())),
|
||
};
|
||
*v.insert(mtime)
|
||
}
|
||
};
|
||
|
||
// TODO: fix #5918.
|
||
// Note that equal mtimes should be considered "stale". For filesystems with
|
||
// not much timestamp precision like 1s this is would be a conservative approximation
|
||
// to handle the case where a file is modified within the same second after
|
||
// a build starts. We want to make sure that incremental rebuilds pick that up!
|
||
//
|
||
// For filesystems with nanosecond precision it's been seen in the wild that
|
||
// its "nanosecond precision" isn't really nanosecond-accurate. It turns out that
|
||
// kernels may cache the current time so files created at different times actually
|
||
// list the same nanosecond precision. Some digging on #5919 picked up that the
|
||
// kernel caches the current time between timer ticks, which could mean that if
|
||
// a file is updated at most 10ms after a build starts then Cargo may not
|
||
// pick up the build changes.
|
||
//
|
||
// All in all, an equality check here would be a conservative assumption that,
|
||
// if equal, files were changed just after a previous build finished.
|
||
// Unfortunately this became problematic when (in #6484) cargo switch to more accurately
|
||
// measuring the start time of builds.
|
||
if path_mtime <= reference_mtime {
|
||
continue;
|
||
}
|
||
|
||
return Some(StaleItem::ChangedFile {
|
||
reference: reference.to_path_buf(),
|
||
reference_mtime,
|
||
stale: path.to_path_buf(),
|
||
stale_mtime: path_mtime,
|
||
});
|
||
}
|
||
|
||
debug!(
|
||
"all paths up-to-date relative to {:?} mtime={}",
|
||
reference, reference_mtime
|
||
);
|
||
None
|
||
}
|
||
|
||
enum DepInfoPathType {
|
||
// src/, e.g. src/lib.rs
|
||
PackageRootRelative,
|
||
// target/debug/deps/lib...
|
||
// or an absolute path /.../sysroot/...
|
||
TargetRootRelative,
|
||
}
|
||
|
||
/// Parses the dep-info file coming out of rustc into a Cargo-specific format.
|
||
///
|
||
/// This function will parse `rustc_dep_info` as a makefile-style dep info to
|
||
/// learn about the all files which a crate depends on. This is then
|
||
/// re-serialized into the `cargo_dep_info` path in a Cargo-specific format.
|
||
///
|
||
/// The `pkg_root` argument here is the absolute path to the directory
|
||
/// containing `Cargo.toml` for this crate that was compiled. The paths listed
|
||
/// in the rustc dep-info file may or may not be absolute but we'll want to
|
||
/// consider all of them relative to the `root` specified.
|
||
///
|
||
/// The `rustc_cwd` argument is the absolute path to the cwd of the compiler
|
||
/// when it was invoked.
|
||
///
|
||
/// If the `allow_package` argument is true, then package-relative paths are
|
||
/// included. If it is false, then package-relative paths are skipped and
|
||
/// ignored (typically used for registry or git dependencies where we assume
|
||
/// the source never changes, and we don't want the cost of running `stat` on
|
||
/// all those files). See the module-level docs for the note about
|
||
/// `-Zbinary-dep-depinfo` for more details on why this is done.
|
||
///
|
||
/// The serialized Cargo format will contain a list of files, all of which are
|
||
/// relative if they're under `root`. or absolute if they're elsewhere.
|
||
pub fn translate_dep_info(
|
||
rustc_dep_info: &Path,
|
||
cargo_dep_info: &Path,
|
||
rustc_cwd: &Path,
|
||
pkg_root: &Path,
|
||
target_root: &Path,
|
||
rustc_cmd: &ProcessBuilder,
|
||
allow_package: bool,
|
||
) -> CargoResult<()> {
|
||
let depinfo = parse_rustc_dep_info(rustc_dep_info)?;
|
||
|
||
let target_root = target_root.canonicalize()?;
|
||
let pkg_root = pkg_root.canonicalize()?;
|
||
let mut on_disk_info = EncodedDepInfo::default();
|
||
on_disk_info.env = depinfo.env;
|
||
|
||
// This is a bit of a tricky statement, but here we're *removing* the
|
||
// dependency on environment variables that were defined specifically for
|
||
// the command itself. Environment variables returend by `get_envs` includes
|
||
// environment variables like:
|
||
//
|
||
// * `OUT_DIR` if applicable
|
||
// * env vars added by a build script, if any
|
||
//
|
||
// The general idea here is that the dep info file tells us what, when
|
||
// changed, should cause us to rebuild the crate. These environment
|
||
// variables are synthesized by Cargo and/or the build script, and the
|
||
// intention is that their values are tracked elsewhere for whether the
|
||
// crate needs to be rebuilt.
|
||
//
|
||
// For example a build script says when it needs to be rerun and otherwise
|
||
// it's assumed to produce the same output, so we're guaranteed that env
|
||
// vars defined by the build script will always be the same unless the build
|
||
// script itself reruns, in which case the crate will rerun anyway.
|
||
//
|
||
// For things like `OUT_DIR` it's a bit sketchy for now. Most of the time
|
||
// that's used for code generation but this is technically buggy where if
|
||
// you write a binary that does `println!("{}", env!("OUT_DIR"))` we won't
|
||
// recompile that if you move the target directory. Hopefully that's not too
|
||
// bad of an issue for now...
|
||
on_disk_info
|
||
.env
|
||
.retain(|(key, _)| !rustc_cmd.get_envs().contains_key(key));
|
||
|
||
for file in depinfo.files {
|
||
// The path may be absolute or relative, canonical or not. Make sure
|
||
// it is canonicalized so we are comparing the same kinds of paths.
|
||
let abs_file = rustc_cwd.join(file);
|
||
// If canonicalization fails, just use the abs path. There is currently
|
||
// a bug where --remap-path-prefix is affecting .d files, causing them
|
||
// to point to non-existent paths.
|
||
let canon_file = abs_file.canonicalize().unwrap_or_else(|_| abs_file.clone());
|
||
|
||
let (ty, path) = if let Ok(stripped) = canon_file.strip_prefix(&target_root) {
|
||
(DepInfoPathType::TargetRootRelative, stripped)
|
||
} else if let Ok(stripped) = canon_file.strip_prefix(&pkg_root) {
|
||
if !allow_package {
|
||
continue;
|
||
}
|
||
(DepInfoPathType::PackageRootRelative, stripped)
|
||
} else {
|
||
// It's definitely not target root relative, but this is an absolute path (since it was
|
||
// joined to rustc_cwd) and as such re-joining it later to the target root will have no
|
||
// effect.
|
||
(DepInfoPathType::TargetRootRelative, &*abs_file)
|
||
};
|
||
on_disk_info.files.push((ty, path.to_owned()));
|
||
}
|
||
paths::write(cargo_dep_info, on_disk_info.serialize()?)?;
|
||
Ok(())
|
||
}
|
||
|
||
#[derive(Default)]
|
||
pub struct RustcDepInfo {
|
||
/// The list of files that the main target in the dep-info file depends on.
|
||
pub files: Vec<PathBuf>,
|
||
/// The list of environment variables we found that the rustc compilation
|
||
/// depends on.
|
||
///
|
||
/// The first element of the pair is the name of the env var and the second
|
||
/// item is the value. `Some` means that the env var was set, and `None`
|
||
/// means that the env var wasn't actually set and the compilation depends
|
||
/// on it not being set.
|
||
pub env: Vec<(String, Option<String>)>,
|
||
}
|
||
|
||
// Same as `RustcDepInfo` except avoids absolute paths as much as possible to
|
||
// allow moving around the target directory.
|
||
//
|
||
// This is also stored in an optimized format to make parsing it fast because
|
||
// Cargo will read it for crates on all future compilations.
|
||
#[derive(Default)]
|
||
struct EncodedDepInfo {
|
||
files: Vec<(DepInfoPathType, PathBuf)>,
|
||
env: Vec<(String, Option<String>)>,
|
||
}
|
||
|
||
impl EncodedDepInfo {
|
||
fn parse(mut bytes: &[u8]) -> Option<EncodedDepInfo> {
|
||
let bytes = &mut bytes;
|
||
let nfiles = read_usize(bytes)?;
|
||
let mut files = Vec::with_capacity(nfiles as usize);
|
||
for _ in 0..nfiles {
|
||
let ty = match read_u8(bytes)? {
|
||
0 => DepInfoPathType::PackageRootRelative,
|
||
1 => DepInfoPathType::TargetRootRelative,
|
||
_ => return None,
|
||
};
|
||
let bytes = read_bytes(bytes)?;
|
||
files.push((ty, util::bytes2path(bytes).ok()?));
|
||
}
|
||
|
||
let nenv = read_usize(bytes)?;
|
||
let mut env = Vec::with_capacity(nenv as usize);
|
||
for _ in 0..nenv {
|
||
let key = str::from_utf8(read_bytes(bytes)?).ok()?.to_string();
|
||
let val = match read_u8(bytes)? {
|
||
0 => None,
|
||
1 => Some(str::from_utf8(read_bytes(bytes)?).ok()?.to_string()),
|
||
_ => return None,
|
||
};
|
||
env.push((key, val));
|
||
}
|
||
return Some(EncodedDepInfo { files, env });
|
||
|
||
fn read_usize(bytes: &mut &[u8]) -> Option<usize> {
|
||
let ret = bytes.get(..4)?;
|
||
*bytes = &bytes[4..];
|
||
Some(u32::from_le_bytes(ret.try_into().unwrap()) as usize)
|
||
}
|
||
|
||
fn read_u8(bytes: &mut &[u8]) -> Option<u8> {
|
||
let ret = *bytes.get(0)?;
|
||
*bytes = &bytes[1..];
|
||
Some(ret)
|
||
}
|
||
|
||
fn read_bytes<'a>(bytes: &mut &'a [u8]) -> Option<&'a [u8]> {
|
||
let n = read_usize(bytes)? as usize;
|
||
let ret = bytes.get(..n)?;
|
||
*bytes = &bytes[n..];
|
||
Some(ret)
|
||
}
|
||
}
|
||
|
||
fn serialize(&self) -> CargoResult<Vec<u8>> {
|
||
let mut ret = Vec::new();
|
||
let dst = &mut ret;
|
||
write_usize(dst, self.files.len());
|
||
for (ty, file) in self.files.iter() {
|
||
match ty {
|
||
DepInfoPathType::PackageRootRelative => dst.push(0),
|
||
DepInfoPathType::TargetRootRelative => dst.push(1),
|
||
}
|
||
write_bytes(dst, util::path2bytes(file)?);
|
||
}
|
||
|
||
write_usize(dst, self.env.len());
|
||
for (key, val) in self.env.iter() {
|
||
write_bytes(dst, key);
|
||
match val {
|
||
None => dst.push(0),
|
||
Some(val) => {
|
||
dst.push(1);
|
||
write_bytes(dst, val);
|
||
}
|
||
}
|
||
}
|
||
return Ok(ret);
|
||
|
||
fn write_bytes(dst: &mut Vec<u8>, val: impl AsRef<[u8]>) {
|
||
let val = val.as_ref();
|
||
write_usize(dst, val.len());
|
||
dst.extend_from_slice(val);
|
||
}
|
||
|
||
fn write_usize(dst: &mut Vec<u8>, val: usize) {
|
||
dst.extend(&u32::to_le_bytes(val as u32));
|
||
}
|
||
}
|
||
}
|
||
|
||
/// Parse the `.d` dep-info file generated by rustc.
|
||
pub fn parse_rustc_dep_info(rustc_dep_info: &Path) -> CargoResult<RustcDepInfo> {
|
||
let contents = paths::read(rustc_dep_info)?;
|
||
let mut ret = RustcDepInfo::default();
|
||
let mut found_deps = false;
|
||
|
||
for line in contents.lines() {
|
||
if let Some(rest) = line.strip_prefix("# env-dep:") {
|
||
let mut parts = rest.splitn(2, '=');
|
||
let env_var = match parts.next() {
|
||
Some(s) => s,
|
||
None => continue,
|
||
};
|
||
let env_val = match parts.next() {
|
||
Some(s) => Some(unescape_env(s)?),
|
||
None => None,
|
||
};
|
||
ret.env.push((unescape_env(env_var)?, env_val));
|
||
} else if let Some(pos) = line.find(": ") {
|
||
if found_deps {
|
||
continue;
|
||
}
|
||
found_deps = true;
|
||
let mut deps = line[pos + 2..].split_whitespace();
|
||
|
||
while let Some(s) = deps.next() {
|
||
let mut file = s.to_string();
|
||
while file.ends_with('\\') {
|
||
file.pop();
|
||
file.push(' ');
|
||
file.push_str(deps.next().ok_or_else(|| {
|
||
internal("malformed dep-info format, trailing \\".to_string())
|
||
})?);
|
||
}
|
||
ret.files.push(file.into());
|
||
}
|
||
}
|
||
}
|
||
return Ok(ret);
|
||
|
||
// rustc tries to fit env var names and values all on a single line, which
|
||
// means it needs to escape `\r` and `\n`. The escape syntax used is "\n"
|
||
// which means that `\` also needs to be escaped.
|
||
fn unescape_env(s: &str) -> CargoResult<String> {
|
||
let mut ret = String::with_capacity(s.len());
|
||
let mut chars = s.chars();
|
||
while let Some(c) = chars.next() {
|
||
if c != '\\' {
|
||
ret.push(c);
|
||
continue;
|
||
}
|
||
match chars.next() {
|
||
Some('\\') => ret.push('\\'),
|
||
Some('n') => ret.push('\n'),
|
||
Some('r') => ret.push('\r'),
|
||
Some(c) => bail!("unknown escape character `{}`", c),
|
||
None => bail!("unterminated escape character"),
|
||
}
|
||
}
|
||
Ok(ret)
|
||
}
|
||
}
|