cargo/src/cargo/util/mod.rs

89 lines
2.5 KiB
Rust
Raw Normal View History

Update the progress bar for parallel downloads This is actually a super tricky problem. We don't really have the capacity for more than one line of update-able information in Cargo right now, so we need to squeeze a lot of information into one line of output for Cargo. The main constraints this tries to satisfy are: * At all times it should be clear what's happening. Cargo shouldn't just hang with no output when downloading a crate for a long time, a counter ideally needs to be decreasing while the download progresses. * If a progress bar is shown, it shouldn't jump around. This ends up just being a surprising user experience for most. Progress bars should only ever increase, but they may increase at different speeds. * Cargo has, currently, at most one line of output (as mentioned above) to pack information into. We haven't delved into fancier terminal features that involve multiple lines of update-able output. * When downloading crates as part of `cargo build` (the norm) we don't actually know ahead of time how many crates are being downloaded. We rely on the calculation of unit dependencies to naturally feed into downloading more crates. * Furthermore, once we decide to download a crate, we don't actually know how big it is! We have to wait for the server to tell us how big it is. There doesn't really seem to be a great solution that satisfies all of these constraints unfortunately. As a result this commit implements a relatively conservative solution which should hopefully get us most of the way there. There isn't actually a progress bar but rather Cargo prints that it's got N crates left to download, and if it takes awhile it prints out that there are M bytes remaining. Unfortunately the progress is pretty choppy and jerky, not providing a smooth UI. This appears to largely be because Cargo will synchronously extract tarballs, which for large crates can cause a noticeable pause. Cargo's not really prepared internally to perform this work on helper threads, but ideally if it could do so it would improve the output quite a bit! (making it much smoother and also able to account for the time tarball extraction takes).
2018-09-13 03:57:01 +00:00
use std::time::Duration;
pub use self::canonical_url::CanonicalUrl;
2018-03-14 15:17:44 +00:00
pub use self::config::{homedir, Config, ConfigValue};
Remove `Freshness` from `DependencyQueue` Ever since the inception of Cargo and the advent of incremental compilation at the crate level via Cargo, Cargo has tracked whether it needs to recompile something at a unit level in its "dependency queue" which manages when items are ready for execution. Over time we've fixed lots and lots of bugs related to incremental compilation, and perhaps one of the most impactful realizations was that the model Cargo started with fundamentally doesn't handle interrupting Cargo halfway through and resuming the build later. The previous model relied upon implicitly propagating "dirtiness" based on whether the one of the dependencies of a build was rebuilt or not. This information is not available, however, if Cargo is interrupted and resumed (or performs a subset of steps and then later performs more). We've fixed this in a number of places historically but the purpose of this commit is to put a nail in this coffin once and for all. Implicit propagation of whether a unit is fresh or dirty is no longer present at all. Instead Cargo should always know, irrespective of it's in-memory state, whether a unit needs to be recompiled or not. This commit actually turns up a few bugs in the test suite, so later commits will be targeted at fixing this. Note that this required a good deal of work on the `fingerprint` module to fix some longstanding bugs (like #6780) and some serious hoops had to be jumped through for others (like #6779). While these were fallout from this change they weren't necessarily the primary motivation, but rather to help make `fingerprints` a bit more straightforward in what's an already confusing system! Closes #6780
2019-04-05 19:54:50 +00:00
pub use self::dependency_queue::DependencyQueue;
2018-12-08 11:19:47 +00:00
pub use self::diagnostic_server::RustfixDiagnosticServer;
2021-03-20 18:28:38 +00:00
pub use self::errors::{internal, CargoResult, CargoResultExt, CliResult, Test};
pub use self::errors::{CargoTestError, CliError};
Fix running Cargo concurrently Cargo has historically had no protections against running it concurrently. This is pretty unfortunate, however, as it essentially just means that you can only run one instance of Cargo at a time **globally on a system**. An "easy solution" to this would be the use of file locks, except they need to be applied judiciously. It'd be a pretty bad experience to just lock the entire system globally for Cargo (although it would work), but otherwise Cargo must be principled how it accesses the filesystem to ensure that locks are properly held. This commit intends to solve all of these problems. A new utility module is added to cargo, `util::flock`, which contains two types: * `FileLock` - a locked version of a `File`. This RAII guard will unlock the lock on `Drop` and I/O can be performed through this object. The actual underlying `Path` can be read from this object as well. * `Filesystem` - an unlocked representation of a `Path`. There is no "safe" method to access the underlying path without locking a file on the filesystem first. Built on the [fs2] library, these locks use the `flock` system call on Unix and `LockFileEx` on Windows. Although file locking on Unix is [documented as not so great][unix-bad], but largely only because of NFS, these are just advisory, and there's no byte-range locking. These issues don't necessarily plague Cargo, however, so we should try to leverage them. On both Windows and Unix the file locks are released when the underlying OS handle is closed, which means that if the process dies the locks are released. Cargo has a number of global resources which it now needs to lock, and the strategy is done in a fairly straightforward way: * Each registry's index contains one lock (a dotfile in the index). Updating the index requires a read/write lock while reading the index requires a shared lock. This should allow each process to ensure a registry update happens while not blocking out others for an unnecessarily long time. Additionally any number of processes can read the index. * When downloading crates, each downloaded crate is individually locked. A lock for the downloaded crate implies a lock on the output directory as well. Because downloaded crates are immutable, once the downloaded directory exists the lock is no longer needed as it won't be modified, so it can be released. This granularity of locking allows multiple Cargo instances to download dependencies in parallel. * Git repositories have separate locks for the database and for the project checkout. The datbase and checkout are locked for read/write access when an update is performed, and the lock of the checkout is held for the entire lifetime of the git source. This is done to ensure that any other Cargo processes must wait while we use the git repository. Unfortunately there's just not that much parallelism here. * Binaries managed by `cargo install` are locked by the local metadata file that Cargo manages. This is relatively straightforward. * The actual artifact output directory is just globally locked for the entire build. It's hypothesized that running Cargo concurrently in *one directory* is less of a feature needed rather than running multiple instances of Cargo globally (for now at least). It would be possible to have finer grained locking here, but that can likely be deferred to a future PR. So with all of this infrastructure in place, Cargo is now ready to grab some locks and ensure that you can call it concurrently anywhere at any time and everything always works out as one might expect. One interesting question, however, is what does Cargo do on contention? On one hand Cargo could immediately abort, but this would lead to a pretty poor UI as any Cargo process on the system could kick out any other. Instead this PR takes a more nuanced approach. * First, all locks are attempted to be acquired (a "try lock"). If this succeeds, we're done. * Next, Cargo prints a message to the console that it's going to block waiting for a lock. This is done because it's indeterminate how long Cargo will wait for the lock to become available, and most long-lasting operations in Cargo have a message printed for them. * Finally, a blocking acquisition of the lock is issued and we wait for it to become available. So all in all this should help Cargo fix any future concurrency bugs with file locking in a principled fashion while also allowing concurrent Cargo processes to proceed reasonably across the system. [fs2]: https://github.com/danburkert/fs2-rs [unix-bad]: http://0pointer.de/blog/projects/locking.html Closes #354
2016-03-12 17:58:53 +00:00
pub use self::flock::{FileLock, Filesystem};
pub use self::graph::Graph;
2020-05-11 20:45:18 +00:00
pub use self::hasher::StableHasher;
2018-12-08 11:19:47 +00:00
pub use self::hex::{hash_u64, short_hash, to_hex};
2019-06-20 16:23:55 +00:00
pub use self::into_url::IntoUrl;
pub use self::into_url_with_base::IntoUrlWithBase;
2019-06-21 18:36:53 +00:00
pub use self::lev_distance::{closest, closest_msg, lev_distance};
2018-12-08 11:19:47 +00:00
pub use self::lockserver::{LockServer, LockServerClient, LockServerStarted};
pub use self::paths::{bytes2path, dylib_path, join_paths, path2bytes};
pub use self::paths::{dylib_path_envvar, normalize_path};
2018-12-08 11:19:47 +00:00
pub use self::progress::{Progress, ProgressStyle};
pub use self::queue::Queue;
pub use self::restricted_names::validate_package_name;
pub use self::rustc::Rustc;
pub use self::sha256::Sha256;
pub use self::to_semver::ToSemver;
2018-12-08 11:19:47 +00:00
pub use self::vcs::{existing_vcs_repo, FossilRepo, GitRepo, HgRepo, PijulRepo};
pub use self::workspace::{
add_path_args, path_args, print_available_benches, print_available_binaries,
print_available_examples, print_available_packages, print_available_tests,
};
2014-05-09 00:48:12 +00:00
mod canonical_url;
2018-12-08 11:19:47 +00:00
pub mod command_prelude;
2014-04-07 01:26:36 +00:00
pub mod config;
pub mod cpu;
2018-12-08 11:19:47 +00:00
mod dependency_queue;
pub mod diagnostic_server;
pub mod errors;
2018-12-08 11:19:47 +00:00
mod flock;
2014-10-22 19:12:06 +00:00
pub mod graph;
2020-05-11 20:45:18 +00:00
mod hasher;
2014-06-26 22:14:31 +00:00
pub mod hex;
2014-10-22 19:12:06 +00:00
pub mod important_paths;
pub mod interning;
2019-06-20 16:23:55 +00:00
pub mod into_url;
mod into_url_with_base;
pub mod job;
pub mod lev_distance;
2018-12-08 11:19:47 +00:00
mod lockserver;
pub mod machine_message;
pub mod network;
2014-10-22 19:12:06 +00:00
pub mod paths;
pub mod profile;
2018-12-08 11:19:47 +00:00
mod progress;
mod queue;
pub mod restricted_names;
2019-03-21 22:57:14 +00:00
pub mod rustc;
2018-12-08 11:19:47 +00:00
mod sha256;
pub mod to_semver;
2014-10-22 19:12:06 +00:00
pub mod toml;
mod vcs;
mod workspace;
Update the progress bar for parallel downloads This is actually a super tricky problem. We don't really have the capacity for more than one line of update-able information in Cargo right now, so we need to squeeze a lot of information into one line of output for Cargo. The main constraints this tries to satisfy are: * At all times it should be clear what's happening. Cargo shouldn't just hang with no output when downloading a crate for a long time, a counter ideally needs to be decreasing while the download progresses. * If a progress bar is shown, it shouldn't jump around. This ends up just being a surprising user experience for most. Progress bars should only ever increase, but they may increase at different speeds. * Cargo has, currently, at most one line of output (as mentioned above) to pack information into. We haven't delved into fancier terminal features that involve multiple lines of update-able output. * When downloading crates as part of `cargo build` (the norm) we don't actually know ahead of time how many crates are being downloaded. We rely on the calculation of unit dependencies to naturally feed into downloading more crates. * Furthermore, once we decide to download a crate, we don't actually know how big it is! We have to wait for the server to tell us how big it is. There doesn't really seem to be a great solution that satisfies all of these constraints unfortunately. As a result this commit implements a relatively conservative solution which should hopefully get us most of the way there. There isn't actually a progress bar but rather Cargo prints that it's got N crates left to download, and if it takes awhile it prints out that there are M bytes remaining. Unfortunately the progress is pretty choppy and jerky, not providing a smooth UI. This appears to largely be because Cargo will synchronously extract tarballs, which for large crates can cause a noticeable pause. Cargo's not really prepared internally to perform this work on helper threads, but ideally if it could do so it would improve the output quite a bit! (making it much smoother and also able to account for the time tarball extraction takes).
2018-09-13 03:57:01 +00:00
pub fn elapsed(duration: Duration) -> String {
let secs = duration.as_secs();
if secs >= 60 {
format!("{}m {:02}s", secs / 60, secs % 60)
} else {
format!("{}.{:02}s", secs, duration.subsec_nanos() / 10_000_000)
}
}
2019-07-25 04:48:15 +00:00
/// Whether or not this running in a Continuous Integration environment.
pub fn is_ci() -> bool {
std::env::var("CI").is_ok() || std::env::var("TF_BUILD").is_ok()
}
pub fn indented_lines(text: &str) -> String {
text.lines()
.map(|line| {
if line.is_empty() {
String::from("\n")
} else {
format!(" {}\n", line)
}
})
.collect()
}