Skip to content

Commit

Permalink
Add support for OS locking (#33)
Browse files Browse the repository at this point in the history
As pointed out in #30, gix's locking mechanism is insufficient in the
face of SIGKILL or other process interruption that can't be caught by
gix's signal handler, in addition to the giant footgun that tame-index
doesn't actually hook that signal handler for you, meaning applications
using it that forget to hook it also run into issues.

In addition, I raised #17 while doing
crate-ci/cargo-release#693
(https://github.com/crate-ci/cargo-release/pull/693/files/012d3e9a7be23db14096e6a0d41cea7528f9348c#r1301688472)
as right now tame-index can perform mutation concurrently with cargo
itself, or potentially read partial data while it is being written.

This PR resolves both of these issues by forcing all R/W operations on
indexes to take a `&FileLock` argument, as well as providing a
`LockOptions` "builder" to create file locks. These locks are created
using flock on unix and LockFileEx on windows, meaning they can properly
be cleaned up by the OS in all situations, including SIGKILL and power
loss etc, unlike gix's locks, and is the same mechanism that cargo uses
for its global package lock, meaning downstream users can ensure they
play nicely with cargo.

The lock facilities are part of the public API of tame-index as I opted
to roll my own implementation instead of using fslock, as it is very
outdated, and doesn't support timeouts. This does mean a lot of unsafe
has been added, but it is tested and not _too_ bad. This can potentially
be moved out to a separate crate in the future, but is fine for now.
This means it could be used to resolve
rustsec/rustsec#1011, and is something I will
use in cargo-deny for the same thing, protecting access to the advisory
database during mutation.

It should also be noted that one can also just construct a
`FileLock::unlocked()` to satisfy the API, without actually performing
any locking, for cases where it's not needed/testing/etc.

Resolves: #17
Resolves: #30
  • Loading branch information
Jake-Shadle authored Sep 29, 2023
1 parent f530166 commit c2b8c03
Show file tree
Hide file tree
Showing 26 changed files with 1,106 additions and 106 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
/target
**/*.rs.bk
Cargo.lock
Cargo.lock
tests/flock/target
16 changes: 10 additions & 6 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -53,26 +53,30 @@ tokio = { version = "1.0", default-features = false, features = [
"time",
], optional = true }
# cargo config parsing
toml = "0.7"
toml = "0.8"
# Faster hashing
twox-hash = { version = "1.6", default-features = false }

[dependencies.gix]
optional = true
version = "0.53.1"
version = "0.54"
default-features = false
features = [
"blocking-http-transport-reqwest",
]
features = ["blocking-http-transport-reqwest"]

[dependencies.reqwest]
optional = true
version = "0.11"
default-features = false
features = ["blocking", "gzip"]

[target.'cfg(unix)'.dependencies]
libc = "0.2"

[target.'cfg(windows)'.dependencies]
windows-targets = "0.48"

[dev-dependencies]
cargo_metadata = "0.17"
cargo_metadata = "0.18"
rayon = "1.7"
tempfile = "3.6"
tiny-bench = "0.3"
Expand Down
15 changes: 13 additions & 2 deletions benches/sparse.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,11 @@ fn main() {
}

fn blocking(rsi: &tame_index::index::RemoteSparseIndex, krates: &KrateSet) {
let krates = rsi.krates(krates.clone(), true);
let krates = rsi.krates(
krates.clone(),
true,
&tame_index::index::FileLock::unlocked(),
);

for (krate, res) in krates {
if let Err(err) = res {
Expand All @@ -70,7 +74,14 @@ fn blocking(rsi: &tame_index::index::RemoteSparseIndex, krates: &KrateSet) {
}

fn asunc(rsi: &tame_index::index::AsyncRemoteSparseIndex, krates: &KrateSet) {
let krates = rsi.krates_blocking(krates.clone(), true, None).unwrap();
let krates = rsi
.krates_blocking(
krates.clone(),
true,
None,
&tame_index::index::FileLock::unlocked(),
)
.unwrap();

for (krate, res) in krates {
if let Err(err) = res {
Expand Down
3 changes: 3 additions & 0 deletions src/error.rs
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,9 @@ pub enum Error {
#[cfg(feature = "local")]
#[error(transparent)]
Local(#[from] LocalRegistryError),
/// Failed to lock a file
#[error(transparent)]
Lock(#[from] crate::utils::flock::FileLockError),
}

impl From<std::path::PathBuf> for Error {
Expand Down
9 changes: 6 additions & 3 deletions src/index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ pub use sparse::SparseIndex;
#[cfg(feature = "sparse")]
pub use sparse_remote::{AsyncRemoteSparseIndex, RemoteSparseIndex};

pub use crate::utils::flock::FileLock;

/// Global configuration of an index, reflecting the [contents of config.json](https://doc.rust-lang.org/cargo/reference/registries.html#index-format).
#[derive(Eq, PartialEq, PartialOrd, Ord, Clone, Debug, serde::Deserialize, serde::Serialize)]
pub struct IndexConfig {
Expand Down Expand Up @@ -119,12 +121,13 @@ impl ComboIndexCache {
pub fn cached_krate(
&self,
name: crate::KrateName<'_>,
lock: &FileLock,
) -> Result<Option<crate::IndexKrate>, Error> {
match self {
Self::Git(index) => index.cached_krate(name),
Self::Sparse(index) => index.cached_krate(name),
Self::Git(index) => index.cached_krate(name, lock),
Self::Sparse(index) => index.cached_krate(name, lock),
#[cfg(feature = "local")]
Self::Local(lr) => lr.cached_krate(name),
Self::Local(lr) => lr.cached_krate(name, lock),
}
}

Expand Down
17 changes: 14 additions & 3 deletions src/index/cache.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ pub const INDEX_V_MAX: u32 = 2;
/// The byte representation of [`INDEX_V_MAX`]
const INDEX_V_MAX_BYTES: [u8; 4] = INDEX_V_MAX.to_le_bytes();

use super::FileLock;
use crate::{CacheError, Error, IndexKrate, KrateName, PathBuf};

/// A wrapper around a byte buffer that has been (partially) validated to be a
Expand Down Expand Up @@ -227,8 +228,9 @@ impl IndexCache {
&self,
name: KrateName<'_>,
revision: Option<&str>,
lock: &FileLock,
) -> Result<Option<IndexKrate>, Error> {
let Some(contents) = self.read_cache_file(name)? else {
let Some(contents) = self.read_cache_file(name, lock)? else {
return Ok(None);
};

Expand All @@ -237,7 +239,12 @@ impl IndexCache {
}

/// Writes the specified crate and revision to the cache
pub fn write_to_cache(&self, krate: &IndexKrate, revision: &str) -> Result<PathBuf, Error> {
pub fn write_to_cache(
&self,
krate: &IndexKrate,
revision: &str,
_lock: &FileLock,
) -> Result<PathBuf, Error> {
let name = krate.name().try_into()?;
let cache_path = self.cache_path(name);

Expand Down Expand Up @@ -280,7 +287,11 @@ impl IndexCache {
///
/// It is recommended to use [`Self::cached_krate`]
#[inline]
pub fn read_cache_file(&self, name: KrateName<'_>) -> Result<Option<Vec<u8>>, Error> {
pub fn read_cache_file(
&self,
name: KrateName<'_>,
_lock: &FileLock,
) -> Result<Option<Vec<u8>>, Error> {
let cache_path = self.cache_path(name);

match std::fs::read(&cache_path) {
Expand Down
21 changes: 13 additions & 8 deletions src/index/combo.rs
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
#[cfg(feature = "local")]
use crate::index::LocalRegistry;
use crate::{
index::{RemoteGitIndex, RemoteSparseIndex},
index::{FileLock, RemoteGitIndex, RemoteSparseIndex},
Error, IndexKrate, KrateName,
};

Expand Down Expand Up @@ -29,23 +29,28 @@ impl ComboIndex {
&self,
name: KrateName<'_>,
write_cache_entry: bool,
lock: &FileLock,
) -> Result<Option<IndexKrate>, Error> {
match self {
Self::Git(index) => index.krate(name, write_cache_entry),
Self::Sparse(index) => index.krate(name, write_cache_entry),
Self::Git(index) => index.krate(name, write_cache_entry, lock),
Self::Sparse(index) => index.krate(name, write_cache_entry, lock),
#[cfg(feature = "local")]
Self::Local(lr) => lr.cached_krate(name),
Self::Local(lr) => lr.cached_krate(name, lock),
}
}

/// Retrieves the cached crate metadata if it exists
#[inline]
pub fn cached_krate(&self, name: KrateName<'_>) -> Result<Option<IndexKrate>, Error> {
pub fn cached_krate(
&self,
name: KrateName<'_>,
lock: &FileLock,
) -> Result<Option<IndexKrate>, Error> {
match self {
Self::Git(index) => index.cached_krate(name),
Self::Sparse(index) => index.cached_krate(name),
Self::Git(index) => index.cached_krate(name, lock),
Self::Sparse(index) => index.cached_krate(name, lock),
#[cfg(feature = "local")]
Self::Local(lr) => lr.cached_krate(name),
Self::Local(lr) => lr.cached_krate(name, lock),
}
}
}
Expand Down
13 changes: 9 additions & 4 deletions src/index/git.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use super::IndexCache;
use super::{FileLock, IndexCache};
use crate::{Error, IndexKrate, KrateName, PathBuf};

/// The URL of the crates.io index for use with git
Expand Down Expand Up @@ -66,8 +66,12 @@ impl GitIndex {
/// There are no guarantees around freshness, and no network I/O will be
/// performed.
#[inline]
pub fn cached_krate(&self, name: KrateName<'_>) -> Result<Option<IndexKrate>, Error> {
self.cache.cached_krate(name, self.head_commit())
pub fn cached_krate(
&self,
name: KrateName<'_>,
lock: &FileLock,
) -> Result<Option<IndexKrate>, Error> {
self.cache.cached_krate(name, self.head_commit(), lock)
}

/// Writes the specified crate to the cache.
Expand All @@ -79,10 +83,11 @@ impl GitIndex {
&self,
krate: &IndexKrate,
blob_id: Option<&str>,
lock: &FileLock,
) -> Result<Option<PathBuf>, Error> {
let Some(id) = blob_id.or_else(|| self.head_commit()) else {
return Ok(None);
};
self.cache.write_to_cache(krate, id).map(Some)
self.cache.write_to_cache(krate, id, lock).map(Some)
}
}
40 changes: 20 additions & 20 deletions src/index/git_remote.rs
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
use super::GitIndex;
use super::{FileLock, GitIndex};
use crate::{Error, IndexKrate, KrateName};
use std::sync::atomic::AtomicBool;

Expand Down Expand Up @@ -30,14 +30,12 @@ impl RemoteGitIndex {
/// if the process is interrupted with Ctrl+C. To support `panic = abort` you also need to register
/// the `gix` signal handler to clean up the locks, see [`gix::interrupt::init_handler`].
#[inline]
pub fn new(index: GitIndex) -> Result<Self, Error> {
pub fn new(index: GitIndex, lock: &FileLock) -> Result<Self, Error> {
Self::with_options(
index,
gix::progress::Discard,
&gix::interrupt::IS_INTERRUPTED,
gix::lock::acquire::Fail::AfterDurationWithBackoff(std::time::Duration::from_secs(
60 * 10, /* 10 minutes */
)),
lock,
)
}

Expand All @@ -60,7 +58,7 @@ impl RemoteGitIndex {
mut index: GitIndex,
progress: P,
should_interrupt: &AtomicBool,
lock_policy: gix::lock::acquire::Fail,
_lock: &FileLock,
) -> Result<Self, Error>
where
P: gix::NestedProgress,
Expand All @@ -82,14 +80,6 @@ impl RemoteGitIndex {
mapping.reduced = open_with_complete_config.clone();
mapping.full = open_with_complete_config.clone();

let _lock = gix::lock::Marker::acquire_to_hold_resource(
index.cache.path.with_extension("tame-index"),
lock_policy,
Some(std::path::PathBuf::from_iter(Some(
std::path::Component::RootDir,
))),
)?;

// Attempt to open the repository, if it fails for any reason,
// attempt to perform a fresh clone instead
let repo = gix::ThreadSafeRepository::discover_opts(
Expand Down Expand Up @@ -235,8 +225,9 @@ impl RemoteGitIndex {
&self,
name: KrateName<'_>,
write_cache_entry: bool,
lock: &FileLock,
) -> Result<Option<IndexKrate>, Error> {
if let Ok(Some(cached)) = self.cached_krate(name) {
if let Ok(Some(cached)) = self.cached_krate(name, lock) {
return Ok(Some(cached));
}

Expand All @@ -252,7 +243,7 @@ impl RemoteGitIndex {
let gix::ObjectId::Sha1(sha1) = blob.id;
let blob_id = crate::utils::encode_hex(&sha1, &mut hex_id);

let _ = self.index.write_to_cache(&krate, Some(blob_id));
let _ = self.index.write_to_cache(&krate, Some(blob_id), lock);
}

Ok(Some(krate))
Expand Down Expand Up @@ -302,8 +293,12 @@ impl RemoteGitIndex {
/// advantage of that though as it does not have access to git and thus
/// cannot know the blob id.
#[inline]
pub fn cached_krate(&self, name: KrateName<'_>) -> Result<Option<IndexKrate>, Error> {
let Some(cached) = self.index.cache.read_cache_file(name)? else {
pub fn cached_krate(
&self,
name: KrateName<'_>,
lock: &FileLock,
) -> Result<Option<IndexKrate>, Error> {
let Some(cached) = self.index.cache.read_cache_file(name, lock)? else {
return Ok(None);
};
let valid = crate::index::cache::ValidCacheEntry::read(&cached)?;
Expand All @@ -329,8 +324,12 @@ impl RemoteGitIndex {
///
/// This method performs network I/O.
#[inline]
pub fn fetch(&mut self) -> Result<(), Error> {
self.fetch_with_options(gix::progress::Discard, &gix::interrupt::IS_INTERRUPTED)
pub fn fetch(&mut self, lock: &FileLock) -> Result<(), Error> {
self.fetch_with_options(
gix::progress::Discard,
&gix::interrupt::IS_INTERRUPTED,
lock,
)
}

/// Same as [`Self::fetch`] but allows specifying a progress implementation
Expand All @@ -339,6 +338,7 @@ impl RemoteGitIndex {
&mut self,
mut progress: P,
should_interrupt: &AtomicBool,
_lock: &FileLock,
) -> Result<(), Error>
where
P: gix::NestedProgress,
Expand Down
7 changes: 6 additions & 1 deletion src/index/local.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
//! Contains code for reading and writing [local registries](https://doc.rust-lang.org/cargo/reference/source-replacement.html#local-registry-sources)
use super::FileLock;
use crate::{Error, IndexKrate, KrateName, Path, PathBuf};
use smol_str::SmolStr;

Expand Down Expand Up @@ -130,7 +131,11 @@ impl LocalRegistry {
/// Note this naming is just to be consistent with [`crate::SparseIndex`] and
/// [`crate::GitIndex`], local registries do not have a .cache in the index
#[inline]
pub fn cached_krate(&self, name: KrateName<'_>) -> Result<Option<IndexKrate>, Error> {
pub fn cached_krate(
&self,
name: KrateName<'_>,
_lock: &FileLock,
) -> Result<Option<IndexKrate>, Error> {
let index_path = make_path(&self.path, name);

let buf = match std::fs::read(&index_path) {
Expand Down
Loading

0 comments on commit c2b8c03

Please sign in to comment.