-
-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gix-lock locked access forever and could not recover #1026
Comments
tldr; I think this is an issue that should probably be resolved in As an immediate action, all I can do here is to improve the documentation to warn about these side-effects if signal handler are not installed in the application. This, however, only pushes the can down the road as ultimately the application will have to know that certain crates are only safe to use if they install
The error reporting code that fails to substitute values also is in
Thanks for the suggestion. I'd like to add more options as this one isn't feasible to do with
Let me also reel in @Shnatsel who probably wants to know about this. |
Unfortunately the current Requiring manual intervention, like There are two ways to go about this:
I was trying to think of downsides of (1), but I could not find any that don't also apply to (2). Both won't work on truly obscure platforms such as Fuchsia, and both won't work over NFS (but there is no sane way to lock over NFS anyway). So I suggest using |
I've fixed error reporting in |
Thanks @Shnatsel for chiming in! Now that this issue is tracked in It's worth noting that locks created during clone may also remain, which could lead to refs which can't be updated, for example, in an otherwise fine repo. A great way to prevent this is to implement signal handlers on application level as these will abort every operation gracefully, and |
Unfortunately there is not much that can be done on Is there a particular reason why |
Yes, this is how Thanks for your persistence! I was ducking away from the problem a bit as file-locks are notoriously difficult for all reasons already mentioned and I didn't really want to deal with it, again. But then it became clear that git-style locking will always be a problem for automated systems and force administrators to specifically deal with it. Isn't there a better way? After all, The only way forward is to use filesystem locks where possible. Doing this broadly would be breaking with My thought is to add filesystem locking like This solution, similar to setting up signal handlers, would still require the application to take action and enable this cargo feature in What do you think? PS: I think I could adapt cargo's lock implementation as probably one of the best tested and most-run implementations around. |
Locks being robust to |
@kornelski The solution I propose would probably work here, but requires you to pull in |
I'm surprised that this is incompatible, and needs extra configuration. Do you know what |
A good point: |
@kornelski The reason why you didn't have issues with |
I don't think this behavior should be controlled by a Cargo feature. Features are additive and global, so any dependency anywhere in the tree enabling this feature will affect all the other dependencies. If any configuration options are provided, they should be done as an argument to a function, or as separate functions. Acquiring two locks does not really solve the problem of the stale lock being left behind, so I don't think that is workable. Within the constraints of I've looked at the Cargo implementation and it just uses OS locks everywhere, forgoing git locks and all the issues associated with them. It does seem that for crates like |
This cargo feature is strictly for the application level. It's no similar to a few feature toggles in
Instead of failing on a stale lock, it would instead try to acquire a filesystem lock, and if that is successful, it would take ownership of the stale lock which means it will be removed once the destructor runs. Also, encountering a stale lock then wouldn't be the end of it.
Yes, it at least should be considered an option, one that I'd still have to implement behind a feature toggle as it will add costs while still not being perfect if the I definitely lean towards filesystem locks though, just because they seem to work very well for
I agree, it's probably best to create this lock using filesystem locks, separately. I kept looking and found Maybe getting this single, long-lived lock right would already solve all problems. All other locks that would be created during a clone are to write resources, and these are typically short lived. Packs are the only long-lived resource, but these are normal tempfiles as their name isn't known until they are done streaming. That would get @kornelski back into the two years without (this kind of) failure territory I am sure :). |
After looking into what the I found thinking about this issue very interesting though and believe when implementing a Closing once again as not planned, but can reopen if there is demand. |
I've opened a PR to switch from |
Thanks for sharing, I am happy this finally gets resolved. Of course I hope that |
I've just published rustsec v0.28.3 that switches to OS locking and resolves the issue. |
Current behavior 😯
I've upgraded
rustsec
crate to a version usinggix
, and after a while (probably after an unexpected restart) every fetch ofrustsec
's git database has been failing due to being locked forever.BTW: the error was literally with the placeholders like that.
I've been restarting and retrying the process many times, and it was locked up every time.
I've noticed that in the parent dir of the path given to rustsec's
Repository::fetch
there was a 0-byte filerustsec.rustsec.lock
. Deleting the file unblocked rustsec's fetch.Backtrace of the process while it was waiting:
Expected behavior 🤔
I think a lockfile has to include and check process pid, because a crash of the process can leave a lockfile behind, and it has to have a way to recover.
Also parent dir of the git checkout is a permanent location, so such lock will survive reboots. On Linux I'd expect lockfiles in
/var/run
.Steps to reproduce 🕹
The text was updated successfully, but these errors were encountered: