Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Redefine "inUse" definition #79

Closed
wants to merge 3 commits into from

Conversation

infeo
Copy link
Member

@infeo infeo commented Aug 2, 2023

This PR redefines the "in use" definition introduced in #48.

Instead of just counting the opened files, it introduces the notion of active ( or stale respectively) open files. An open file is active, if within the last X seconds its read() or write() method were called. X is a filesystem specifc threshold and can be set in during the fs creation.

This redefintion is motivated by Windows/WinFsp behavior: It seems, that on Windows there are processes keeping file handles open, even though they are not used anymore until unmount().
For examples, there are the following user reports:

@infeo infeo requested a review from overheadhunter August 2, 2023 10:23
@infeo infeo self-assigned this Aug 2, 2023
@infeo infeo marked this pull request as draft August 2, 2023 10:24
@sonarcloud
Copy link

sonarcloud bot commented Aug 2, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 3 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Member

@overheadhunter overheadhunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When switching to a timeout-based approach, I have just a small change request.

However, I have the gut feeling that this is rather brittle, as I have learned to never depend on timing of third party apps doing certain tasks.

Instead I'm suggesting to check for unflushed changes: Set a flag on every write and unset it during flush. Force-closing a file opened for reading should never be a problem.


public class OpenFile implements Closeable {

private static final Logger LOG = LoggerFactory.getLogger(OpenFile.class);

private final Path path;
private final FileChannel channel;
private final AtomicReference<Instant> lastUsed = new AtomicReference(Instant.now());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AtomicLong is more lightweight (e.g. using System.nanoTime() instead of Instant)

@infeo
Copy link
Member Author

infeo commented Sep 15, 2023

Instead I'm suggesting to check for unflushed changes: Set a flag on every write and unset it during flush. Force-closing a file opened for reading should never be a problem

If we are only checking for unflushed changes, we could unmount a filesystem during two subsequent read operations. But instead of ensuring we can force unmount, "in use" should mean some process is accessing the filesystem. At least that is my perspective.

Nonetheless, a hybrid approach of timing and writing seems to be a good way.

@overheadhunter
Copy link
Member

I guess we have two different use cases in mind.

  1. "unmount when idle" → in this case, I agree that we should keep track of reads as well (don't we do this already?)
  2. "manual unmount" → this is explicitly requested by the user. We should obey, no questions asked. Only intervene, when data is at risk (only affects writes)

@infeo
Copy link
Member Author

infeo commented Sep 19, 2023

I have the feeling, we are talking about two different things, hence, i'll explain myself again.

Our API offers two methods for unmounting: regular and forced.
Forced is the "no questions asked"-"try under any circumstances" method. So the question is, under which circumstances the regular unmount should fail. The current sitation is: If we can lock the fs root for writing and there are no open files, unmount succeeds.

I want to change it, such that it can even succeed with open files by adding a timer to every open file. On every read or write, the timer is resetted. If the timer elapsed, the file is stale meaning it cannot block a regular unmount. Every mounter impl can set the elapse threshold individually.

Why at all the change? Because we are using a not-windows-native filesystem protocol, creating incompabilities. One of those are that on Windows handles can be kept open even though there is no activity on open files.


I guess i got a little off the road by your comment about "reyling on timing is brittle". If even after a longer period a file is opened, but not touched, why should we worried about data loss? As you said it, if unmounting is requested by the consumer, we should obey. Of course it is not the perfect solution: Office apps like to create temporary/lock files. If such a file is stale, we can lock anyway preventing the lock of being deleted (-> delete on close could be checked).
And of course, this change would actually only affect Windows, because the threshold for Linux/MacOS would be the default of MAX_LONG.

@sonarcloud
Copy link

sonarcloud bot commented Sep 20, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 2 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@infeo infeo mentioned this pull request Sep 25, 2023
@infeo
Copy link
Member Author

infeo commented Sep 25, 2023

Superseded by #80

@infeo infeo closed this Sep 25, 2023
@infeo infeo added this to the 4.0.0 milestone Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants