Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NSFileManager: data loss when copying files over the network #452

Closed
eukara opened this issue Oct 12, 2024 · 10 comments
Closed

NSFileManager: data loss when copying files over the network #452

eukara opened this issue Oct 12, 2024 · 10 comments

Comments

@eukara
Copy link

eukara commented Oct 12, 2024

Re-filing this here so we don't forget:

As discussed in the meeting, referencing gnustep/apps-gworkspace#18, it appears that copying directories* over NSFileManager results in data loss in certain environments, specifically NFS imports. The current theory is that because the underlying ownership/permissions are different due to all_squash being set on the server side, ignoring incoming user-ids as a result, will cause some higher-level mismatch that will error out the file operation.

* Single files copy fine. Directories (empty ones) copy fine. It is merely the contents of directories which are omitted.

Test case for NSFileManager is pending

@rfm
Copy link
Contributor

rfm commented Oct 13, 2024

I have tried to reproduce this without success using this simple code to copy from a directory containing two subdirectories each containing a plain text file.

[mgr copyItemAtPath: @"." toPath: @"/nfs/general/xxx" error: &err];

The copy was to an Ubuntu server with a pretty standard /etc/exports entry as follows:
/srv/nfs/share 10.211.55.9(rw,sync,no_subtree_check)

That works fine for me, and I note that in gnustep/apps-gworkspace#18 Riccardo ried to reproduce the problem using GWorkspace and was unable to do so.

So we definitely need instructions on how to reproduce the issue; exact information on what nfs set up to use etc.
I suppose it may be the case that, rather than a bug copying, this is a permissions issue on the particular machine preventing the copy, and the issue is a failure to report it well (or at all). We need instructions to reproduce the problem either way.

@gcasa
Copy link
Member

gcasa commented Oct 13, 2024

I think what might help is if you run Gworkspace or your test case in the debugger. Perhaps it is getting some error that the calling code in nsfilemanager is ignoring. @eukara

I would suggest setting breakpoints where nsfilemanager makes low level calls to do the copy. My suspicions are similar to @rfm

It's alarming that you were able to reproduce it during the meeting. Very concerning.

@rfm
Copy link
Contributor

rfm commented Nov 1, 2024

Any update on a way to reproduce this issue @eukara ?

@eukara
Copy link
Author

eukara commented Nov 2, 2024

Here is my test case. https://github.com/eukara/NSFilemanagerBug/

@rfm
Copy link
Contributor

rfm commented Nov 2, 2024

I haven't set up the appropriate NFS to try it out yet, but looking at the description of the test case and the code I can see two clear issues that are not problems with NSFileManager but rather in the code using it.

  1. the return value says that the operation failed ... so the possibility of data loss is because the calling code is ignoring the return value (allowing a user to think the copy completed when it did not).
  2. the calling code is not providing an error handler ... which means that on error the -copyPath:toPath:handler: method is required to abandon the copy and return a failure status. To have the operation attempt to continue needs a handler to return YES when it is asked whether to continue after the error.

If the code in GWorkspace is like that in the testcase, then certainly that's an issue: it's not handling errors well (no attempt to continue/recover) and is failing to report them to the user.

And it might be the case that the NSFileManager method is behaving as documented/designed (ie no bug), but it will take some time to reproduce the NFS setup and comparing the exact behavior with that of OSX to be certain.

@eukara
Copy link
Author

eukara commented Nov 2, 2024

The problem is concisely that NSFileManager its copyPath operation fails. It should not. There's no reason why it should. It is asked to copy a directory along with its contents, and it only manages to create a directory. It succeeds anywhere else on the locally mounted filesystems, but this NFS mount point where the server is expected to handle the re-mapping of permissions will confuse whatever is going on inside NSFileManager copyPath.

While the status return code is acceptable (and should signal GWorkspace to stop, work needs to be done there of course) the half-successful behaviour will trip applications such as GWorkspace. I assume they check if the destination dir exists, then (in the case of a File Move op) delete the source directory.

I don't see how NSFileManager managing to create an empty directory, but failing to populate it is acceptable either way. Perhaps when we've run out of disk space, or hit a different filesystem error - which is unlikely since I'm unable to reproduce this by moving files individually.

@rfm
Copy link
Contributor

rfm commented Nov 2, 2024

Failing to set correct ownership etc is often a major security issue, so yes there is at least one huge reason why the operation should fail (and there may be others: the fact that no reason comes to mind is not evidence that there are no reasons).
The method can't know whether the caller wants files created when the directory has incorrect attributes ... unless the handler supplied by the caller tells it.

So the method is almost certainly stopping the copy because the calling code effectively told it to, which is the designed/documented behavior. And it's returning the appropriate status to tell the calling code that it didn't complete the copy.

While there is a question as to whether the particular set of unusual circumstances should have caused it to ask the handler whether to continue, that's more a matter of OSX compatibility than anything else as far as I can see ... the API documentation can't really specify exactly which of the many possible errors (if any) are ignored rather than being passed to the handler.

The documentation of the method doesn't say that the method attempts to undo a partially completed copy on error, so I think the behavior of simply stopping when it's supposed to stop is correct, but again it may be that OSX behaves differently in practice.

So, there's a definite problem in the way the method is being used, but only a possible issue with the method itself.

@rfm
Copy link
Contributor

rfm commented Nov 3, 2024

I have reproduced this ... the underlying issue is that the directory can't be fully copied because the nfs mount doesn't permit the ownership to be set to the current user. Continuing at this point would be a a security flaw because it would allow copied data to be read by other people (and good practice is that operations must be secure by default,).
A partial (content only) copy might be what the user wants (eg. if you are deliberately making the data public), but a complete (ie attributes as well as content) copy is impossible.
From a security point of view this is the point at which we must have confirmation that an insecure copy is what is wanted, and indeed stepping through under debug reveals that the code then, correctly, attempts to call the handler method -fileManager:shouldProceedAfterError: passing it an error object with text "Unable to change NSFileOwnerAccountName to 'richard' ..." and with the paths of the source and destination.

@rfm
Copy link
Contributor

rfm commented Nov 3, 2024

But, at one point in the code, the result of calling -fileManager:shouldProceedAfterError: was ignored. That is a bug.
This has no effect in the testcase (which fails to set a handler), but is important to any code using the API correctly.
It turns out that the handler in GWorkspace is telling the operation to proceed, but in this instance that was ignored (I quickly reviewed the other places where the method is called, and they all looked OK).
I fixed it in git master to honor the response from the handler.

That leaves at least one GWorkspace issue:
If an operation returns a failure result, GWorkspace definitely should tell the user it has failed!
And one possible issue: arguably, if the handler is asked whether to proceed, GWorkspace should ask the user.

@rfm rfm closed this as completed Nov 3, 2024
@rfm
Copy link
Contributor

rfm commented Nov 3, 2024

PS. thanks for the testcase info, particularly the nfs setup and the ideas about likely reasons for copying the attributes to be impossible. Without that information the issue would likely have been unresolved for a very long time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants