-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hard linked files are sometimes missed #29
Comments
The file's basename and inode are tracked in a hash, and another file is not reported if it has the same inode and basename. I believe that map is to help when following a symlink, if the target directory would also be traversed on its own. In this case the match is a false-positive. One solution (though not perfect) is to ignore the match in the hash if the target file is actually a hard link. This helps narrow one corner case, but still leaves other more obscure corner cases. I am providing two possible (imperfect) solutions for your review.
I suspect a more complete solution would require keeping track of symlinks encountered during the scan and then resolving and reporting any dangling symlink targets after all other files are reported. |
I'm not sure i understand how to use nlink like you propose. But ill check it out. thanks for the examples. The reason we keep the hash is to prevent infinite recursion when following symlinks to directories. Generally the behavior is too broad now. In that we never report a file we have seen before. We should probably change the behavior to never list a directory we have listed before and bump the major version. the basename thing seems like a bug disaster waiting to happen on windows where ino is empty |
For what it's worth I'm seeing similar behavior walking node_modules on a Windows 8.1 machine. I'm not sure if hard links are at play in my case, but we see files with the same name occasionally skipped. As a simple example: consider a node_modules folder that includes |
@MarkDuckworth I'm curious whether either of my proposed fixes would work in your situation. Are you able to try them? Also in your case are the contents of the two files the same? |
In a situation where two directories have files of the same name which are hard links the 2nd is not reported. For example the tzdata package installs many copies of the same files into /usr/share/zoneinfo such that files like
/usr/share/zoneinfo/GMT0
and/usr/share/zoneinfo/Etc/GMT0
(and many other examples) are hard links.Both
GMT0
andEtc/GMT0
should be reported when traversing/usr/share/zoneinfo
.The text was updated successfully, but these errors were encountered: