-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Apparent corruption with readonly cache?? #4
Comments
I know it's been nearly a couple years and everyone has probably moved on but I'm getting the same issue. Did a write back to one partition and that works flawlessly. Did a read-only on my boot and had instant corrupted blocks reading from it(tried verifying a completed torrent). From this point, no apps will load and everything will error out. Anything that gets touched(read/written) gets corrupted. I assumed running it off the boot drive caused it but perhaps it's just the read-only mode. Might take the plunge and do a writethrough cache and hope for the best. Ubuntu 20, external USB 3.1 SSD and two partitions running off mdadm RAID5 and RAID6 modes. Wish me luck. If you never hear from me again, write-through either didn't work or I'm happily chugging away. |
Tried it on my RAID6 boot drive with write through and verifying a torrent immediately lead to about 20% corrupt data followed by the system going downhill quickly. Deleting the cache, rebooting and reverifying the same corrupt torrent brought me back to around 99% good data. I assume the 1% was just the torrent running for a few seconds trying to overwrite the "bad" stuff. So it turns out this is semi-reversible if one is experiencing corruption. The secondary partition running RAID5 and read only still appears fine after all this time so perhaps it just doesn't like caching boot drives. |
Hi |
Yea, I think this project has something that the other caching solutions do not - you can add/remove/modify the cache without needing to wipe the whole partition. I’m personally interested in it. |
Y'all seem to have the most active of the EnhanceIO forks, so I'm going to ask here to see if perhaps you can help me. I'm generally pretty clueful, but I've got no clue how to even start debugging a problem like this, so I'm reaching out. EnhanceIO seems like the only caching solution that actually meets my needs, so I really appreciate any help/pointers/etc you could give.
The setup:
Spinning rust: A Drobo 5C (think "raid array"), formatted NTFS (don't judge me!), connected via USB3, and mounted as
/export/Drobo
using the ntfs-3g FUSE module (this setup has worked flawlessly for about a decade, FWIW). The device shows up as a 64TB filesystem (despite not having that much actual disk) due to thin provisioning magic that is transparent to anything using the device.Cache disk: Partition #3 on a SATA-attached Samsung 860 EVO, roughly 120GB in size
Kernel: The 4.4.176 "LTS" kernel, as provided by elrepo (specifically 4.4.176-1.el7.elrepo.x86_64)
enhanceio: current HEAD (009db3a)
userspace: RHEL 7.6, though this shouldn't matter much, due to the kernel being a completely stock 4.4 kernel (and I've verified that none of the "Redhat backported a bunch of stuff" fixes to the driver build are getting included)
Cache configured with:
The problem:
I was testing this out by doing
ls -alR /export/Drobo
to prime the cache with at least the directory entries (the device has several hundred thousand files on it). After this has been running for 10 or 20 seconds, the system log starts to spam messages like this:Once this starts happening, attempting to just walk the filesystem results in I/O errors:
...and accompanying log entries...
(I'm assuming the specific ntfs-3g errors won't be meaningful to anyone except the ntfs-3g developers, but I'm including them for completeness.)
One this starts happening, to get a completely functional filesystem again, I need to disable the cache and remount the filesystem (I'm guessing some bad blocks are getting into the buffer cache).
At no point in this exercise does /proc/enhanceio/DroboCache/errors show any nonzero values.
Any ideas? Is there any debugging functionality available that I could enable to allow me to provide more information? Halp?
The text was updated successfully, but these errors were encountered: