Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk delete filesets from 2024 Feb/Mar fixity failures #2571

Closed
carakey opened this issue Mar 18, 2024 · 7 comments
Closed

Bulk delete filesets from 2024 Feb/Mar fixity failures #2571

carakey opened this issue Mar 18, 2024 · 7 comments

Comments

@carakey
Copy link

carakey commented Mar 18, 2024

Descriptive summary

The February and March fixity check reports both had identical sets of 14 failed file IDs. I researched these and found that one ID (kp78gg577) is a member of the set of known ghost filesets from #2530. Based on evidence including comment history and duplicate filenames, the remaining 13 all appear to be older versions of files that were supposedly deleted during review prior to approval. This ticket pertains to deleting the problematic filesets -- they do not load on the site so cannot be deleted in the UI. There is a separate ticket for the underlying issue, #2572.

The following filesets should be deleted if possible -- with some suspicion that they either are or will become zombies/ghosts.

1z40m1672
8c97kx72p
br86bb654
dj52wc82j
hx11xp80b
jm214w89m
pc289r65p
qn59qb39p
qv33s449d
v979v971k
vh53x358h

Update: These two PIDs were on the list originally, but appear to have been deleted as of April 2024:
41687r184
9p290h89n

Related work

#2572
#2530

@carakey
Copy link
Author

carakey commented Apr 16, 2024

Update -- the April fixity report had 12 out of 14 of these PIDs as failures. The two no longer appearing are 41687r184 and 9p290h89n. I checked Solr and these two appear to have been successfully removed.

@carakey
Copy link
Author

carakey commented Jun 7, 2024

I'm adding this to the Summer board as part of the ghost fileset investigation. If they resist deletion by standard means, no need to pursue further -- we'll update the ghost-tracking list to include them and close this ticket.

@straleyb straleyb self-assigned this Jun 10, 2024
@straleyb
Copy link
Contributor

@carakey Im gonna give this a go and will report back once I go through and try to delete the list of PIDs

@straleyb
Copy link
Contributor

I went through and checked on these works. All the works are coming back as Ldp::Gone meaning that Fedora can't find them in its tree for some reason. That means grabbing the work and deleting it using ActiveFedora::Base is impossible. We can try and see if deleting them through the Fedora front end is possible but I know Ryan W has tried that and it didn't work. This is, from what I understand, where the Ghost part of the Ghost FileSets come from.

I did find them in Solr though. I was able to grab them using SolrDocument.find(id) and they returned with a Solr Document. I used Hyrax::SolrService.delete(id) to delete them and double checked using SolrDocument.find(id) and verified that they were removed from the Index.

Im going to try to see if there are any services Hyrax has that interacts with Fedora on a deeper level, like the SolrService, but I wont spin my wheels on it. This is about how far I've been able to get in the past with the Ghost FileSets.

@carakey
Copy link
Author

carakey commented Jun 11, 2024

@straleyb Thanks for taking a look and for the explanation. Good timing, too - the fixity check should run later this week, so we'll see if they pop up again there.

@carakey
Copy link
Author

carakey commented Jun 17, 2024

@straleyb These filesets did not reappear on the June fixity report. Do we conclude that they were just leftover Solr apparitions, already removed from the database, and now also removed from the index?

@straleyb
Copy link
Contributor

Sounds like a pretty accurate conclusion to me. All I did was delete them from Solr and if its resolved now, its gotta be that. Case closed!

@carakey carakey closed this as completed Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

2 participants