Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Files deleted during review are not actually deleted #2572

Open
carakey opened this issue Mar 18, 2024 · 1 comment
Open

Files deleted during review are not actually deleted #2572

carakey opened this issue Mar 18, 2024 · 1 comment

Comments

@carakey
Copy link

carakey commented Mar 18, 2024

Descriptive summary

A number of problematic fileset objects were found after they recently failed fixity checks. These appear to be versions of files that were deleted and replaced during the review process prior to approval. Searching by the depositor and comparing filenames and dates of deposit, I was able to find a current work that appears to be the original parent for each problem fileset, with comment history suggesting changes were requested and had been made, and at least one current, functional fileset. This would indicate that the original file was supposed to have been deleted but continues to persist in the system.

It is not clear whether this is the same issue as the ghost filesets in #2530 as these fileset objects have more extensive metadata, including full characterization information. The extent of the problem is not presently known.

One thing that stands out in the metadata is that 12 out of the 14 fixity failures have a timestamp value of 2021-11-14 between 14:44 and 16:58.

Expected behavior

Deleting a fileset object removes it entirely from the system.

Actual behavior

The filesets identified in February and March 2024 fixity check reports remain findable by ID in Solr and continue to be subject to fixity checking after presumed deletion.

  • filesets give a 500 error when using the direct URL https://ir.library.oregonstate.edu/concern/file_sets/{fileset PID}.
  • filesets are not associated with any parent work in Solr using file_set_ids_ssim:{fileset PID}

Related work

#2571 - deleting these specific fileset objects from the fixity failure reports
#2530 - more fileset objects that refused to be deleted

@carakey
Copy link
Author

carakey commented Jun 12, 2024

Tested and did not replicate the problem:

  • Created a work for review jw827m60t (type = Undergraduate Thesis) and added two file objects 6h4412942 and gq67k088k
  • While in Pending Review workflow state, reviewer deleted gq67k088k and requested changes
  • While in Changes Requested workflow state, depositor added a new file version for 6h4412942 and restarted review
  • While in Pending Review workflow state, reviewer deleted the work jw827m60t

Following this test, none of the three PIDs were found in Solr.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Development

No branches or pull requests

1 participant