Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Go through DMCI workdir file-accumulation #211

Open
johtoblan opened this issue Nov 22, 2023 · 5 comments
Open

Go through DMCI workdir file-accumulation #211

johtoblan opened this issue Nov 22, 2023 · 5 comments
Assignees

Comments

@johtoblan
Copy link
Collaborator

johtoblan commented Nov 22, 2023

No description provided.

@johtoblan
Copy link
Collaborator Author

Might be caused by postgis being down, should there be an automatic iteration of files in workdir older than X or should they just be removed?

@johtoblan johtoblan changed the title Go through DMCI workdir accumulation Go through DMCI workdir file-accumulation Nov 22, 2023
@mortenwh
Copy link
Collaborator

We need to assure that they are ingested, then we can delete them

@mortenwh
Copy link
Collaborator

mortenwh commented Jan 19, 2024

Test:

  • Push valid xml to dmci
  • Let file dist succeed but pycsw and solr dists fail
  • In this case, the file should be removed from workdir but added to rejected dir
  • Let all dists succeed
  • Then, the file should not be in workdir and not in rejected
  • Let all dists fail
  • The file should be in rejected and not in workdir

If all tests succeed, the reason for the build up of data in the work dir must be that the container has crashed in the middle of an ingestion process. In that case, dmci should pick up the files in workdir and re-run the process. Currently, dmci only acts on API actions (i.e., a file is pushed to the API). DMCI currently does not check the work dir.

app.py line 140 to 165 should be a function: return tools.insert_update(...)

Then a tool should be added that regularly checks the workdir and runs insert_update. If all works as expected, this should empty the workdir.

@mortenwh mortenwh reopened this Oct 7, 2024
@mortenwh
Copy link
Collaborator

mortenwh commented Oct 7, 2024

The workdir still often contains files, and seems to not be properly cleaned up. Maybe related to some issues in solr? See #240

@mortenwh
Copy link
Collaborator

Is this related to #237 and #240? @shamlymajeed - could you follow up this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants