Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexJob model #1699

Open
wants to merge 37 commits into
base: develop
Choose a base branch
from
Open

IndexJob model #1699

wants to merge 37 commits into from

Conversation

lukavdplas
Copy link
Contributor

Add an IndexJob model to track indexing commands -> close #1696

The IndexJob describes a set of index-related tasks. (E.g. create new index for a corpus, populate it with documents, and update the alias.) Tasks are broken up into IndexTask models.

At this point, the IndexJob is essentially a log for the index and alias commands; you can't create a job and then run it. Also, the model does not include status tracking. I'll tackle these in #1697

Functions related to indexing commands are refactored. The index and alias now create an IndexJob to describe the command, and then call perform_indexing(job) to run it. This makes up the most hefty changes in code, as it separates "figuring out what to do" from actually doing it.

There is a new indexing app which contains indexing-related models. This was to keep models.py from becoming too bloated. Functions and commands related to indexing should probably be moved to this app eventually, but I did not do that here.

The index and alias commands still work as before, with a few slight changes to index:

  • In development mode, if you want to add documents to an existing index, you have to use --add, like you would in production mode. Running python manage.py index mycorpus when the index already exists will raise an error, unless you use --delete or --add.
  • --update no longer excludes the option to use --rollover. (the crowd goes wild)

Small addition:

  • You can now define the min_date / max_date in a Python corpus definition as a date type instead of datetime. Though individual corpora may still implement sources() in a way that doesn't support this.

Deployment configuration

The 'indexing' app must be added in the project settings.

@lukavdplas lukavdplas added backend changes to the django backend affects-deployment changes that require an update in the deployment module labels Nov 13, 2024
@lukavdplas lukavdplas changed the title Feature/index job models IndexJob model Nov 13, 2024
@lukavdplas lukavdplas marked this pull request as ready for review November 14, 2024 09:51
Base automatically changed from feature/index-overview to develop November 14, 2024 10:43
@lukavdplas lukavdplas linked an issue Nov 15, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-deployment changes that require an update in the deployment module backend changes to the django backend
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IndexJob model
1 participant