-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Store content object collections in search index [FC-0062] #35469
Store content object collections in search index [FC-0062] #35469
Conversation
…#674) * chore: uses openedx-learning==0.11.3 * feat: add/remove components to/from a collection Sends a CONTENT_OBJECT_TAGS_CHANGED for each component added/removed. * docs: Add warning about unstable REST APIs * refactor: use oel_collections.get_collections as oel_collections.get_learning_package_collections has been removed. * test: fixes flaky collection search test * refactor: simplify the REST API params and validation
For the Collection "create" and "update" views, we call an authoring_api method and emit an event on success. This change simplifies the view code by moving these call to new content_libraries.api methods.
and refactors collection views to use DRF conventions.
They expect a UsageKey object, not the string object_id.
and ensures this new field is searchable and filterable. Serializes the object's collections to a list of collection.key values.
which adds CONTENT_OBJECT_ASSOCIATIONS_CHANGED
whenever a content object's tags or collections have changed, and handle that event in content/search. The deprecated CONTENT_OBJECT_TAGS_CHANGED event is still emitted when tags change; to be removed after Sumac.
…_TAGS_CHANGED in content_tagging app, while CONTENT_OBJECT_TAGS_CHANGED is being deprecated.
Collection.key is stored here, not ID
and re-raise as api.LibraryCollectionAlreadyExists
…tions-crud-rest-api
and fixes event types documented in the hooks.
Thanks for the pull request, @pomegranited! What's next?Please work through the following steps to get your changes ready for engineering review: 🔘 Get product approvalIf you haven't already, check this list to see if your contribution needs to go through the product review process.
🔘 Provide contextTo help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:
🔘 Get a green buildIf one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green. 🔘 Let us know that your PR is ready for review:Who will review my changes?This repository is currently maintained by Where can I find more information?If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:
When can I expect my changes to be merged?Our goal is to get community contributions seen and reviewed as efficiently as possible. However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:
💡 As a result it may take up to several weeks or months to complete a review and merge your PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @pomegranited!
Great work here! I added some comments, and as soon as they are addressed, this will be ready for CC review!
- I tested this using the instructions from the PR
- I read through the code
-
I checked for accessibility issues - Includes documentation
|
||
e.g. for something in Collections "COL_A" and "COL_B", this would return: | ||
{ | ||
"collections": ["COL_A", "COL_B"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pomegranited Sorry for the late request (I overlooked this before), but what about changing the structure here?
"collections": ["COL_A", "COL_B"], | |
"collections": [ | |
{ "display_name": "Collection A", key: "COL_A" }, | |
{ "display_name": "Collection B", key: "COL_B" }, | |
], |
That way, we can use the display_name
as a searchable attribute and the slug/key
to a dev action (like a redirect).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
).values_list("key", flat=True) | ||
except ObjectDoesNotExist: | ||
log.warning(f"No component found for {object_id}") | ||
|
||
if collections: | ||
result[Fields.collections] = list(collections) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the component is not in any collections, we must return an empty list to ensure the index is updated with this information.
I created a small PR here: open-craft#685
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mock_meilisearch.return_value.index.return_value.update_documents.assert_has_calls( | ||
[ | ||
call([doc_problem_with_collection2]), | ||
call([doc_problem_with_collection1]), | ||
], | ||
any_order=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can have a more deterministic approach here to assert we don't have extra calls
mock_meilisearch.return_value.index.return_value.update_documents.assert_has_calls( | |
[ | |
call([doc_problem_with_collection2]), | |
call([doc_problem_with_collection1]), | |
], | |
any_order=True, | |
self.assertEqual( | |
mock_meilisearch.return_value.index.return_value.update_documents.call_count, | |
2, | |
) | |
mock_meilisearch.return_value.index.return_value.update_documents.assert_has_calls( | |
[ | |
call([doc_problem_with_collection2]), | |
call([doc_problem_with_collection1]), | |
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure -- we still can't rely on the order these calls are made in, but I've added asserts for the call_count
, e.g. a81ea9a#diff-f6238e12f564d80e4a90b6f385c31c1396de25784be34992c1f17d1c9ca557a1R353
if "tags" in content_object.changes: | ||
upsert_block_tags_index_docs(usage_key) | ||
elif "collections" in content_object.changes: | ||
upsert_block_collections_index_docs(usage_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: If we have empty changes
, we assume everything will change.
Also, we cannot use elif
here, or we may skip the collecions
update.
if "tags" in content_object.changes: | |
upsert_block_tags_index_docs(usage_key) | |
elif "collections" in content_object.changes: | |
upsert_block_collections_index_docs(usage_key) | |
if not content_object.changes or "tags" in content_object.changes: | |
upsert_block_tags_index_docs(usage_key) | |
if not content_object.changes or "collections" in content_object.changes: | |
upsert_block_collections_index_docs(usage_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 360ec35
@@ -288,7 +334,7 @@ def searchable_doc_for_collection(collection) -> dict: | |||
found using faceted search. | |||
""" | |||
doc = { | |||
Fields.id: collection.id, | |||
Fields.id: collection.key, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are better with the id
here. I fixed it in the other PR and add comments here: 1738447
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#35321 is merged, so we can update this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need the collection.key in order to call the REST APIs though -- I've re-added it in the internal PR: open-craft@5bdcc9e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching this @pomegranited!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. Rebase this with the changes of #35321 and I will do another quick review
so we can index on the collection key or title.
@ChrisChV I think it's worth merging open-craft#684 into here first before we merge this change -- it contains a number of necessary fixes and additions. |
to update the index before the frontend re-fetches
Store Collection metadata + component count in meilisearch
@pomegranited I will merge this tomorrow morning |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
1 similar comment
2U Release Notice: This PR has been deployed to the edX production environment. |
Description
This PR ensures that when a content object is added/removed from a Collection, its search index document is updated to reflect this change. The
collection.key
values for each collection are stored in a newcollections
list.To support this, we added a new event called
CONTENT_OBJECT_ASSOCIATIONS_CHANGED
, and trigger it when an object's collections change. This event is also now triggered when an object's tags change. The previousCONTENT_OBJECT_TAGS_CHANGED
event is still emitted for tag changes, but is now deprecated.Useful information to include:
Supporting information
Related Tickets:
Depends on / blocked by:
Testing instructions
tutor dev run cms python manage.py migrate
lib:SampleTaxonomyOrg1:AL1
Hint: you can retrieve the block keys using the REST API, e.g. http://studio.local.edly.io:8001/api/libraries/v2/lib:SampleTaxonomyOrg1:AL1/blocks/
tutor dev run cms ./manage.py cms reindex_studio --experimental
to add the newcollections
field.Your api key can be found with
tutor config printvalue MEILISEARCH_API_KEY
FAL-3787
and verify that each of your blocks hasFAL-3787
showing in itscollections
list.Deadline
Before Sumac gets cut in October.