feat: update search index when course content is updated [FC-0040] #34391

rpenido · 2024-03-19T18:38:25Z

Description

This PR implements updating the search index with content metadata whenever they change

More Infomation

Closes [Course Search] Update search index when course content is changed modular-learning#196
Depends on:
- Index Studio content using Meilisearch [FC-0040] #34310

Testing Instructions

Run your local stack on this branch
Make sure you have meilisearch setup locally, follow the setup instructions here https://github.com/open-craft/tutor-contrib-meilisearch
Make sure you also have some sample taxonomies/tags, you can add some from here: https://github.com/open-craft/taxonomy-sample-data
Run tutor dev run cms bash and ./manage.py cms reindex_studio
View the resulting index at http://meilisearch.local.edly.io:7700/ (see tutor-contrib-meilisearch README for how to get the API key to log in)
Go back to studio, and add/update XBlocks from a course
Go back to meilisearch dashboard, and confirm that the XBlocks docs have been updated to reflect the changes you made
Repeat and confirm the above for both units and blocks
Also, try to create a Content Library with some components and check if the index is updated accordingly.

NOTE: Meilisearch seems to cache queries along with their results in the frontend, so if you simply search it might
show you stale data (network tab shows no requests), especially if you're searching for the same query. Make sure you refresh the Meilisearch dashboard (http://meilisearch.local.edly.io:7700/) and then perform the search.

Private ref: FAL-3690

openedx-webhooks · 2024-03-19T18:38:30Z

Thanks for the pull request, @rpenido! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

supporting documentation
Open edX discussion forum threads
timeline information ("this must be merged by XX date", and why that is)
partner information ("this is a course on edx.org")
any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

yusuf-musleh

Great work on this @rpenido ! I realized after reviewing that a lot of the changes on this PR diff are from @bradenmacdonald 's PR since this PR wasn't updated with upstream yet. None the less I think the code overall is really well structured and easy to follow!

Just had a small question/nit.

However in terms of testing, I'm not sure if its ready or not, but after briefly tried I was faced with some errors (maybe because the PR needs to be updated with upstream), so I just reviewed the code for now. @rpenido Let me know when its ready for another review/testing it out.

openedx/core/djangoapps/content/search/api.py

yusuf-musleh · 2024-03-20T10:30:18Z

openedx/core/djangoapps/content/search/handlers.py

+    upsert_xblock_index_doc.delay(
+        str(xblock_info.usage_key),
+        recursive=True,  # Update all children because the breadcrumb may have changed
+        update_metadata=True,
+        update_tags=False,
+    )


Why is update_tags set to False here? Is it because we are going to implement updating tags in the index separately in openedx/modular-learning#197 ?

Yes! This handler is called when we receive an XBLOCK_UPDATED event. If this block already has tags, I think the index will still be up to date with its information.

The upsert_xblock_index_doc did not replace the document in the index; it only updates it with the new fields (https://www.meilisearch.com/docs/reference/api/documents#add-or-update-documents).

For the tagging data, I was planning to mirror the implementation we did here:

Create an OBJECT_TAGGED event that is dispatched when we call the tag_object api.

Create a new handler here that only updates the tags for the object and leaves the metadata as it is ({recursive=False, update_metadata: False, updat_tags=True});

But I'm open to new ideas if you think otherwise! 😄

Yes, as long as we are careful not to overwrite the tags field with an empty list, this is a good approach. Generally the tags haven't changed when this event is called, so there's no reason to update them.

Got it! Thanks for the clarification, and I agree I think it's a good approach. I'll follow it when working on openedx/modular-learning#197

rpenido · 2024-03-21T01:08:05Z

I realized after reviewing that a lot of the changes on this PR diff are from @bradenmacdonald 's PR since this PR wasn't updated with upstream yet.

We have this temp PR (open-craft#645). It could be easier to review before we get #34310 merged.

However in terms of testing, I'm not sure if its ready or not, but after briefly tried I was faced with some errors

Sorry for that @yusuf-musleh! I probably messed up something when I merged the changes from @bradenmacdonald.

I will let you know when this is ready for testing again (probably for a final review).

Also, I started writing some tests for the new API, but I'm not happy with them. As meilisearch is an external service, I need to mock many functions and only really test if the external API is called with the expected parameters. Maybe we should fire up a Meileisearch instance during the tests to actually test the feature (as we do with external services, like the database), but I'm not sure if this is the right time to do this (and I don't know how to do this in our CI suit either).

I plan to continue with the current approach (mocking and checking calls) for the API and the handlers to get some coverage, but I think we should revisit this if the prototype becomes the default search engine. Let me know if you have a better idea for this!

yusuf-musleh · 2024-03-21T05:24:43Z

As meilisearch is an external service, I need to mock many functions and only really test if the external API is called with the expected parameters.

@rpenido I think this approach is fine, since we can make the assumption that Meilisearch will behave as expected after we manually tested it. However I can see the value of having integration tests to actually test that it behaves as expected in case there are breaking/unexpected changes on the Meilisearch side. Although that could be avoided by using the same working version vs updating to the latest version every time.

I think we should revisit this if the prototype becomes the default search engine

I agree, I think this can come at a later stage. As you suggested, we can potentially follow the footsteps of the mysqldb when the time comes.

bradenmacdonald · 2024-03-22T15:54:15Z

@rpenido I agree that we should eventually run Meilisearch during tests. However, it's probably not necessary to implement that for this first studio search project which is considered experimental. Maybe you could add that to the ADR that I wrote? That for the experiment, we won't use Meilisearch during tests, but we expect to add that in the future if we move forward with replacing Elasticsearch completely.

openedx/core/djangoapps/content/search/tests/test_handlers.py

rpenido · 2024-03-26T22:59:29Z

Hi @yusuf-musleh ! There is a test failing here, but I can't find the logs. I will look into it further tomorrow.

Edit: fixed 5a6beafc86276aafa9a02ca895f49b9d691a2905

yusuf-musleh

@rpenido Great work, the code is coming along quite well! The code looks good to me, however while testing the library aspects, I noticed a few issues:

When I create a new empty library, nothing gets added to the index, is that expected?
When I add a component/block to the library that gets added to the index, as expected, when I edit the blocks, eg: the name or the problem type (for problem blocks) the index does not get updated. However If I change the name of the library, thats when the blocks under it get updated in the index with the latest information. It's probably just a missing condition on how/when the docs should be updated in the index.

rpenido · 2024-03-28T13:09:03Z

Thank you for the review @yusuf-musleh!

* When I create a new empty library, nothing gets added to the index, is that expected?

I think that it is expected. I commented on the first PR but I think it got lost in the review comments.
@bradenmacdonald, could you confirm that we are not indexing the Library and the Course entities?

* When I add a component/block to the library that gets added to the index, as expected, when I edit the blocks, eg: the name or the problem type (for problem blocks) the index does not get updated. However If I change the name of the library, thats when the blocks under it get updated in the index with the latest information. It's probably just a missing condition on how/when the docs should be updated in the index.

This is because the LIBRARY_BLOCK_UPDATED event is not dispatched when we edit an library XBlock. Not sure if we will fix it now or later.

rpenido · 2024-03-28T13:15:51Z

@yusuf-musleh I want to rebase and squash this PR, but you will need to rebase it and merge conflicts again after that too (it will happen anyway when this goes upstream). Let me know when you think it is a good time to do it!

bradenmacdonald · 2024-03-28T18:42:22Z

@rpenido @yusuf-musleh We aren't indexing the courses/libraries themselves. So it's expected that a newly created library won't appear in the results.

…course-content-is-changed

bradenmacdonald · 2024-04-09T17:33:52Z

This is ready to merge but we're waiting on some unrelated issues with the NodeJS 18 upgrade to be solved first.

rpenido · 2024-04-16T21:57:15Z

Hi @bradenmacdonald ! I added this commit (612f32f) here.
There are some old libraries (before the recent module store refactor) that crash our reindex.

I deleted these libraries locally, and the code was working. In the sandbox, the error appeared again, so I think it would be better to skip the library if we found an error instead of crashing the reindex.

```File"/openedx/edx-platform/openedx/core/djangoapps/content_libraries/api.py",line 617, in get_library_componentslearning_package.id,AttributeError: 'NoneType' object has no attribute'id'```

bradenmacdonald · 2024-04-17T17:20:52Z

Thanks @rpenido - that's a great fix.

bradenmacdonald · 2024-04-17T18:22:35Z

@rpenido I was just about to merge this, but there are some conflicts with the "permissions" PR that I just merged. Could you please fix them?

…course-content-is-changed

rpenido · 2024-04-17T20:22:24Z

@rpenido I was just about to merge this, but there are some conflicts with the "permissions" PR that I just merged. Could you please fix them?

Working on it. The code changed a lot, so I will need to manually test it to make sure I did the merge right.

bradenmacdonald · 2024-04-17T20:27:33Z

@rpenido Take your time - it's important to get it right :)
Thanks.

Co-authored-by: Braden MacDonald <[email protected]>

rpenido · 2024-04-17T22:29:50Z

This is ready for review and merge @bradenmacdonald. It also includes the changes from your PR:

feat: update meilisearch configuration needed for tags filter [FC-0040] #34534 (review)

It may have impacted your PR @yusuf-musleh.

openedx-webhooks · 2024-04-18T16:53:25Z

@rpenido 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.

edx-pipeline-bot · 2024-04-18T18:15:01Z

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

edx-pipeline-bot · 2024-04-18T18:40:47Z

2U Release Notice: This PR has been deployed to the edX production environment.

edx-pipeline-bot · 2024-04-18T19:32:14Z

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

edx-pipeline-bot · 2024-04-18T19:53:52Z

2U Release Notice: This PR has been deployed to the edX production environment.

…edx#34391)

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Mar 19, 2024

yusuf-musleh reviewed Mar 20, 2024

View reviewed changes

rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch 5 times, most recently from 34f5722 to cab3af9 Compare March 20, 2024 21:52

rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch 2 times, most recently from 7c82a0b to bd28dec Compare March 22, 2024 15:50

rpenido mentioned this pull request Mar 26, 2024

feat: update search index when course content is updated (TEMP) open-craft/edx-platform#645

Closed

rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch 3 times, most recently from a2e9d2a to 4a55c89 Compare March 26, 2024 20:57

rpenido commented Mar 26, 2024

View reviewed changes

openedx/core/djangoapps/content/search/tests/test_handlers.py Outdated Show resolved Hide resolved

rpenido marked this pull request as ready for review March 26, 2024 22:13

rpenido requested a review from yusuf-musleh March 26, 2024 22:58

yusuf-musleh reviewed Mar 27, 2024

View reviewed changes

rpenido changed the title ~~feat: update search index when course content is updated~~ [FC-0040] feat: update search index when course content is updated Mar 28, 2024

rpenido changed the title ~~[FC-0040] feat: update search index when course content is updated~~ feat: update search index when course content is updated [FC-0040] Mar 28, 2024

rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch 2 times, most recently from 128e717 to b7aef89 Compare March 28, 2024 13:53

rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch from b7aef89 to 45b6bf9 Compare March 28, 2024 19:34

Merge branch 'master' into rpenido/fal-3690-update-search-index-when-…

3b6e475

…course-content-is-changed

yusuf-musleh mentioned this pull request Apr 11, 2024

[FC-0040] feat: update search index when object tags change open-craft/edx-platform#647

Closed

fix: error in library crash reindex

612f32f

```File"/openedx/edx-platform/openedx/core/djangoapps/content_libraries/api.py",line 617, in get_library_componentslearning_package.id,AttributeError: 'NoneType' object has no attribute'id'```

rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch from 83d6823 to 612f32f Compare April 16, 2024 22:27

rpenido mentioned this pull request Apr 17, 2024

feat: add usage_key for breadcrumbs in search index [FC-0049] #34535

Merged

rpenido added 2 commits April 17, 2024 16:32

Merge branch 'master' into rpenido/fal-3690-update-search-index-when-…

cbf4a46

…course-content-is-changed

fix: fix code and tests after merge

670c04a

rpenido marked this pull request as draft April 17, 2024 20:21

style: fix pylint

58f8ff4

rpenido force-pushed the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch from 36c48a4 to 58f8ff4 Compare April 17, 2024 20:58

rpenido and others added 3 commits April 17, 2024 18:05

feat: update meilisearch configuration needed for tags filter

66b1b71

Co-authored-by: Braden MacDonald <[email protected]>

fix: conditional import

83e1d27

style: fix pylint

17e1e53

rpenido marked this pull request as ready for review April 17, 2024 21:55

rpenido mentioned this pull request Apr 17, 2024

feat: update meilisearch configuration needed for tags filter [FC-0040] #34534

Closed

bradenmacdonald merged commit 90b253a into openedx:master Apr 18, 2024
66 checks passed

xitij2000 deleted the rpenido/fal-3690-update-search-index-when-course-content-is-changed branch April 23, 2024 08:06

KyryloKireiev pushed a commit to raccoongang/edx-platform that referenced this pull request Apr 24, 2024

feat: update Studio search index when course content is updated (open…

6c82203

…edx#34391)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: update search index when course content is updated [FC-0040] #34391

feat: update search index when course content is updated [FC-0040] #34391

rpenido commented Mar 19, 2024 •

edited

Loading

openedx-webhooks commented Mar 19, 2024 •

edited

Loading

yusuf-musleh left a comment

yusuf-musleh Mar 20, 2024

rpenido Mar 20, 2024 •

edited

Loading

bradenmacdonald Mar 20, 2024

yusuf-musleh Mar 21, 2024

rpenido commented Mar 21, 2024

yusuf-musleh commented Mar 21, 2024

bradenmacdonald commented Mar 22, 2024

rpenido commented Mar 26, 2024 •

edited

Loading

yusuf-musleh left a comment

rpenido commented Mar 28, 2024 •

edited

Loading

rpenido commented Mar 28, 2024

bradenmacdonald commented Mar 28, 2024

bradenmacdonald commented Apr 9, 2024

rpenido commented Apr 16, 2024 •

edited

Loading

bradenmacdonald commented Apr 17, 2024

bradenmacdonald commented Apr 17, 2024

rpenido commented Apr 17, 2024

bradenmacdonald commented Apr 17, 2024

rpenido commented Apr 17, 2024

openedx-webhooks commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

feat: update search index when course content is updated [FC-0040] #34391

feat: update search index when course content is updated [FC-0040] #34391

Conversation

rpenido commented Mar 19, 2024 • edited Loading

Description

More Infomation

Testing Instructions

openedx-webhooks commented Mar 19, 2024 • edited Loading

yusuf-musleh left a comment

Choose a reason for hiding this comment

yusuf-musleh Mar 20, 2024

Choose a reason for hiding this comment

rpenido Mar 20, 2024 • edited Loading

Choose a reason for hiding this comment

bradenmacdonald Mar 20, 2024

Choose a reason for hiding this comment

yusuf-musleh Mar 21, 2024

Choose a reason for hiding this comment

rpenido commented Mar 21, 2024

yusuf-musleh commented Mar 21, 2024

bradenmacdonald commented Mar 22, 2024

rpenido commented Mar 26, 2024 • edited Loading

yusuf-musleh left a comment

Choose a reason for hiding this comment

rpenido commented Mar 28, 2024 • edited Loading

rpenido commented Mar 28, 2024

bradenmacdonald commented Mar 28, 2024

bradenmacdonald commented Apr 9, 2024

rpenido commented Apr 16, 2024 • edited Loading

bradenmacdonald commented Apr 17, 2024

bradenmacdonald commented Apr 17, 2024

rpenido commented Apr 17, 2024

bradenmacdonald commented Apr 17, 2024

rpenido commented Apr 17, 2024

openedx-webhooks commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

edx-pipeline-bot commented Apr 18, 2024

rpenido commented Mar 19, 2024 •

edited

Loading

openedx-webhooks commented Mar 19, 2024 •

edited

Loading

rpenido Mar 20, 2024 •

edited

Loading

rpenido commented Mar 26, 2024 •

edited

Loading

rpenido commented Mar 28, 2024 •

edited

Loading

rpenido commented Apr 16, 2024 •

edited

Loading