Restrict search results based on user permissions [FC-0040] #34471

pomegranited · 2024-04-04T07:17:28Z

Description

Meilisearch allows us to create JWTs with embedded "filters", which guarantee that users searching with these tokens will only be permitted to see the documents that match these filters.

In Open edX, we grant permissions to block content based on its learning context, i.e. the course or content library that contains the block. Access to courses can be done in two ways, either individually on the course, or using org level rights. Access to content libraries is managed individually on the library (only library creation uses org level rights).

Because of these individual access rights, the list of courses and libraries a user has access to can be very long, >1000 in some instances (see comment). But since we need to encode that list into a JWT, we're limited to a total token size of 8Kib, but course keys can be very long (255 char). So to facilitate packing as many course/library identifiers into this token, this PR:

Adds a SearchAccess model with a numeric ID, mapped to all CourseOverview and ContentLibrary context_keys.
Adds an access_id field to the search-indexed Document, storing the SearchAccess.id relevant to that block's learning context.
Adds an access_id filter to the JWT token generation, containing the SearchAccess.id's for the top 1,000 learning contexts that the user has access to. (We omit access_ids covered by the org filter described below.)
Adds an org filter to the JWT token generation, containing max 1,000 org names that the user has been granted access to.

Supporting information

Content search is not enabled by default, so this change should have no impact on edX/2U or other sites running from master.

closes openedx/modular-learning#198

Private-ref: FAL-3692

Testing instructions

Setup:

Ensure that you are running the latest https://github.com/open-craft/tutor-contrib-meilisearch -- this will ensure that MEILISEARCH_ENABLED = True is set in the CMS and the MFE.
Enable course-authoring MFE, running from the branch in feat: add content search modal [FC-0040] frontend-app-authoring#928
Load https://github.com/open-craft/taxonomy-sample-data/
Using this branch, run migrations: python manage.py cms migrate
Also using this branch, reindex studio content: python manage.py cms reindex_studio --experimental

Testing:

Login to Studio and the Course Authoring MFE as a global staff user.
Click the 'search' button in the header bar to start searching studio content.
Ensure that you can see content from all your courses.
Repeat these steps with a non-global staff user who only has access to a single course or library.
Ensure that user can only see that course/library when searching Studio content.

Deadline

None

Other information

Anyone using content.search will need to run a full reindex to add the new document access_id field:
```
python manage.py cms reindex_studio --experimental
```

Stores a numeric ID for each course + library, which will generally be shorter than the full context_key, so we can pack more of them into the the Meilisearch search filter. Also: * Adds data migration pre-populates the SearchAccess model from the existing CourseOverview and ContentLibrary records * Adds signal handlers to add/remove SearchAccess entries when content is created or deleted. * Adds get_access_ids_for_request() helper method for use in views. * Adds tests.

openedx-webhooks · 2024-04-04T07:17:32Z

Thanks for the pull request, @pomegranited! Please note that it may take us up to several weeks or months to complete a review and merge your PR.

Feel free to add as much of the following information to the ticket as you can:

supporting documentation
Open edX discussion forum threads
timeline information ("this must be merged by XX date", and why that is)
partner information ("this is a course on edx.org")
any other information that can help Product understand the context for the PR

All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here.

Please let us know once your PR is ready for our review and all tests are green.

* Adds an access_id field to the document, which stores the SearchAccess.id for the block's context. * Use the requesting user's allowed access_ids to filter search results to documents with those access_ids. * Since some users have a lot of individual access granted, limit the number of access_ids in the filter to a large number (1_000) * Updates tests to demonstrate.

rpenido

Hi, @pomegranited! LGTM 👍

I tested this, and it worked after a minor fix (e1e910a). Could you make this change in your PR?

Also, I added a comment about filtering orgs.

Note that #34391 refactors part of the base code used here. If it gets merged before, you will have to move some things around here. I don't think it will be a lot, though.

openedx/core/djangoapps/content/search/views.py

In theory, we shouldn't have to do this, because the CREATE and DELETE events should keep the SearchAccess table up-to-date. But in practice, signals can be missed (or in tests, they may be disabled). So we assume that it's ok to re-use a SearchAccess.id created for a given course or library context_key.

Uses helper methods and decorators to wrap the settings and patches used by multiple view tests.

pomegranited · 2024-04-05T02:40:22Z

openedx/core/djangoapps/content/search/migrations/0001_initial.py

+    for lib in ContentLibrary.objects.select_related("org").only("org", "slug"):
+        # ContentLibrary.library_key property is not available here, so we have to assemble it.
+        lib_key = LibraryLocatorV2(org=lib.org.short_name, slug=lib.slug)
+        SearchAccess.objects.create(context_key=lib_key)


If we're concerned about the performance of this data migration, I can remove it and add a management command instead.

FYI @rpenido -- I decided to remove the data migration (ef89591): running reindex_studio handles this for us.

openedx/core/djangoapps/content/search/handlers.py

* Uses content_tagging.rules.get_user_orgs to fetch the user's content-related orgs for use in the meilisearch filter. * Limits the number of orgs used to 1_000 to keep token size down

rpenido

Hi @pomegranited!

I think we went the wrong path here regarding the org + access_id filter.

If I understand correctly, we should use the access_id filter only for course-level role assignments.

I put some tests (that are falling) here (open-craft#650) to show what I'm expecting. Could you check if my understanding is right? 😄

PS: It adds a lot of boilerplate because I'm creating a course to test. You don't need to add these tests to this PR.

openedx/core/djangoapps/content/search/tests/test_views.py

rpenido · 2024-04-05T20:22:43Z

openedx/core/djangoapps/content/search/views.py

+
+    # Everyone else is limited to their org roles...
+    user_orgs = [
+        org.short_name for org in get_user_orgs(request.user)[:MAX_ORGS_IN_FILTER]


If the user can access only one course in orgA, orgA will be returned here. I think we want to add only orgs that the user is an admin here. Maybe use _get_content_creator_orgs ?

See 1181eea

rpenido · 2024-04-05T20:31:47Z

openedx/core/djangoapps/content/search/models.py

+    """
+    Returns the SearchAccess.id values that the user has read access to.
+    """
+    courses, _ = get_courses_accessible_to_user(request)


I think we want to get the IDs only for course-level assignments here. If the user is staff in orgA and doesn't have any course-level role, this list should be empty. I'm not sure if we have a function for that.

Maybe use what we did in _get_course_user_orgs in the tagging rules, but return the course ids?

Done with 6a3c30a.

Users should use the reindex_studio management command to populate SearchAccess.

openedx/core/djangoapps/content/search/models.py

to allow general access to the user's RoleCache without having to access private attributes of User or RoleCache. Related changes: * Moves some functionality from openedx.core.djangoapps.enrollments.data.get_user_roles to this new helper method. * Use these new helper method in content_tagging.rules

instead of all course keys that the user can read. Org- and GlobalStaff access checks will handle the rest.

Also refactors tests to demonstrate this change for OrgStaff and OrgInstructor users.

pomegranited · 2024-04-09T07:25:25Z

I put some tests (that are falling) here (open-craft#650) to show what I'm expecting. Could you check if my understanding is right? 😄

Thank you for those added tests! I didn't use them directly, but please see 6a3c30a and c7215ac for the relevant changes and tests.

openedx/core/djangoapps/content/search/views.py

rpenido

Thank you, @pomegranited! Good work here!
LGTM 👍

I tested this using the instructions in my tutor stack
I read through the code
~~I checked for accessibility issues~~
Includes documentation

Lets SearchAccess entries be created on demand during search indexing.

…user's org roles

bradenmacdonald

This is great, thanks. I will try to merge when I can but there are some unrelated issues affecting platform today so may need to wait.

openedx-webhooks · 2024-04-17T18:21:37Z

@pomegranited 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future.

edx-pipeline-bot · 2024-04-17T19:44:36Z

2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production.

edx-pipeline-bot · 2024-04-17T20:38:21Z

2U Release Notice: This PR has been deployed to the edX production environment.

edx-pipeline-bot · 2024-04-17T20:38:21Z

2U Release Notice: This PR has been deployed to the edX production environment.

…dx#34471) * feat: adds SearchAccess model Stores a numeric ID for each course + library, which will generally be shorter than the full context_key, so we can pack more of them into the the Meilisearch search filter. Also: * Adds data migration pre-populates the SearchAccess model from the existing CourseOverview and ContentLibrary records * Adds signal handlers to add/remove SearchAccess entries when content is created or deleted. * Adds get_access_ids_for_request() helper method for use in views. * Adds tests. * test: can't import content.search in lms tests * feat: use SearchAccess in documents and views * Adds an access_id field to the document, which stores the SearchAccess.id for the block's context. * Use the requesting user's allowed access_ids to filter search results to documents with those access_ids. * Since some users have a lot of individual access granted, limit the number of access_ids in the filter to a large number (1_000) * Updates tests to demonstrate. * test: can't import content.search or content_staging in lms tests * fix: make access_id field filterable * fix: use SearchAccess.get_or_create in signal handlers In theory, we shouldn't have to do this, because the CREATE and DELETE events should keep the SearchAccess table up-to-date. But in practice, signals can be missed (or in tests, they may be disabled). So we assume that it's ok to re-use a SearchAccess.id created for a given course or library context_key. * refactor: refactors the view tests to make them clearer Uses helper methods and decorators to wrap the settings and patches used by multiple view tests. * feat: adds org filters to meilisearch filter * Uses content_tagging.rules.get_user_orgs to fetch the user's content-related orgs for use in the meilisearch filter. * Limits the number of orgs used to 1_000 to keep token size down * refactor: removes data migration Users should use the reindex_studio management command to populate SearchAccess. * refactor: adds functions to common.djangoapps.student.role_helpers to allow general access to the user's RoleCache without having to access private attributes of User or RoleCache. Related changes: * Moves some functionality from openedx.core.djangoapps.enrollments.data.get_user_roles to this new helper method. * Use these new helper method in content_tagging.rules * fix: get_access_ids_for_request only returns individual access instead of all course keys that the user can read. Org- and GlobalStaff access checks will handle the rest. * fix: use org-level permissions when generating search filter Also refactors tests to demonstrate this change for OrgStaff and OrgInstructor users. * refactor: remove SearchAccess creation signal handlers Lets SearchAccess entries be created on demand during search indexing. * feat: omit access_ids from the search filter that are covered by the user's org roles --------- Co-authored-by: Rômulo Penido <[email protected]>

openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 4, 2024

pomegranited added 4 commits April 4, 2024 17:59

test: can't import content.search in lms tests

80699f1

Merge branch 'master' into jill/content-search-perms

702abb6

test: can't import content.search or content_staging in lms tests

0b32048

rpenido suggested changes Apr 4, 2024

View reviewed changes

openedx/core/djangoapps/content/search/views.py Outdated Show resolved Hide resolved

rpenido and others added 3 commits April 5, 2024 11:51

fix: make access_id field filterable

b864e6c

refactor: refactors the view tests to make them clearer

902e75c

Uses helper methods and decorators to wrap the settings and patches used by multiple view tests.

pomegranited commented Apr 5, 2024

View reviewed changes

openedx/core/djangoapps/content/search/handlers.py Outdated Show resolved Hide resolved

feat: adds org filters to meilisearch filter

3b07501

* Uses content_tagging.rules.get_user_orgs to fetch the user's content-related orgs for use in the meilisearch filter. * Limits the number of orgs used to 1_000 to keep token size down

pomegranited marked this pull request as ready for review April 5, 2024 04:40

rpenido reviewed Apr 5, 2024

View reviewed changes

pomegranited added 2 commits April 9, 2024 11:35

Merge branch 'master' into jill/content-search-perms

4edcebe

refactor: removes data migration

ef89591

Users should use the reindex_studio management command to populate SearchAccess.

pomegranited commented Apr 9, 2024

View reviewed changes

openedx/core/djangoapps/content/search/models.py Show resolved Hide resolved

pomegranited force-pushed the jill/content-search-perms branch from 6a490cd to a917c64 Compare April 9, 2024 05:13

pomegranited added 2 commits April 9, 2024 15:06

fix: get_access_ids_for_request only returns individual access

6a3c30a

instead of all course keys that the user can read. Org- and GlobalStaff access checks will handle the rest.

fix: use org-level permissions when generating search filter

c7215ac

Also refactors tests to demonstrate this change for OrgStaff and OrgInstructor users.

pomegranited force-pushed the jill/content-search-perms branch from 1181eea to c7215ac Compare April 9, 2024 07:23

pomegranited requested a review from bradenmacdonald April 9, 2024 07:32

bradenmacdonald reviewed Apr 9, 2024

View reviewed changes

openedx/core/djangoapps/content/search/views.py Outdated Show resolved Hide resolved

rpenido approved these changes Apr 9, 2024

View reviewed changes

pomegranited added 2 commits April 10, 2024 09:28

refactor: remove SearchAccess creation signal handlers

a28f3a9

Lets SearchAccess entries be created on demand during search indexing.

Merge branch 'master' into jill/content-search-perms

ef24a88

feat: omit access_ids from the search filter that are covered by the …

811687d

…user's org roles

pomegranited requested a review from bradenmacdonald April 10, 2024 00:24

bradenmacdonald approved these changes Apr 10, 2024

View reviewed changes

bradenmacdonald merged commit d672110 into openedx:master Apr 17, 2024
66 checks passed

bradenmacdonald deleted the jill/content-search-perms branch April 17, 2024 18:21

bradenmacdonald mentioned this pull request Apr 23, 2024

feat: Allow filtering by multiple tags [FC-0040] openedx/frontend-app-authoring#945

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restrict search results based on user permissions [FC-0040] #34471

Restrict search results based on user permissions [FC-0040] #34471

pomegranited commented Apr 4, 2024 •

edited

Loading

openedx-webhooks commented Apr 4, 2024 •

edited

Loading

rpenido left a comment

pomegranited Apr 5, 2024

pomegranited Apr 9, 2024 •

edited

Loading

rpenido left a comment

rpenido Apr 5, 2024

pomegranited Apr 9, 2024

rpenido Apr 5, 2024

pomegranited Apr 9, 2024

pomegranited commented Apr 9, 2024

rpenido left a comment •

edited

Loading

bradenmacdonald left a comment

openedx-webhooks commented Apr 17, 2024

edx-pipeline-bot commented Apr 17, 2024

edx-pipeline-bot commented Apr 17, 2024

edx-pipeline-bot commented Apr 17, 2024

Restrict search results based on user permissions [FC-0040] #34471

Restrict search results based on user permissions [FC-0040] #34471

Conversation

pomegranited commented Apr 4, 2024 • edited Loading

Description

Supporting information

Testing instructions

Deadline

Other information

openedx-webhooks commented Apr 4, 2024 • edited Loading

rpenido left a comment

Choose a reason for hiding this comment

pomegranited Apr 5, 2024

Choose a reason for hiding this comment

pomegranited Apr 9, 2024 • edited Loading

Choose a reason for hiding this comment

rpenido left a comment

Choose a reason for hiding this comment

rpenido Apr 5, 2024

Choose a reason for hiding this comment

pomegranited Apr 9, 2024

Choose a reason for hiding this comment

rpenido Apr 5, 2024

Choose a reason for hiding this comment

pomegranited Apr 9, 2024

Choose a reason for hiding this comment

pomegranited commented Apr 9, 2024

rpenido left a comment • edited Loading

Choose a reason for hiding this comment

bradenmacdonald left a comment

Choose a reason for hiding this comment

openedx-webhooks commented Apr 17, 2024

edx-pipeline-bot commented Apr 17, 2024

edx-pipeline-bot commented Apr 17, 2024

edx-pipeline-bot commented Apr 17, 2024

pomegranited commented Apr 4, 2024 •

edited

Loading

openedx-webhooks commented Apr 4, 2024 •

edited

Loading

pomegranited Apr 9, 2024 •

edited

Loading

rpenido left a comment •

edited

Loading