-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restrict search results based on user permissions [FC-0040] #34471
Restrict search results based on user permissions [FC-0040] #34471
Conversation
Stores a numeric ID for each course + library, which will generally be shorter than the full context_key, so we can pack more of them into the the Meilisearch search filter. Also: * Adds data migration pre-populates the SearchAccess model from the existing CourseOverview and ContentLibrary records * Adds signal handlers to add/remove SearchAccess entries when content is created or deleted. * Adds get_access_ids_for_request() helper method for use in views. * Adds tests.
Thanks for the pull request, @pomegranited! Please note that it may take us up to several weeks or months to complete a review and merge your PR. Feel free to add as much of the following information to the ticket as you can:
All technical communication about the code itself will be done via the GitHub pull request interface. As a reminder, our process documentation is here. Please let us know once your PR is ready for our review and all tests are green. |
* Adds an access_id field to the document, which stores the SearchAccess.id for the block's context. * Use the requesting user's allowed access_ids to filter search results to documents with those access_ids. * Since some users have a lot of individual access granted, limit the number of access_ids in the filter to a large number (1_000) * Updates tests to demonstrate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @pomegranited! LGTM 👍
I tested this, and it worked after a minor fix (e1e910a). Could you make this change in your PR?
Also, I added a comment about filtering orgs.
Note that #34391 refactors part of the base code used here. If it gets merged before, you will have to move some things around here. I don't think it will be a lot, though.
In theory, we shouldn't have to do this, because the CREATE and DELETE events should keep the SearchAccess table up-to-date. But in practice, signals can be missed (or in tests, they may be disabled). So we assume that it's ok to re-use a SearchAccess.id created for a given course or library context_key.
Uses helper methods and decorators to wrap the settings and patches used by multiple view tests.
for lib in ContentLibrary.objects.select_related("org").only("org", "slug"): | ||
# ContentLibrary.library_key property is not available here, so we have to assemble it. | ||
lib_key = LibraryLocatorV2(org=lib.org.short_name, slug=lib.slug) | ||
SearchAccess.objects.create(context_key=lib_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're concerned about the performance of this data migration, I can remove it and add a management command instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI @rpenido -- I decided to remove the data migration (ef89591): running reindex_studio
handles this for us.
* Uses content_tagging.rules.get_user_orgs to fetch the user's content-related orgs for use in the meilisearch filter. * Limits the number of orgs used to 1_000 to keep token size down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pomegranited!
I think we went the wrong path here regarding the org + access_id filter.
If I understand correctly, we should use the access_id filter only for course-level role assignments.
I put some tests (that are falling) here (open-craft#650) to show what I'm expecting. Could you check if my understanding is right? 😄
PS: It adds a lot of boilerplate because I'm creating a course to test. You don't need to add these tests to this PR.
|
||
# Everyone else is limited to their org roles... | ||
user_orgs = [ | ||
org.short_name for org in get_user_orgs(request.user)[:MAX_ORGS_IN_FILTER] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the user can access only one course in orgA
, orgA
will be returned here. I think we want to add only orgs that the user is an admin here. Maybe use _get_content_creator_orgs
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See 1181eea
""" | ||
Returns the SearchAccess.id values that the user has read access to. | ||
""" | ||
courses, _ = get_courses_accessible_to_user(request) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we want to get the IDs only for course-level assignments here. If the user is staff in orgA
and doesn't have any course-level role, this list should be empty. I'm not sure if we have a function for that.
Maybe use what we did in _get_course_user_orgs
in the tagging rules, but return the course ids?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done with 6a3c30a.
Users should use the reindex_studio management command to populate SearchAccess.
to allow general access to the user's RoleCache without having to access private attributes of User or RoleCache. Related changes: * Moves some functionality from openedx.core.djangoapps.enrollments.data.get_user_roles to this new helper method. * Use these new helper method in content_tagging.rules
6a490cd
to
a917c64
Compare
instead of all course keys that the user can read. Org- and GlobalStaff access checks will handle the rest.
Also refactors tests to demonstrate this change for OrgStaff and OrgInstructor users.
1181eea
to
c7215ac
Compare
Thank you for those added tests! I didn't use them directly, but please see 6a3c30a and c7215ac for the relevant changes and tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, @pomegranited! Good work here!
LGTM 👍
- I tested this using the instructions in my tutor stack
- I read through the code
-
I checked for accessibility issues - Includes documentation
Lets SearchAccess entries be created on demand during search indexing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks. I will try to merge when I can but there are some unrelated issues affecting platform today so may need to wait.
@pomegranited 🎉 Your pull request was merged! Please take a moment to answer a two question survey so we can improve your experience in the future. |
2U Release Notice: This PR has been deployed to the edX staging environment in preparation for a release to production. |
2U Release Notice: This PR has been deployed to the edX production environment. |
1 similar comment
2U Release Notice: This PR has been deployed to the edX production environment. |
…dx#34471) * feat: adds SearchAccess model Stores a numeric ID for each course + library, which will generally be shorter than the full context_key, so we can pack more of them into the the Meilisearch search filter. Also: * Adds data migration pre-populates the SearchAccess model from the existing CourseOverview and ContentLibrary records * Adds signal handlers to add/remove SearchAccess entries when content is created or deleted. * Adds get_access_ids_for_request() helper method for use in views. * Adds tests. * test: can't import content.search in lms tests * feat: use SearchAccess in documents and views * Adds an access_id field to the document, which stores the SearchAccess.id for the block's context. * Use the requesting user's allowed access_ids to filter search results to documents with those access_ids. * Since some users have a lot of individual access granted, limit the number of access_ids in the filter to a large number (1_000) * Updates tests to demonstrate. * test: can't import content.search or content_staging in lms tests * fix: make access_id field filterable * fix: use SearchAccess.get_or_create in signal handlers In theory, we shouldn't have to do this, because the CREATE and DELETE events should keep the SearchAccess table up-to-date. But in practice, signals can be missed (or in tests, they may be disabled). So we assume that it's ok to re-use a SearchAccess.id created for a given course or library context_key. * refactor: refactors the view tests to make them clearer Uses helper methods and decorators to wrap the settings and patches used by multiple view tests. * feat: adds org filters to meilisearch filter * Uses content_tagging.rules.get_user_orgs to fetch the user's content-related orgs for use in the meilisearch filter. * Limits the number of orgs used to 1_000 to keep token size down * refactor: removes data migration Users should use the reindex_studio management command to populate SearchAccess. * refactor: adds functions to common.djangoapps.student.role_helpers to allow general access to the user's RoleCache without having to access private attributes of User or RoleCache. Related changes: * Moves some functionality from openedx.core.djangoapps.enrollments.data.get_user_roles to this new helper method. * Use these new helper method in content_tagging.rules * fix: get_access_ids_for_request only returns individual access instead of all course keys that the user can read. Org- and GlobalStaff access checks will handle the rest. * fix: use org-level permissions when generating search filter Also refactors tests to demonstrate this change for OrgStaff and OrgInstructor users. * refactor: remove SearchAccess creation signal handlers Lets SearchAccess entries be created on demand during search indexing. * feat: omit access_ids from the search filter that are covered by the user's org roles --------- Co-authored-by: Rômulo Penido <[email protected]>
Description
Meilisearch allows us to create JWTs with embedded "filters", which guarantee that users searching with these tokens will only be permitted to see the documents that match these filters.
In Open edX, we grant permissions to block content based on its learning context, i.e. the course or content library that contains the block. Access to courses can be done in two ways, either individually on the course, or using
org
level rights. Access to content libraries is managed individually on the library (only library creation usesorg
level rights).Because of these individual access rights, the list of courses and libraries a user has access to can be very long, >1000 in some instances (see comment). But since we need to encode that list into a JWT, we're limited to a total token size of 8Kib, but course keys can be very long (255 char). So to facilitate packing as many course/library identifiers into this token, this PR:
access_id
field to the search-indexed Document, storing the SearchAccess.id relevant to that block's learning context.access_id
filter to the JWT token generation, containing the SearchAccess.id's for the top 1,000 learning contexts that the user has access to. (We omitaccess_id
s covered by theorg
filter described below.)org
filter to the JWT token generation, containing max 1,000 org names that the user has been granted access to.Supporting information
Content search is not enabled by default, so this change should have no impact on edX/2U or other sites running from master.
closes openedx/modular-learning#198
Private-ref: FAL-3692
Testing instructions
Setup:
MEILISEARCH_ENABLED = True
is set in the CMS and the MFE.python manage.py cms migrate
python manage.py cms reindex_studio --experimental
Testing:
Ensure that you can see content from all your courses.
Ensure that user can only see that course/library when searching Studio content.
Deadline
None
Other information
access_id
field: