Skip to content

Commit

Permalink
Make empty URL:s work in ACL:s
Browse files Browse the repository at this point in the history
Setting a different default access for a specific user didn't previously work
because

* the ACL must be sorted in reversed alphabetic order which means that an empty
  URL (ie. a default rule) must be last, and

* the search for matching ACL rules stops when the top level domain of the
  a rule is alphabetically lower than the one searched for.

Ie. given "default_access: allow" and the ACL:

com,example)/ - {"access": "block"}
 - {"access": "block", "user": "unknown"}

and searching for "http://iana.org", the last line would not be found so that
the "unknown" user would still be "allowed".
  • Loading branch information
krakan committed Feb 10, 2023
1 parent 4a3e7dd commit 330d368
Show file tree
Hide file tree
Showing 5 changed files with 24 additions and 5 deletions.
6 changes: 6 additions & 0 deletions docs/manual/access-control.rst
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,12 @@ For example, the following access control settings restrict access to ``https://
com,example)/restricted - {"access": "allow", "user": "staff"}
com,example)/restricted - {"access": "block"}

As a special case one can add a default rule for a specific user by supplying an empty URL (note the leading space)::

com,example) - {"access": "allow", "user": "other"}
- {"access": "block", "user": "other"}

which would give the ``other`` user access to the index for any URL but not to any data except for ``example.com``.

Combined with the embargo settings, this can also be used to override the embargo for internal organizational users, while keeping the embargo for general access::

Expand Down
4 changes: 1 addition & 3 deletions pywb/warcserver/access_checker.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,9 +269,7 @@ def find_access_rule(self, url, ts=None, urlkey=None, collection=None, acl_user=
last_obj = acl_obj

# if acl key already less than first tld,
# no match can be found
if acl_key < tld:
break
# keep going in case there's a user specific default rule

return last_obj if last_obj else self.default_rule

Expand Down
3 changes: 3 additions & 0 deletions sample_archive/access/user-default.aclj
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
com,example)/ - {"access": "block", "user": "staff"}
com,example)/ - {"access": "allow", "user": "staff2"}
- {"access": "block", "user": "staff2"}
7 changes: 7 additions & 0 deletions tests/config_test_access.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,13 @@ collections:

default_access: block

pywb-acl-user-default:
index_paths: ./sample_archive/cdx/
archive_paths: ./sample_archive/warcs/
acl_paths: ./sample_archive/access/user-default.aclj

default_access: allow

pywb-embargo-before:
index_paths: ./sample_archive/cdx/
archive_paths: ./sample_archive/warcs/
Expand Down
9 changes: 7 additions & 2 deletions tests/test_acl.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,5 +85,10 @@ def test_allowed_different_coll_acl_dir(self):

assert '"http://httpbin.org/anything/resource.json"' in resp.text



def test_user_default(self):
headers = {"X-Pywb-ACL-User": "staff"}
self.testapp.get('/pywb-acl-user-default/mp_/http://www.iana.org/', headers=headers, status=200)
self.testapp.get('/pywb-acl-user-default/mp_/http://www.example.com/', headers=headers, status=451)
headers = {"X-Pywb-ACL-User": "staff2"}
self.testapp.get('/pywb-acl-user-default/mp_/http://www.iana.org/', headers=headers, status=451)
self.testapp.get('/pywb-acl-user-default/mp_/http://www.example.com/', headers=headers, status=200)

0 comments on commit 330d368

Please sign in to comment.