Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce default AlyxClient REST cache expiry time from 1 day to 5 minutes #177

Merged
merged 2 commits into from
Jan 10, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 20 additions & 25 deletions docs/FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,33 +14,28 @@ Read more about ONE modes [here](notebooks/one_modes).
## Why are my recent data missing from my cache but present on Alyx?
After new data are acquired it may take time for it to be copied to an online server (it will
be marked as 'online' in Alyx). Once the data is marked as existing and online, it should appear
in the cache tables next time they are generated. For the IBL Alyx, the ONE cache tables are
re-generated every 6 hours, however by default ONE will only download a new cache once per day. To
force a download you can run `ONE().refresh_cache('remote')`. More information, including
increasing refresh frequency, can be found [here](https://int-brain-lab.github.io/ONE/notebooks/one_modes.html#Refreshing-the-cache).

Note: There are two different definitions of caches that are used in ONE2:
1. The cache table that stores info about all sessions and their associated datasets.
This is refreshed every night and uploaded to Flatiron and downloaded onto your computer
every 24hr (this is what the datetime object returned as output of the `ONE().refresh_cache('remote')`
command is showing, i.e. when this cache was last updated).
This table is used in all one.search, one.load, one.list functions. When doing
`ONE().refresh_cache('remote')`, you are basically forcing ONE to re-download this table
regardless of when it was last downloaded from Flatiron.

2. When running remote queries (anything that uses `one.alyx.rest(....)`),
ONE stores the results of these queries for 24 hours, so that if you
in queries, and the remote cache tables next time they are generated. The latency depends on the
[ONE mode](notebooks/one_modes) used.

**Remote mode (default)**
When running remote queries (anything that uses `one.alyx.rest(....)`),
ONE stores the results of these queries for 5 minutes, so that if you
repeatedly make the same query over and over you don't hit the database
each time but can use the local cached result.
A problem can arise if something on the Alyx database changes in between the same query:
- For example, at time X a given query returns an empty result (e.g. no histology session for a given subject).
At time X+1, data is registered onto Alyx.
At time X+2, you run the same query again.
Because you had already made the query earlier, ONE uses the local result that
it had previously and displays that there isn't a histology session.
To circumvent this, use the `no_cache=True` argument in `one.alyx.rest(..., no_cache=True)` or
the `no_cache` web client context. More information can be found [here](https://int-brain-lab.github.io/ONE/notebooks/one_modes.html#REST-caching).
Use this only if necessary, as these methods are not optimized.

To circumvent this, instantiate ONE with `cache_rest=None` or use the `one.webclient.no_cache`
context manager when calling ONE list, search and load methods. You can pass the `no_cache=True`
argument AlyxClient: `one.alyx.rest(..., no_cache=True)`. More information can be found
[here](https://int-brain-lab.github.io/ONE/notebooks/one_modes.html#REST-caching).

**Auto mode**
Remote cache tables are downloaded used when ONE is in 'auto' mode (or when `query_type='auto'` is passed).
These table contain info about all sessions and their associated datasets and is used instead of querying
the database.
For the IBL Alyx, the tables are generated every 6 hours, however by default ONE will only download a
new cache once per day. To force a download you can run `ONE().refresh_cache('remote')`. More
information, including increasing refresh frequency, can be found
[here](https://int-brain-lab.github.io/ONE/notebooks/one_modes.html#Refreshing-the-cache).

## I made a mistake during setup and now can't call setup, how do I fix it?
Usually you can re-run your setup with the following command:
Expand Down
19 changes: 18 additions & 1 deletion docs/notebooks/one_modes.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@
"source": [
"## REST caching\n",
"In remote mode ONE makes a REST query instead of using the local cache tables. The results of\n",
"the remote REST queries are also cached for 24 hours. This means that making the same remote\n",
"the remote REST queries are also cached for 5 minutes. This means that making the same remote\n",
"REST query twice in a row will only hit the database once. The default cache expiry can be set\n",
"by changing the relevant AlyxClient property:"
]
Expand All @@ -124,6 +124,23 @@
"one.alyx.default_expiry = timedelta(days=20) # Cache results for up to 20 days"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The expiry time can be set for individual queries by passing the `expires` kwarg to `AlyxClient.rest`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Cache subjects list for 24 hours\n",
"subjects = one.alyx.rest('subjects', 'list', expires=timedelta(days=1))"
]
},
{
"cell_type": "markdown",
"metadata": {
Expand Down
2 changes: 1 addition & 1 deletion one/webclient.py
Original file line number Diff line number Diff line change
Expand Up @@ -552,7 +552,7 @@ def __init__(self, base_url=None, username=None, password=None,
# The default length of time that cache file is valid for,
# The default expiry is overridden by the `expires` kwarg. If False, the caching is
# turned off.
self.default_expiry = timedelta(days=1)
self.default_expiry = timedelta(minutes=5)
self.cache_mode = cache_rest
self._obj_id = id(self)

Expand Down
Loading