-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sync writes to multiMerge() in WebStorage. Fix caching when multiMerge() runs. #122
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we still have a problem with multiMerge, though it might be more subtle now
// Prefill cache if necessary by calling get() on any existing keys and then merge original data to cache | ||
// and update all subscribers | ||
Promise.all(_.map(existingKeys, get)).then(() => { | ||
cache.merge(collection); | ||
keysChanged(collectionKey, collection); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense, but:
- it suffer from the same problem we discovered - if
get
has to read from disk, data might not yet be written to disk (when write is still pending), and still result in partial data being added to cache - delaying cache update opens a door for problems - other get/connect calls might use stale cache information
- it might delay the UI update a bit, since now
keysChanged
would be delayed ifget
actually has to read something
Point 2) is probably covered by keysChnaged
being delayed as well - anyone that read stale information would be notified of an update, but we'd still need to explain that and warn anyone to keep usages together in that order.
The goal with cache is to update it sync as fast as possible to avoid concurrency issues
This solves problems like
set
starts and setskey: alpha
- it immediately updates cache before write to disk is complete
get
starts forkey: alpha
- it receives the latest value even before it's written to disk
Perhaps you'd like this alternative:
- Keep code here unchanged
- Update code in
cache.merge
to exclude merging any keys that are not currently present in cache
The idea is:
If we lack data in cache, we skip merging potentially partial data
When someone actually needs to get
this key, it would trigger a read from disk and re-populate cache with correct data
We don't have many usages of cache.merge
but this could address the partial data problem for all such usages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if get has to read from disk, data might not yet be written to disk (when write is still pending), and still result in partial data being added to cache
If you can, please explain how exactly it can result in partial data being added to the cache? I would love to write a test for it and we can come up with a solution if it's possible, but I want to make sure there we understand how it can happen.
delaying cache update opens a door for problems - other get/connect calls might use stale cache information
One alternative I can think of is to prefill the cache with all data before Onyx does any work at all (guessing most front end caches work this way).
it might delay the UI update a bit, since now keysChanged would be delayed if get actually has to read something
Yes, but better for something to be delayed than send partial data without previous data initially in storage.
The goal with cache is to update it sync as fast as possible to avoid concurrency issues
I think I understand this argument, but I'm not sure that's how merge()
works today. Here's the implementation for merge()
which also potentially blocks with a get()
call
Lines 672 to 681 in 032ff54
return get(key) | |
.then((data) => { | |
try { | |
const modifiedData = applyMerge(key, data); | |
// Clean up the write queue so we | |
// don't apply these changes again | |
delete mergeQueue[key]; | |
return set(key, modifiedData); |
set()
is the only thing which modifies the data in the cache first before reading existing values
Lines 548 to 551 in 032ff54
cache.set(key, value); | |
// Optimistically inform subscribers on the next tick | |
Promise.resolve().then(() => keyChanged(key, value)); |
Perhaps you'd like this alternative:
Keep code here unchanged
Update code in cache.merge to exclude merging any keys that are not currently present in cache
I do like this idea and it solves this problem in a different way, but we would defer the filling of the cache. I'm not sure if that is bad or good or doesn't make a difference and would be willing to try it if it sounds simpler. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so pulling on that last thread and I can't get it working. It leads to a stale cache that is missing the data. Here are the changes I made:
- Reverted the change above.
- Added this so we would not set data if no existing key was in the cache
@@ -107,11 +110,19 @@ class OnyxCache {
}
/**
- * Deep merge data to cache, any non existing keys will be created
+ * Deep merge data to cache, any non existing keys will not be merged
* @param {Record<string, *>} data - a map of (cache) key - values
*/
merge(data) {
- this.storageMap = lodashMerge({}, this.storageMap, data);
+ // Filter any keys we have not yet cached
+ const newData = _.reduce(data, (prev, curr, key) => {
+ if (this.hasCacheForKey(key)) {
+ return {...prev, [key]: curr};
+ }
+ return prev;
+ }, {});
+
+ this.storageMap = lodashMerge({}, this.storageMap, newData);
const storageKeys = this.getAllKeys();
- Re-ran tests and added a subscriber (which should trigger the cache fill)
- Storage ends up correct, but cache is wrong
I think it's because the subscriber can be added at anytime while we are adding new data. We already skipped the cache for any consecutive calls to mergeCollection() and yes they are making it to storage correctly. But if you add a subscriber at the same time while this process is happened it will only get the result of what has been written and save that to the cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because the subscriber can be added at anytime while we are adding new data. We already skipped the cache for any consecutive calls to mergeCollection() and yes they are making it to storage correctly. But if you add a subscriber at the same time while this process is happened it will only get the result of what has been written and save that to the cache.
Yeah, this case slipped my mind
If you can, please explain how exactly it can result in partial data being added to the cache? I would love to write a test for it and we can come up with a solution if it's possible, but I want to make sure there we understand how it can happen.
I think you answered that yourself at the end or at least the reason is similar
- We have a
mergeCollection
call and then anothermergeCollection
call(s) in close proximity - Writes are scheduled and happen one by one
Promise.all(_.map(existingKeys, get))
is invoked before scheduled writes complete- When some of the
existingKeys
is not available in cache get
from disk is triggered - it reads older data and puts it in cache
To capture that in a test we're looking for a way to:
- Have storage in a state where it has
{ a: 1, b: 1, c: 1 }
for a givenkey
- Have a queue of writes happening and one particular
key
being written at the end of the queue. Changingb: 2
- Have cache in a state where it lacks the given
key
- Call
mergeCollection
to merge multiple updates, one of them regardingkey
settingc: 2
I think that what we have here with Promise.all would result in cache ending up with
{ a: 1, b: 1, c: 2 }
vs
{ a: 1, b: 2, c: 2 }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @kidroca, I couldn't figure this one out, but got your other idea working. Let me know if you have any other thoughts here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think prefilling cache might be our safest and optimal option here
It seems that for Native:
- Calling AsyncStorage.get
- Then calling AsyncStorage.set, before the get is completed
get
would resolve with the value from step 2)
I'm not certain this would work out the same way on Web/Desktop
It used to work that way when we used window.localStorage
because local storage calls are sync and happen sequentially
So we're back to: "Oh, Cache save us from concurrency issues" 😄
And to do that it might be best to leverage multiGet
and get everything but safe eviction keys in cache during init
Then we don't have to delay updates to cache and cache should never become stale
Another strategy could be to recreated the above 3 steps ourselves
- capture pending writes (we already have a queue)
- when a read happens check whether we're not writing or are about to write the requested key
- if we are - resolve the get with the value from the write queue, otherwise read from disk
it('mergeCollection', () => { | ||
expect(Storage.name).toBe('WebStorage'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the success of the test depend on the underlying storage provider? - it shouldn't
mergeCollection
should be considered working correctly if it correctly delegates work to a Storage provider
Testing that mergeCollection
is interacting correctly with cache is valuable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the implementation we are having issues with is the storage provider then I want to make sure things work correctly - it also took me some time to get this working. Testing against AsyncStorage would not touch any of the code we're proposing to add. Sorry, I think you are trying to say that we should maybe:
- Have one set of tests
- Run them against both storage providers
But not totally sure. Please clarify if there is a specific suggestion, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't think it would be worth it to run against all storage providers
We've introduced 2 changes:
- Queue LocalForage writes - this can be tested outside of
mergeCollection
- We changed the way
meregeCollection
updates cache - this shouldn't depend on specific storage provider
Specifically for the bug with partial cache I think the issue should be happening with any storage provider, because we evicted cache keys
// Check the storage to make sure our data has been saved | ||
expect(localforageMock.storageMap.test_1).toEqual(finalObject); | ||
expect(localforageMock.storageMap.test_2).toEqual(finalObject); | ||
expect(localforageMock.storageMap.test_3).toEqual(finalObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm saying is changes made around mocking are not necessary here, because which mock is used shouldn't matter
We can assert that Storage.multiSet
and/or Storage.multiMerge
are getting called correctly
The changes made to __mocks__/localforage.js
can help us test LocalForage.multiSet
and LocalForage.multiMerge
are working correctly with the queues here: LocalForageProviderTest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am probably not understanding the message. The test is helping me to verify the changes by using a mock that works similar to storage. Why is it not necessary? The previous tests did not test this scenario and things were broken. So I'm adding a case to cover the thing that was breaking. What should we do instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By testing what's stored inside storageMap
of the mock we're asserting that WebStorage
works correctly when Storage.multiMerge
is called within mergeCollection
We can just have a test where we import LocalForage
and invoke multiMerge
on it directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this test because it more accurately represents the code flow that went wrong here. It's a combination of Onyx changes and storage provider changes so I think it is reasonable to test them both. But I am possibly being lazy 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense it's just that we're not doing it anywhere else - using specific storage provider and mock
So it's not clear when or why tests need to use a specific storage provider, e.g. why this test does but others don't?
What we see changed in Onyx.mergeCollection is a change that should work regardless of the storage provider
We can write 2x tests with the other provider as well, but then it results in more maintenance outweighing the value of the tests
I think we can achieve the same test by:
- Mock
Storage.getItem
- Verity
Storage.getItem
is being called when a given key is not available in cache - Verify returned mock data is used in cache - e.g. cache matches expected final value
To me this gives us more information regarding what we expect to happen
(First time here I didn't even realize we we were testing the case where we don't have data in cache but have it on disk)
Sorry, I didn't get to this today -- I stopped due to a family issue. Adding it to my list for tomorrow. |
// Prefill cache if necessary by calling get() on any existing keys and then merge original data to cache | ||
// and update all subscribers | ||
Promise.all(_.map(existingKeys, get)).then(() => { | ||
cache.merge(collection); | ||
keysChanged(collectionKey, collection); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's because the subscriber can be added at anytime while we are adding new data. We already skipped the cache for any consecutive calls to mergeCollection() and yes they are making it to storage correctly. But if you add a subscriber at the same time while this process is happened it will only get the result of what has been written and save that to the cache.
Yeah, this case slipped my mind
If you can, please explain how exactly it can result in partial data being added to the cache? I would love to write a test for it and we can come up with a solution if it's possible, but I want to make sure there we understand how it can happen.
I think you answered that yourself at the end or at least the reason is similar
- We have a
mergeCollection
call and then anothermergeCollection
call(s) in close proximity - Writes are scheduled and happen one by one
Promise.all(_.map(existingKeys, get))
is invoked before scheduled writes complete- When some of the
existingKeys
is not available in cache get
from disk is triggered - it reads older data and puts it in cache
To capture that in a test we're looking for a way to:
- Have storage in a state where it has
{ a: 1, b: 1, c: 1 }
for a givenkey
- Have a queue of writes happening and one particular
key
being written at the end of the queue. Changingb: 2
- Have cache in a state where it lacks the given
key
- Call
mergeCollection
to merge multiple updates, one of them regardingkey
settingc: 2
I think that what we have here with Promise.all would result in cache ending up with
{ a: 1, b: 1, c: 2 }
vs
{ a: 1, b: 2, c: 2 }
it('mergeCollection', () => { | ||
expect(Storage.name).toBe('WebStorage'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I don't think it would be worth it to run against all storage providers
We've introduced 2 changes:
- Queue LocalForage writes - this can be tested outside of
mergeCollection
- We changed the way
meregeCollection
updates cache - this shouldn't depend on specific storage provider
Specifically for the bug with partial cache I think the issue should be happening with any storage provider, because we evicted cache keys
// Check the storage to make sure our data has been saved | ||
expect(localforageMock.storageMap.test_1).toEqual(finalObject); | ||
expect(localforageMock.storageMap.test_2).toEqual(finalObject); | ||
expect(localforageMock.storageMap.test_3).toEqual(finalObject); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By testing what's stored inside storageMap
of the mock we're asserting that WebStorage
works correctly when Storage.multiMerge
is called within mergeCollection
We can just have a test where we import LocalForage
and invoke multiMerge
on it directly
const additionalDataOne = {b: 'b', c: 'c'}; | ||
Onyx.mergeCollection(ONYX_KEYS.COLLECTION.TEST_KEY, { | ||
test_1: additionalDataOne, | ||
test_2: additionalDataOne, | ||
test_3: additionalDataOne, | ||
}); | ||
|
||
const additionalDataTwo = {d: 'd'}; | ||
Onyx.mergeCollection(ONYX_KEYS.COLLECTION.TEST_KEY, { | ||
test_1: additionalDataTwo, | ||
test_2: additionalDataTwo, | ||
test_3: additionalDataTwo, | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this captures the bug with cache
If you revert mergeCollection
changes would the test fail because finalObject
is not matching OnyxCache?
I think to cover the cache bug the test should:
- merge
additionalDataOne
- wait for the merge to complete
- evict test_1, test_2 from cache
- merge
additionalDataTwo
- Assert
finalObject
is in cache
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds like a good additional test. Just to clarify this is covering the case where we are not first grabbing existing data and merging it into the cache. We start with some initial data, then call mergeCollection, then call it again. Before the changes we would end up skipping the initial value and merging partial data into the cache.
Still need to make sure there are no major issues for native with this change. But the performance is pretty nice on web and the cache issues seem resolved. Gonna take it out of draft. |
Seems like we're still discussing solutions. @marcaaron is this still the blocking PR? |
@Julesssss Sorry, not sure which solutions you are referring to. I think this PR is ready to be reviewed/tested - we are just discussing improvements to the tests at this point. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good! Just had one small question/comment
const {data, resolve, reject} = this.queue.shift(); | ||
this.run(data) | ||
.then(resolve) | ||
.catch(reject) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log or do anything with failed requests? Or I guess we will have server logs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do have some handling here (but it could be better). In this case, we're just letting the original caller handle the error though and don't need more specific handling (at least not for now since the error will reach this code here)..
Line 778 in 032ff54
.catch(error => evictStorageAndRetry(error, mergeCollection, collection)); |
I'm gonna move ahead with this with a caveat that there's probably more work to be done here. The performance is suffering a lot for web/desktop users internally so this should be a welcome temporary improvement Things to follow up on:
|
Details
cc @Julesssss @kidroca Second attempt at alleviating the IDB issues
Related Issues
#119
Automated Tests
Added a new automated test specifically for the weirdness reported here.
Linked PRs
Expensify/App#8053