-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disallow starting readwrite transactions while readonly transactions are running? #253
Comments
I'm conflicted between the two. I'll explain my arguments for both sides, to help the decision. No ordering As a user, the tutorials and MDN guides I have access to don't describe the ordering guarantees. For example, purely based on the guides, I didn't know that I can rely on requests completing in order. For example, when I had to fire 3 put requests, I wrote code waiting for all 3 to complete before proceeding. I didn't realize that I can just wait for the last request. I claim that most folks writing IndexedDB code operate in a similar mode. As an implementer, the request ordering costs us extra complexity and will cost us extra memory. AFAIK the Cache Storage API has ran into a similar problem, and they decided to sacrifice ordering for performance. I definitely wish we didn't have ordering constraints -- user code that needs ordering can add extra coordination, and code that doesn't could squeeze out a bit more performance. So, I think we shouldn't add extra ordering and, instead, we should call out that the ordering is implementation-dependent, but must be consistent with some ordering where readonly and readwrite don't overlap. To be clear, I don't know of a concrete setup where this would be a net-benefit... I'm just thinking about not over-constraining the future. Ordering Having strict ordering requirements and associated tests means we'll have consistency across IndexedDB implementations, increasing the odds that content will work well across browsers. On the testing front -- after Chrome removes its snapshotting system, we won't have any implementation that does snapshots. So, we won't have any guarantee that WPT tests won't accidentally start depending on ordering anyways. The Cache Storage API parallel isn't perfect. In their case, prioritizing performance is easily the right call, because Cache Storage reads generally block page loads. IndexedDB is used more often outside of critical paths. |
Is Chrome making this change merely to align with other engines, because of technical limitations of the rework, or because the new behavior is deemed better (and if so, why?). I'm curious, because being able to (slowly) read and write simultaneously, and for reads to not be affected by writes, seems like a fairly fundamental requirement for a database. |
The implementation ends up being slightly simpler. The new behavior "falls out" of the new implementation. We could maintain Chrome's current behavior with additional code, which would not be too burdensome, but it's not clear there's a good reason to do so. (Anyone relying on it would have cross-browser bugs.) We'll be monitoring metrics to see if it affects performance of sites in a measurable way. |
How so? Is that based on the assumption that people are relying on the ordering of transactions? Like @pwnall is describing, I did not know I could rely on the order, nor would I (if and when I'd need it, I would handle such a thing in the application).
IMO IndexedDB has never been optimal for large amounts of data (with backpressure), which means you'll have a "blind spot" in the spectrum of use cases. If no one uses IndexedDB like that (because they can't, not because they don't want to) then metrics will not show a performance degradation. |
@vweevers The "traditional" database systems I've looked at use some form of locking, so at least some reads will be blocked on some writes. SQLite only has one database-level S/X lock, and a bunch of systems only implement table-level locking. I don't consider row-level locking or snapshotting to be "fundamental" requirements for a DB system, and I think that SQLite's adoption proves this to be correct. In other words, given the available embedded database systems, I'd strongly object to making snapshots (or any other form of concurrent reads and writes) required in IndexedDB. If your application relies on concurrent reads and writes to the same store, I'm guessing you have large transactions. You'll enjoy better performance on all browser engines if you break up your writes into smaller batches. This will also give your read transactions an opportunity to get scheduled. |
That's fine. Completely different from blocking all reads or writes.
There aren't many available embedded database systems in browsers with persistent storage. Emscripten has the IDBFS file system which (full-circle) uses IndexedDB, and WORKERFS, which is only available in workers. I'm gonna be looking into this though. Particularly interested in the wasm+leveldb option that @inexorabletash mentioned elsewhere.
Yeah, batching writes is how I managed to get 10+ million writes working in a now-discontinued Chrome extension, where the background page was silently synchronizing data with a backend while other contexts were reading from the same store. Thanks to snapshots, I did not have to worry about the order of transactions or what other contexts were doing in the mean time. I still had to work around the limited lifetime of transactions though, having to make a choice between snapshot guarantees and proper backpressure. |
As things stand, from a standards-perspective, I think the obvious right thing to do is to codify that the readonly transaction must finish before the readwrite can start. If all code will be written on implementations that assume the tx1 and tx2 transactions and their application logic can't interleave, then it's almost certain there will be emergent application bugs when they can interleave. However, at TPAC 2018 in the ServiceWorkers meeting, there was the ongoing issue of how SWs should deal with complicated issues like LRU eviction from the Cache API. And a serious proposal was that perhaps we should turn to IndexedDB in those cases (see w3c/ServiceWorker#863 (comment)), enabling it to persist Request and Response objects. Its indices enable all sorts of custom logic and lets us avoid complicating the Cache API. I think it makes sense to make sure there is a game-plan around the realities of that use-case before changing the spec for this related issue. If we pursue that course of action, it's likely that we'd have to face the reality of a SW that wants to serve pre-readwrite-transaction data while there's an ongoing readwrite-transaction that's updating the database. In those cases, it wouldn't be acceptable for "fetch" events to be stalled by a potentially long-running "readwrite" transaction. A snapshot based mechanism would seem helpful. However, that could also just be a new type of transaction with new semantics that are "any 'readonly' transactions opened before the 'fancy' transaction completes will see the database's state before the 'fancy' transaction was opened." (Existing "readonly" transactions wouldn't really work, even with snapshot required, because they would have to be opened prior to the "readwrite" transaction and then spammed with read requests to keep the transaction open under the existing transaction model. Currently we require any transaction created after the "readwrite" one to see its changes and therefore block on it.) |
If the expectation is that these IndexedDB cache-api calls are going to have multiple...cycles? (if they are going to issue an operation, wait until the result, and then issue another operation) - then I can see how the write-while-reading pattern could slow things down. However, if we are only expecting these IndexedDB transactions to be a single 'get' or a single 'put', then I believe the API user could achieve optimal performance by calling 'commit()' on all of their transactions. I could be misunderstanding this though. |
I guess I'm picturing 2 primary use-cases:
Complicated Upgrade ScenariosI think the idealized upgrade scenario for ServiceWorkers as naturally flows from the spec is that the SW script is changed whenever an upgrade needs to happen and it populates a new CacheStorage cache asynchronously during a waitUntil()'ed "install". Normal activation happens or a complicated dance with skipWaiting()/claim() happens. The newly activated ServiceWorker serves out of the new cache. In practice, it seems like many ServiceWorker authors do not want to change the ServiceWorker script when they make changes to their app. Instead, they want the ServiceWorker to be parameterized. This can work with the above, but one problem is that if the cache name/IDB name isn't something hardcoded into the SW script that gets updated, a mechanism like w3c/ServiceWorker#1331 that's synchronously available is desired so the right cache/db can be opened. Or there needs to be a way to pivot the new cache/database to replace the old database. Pivoting from old to new storage could happen in a few ways:
Tying this back to the issue, I think it would be good to have a general consensus about what directions we expect to move the IndexedDB spec to support storing Requests/Responses, and how large or over-engineered sites will deal with the upgrade mechanism. I personally like the idea of having SW's create custom rename-able storage buckets that we are able to tie to their registrations (which helps with automatic cleanup), and giving them the option to fork existing storage buckets as a starting point. Clever IDB/Cache implementations could use snapshot/copy-on-write/unionfs/refcounting/whatever mechanisms under the hood, whereas naive implementations could just brute force copy the data across. And we'd never need to expose snapshot semantics to IDB. |
Chrome no longer has snapshots here, per @dmurph |
TPAC 2019 Web Apps Indexed DB triage notes: Since behaviors align across browsers, we should spec it. |
All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253
In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab
All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253
In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193
In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553}
In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553}
In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553} Co-authored-by: Joshua Bell <[email protected]>
All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253
…cheduling tests, a=testonly Automatic update from web-platform-tests WPT: Upstream and add some transaction scheduling tests (#22027) In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553} Co-authored-by: Joshua Bell <[email protected]> -- wpt-commits: 5e1160e543827227c05b10c7660795436a76b307 wpt-pr: 22027
…cheduling tests, a=testonly Automatic update from web-platform-tests WPT: Upstream and add some transaction scheduling tests (#22027) In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553} Co-authored-by: Joshua Bell <[email protected]> -- wpt-commits: 5e1160e543827227c05b10c7660795436a76b307 wpt-pr: 22027
…cheduling tests, a=testonly Automatic update from web-platform-tests WPT: Upstream and add some transaction scheduling tests (#22027) In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <dmurphchromium.org> Commit-Queue: Joshua Bell <jsbellchromium.org> Auto-Submit: Joshua Bell <jsbellchromium.org> Cr-Commit-Position: refs/heads/master{#746553} Co-authored-by: Joshua Bell <inexorabletashgmail.com> -- wpt-commits: 5e1160e543827227c05b10c7660795436a76b307 wpt-pr: 22027 UltraBlame original commit: 6b1929e476680a2aafc78037914b756199333f09
…cheduling tests, a=testonly Automatic update from web-platform-tests WPT: Upstream and add some transaction scheduling tests (#22027) In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <dmurphchromium.org> Commit-Queue: Joshua Bell <jsbellchromium.org> Auto-Submit: Joshua Bell <jsbellchromium.org> Cr-Commit-Position: refs/heads/master{#746553} Co-authored-by: Joshua Bell <inexorabletashgmail.com> -- wpt-commits: 5e1160e543827227c05b10c7660795436a76b307 wpt-pr: 22027 UltraBlame original commit: 6b1929e476680a2aafc78037914b756199333f09
…cheduling tests, a=testonly Automatic update from web-platform-tests WPT: Upstream and add some transaction scheduling tests (#22027) In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <dmurphchromium.org> Commit-Queue: Joshua Bell <jsbellchromium.org> Auto-Submit: Joshua Bell <jsbellchromium.org> Cr-Commit-Position: refs/heads/master{#746553} Co-authored-by: Joshua Bell <inexorabletashgmail.com> -- wpt-commits: 5e1160e543827227c05b10c7660795436a76b307 wpt-pr: 22027 UltraBlame original commit: 6b1929e476680a2aafc78037914b756199333f09
All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253
The text in transaction lifetime about "when a transaction can be started" doesn't define an algorithm (unlike e.g. web locks). Instead, it just defines preconditions. Specifically for this case:
Which is then followed by the note:
Chrome implemented snapshots, and therefore allowed a readwrite transaction to start even while a readonly transaction overlapping scope was still running. e.g.
Both tx1 and tx2 could start immediately; operations within tx1 would see a snapshot of the data.
Other engines (Firefox, Safari, Edge) did not implement this; tx2 would not start until tx1 finished.
Chrome is changing this as part of a rework of the implementation; we actually had a non-WPT test asserting this behavior, which broke.
Following this change, all engines will behave the same way - the readonly transactions must finish before the readwrite transaction can start. Should we codify that in the spec and WPTs?
The text was updated successfully, but these errors were encountered: