Disallow starting readwrite transactions while readonly transactions are running? #253

inexorabletash · 2019-01-18T19:37:16Z

The text in transaction lifetime about "when a transaction can be started" doesn't define an algorithm (unlike e.g. web locks). Instead, it just defines preconditions. Specifically for this case:

... As long as a read-only transaction is running, the data that the implementation returns through requests created with that transaction must remain constant ...

Which is then followed by the note:

There are a number of ways that an implementation can ensure this. The implementation could prevent any read/write transaction, whose scope overlaps the scope of the read-only transaction, from starting until the read-only transaction finishes. Or the implementation could allow the read-only transaction to see a snapshot of the contents of the object stores which is taken when the read-only transaction started.

Chrome implemented snapshots, and therefore allowed a readwrite transaction to start even while a readonly transaction overlapping scope was still running. e.g.

const tx1 = db.transaction('store', 'readonly');
const tx2 = db.transaction('store', 'readwrite');

Both tx1 and tx2 could start immediately; operations within tx1 would see a snapshot of the data.

Other engines (Firefox, Safari, Edge) did not implement this; tx2 would not start until tx1 finished.

Chrome is changing this as part of a rework of the implementation; we actually had a non-WPT test asserting this behavior, which broke.

Following this change, all engines will behave the same way - the readonly transactions must finish before the readwrite transaction can start. Should we codify that in the spec and WPTs?

pwnall · 2019-01-18T20:16:58Z

I'm conflicted between the two. I'll explain my arguments for both sides, to help the decision.

No ordering

As a user, the tutorials and MDN guides I have access to don't describe the ordering guarantees. For example, purely based on the guides, I didn't know that I can rely on requests completing in order. For example, when I had to fire 3 put requests, I wrote code waiting for all 3 to complete before proceeding. I didn't realize that I can just wait for the last request. I claim that most folks writing IndexedDB code operate in a similar mode.

As an implementer, the request ordering costs us extra complexity and will cost us extra memory. AFAIK the Cache Storage API has ran into a similar problem, and they decided to sacrifice ordering for performance. I definitely wish we didn't have ordering constraints -- user code that needs ordering can add extra coordination, and code that doesn't could squeeze out a bit more performance.

So, I think we shouldn't add extra ordering and, instead, we should call out that the ordering is implementation-dependent, but must be consistent with some ordering where readonly and readwrite don't overlap. To be clear, I don't know of a concrete setup where this would be a net-benefit... I'm just thinking about not over-constraining the future.

Ordering

Having strict ordering requirements and associated tests means we'll have consistency across IndexedDB implementations, increasing the odds that content will work well across browsers.

On the testing front -- after Chrome removes its snapshotting system, we won't have any implementation that does snapshots. So, we won't have any guarantee that WPT tests won't accidentally start depending on ordering anyways.

The Cache Storage API parallel isn't perfect. In their case, prioritizing performance is easily the right call, because Cache Storage reads generally block page loads. IndexedDB is used more often outside of critical paths.

vweevers · 2019-01-18T20:19:05Z

Chrome is changing this as part of a rework of the implementation

Is Chrome making this change merely to align with other engines, because of technical limitations of the rework, or because the new behavior is deemed better (and if so, why?).

I'm curious, because being able to (slowly) read and write simultaneously, and for reads to not be affected by writes, seems like a fairly fundamental requirement for a database.

inexorabletash · 2019-01-18T20:26:40Z

Is Chrome making this change merely to align with other engines, because of technical limitations of the rework, or because the new behavior is deemed better (and if so, why?).

The implementation ends up being slightly simpler. The new behavior "falls out" of the new implementation. We could maintain Chrome's current behavior with additional code, which would not be too burdensome, but it's not clear there's a good reason to do so. (Anyone relying on it would have cross-browser bugs.) We'll be monitoring metrics to see if it affects performance of sites in a measurable way.

vweevers · 2019-01-18T20:48:09Z

Anyone relying on it would have cross-browser bugs

How so? Is that based on the assumption that people are relying on the ordering of transactions? Like @pwnall is describing, I did not know I could rely on the order, nor would I (if and when I'd need it, I would handle such a thing in the application).

We'll be monitoring metrics to see if it affects performance of sites in a measurable way.

IMO IndexedDB has never been optimal for large amounts of data (with backpressure), which means you'll have a "blind spot" in the spectrum of use cases. If no one uses IndexedDB like that (because they can't, not because they don't want to) then metrics will not show a performance degradation.

pwnall · 2019-01-18T20:57:30Z

@vweevers The "traditional" database systems I've looked at use some form of locking, so at least some reads will be blocked on some writes. SQLite only has one database-level S/X lock, and a bunch of systems only implement table-level locking. I don't consider row-level locking or snapshotting to be "fundamental" requirements for a DB system, and I think that SQLite's adoption proves this to be correct.

In other words, given the available embedded database systems, I'd strongly object to making snapshots (or any other form of concurrent reads and writes) required in IndexedDB.

If your application relies on concurrent reads and writes to the same store, I'm guessing you have large transactions. You'll enjoy better performance on all browser engines if you break up your writes into smaller batches. This will also give your read transactions an opportunity to get scheduled.

vweevers · 2019-01-18T21:29:23Z

so at least some reads will be blocked on some writes.

That's fine. Completely different from blocking all reads or writes.

In other words, given the available embedded database systems, I'd strongly object to making snapshots (or any other form of concurrent reads and writes) required in IndexedDB.

There aren't many available embedded database systems in browsers with persistent storage. Emscripten has the IDBFS file system which (full-circle) uses IndexedDB, and WORKERFS, which is only available in workers. I'm gonna be looking into this though. Particularly interested in the wasm+leveldb option that @inexorabletash mentioned elsewhere.

You'll enjoy better performance on all browser engines if you break up your writes into smaller batches. This will also give your read transactions an opportunity to get scheduled.

Yeah, batching writes is how I managed to get 10+ million writes working in a now-discontinued Chrome extension, where the background page was silently synchronizing data with a backend while other contexts were reading from the same store. Thanks to snapshots, I did not have to worry about the order of transactions or what other contexts were doing in the mean time. I still had to work around the limited lifetime of transactions though, having to make a choice between snapshot guarantees and proper backpressure.

asutherland · 2019-01-23T13:48:09Z

As things stand, from a standards-perspective, I think the obvious right thing to do is to codify that the readonly transaction must finish before the readwrite can start. If all code will be written on implementations that assume the tx1 and tx2 transactions and their application logic can't interleave, then it's almost certain there will be emergent application bugs when they can interleave.

However, at TPAC 2018 in the ServiceWorkers meeting, there was the ongoing issue of how SWs should deal with complicated issues like LRU eviction from the Cache API. And a serious proposal was that perhaps we should turn to IndexedDB in those cases (see w3c/ServiceWorker#863 (comment)), enabling it to persist Request and Response objects. Its indices enable all sorts of custom logic and lets us avoid complicating the Cache API. I think it makes sense to make sure there is a game-plan around the realities of that use-case before changing the spec for this related issue.

If we pursue that course of action, it's likely that we'd have to face the reality of a SW that wants to serve pre-readwrite-transaction data while there's an ongoing readwrite-transaction that's updating the database. In those cases, it wouldn't be acceptable for "fetch" events to be stalled by a potentially long-running "readwrite" transaction. A snapshot based mechanism would seem helpful. However, that could also just be a new type of transaction with new semantics that are "any 'readonly' transactions opened before the 'fancy' transaction completes will see the database's state before the 'fancy' transaction was opened."

(Existing "readonly" transactions wouldn't really work, even with snapshot required, because they would have to be opened prior to the "readwrite" transaction and then spammed with read requests to keep the transaction open under the existing transaction model. Currently we require any transaction created after the "readwrite" one to see its changes and therefore block on it.)

dmurph · 2019-01-25T23:47:41Z

If the expectation is that these IndexedDB cache-api calls are going to have multiple...cycles? (if they are going to issue an operation, wait until the result, and then issue another operation) - then I can see how the write-while-reading pattern could slow things down.

However, if we are only expecting these IndexedDB transactions to be a single 'get' or a single 'put', then I believe the API user could achieve optimal performance by calling 'commit()' on all of their transactions. I could be misunderstanding this though.

asutherland · 2019-01-28T23:34:35Z

I guess I'm picturing 2 primary use-cases:

Something like LRU eviction that operates on streaming processing of the current contents of the database where the reads use cursors/similar to perform mutations in bite-sized pieces. This could be a fairly large amount of data and take a fairly long time.
Complicated upgrade scenarios that may require IndexedDB enhancements.

Complicated Upgrade Scenarios

I think the idealized upgrade scenario for ServiceWorkers as naturally flows from the spec is that the SW script is changed whenever an upgrade needs to happen and it populates a new CacheStorage cache asynchronously during a waitUntil()'ed "install". Normal activation happens or a complicated dance with skipWaiting()/claim() happens. The newly activated ServiceWorker serves out of the new cache.

In practice, it seems like many ServiceWorker authors do not want to change the ServiceWorker script when they make changes to their app. Instead, they want the ServiceWorker to be parameterized. This can work with the above, but one problem is that if the cache name/IDB name isn't something hardcoded into the SW script that gets updated, a mechanism like w3c/ServiceWorker#1331 that's synchronously available is desired so the right cache/db can be opened. Or there needs to be a way to pivot the new cache/database to replace the old database.

Pivoting from old to new storage could happen in a few ways:

Atomically renaming caches / IDB databases. This is not yet a thing.
Use a lock or transaction to stall new requests while the new, already-fetched state is propagated into the "current" database. The IDB transaction model doesn't quite work here without jumping through some hoops to spam the target "readwrite" database with no-op requests if the source isn't delivering data fast enough. This could all be made less ugly if IDB started supporting waitUntil()/web-locks, but then things could end up blocking for quite some time. So then a snapshot mechanism potentially looks more attractive, especially because it could also simplify the update process.
Something involving storage buckets. Be able to clone/fork them and/or rename them.
Addressing A simple globally available store of data per-registration ServiceWorker#1331 with some means of having a per-ServiceWorker or per-registration store that has very good performance characteristics (synchronous or guaranteed <10ms fast async) so that people use it to identify the current db/cache names.

Tying this back to the issue, I think it would be good to have a general consensus about what directions we expect to move the IndexedDB spec to support storing Requests/Responses, and how large or over-engineered sites will deal with the upgrade mechanism. I personally like the idea of having SW's create custom rename-able storage buckets that we are able to tie to their registrations (which helps with automatic cleanup), and giving them the option to fork existing storage buckets as a starting point. Clever IDB/Cache implementations could use snapshot/copy-on-write/unionfs/refcounting/whatever mechanisms under the hood, whereas naive implementations could just brute force copy the data across. And we'd never need to expose snapshot semantics to IDB.

inexorabletash · 2019-09-20T01:57:19Z

Chrome no longer has snapshots here, per @dmurph

inexorabletash · 2019-09-20T01:58:21Z

TPAC 2019 Web Apps Indexed DB triage notes:

Since behaviors align across browsers, we should spec it.

All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253

In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab

All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253

In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193

In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553}

In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553} Co-authored-by: Joshua Bell <[email protected]>

All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253

…cheduling tests, a=testonly Automatic update from web-platform-tests WPT: Upstream and add some transaction scheduling tests (#22027) In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <[email protected]> Commit-Queue: Joshua Bell <[email protected]> Auto-Submit: Joshua Bell <[email protected]> Cr-Commit-Position: refs/heads/master@{#746553} Co-authored-by: Joshua Bell <[email protected]> -- wpt-commits: 5e1160e543827227c05b10c7660795436a76b307 wpt-pr: 22027

…cheduling tests, a=testonly Automatic update from web-platform-tests WPT: Upstream and add some transaction scheduling tests (#22027) In service of w3c/IndexedDB#253 move some transaction scheduling tests from Blink to WPT. This involved converting them from js-test.js to testharness.js, but the overall logic of each test was retained. This also adds one new test which verifies the change described in w3c/IndexedDB#319 - all browsers implicitly block R-O transactions behind overlapping R/W transactions. Change-Id: I596aaa75b79bf3bf3e17a2553abb4e11329d59ab Bug: 921193 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2081237 Reviewed-by: Daniel Murphy <dmurphchromium.org> Commit-Queue: Joshua Bell <jsbellchromium.org> Auto-Submit: Joshua Bell <jsbellchromium.org> Cr-Commit-Position: refs/heads/master{#746553} Co-authored-by: Joshua Bell <inexorabletashgmail.com> -- wpt-commits: 5e1160e543827227c05b10c7660795436a76b307 wpt-pr: 22027 UltraBlame original commit: 6b1929e476680a2aafc78037914b756199333f09

All implementations have a strict ordering of transactions with overlapping scopes; read-only transactions can run in parallel but block a later read/write transaction from starting, and the read/write transactions similarly block later read/write and read-only transactions. Tighten up the constraints definition to something precise, and move the wordy implications into a non-normative aside. Also, define "overlapping scopes" as a term. Closes #253

inexorabletash added feedback wanted needs-tests labels Jan 18, 2019

inexorabletash closed this as completed Jan 29, 2019

inexorabletash reopened this Jan 29, 2019

inexorabletash added needs-pr and removed feedback wanted labels Sep 20, 2019

inexorabletash added this to the V3 milestone Sep 20, 2019

inexorabletash mentioned this issue Feb 27, 2020

Simplify transaction starting constraints to match reality #319

Merged

2 tasks

inexorabletash added has-pr and removed needs-pr labels Feb 27, 2020

chromium-wpt-export-bot mentioned this issue Feb 29, 2020

WPT: Upstream and add some transaction scheduling tests web-platform-tests/wpt#22027

Merged

inexorabletash closed this as completed in #319 Mar 4, 2020

inexorabletash added has-tests and removed needs-tests labels Mar 4, 2020

whiten mentioned this issue Jul 24, 2020

decide what if anything to do about the difference in memstore/idbstore apis rocicorp/repc#62

Closed

inexorabletash mentioned this issue Sep 28, 2020

Specification status report for TPAC 2020 #338

Closed

fergald mentioned this issue May 20, 2022

How should IDB connections and outstanding IDB transactions behave when a page enters back/forward cache? #381

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disallow starting readwrite transactions while readonly transactions are running? #253

Disallow starting readwrite transactions while readonly transactions are running? #253

inexorabletash commented Jan 18, 2019

pwnall commented Jan 18, 2019

vweevers commented Jan 18, 2019

inexorabletash commented Jan 18, 2019

vweevers commented Jan 18, 2019

pwnall commented Jan 18, 2019

vweevers commented Jan 18, 2019

asutherland commented Jan 23, 2019

dmurph commented Jan 25, 2019

asutherland commented Jan 28, 2019

inexorabletash commented Sep 20, 2019

inexorabletash commented Sep 20, 2019

Disallow starting readwrite transactions while readonly transactions are running? #253

Disallow starting readwrite transactions while readonly transactions are running? #253

Comments

inexorabletash commented Jan 18, 2019

pwnall commented Jan 18, 2019

vweevers commented Jan 18, 2019

inexorabletash commented Jan 18, 2019

vweevers commented Jan 18, 2019

pwnall commented Jan 18, 2019

vweevers commented Jan 18, 2019

asutherland commented Jan 23, 2019

dmurph commented Jan 25, 2019

asutherland commented Jan 28, 2019

Complicated Upgrade Scenarios

inexorabletash commented Sep 20, 2019

inexorabletash commented Sep 20, 2019