Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sqlite persistence experiment - *Do not merge* #1746

Closed
wants to merge 7 commits into from

Conversation

v0d1ch
Copy link
Contributor

@v0d1ch v0d1ch commented Nov 21, 2024

Why

  • We wanted to test out the quick benefits of using the simple sqlite db instead of file based storage.

Notes:

  • We get atomic writes which is something we had to simulate with file based storage.
  • We don't keep the db connection open but open and close each time we need to execute a query since this is considered best practise.

  • CHANGELOG updated or not needed
  • Documentation updated or not needed
  • Haddocks updated or not needed
  • No new TODOs introduced or explained herafter

Copy link

Transaction cost differences

No cost or size differences found

Copy link

github-actions bot commented Nov 21, 2024

Transaction costs

Sizes and execution budgets for Hydra protocol transactions. Note that unlisted parameters are currently using arbitrary values and results are not fully deterministic and comparable to previous runs.

Metadata
Generated at 2024-11-26 15:00:39.676406857 UTC
Max. memory units 14000000
Max. CPU units 10000000000
Max. tx size (kB) 16384

Script summary

Name Hash Size (Bytes)
νInitial 00a6ddbc130ab92f5b7cb8d1ccd8d79eca5bfe25f6843c07b62841f0 2667
νCommit 3e5a776bcee213e3dfd15806952a10ac5590e3e97d09d62eb99266b2 690
νHead 8fc2a74df32d01d1db56b3acb561831ef9c9970123079423abfcb86e 12622
μHead c40e78e78083a4c137734abe9ac4070cc978842e9755fe88e0c7b922* 11133
νDeposit 2feb47889a4f658dc593cefcb0e37d584b9431944f08a687f3dab4af 4865
  • The minting policy hash is only usable for comparison. As the script is parameterized, the actual script is unique per head.

Init transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 11721 9.11 2.98 0.76
2 11918 10.60 3.46 0.79
3 12124 12.36 4.03 0.81
5 12519 16.04 5.23 0.87
10 13528 24.69 8.03 1.00
24 16345 49.51 16.10 1.39

Commit transaction costs

This uses ada-only outputs for better comparability.

UTxO Tx size % max Mem % max CPU Min fee ₳
1 559 2.45 1.17 0.20
2 740 3.40 1.74 0.22
3 923 4.39 2.34 0.24
5 1278 6.46 3.61 0.28
10 2167 12.24 7.28 0.40
54 10046 99.20 68.72 1.89

CollectCom transaction costs

Parties UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
1 57 560 24.87 7.24 0.43
2 114 671 31.81 9.29 0.50
3 170 782 43.22 12.42 0.62
4 225 897 48.68 14.12 0.68
5 284 1004 55.69 16.25 0.76
6 336 1116 68.68 19.69 0.90
7 394 1227 71.67 20.85 0.93
8 450 1338 84.38 24.30 1.07
9 505 1449 92.03 26.42 1.15
10 561 1560 99.09 28.75 1.23

Cost of Decrement Transaction

Parties Tx size % max Mem % max CPU Min fee ₳
1 669 22.31 7.23 0.41
2 786 23.80 8.32 0.44
3 953 25.72 9.49 0.47
5 1294 29.78 11.96 0.53
10 2092 40.97 18.44 0.71
42 6724 99.16 56.08 1.65

Close transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 688 24.34 8.02 0.44
2 741 25.38 8.90 0.45
3 989 28.11 10.64 0.50
5 1220 31.10 12.91 0.55
10 2122 41.73 20.11 0.73
45 7111 99.23 62.39 1.71

Contest transaction costs

Parties Tx size % max Mem % max CPU Min fee ₳
1 703 30.38 9.63 0.50
2 864 32.96 11.18 0.54
3 904 33.61 11.84 0.55
5 1258 38.43 14.87 0.62
10 2062 49.34 21.89 0.80
34 5616 97.95 53.65 1.58

Abort transaction costs

There is some variation due to the random mixture of initial and already committed outputs.

Parties Tx size % max Mem % max CPU Min fee ₳
1 11608 25.79 8.77 0.93
2 11765 35.15 12.00 1.04
3 11785 41.50 14.07 1.11
4 12054 49.95 17.03 1.21
5 12131 61.29 20.86 1.33
6 12298 70.04 23.88 1.43
7 12565 81.15 27.74 1.56
8 12603 88.23 30.13 1.63
9 12684 93.30 31.72 1.69
10 12719 96.75 32.91 1.73

FanOut transaction costs

Involves spending head output and burning head tokens. Uses ada-only UTXO for better comparability.

Parties UTxO UTxO (bytes) Tx size % max Mem % max CPU Min fee ₳
10 0 0 11715 16.61 5.67 0.84
10 1 57 11749 18.83 6.56 0.87
10 5 285 11885 27.33 10.01 0.97
10 10 570 12054 36.59 13.82 1.08
10 20 1139 12393 55.27 21.49 1.30
10 30 1709 12735 73.43 28.98 1.51
10 40 2277 13074 91.60 36.46 1.73
10 44 2504 13209 98.49 39.32 1.81

End-to-end benchmark results

This page is intended to collect the latest end-to-end benchmark results produced by Hydra's continuous integration (CI) system from the latest master code.

Please note that these results are approximate as they are currently produced from limited cloud VMs and not controlled hardware. Rather than focusing on the absolute results, the emphasis should be on relative results, such as how the timings for a scenario evolve as the code changes.

Generated at 2024-11-26 15:03:36.556624327 UTC

Baseline Scenario

Number of nodes 1
Number of txs 300
Avg. Confirmation Time (ms) 29.641562066
P99 39.48164574999999ms
P95 35.44021810000001ms
P50 27.1035685ms
Number of Invalid txs 0

Three local nodes

Number of nodes 3
Number of txs 900
Avg. Confirmation Time (ms) 96.869982766
P99 139.20830799999996ms
P95 118.44914309999999ms
P50 94.985839ms
Number of Invalid txs 0

@v0d1ch v0d1ch force-pushed the sqlite-persistence-experiment branch from 439d56f to 1dceaec Compare November 22, 2024 12:09
Copy link

github-actions bot commented Nov 22, 2024

Test Results

  5 files  ±0  164 suites  +2   33m 29s ⏱️ + 3m 39s
559 tests +4  552 ✅ +4  7 💤 ±0  0 ❌ ±0 
561 runs  +4  554 ✅ +4  7 💤 ±0  0 ❌ ±0 

Results for commit e9d3d08. ± Comparison against base commit 3748690.

♻️ This comment has been updated with latest results.

@v0d1ch v0d1ch self-assigned this Nov 22, 2024
@v0d1ch v0d1ch added the green 💚 Low complexity or well understood feature label Nov 22, 2024
@v0d1ch v0d1ch marked this pull request as ready for review November 22, 2024 13:20
@v0d1ch v0d1ch force-pushed the sqlite-persistence-experiment branch from 430d9de to 3deeff0 Compare November 22, 2024 13:28
@v0d1ch v0d1ch requested a review from a team November 22, 2024 13:31
@v0d1ch v0d1ch force-pushed the sqlite-persistence-experiment branch 4 times, most recently from 5e94ecd to 7c603f5 Compare November 25, 2024 08:22
@locallycompact
Copy link
Contributor

@v0d1ch Is this still an experiment or do you want to merge it?

@locallycompact locallycompact force-pushed the sqlite-persistence-experiment branch from 7c603f5 to e9d3d08 Compare November 26, 2024 14:56
@noonio
Copy link
Contributor

noonio commented Nov 26, 2024

No, we won't merge this (at present, it's significantly slower than master; 5 times or so at least.)

If we can resolve that, then maybe, but definitely not beforehand.

@noonio noonio changed the title Sqlite persistence experiment Sqlite persistence experiment - *Do not merge* Nov 26, 2024
@v0d1ch
Copy link
Contributor Author

v0d1ch commented Nov 26, 2024

I think we need to find a way configure sqlite since the haskell library we used does not provide this. Perhaps we should also explore using some different library since I believe this PR should bring us benefits if we are able to configure sqlite for performance.

@noonio
Copy link
Contributor

noonio commented Nov 27, 2024

Here's my thoughts.

Overall I think this is the way we need to head.

At the moment, it seems like the performance here is too poor to consider adopting it. On my computer, it's roughly 10x slower than master at present.

From reading a few articles, mostly https://www.powersync.com/blog/sqlite-optimizations-for-ultra-high-performance, it seems like sqlite shouldn't really be this slow; and may in fact be faster, if we can do it correctly.

There are a couple of notes:

  • We have one "database" per thing in the state folder: acks, network-messages, server-output, state; maybe they could all be tables in one database
  • We write raw JSON; I couldn't work out if this should be BLOB type or TEXT type or something else. Nothing I changed here made much of a difference.
  • Relatedly, given that we don't expect the state folder to persist between type changes, we could just create tables that match the Haskell types directly, and avoid Aeson encoding/decoding (if that's what's slow?!)
  • We increment the acks, instead of overwriting it. Maybe it should just be a single row that's updated.
  • The files sizes in sqlite are larger (naturally); that's probably fine but maybe something to be cautious about; maybe it means we should have multiple databases instead of just one

I think we should revisit this soon; and try and pinpoint the slow parts:

Appendix

Here's the diff of my hacking:

diff --git a/hydra-node/src/Hydra/SqlLitePersistence.hs b/hydra-node/src/Hydra/SqlLitePersistence.hs
index 9f639def3..1d7836b30 100644
--- a/hydra-node/src/Hydra/SqlLitePersistence.hs
+++ b/hydra-node/src/Hydra/SqlLitePersistence.hs
@@ -39,15 +39,17 @@ createPersistence ::
   m (Persistence a m)
 createPersistence fp = do
   liftIO . createDirectoryIfMissing True $ takeDirectory fp
-  let dbName = Query (T.pack $ "\"" <> fp <> "\"")
   liftIO $ withConnection fp $ \conn -> do
-    execute_ conn $ "CREATE TABLE IF NOT EXISTS " <> dbName <> " (id INTEGER PRIMARY KEY, msg SQLBlob)"
+    execute_ conn "pragma journal_mode = WAL;"
+    execute_ conn "pragma synchronous = normal;"
+    execute_ conn "pragma journal_size_limit = 6144000;"
+    execute_ conn "CREATE TABLE IF NOT EXISTS items (id INTEGER PRIMARY KEY, msg BLOB)"
   pure $
     Persistence
       { save = \a -> liftIO $ withConnection fp $ \conn' ->
-          execute conn' ("INSERT INTO " <> dbName <> " (msg) VALUES (?)") (Only $ Aeson.encode a)
+          execute conn' "INSERT INTO items (msg) VALUES (?)" (Only $ Aeson.encode a)
       , load = liftIO $ withConnection fp $ \conn' -> do
-          r <- query_ conn' ("SELECT msg FROM " <> dbName <> " ORDER BY id DESC LIMIT 1")
+          r <- query_ conn' "SELECT msg FROM items ORDER BY id DESC LIMIT 1"
           case r of
             [] -> pure Nothing
             (Record result : _) -> pure $ Aeson.decode result
@@ -68,18 +70,20 @@ createPersistenceIncremental ::
   m (PersistenceIncremental a m)
 createPersistenceIncremental fp = do
   liftIO . createDirectoryIfMissing True $ takeDirectory fp
-  let dbName = Query (T.pack $ "\"" <> fp <> "\"")
   liftIO $ withConnection fp $ \conn -> do
-    execute_ conn $ "CREATE TABLE IF NOT EXISTS " <> dbName <> " (id INTEGER PRIMARY KEY, msg SQLBlob)"
+    execute_ conn "pragma journal_mode = WAL;"
+    execute_ conn "pragma synchronous = normal;"
+    execute_ conn "pragma journal_size_limit = 6144000;"
+    execute_ conn "CREATE TABLE IF NOT EXISTS items (id INTEGER PRIMARY KEY, msg BLOB)"
   pure $
     PersistenceIncremental
       { append =
           -- TODO: try to batch insert here or use some other trick to make it faster
           \a -> liftIO $ withConnection fp $ \conn' ->
-            execute conn' ("INSERT INTO " <> dbName <> " (msg) VALUES (?)") (Only $ Aeson.encode a)
+            execute conn' "INSERT INTO items (msg) VALUES (?)" (Only $ Aeson.encode a)
       , loadAll = liftIO $ withConnection fp $ \conn' -> do
           let collectValues acc (Record i) = pure $ i : acc
-          bsVals <- fold_ conn' ("SELECT msg FROM " <> dbName <> " ORDER BY id DESC") [] collectValues
+          bsVals <- fold_ conn' "SELECT msg FROM items ORDER BY id DESC" [] collectValues
           pure $ mapMaybe Aeson.decode bsVals
       , dropDb = liftIO $ removeFile fp
       }

@noonio noonio closed this Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
green 💚 Low complexity or well understood feature
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

3 participants