-
Notifications
You must be signed in to change notification settings - Fork 39
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pure SQLite approach, with generate_series #23
base: master
Are you sure you want to change the base?
Conversation
Hey @asg017, I tried your version on my iMac (3,6 GHz Intel Core i9, 8 cores) and unfortunately, I am not seeing better timings than the best Rust ones. sqlite3 --version
3.34.0 2020-12-01 16:14:00 a26b6597e3ae272231b96f9982c3bcc17ddec2f2b6eb4df06a224b91089fed5b
# The Rust program
./target/release/threaded_batched 25,57s user 3,00s system 109% cpu 26,178 total
# The pure SQLite version
sqlite3 sqlite3.db '.read load.sql' 40,06s user 1,95s system 84% cpu 49,616 total Not sure why but it is maybe because the random generator is not that fast internally and this is also something that I find to be unfair as the original Blog Post wasn't specifying the random generator to use. It should be fairer to align the RNG of the Python and Rust programs to the one used by SQLite internally and if possible to seed the programs to be sure that the same databases are generated at the end. It is too easy to generate false random numbers very fast and gain speed this way. |
Hey @Kerollmops , thanks for giving it a shot! I agree that this may all come down to the (And the same point in the opposite way - maybe a custom RNG in rust/pypy might make those runs faster?) In general benchmarks are always hard, especially on different hardware. I'm guessing my specific machine isn't efficiently using all of its cores or something. And I'd say there's even more things you can benchmark when testing inserts in SQLite - here we insert 3 random numbers 100million times, but what if we instead insert 100million rows from a CSV, or 100million JSON objects? Of course this is out of scope of what this repo is, but I definitely love to think about it! And some other random notes: I tried added the pragma statements that made the other approaches faster, like so: PRAGMA journal_mode = OFF;
PRAGMA synchronous = 0;
PRAGMA cache_size = 1000000;
PRAGMA locking_mode = EXCLUSIVE;
PRAGMA temp_store = MEMORY; But it actually made this slower, by ~15s for me. I'll try to find out why, since I don't quite understand these pragma statements to begin with... |
With
|
Also, I tested without
I replaced a code for random generation in SQL and in Rust. The function SQLite random(): https://www.sqlite.org/c3ref/randomness.html |
Based from #11. This PR adds a new method of quickly inserting 100M rows into a SQLite table, using a pure SQLite approach.
Using the
generate_series
table valued function, we can insert 100M rows into a table very, very quickly with a singleINSERT INTO ... SELECT
statement.The question of "how fast is this" is difficult to answer. I'm not sure if it's something with my computer/background apps I'm running, but my benchmarks of the previous python/rust solutions have been all over the place. Using this approach, I get
57s
. For the pypy threaded_batched approach, I get5m50s
, and with the rust threaded_batched approach, I get1m9s
.Given this, I'm tempted to say that this actually might be the fastest solution, but given my wildly different benchmark times (pypy's
5m50s
vs126s
, rust's1m9s vs 30s
), I'd love to see how this runs on your computer, using the full benchmark!Why (I think) it's so fast
One solid reason why I think this approach is so fast is because it's only a single SQL statement that is being ran. All the other approaches are performing millions of
insert into ... values (...)
SQL statements at a time, which is a fast operation, but doing anything millions of times starts to add up. We see an obvious benefit when we start to batch statements together, but even batching 50 at a time is still 2 million distinct SQL statements that need to be ran.Also, by using
generate_series
, all operations are kept in C, which is very fast. Rust and pypy are also fast, but my guess is that context switching between Rust -> C or pypy -> C takes a non-trivial amount of time that adds up fast.Also, on
generate_series
vs recursive CTEs, thegenerate_series
page mentions that recursive CTEs are slower than table-valued functions.Note:
generate_series
may not be in your sqlite CLI by default, so make sure thesqlite3
CLI that you have contains thatWould love to hear what you think, and to see the "official" benchmark numbers!