-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Slow INSERT DATA #1683
Comments
@nicolano Can you please post a link to the full dataset and the query? A local link to our file system is also fine if that is easier for you. Here is the expected performance for the current version of the code: https://github.com/ad-freiburg/qlever/wiki/First-tests-with-SPARQL-1.1-Update . That's 2 µs / triple, not 70 ms / triple. |
I think you can use any dataset here, as this happened with every dataset I used. For a freshly started QLever endpoint, I get the expected results. The slow queries only occur
|
@nicolano I just tested it with your example update on OSM Switzerland (0.7 B triples) and it took 0.1 s. If you have a use case that is slow, it would be good to specify it exactly, so that we can try whether we can reproduce it. |
I am unable to get OLU to run on an OSM dataset for Bremen. The TTL extracts at https://osm2rdf.cs.uni-freiburg.de are only available for countries. Converting the snapshots from Geofabrik (both external and internal) with
Reproduction setupThe tests that I conducted are using the dataset for switzerland as of today or 2024-12-17. The QLever instances were running with increased timeouts and memory allocation (Qleverfile). I decreased ResultsRunning it on 2024-12-17 the deletion step was slow. The first batch of 32 values resulted in ~30 mio deletions and took ~94s. The number of deletion is high (1/30 of the total ~900 mio triples). The duration per triple is comparable to the 2us measured by Hannah. |
@Qup42 Could you please check the issue i have opened for your problem with the bremen dataset? I try to specify my use case exactly:
OLU inserts 320804 triples in batches of 1024 triples to the qlever instance, which takes 435321 ms (processing time taken from the qlever server log), or about 1.4ms per triple. |
Here you can find the QLever server log for reference |
I use the following query when inserting data into a graph
Depending on the size of the database, this can be very slow. For a SPARQL endpoint with the OSM dataset for Bremen (~30Million triples) it takes about 70ms per triple. Is this behaviour to be expected with the current version of QLever or is there a bug here (is the query I am using possibly not optimal)?
The text was updated successfully, but these errors were encountered: