This guide explains how to install, build and run the benchmark.
The benchmark requires the following platforms.
Build:
- JDK 8 (note that JOPA currently does not support later versions of Java)
- Apache Maven 3.3 or later (to build the benchmark artifacts)
Execution:
- RDF4J-compatible repository server
- A Debian shell (e.g.,
dash
,bash
)*
* A shell is required to execute the benchmark using the attached scripts
Configuration of the object-triple mapping (OTM) library-specific runners is done through config.properties
files in the respective modules. The files can be found in src/main/resources
.
For example, the AliBaba configuration is thus in alibaba-benchmark/src/main/resources/config.properties
.
Each configuration has two parameters:
url
- URL of the repository to be used for benchmark executionmemory.runtime
- time in milliseconds specifying for how long the memory benchmark should be executed. During this time, garbage collection data are gathered for later analysis.
The only exception to these configuration rules is Empire for which repository URL is set in a special file called empire.configuration
(placed in the same location as config.properties
). This is required by Empire itself.
To build the benchmark (including runners of the individual OTM frameworks), invoke mvn clean package
in the root benchmark directory.
The build produces JAR files for each of the OTM framework benchmark runners, which can be found in the target
directory of each respective module. These
JAR files are used by the benchmark executor scripts.
The benchmark cleans up the repository after each round and uses a new persistence context for each round.
The execution can be configured using the following parameters:
- -w is the number of warmup rounds, which are not measured.
- -r is the number of measured rounds.
- -f is the scaling factor, which configures the size of the benchmark dataset. Default is 1.
- -o is the file into which individual round execution times should be written. This is useful for separate processing of the raw execution times e.g. in R.
- -m is the file into which memory tracking statistics should be output. These are collected using
jstat
.
The easiest way to execute the benchmark is to use the associated benchmark.sh
script. This script contains a predefined configuration which:
- Executes the benchmark using each of the supported OTM frameworks
- Runs the benchmark for each framework with multiple heap size configurations (using the
Xmx
andXms
JVM parameters) - Uses a GraphDB instance as a repository
- Restarts the GraphDB between operation executions
- Runs the benchmark repeatedly to get performance data from multiple JVM executions
- Outputs raw performance data (using the -o switch into a directory called
data
) as well as overall statistics (written intobenchmark.log
)
The script expects GraphDB to be installed in ~/Java/graphdb
and writes GraphDB process id to /tmp/.graphdbpid
(it is used to stop the GraphDB instance when needed).
The current configuration is:
- Execution count: 5
- Heap Memory sizes: 32MB, 64MB, 128MB, 256MB, 512MB, 1GB
- Warmup rounds: 20
- Measured rounds: 100
Memory benchmark using the memory-benchmark-gc.sh
is the preferred way of benchmarking memory usage of the tested libraries. It runs a subset of operations of the
performance benchmark in a loop for a predefined time interval (configured using the config.properties
files described above), outputting garbage collection (GC) data into
a preconfigured file. The GC logging is set using the -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -Xloggc
JVM parameters.
The script runs the memory benchmark for each of the supported OTM framework, restarting the GraphDB repository between running individual frameworks.
All the GC data are written into separate files named after the respective OTM frameworks (e.g., alibaba-gc.log
) and stored in a directory called memory
.
jstat
-based memory benchmark is not used anymore because of its inconclusive results.
Adapting the benchmark to a new library consists of creating implementations of several key interfaces and abstract classes. The benchmark algorithm itself is implemented in the core module, the OTM benchmarking modules need to provide only parts specific to the evaluated framework.
The best way to integrate it into the benchmark is to create a new Maven module and add it into the root pom.xml
, so that it is built together
with all the other modules.
The object model is specified in the core module in the form of interfaces. Modules testing OTM frameworks need to provide
concrete implementations (or their own interfaces from which proxies are generated by the OTM framework) specifying mapping in
the form supported by the selected OTM framework. Event
, Occurrence
and OccurrenceReport
use generics, so that implementations
can use the correct concrete type.
Benchmark operations (create, batch create, retrieve, retrieve all, update, delete) are performed by implementations of CRUD operation
executors - Saver
, Finder
, Updater
, Deleter
. These interfaces can be found in the cz.cvut.kbss.benchmark.util
package in core.
Instructions for implementations are provided in javadoc of the respective executors.
DataGenerator
provides the benchmark with data. Again, the configuration of how many instances of what classes and how to interconnect them
is in DataGenerator
. Concrete subclasses only need to provide factory methods for creating instances of the model classes and invoke
generate
to pre-generate test data. Consult the class' javadoc for details.
Benchmark runners represent implementations for the individual benchmark operations. They should implement suitable setup methods, mostly to
create DataGenerator
, let it generate test data and initialize the OTM framework. Then, in execute
, they invoke appropriate superclass methods
for the benchmarked operation, e.g. batch create runner should invoke executeBatchCreate
, passing operation executor as a parameter.
It is also important to implement an tear down method, which should clear the repository after each round (by invoking BenchmarkUtil.clearRepository()
with
repository URL). Usually, it is good to extract this behaviour into a common superclass for all benchmark runners.
Last, it is necessary to provide an implementation of AbstractBenchmark
(AbstractMemoryBenchmark
), which creates an appropriate benchmark runner
based on application CLI parameters. This implementation should have a main
method, which create a new instance of the benchmark application class
and invoke run
with command line parameters.
To get a better understanding of how to extends the benchmark for a concrete OTM framework, see the classes and interfaces in module core and their implementations, for example, in module jopa-benchmark or empire-benchmark.