Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework result processing #219

Merged
merged 37 commits into from
Aug 7, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
822e426
add WorkerMetadata class
nck-mlcnv Jun 30, 2023
a21c555
add StresstestMetadata class
nck-mlcnv Jun 30, 2023
64515f6
rename config classes
nck-mlcnv Jun 30, 2023
49287fc
rename package
nck-mlcnv Jun 30, 2023
3817979
make QueryExecutionStats a record
nck-mlcnv Jun 30, 2023
5fbd6b6
use primitives for constants
nck-mlcnv Jun 30, 2023
f31eea7
remove usage of Properties class for saving worker results
nck-mlcnv Jun 30, 2023
0ada2f7
add queryHash to WorkerMetadata
nck-mlcnv Jul 1, 2023
8897c39
add every unique queryID to StresstestMetadata
nck-mlcnv Jul 1, 2023
b900689
add new Metric abstract class and interfaces
nck-mlcnv Jul 1, 2023
d2d4d13
add static namespace classes
nck-mlcnv Jul 1, 2023
8d19105
introduce new StresstestResultProcessor
nck-mlcnv Jul 1, 2023
bd184ca
move storage package
nck-mlcnv Jul 1, 2023
e0bd17e
refactor Storage interface to make storages have only one public method
nck-mlcnv Jul 1, 2023
7fa5ad1
add MetricManager class
nck-mlcnv Jul 1, 2023
8257eb0
remove rp package
nck-mlcnv Jul 1, 2023
a104f86
use Optional class for nullable objects
nck-mlcnv Jul 1, 2023
60d3a75
bring back accidentally removed test classes
nck-mlcnv Jul 1, 2023
e1a9749
implement metrics
nck-mlcnv Jul 4, 2023
01ec1a6
remove unused code
nck-mlcnv Jul 5, 2023
6c74a9f
move classes
nck-mlcnv Jul 5, 2023
724f851
code cleanup
nck-mlcnv Jul 5, 2023
b559270
fix tests
nck-mlcnv Jul 6, 2023
1db6f0e
Stresstest shouldn't send the worker results during execution
nck-mlcnv Jul 6, 2023
2eb020a
cleanup
nck-mlcnv Jul 6, 2023
d21a7fe
fix the queryID property statements
nck-mlcnv Jul 6, 2023
0348907
update docs
nck-mlcnv Jul 7, 2023
38b2dcc
Merge remote-tracking branch 'origin/develop' into feature/rework-res…
nck-mlcnv Jul 10, 2023
f7eb287
add CSVStorage
nck-mlcnv Jul 11, 2023
afb20f0
small fix to CSVStorage
nck-mlcnv Jul 11, 2023
bc7a757
use apache jena to write csv files in the CSVStorage
nck-mlcnv Jul 18, 2023
468004a
add test for CSVStorage with simple test case
nck-mlcnv Jul 18, 2023
d02c3f5
docs cleanup
nck-mlcnv Jul 18, 2023
5248199
store Duration object as XSDDuration type
nck-mlcnv Jul 19, 2023
336bd87
add a prefix map
nck-mlcnv Jul 19, 2023
793dcbb
small fix
nck-mlcnv Jul 19, 2023
dac7816
add better messages to the CSVStorageTest
nck-mlcnv Aug 4, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 4 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,23 +21,19 @@ For further information visit:
- [iguana-benchmark.eu](http://iguana-benchmark.eu)
- [Documentation](http://iguana-benchmark.eu/docs/3.3/)

## Iguana Modules

Iguana consists of two modules
- **corecontroller** - this will benchmark the systems
- **resultprocessor** - this will calculate the metrics and save the raw benchmark results

### Available metrics

Per run metrics:
* Query Mixes Per Hour (QMPH)
* Number of Queries Per Hour (NoQPH)
* Number of Queries (NoQ)
* Average Queries Per Second (AvgQPS)
* Penalized Average Queries Per Second (PAvgQPS)

Per query metrics:
* Queries Per Second (QPS)
* number of successful and failed queries
* Penalized Queries Per Second (PQPS)
* Number of successful and failed queries
* result size
* queries per second
* sum of execution times
Expand All @@ -46,7 +42,7 @@ Per query metrics:

### Prerequisites

In order to run Iguana, you need to have `Java 11`, or greater, installed on your system.
In order to run Iguana, you need to have `Java 17`, or greater, installed on your system.

### Download
Download the newest release of Iguana [here](https://github.com/dice-group/IGUANA/releases/latest), or run on a unix shell:
Expand Down
2 changes: 2 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,9 +51,11 @@ Per run metrics:
* Number of Queries Per Hour (NoQPH)
* Number of Queries (NoQ)
* Average Queries Per Second (AvgQPS)
* Penalized Average Queries Per Second (PAvgQPS)

Per query metrics:
* Queries Per Second (QPS)
* Penalized Queries Per Second (PQPS)
* Number of successful and failed queries
* result size
* queries per second
Expand Down
126 changes: 38 additions & 88 deletions docs/develop/extend-metrics.md
Original file line number Diff line number Diff line change
@@ -1,107 +1,57 @@
# Extend Metrics

To implement a new metric, create a new class that extends the abstract class `AbstractMetric`:
To implement a new metric, create a new class that extends the abstract class `Metric`:

```java
package org.benchmark.metric;

@Shorthand("MyMetric")
public class MyMetric extends AbstractMetric{
public class MyMetric extends Metric {

@Override
public void receiveData(Properties p) {
// ...
}

@Override
public void close() {
callbackClose();
super.close();

}

protected void callbackClose() {
// your close method
}
public MyMetric() {
super("name", "abbreviation", "description");
}
}
```

## Receive Data

This method will receive all the results during the benchmark.

You'll receive a few values regarding each query execution. Those values include the amount of time the execution took, if it succeeded, and if not, the reason why it failed, which can be either a timeout, a wrong HTTP Code or an unknown error.
Further on you also receive the result size of the query.

If your metric is a single value metric, you can use the `processData` method, which will automatically add each value together.
However, if your metric is query specific, you can use the `addDataToContainter` method. (Look at the [QPSMetric](https://github.com/dice-group/IGUANA/blob/master/iguana.resultprocessor/src/main/java/org/aksw/iguana/rp/metrics/impl/QPSMetric.java))
You can then choose if the metric is supposed to be calculated for each Query, Worker
or Task by implementing the appropriate interfaces: `QueryMetric`, `WorkerMetric`, `TaskMetric`.

Be aware that both methods will save the results for each used worker. This allows the calculation of the overall metric, as well as the metric for each worker itself.
You can also choose to implement the `ModelWritingMetric` interface, if you want your
metric to create a special RDF model, that you want to be added to the result model.

We will stick to the single-value metric for now.


The following shows an example, that retrieves every possible value and saves the time and success:
The following gives you an examples on how to work with the `data` parameter:

```java
@Override
public void receiveData(Properties p) {

double time = Double.parseDouble(p.get(COMMON.RECEIVE_DATA_TIME).toString());
long tmpSuccess = Long.parseLong(p.get(COMMON.RECEIVE_DATA_SUCCESS).toString());
long success = (tmpSuccess > 0) ? 1 : 0;
long failure = (success == 1) ? 0 : 1;
long timeout = (tmpSuccess == COMMON.QUERY_SOCKET_TIMEOUT) ? 1 : 0;
long unknown = (tmpSuccess == COMMON.QUERY_UNKNOWN_EXCEPTION) ? 1 : 0;
long wrongCode = (tmpSuccess == COMMON.QUERY_HTTP_FAILURE) ? 1 : 0;

if(p.containsKey(COMMON.RECEIVE_DATA_SIZE)) {
size = Long.parseLong(p.get(COMMON.RECEIVE_DATA_SIZE).toString());
@Override
public Number calculateTaskMetric(StresstestMetadata task, List<QueryExecutionStats>[][] data) {
for (WorkerMetadata worker : task.workers()) {
for (int i = 0; i < worker.noOfQueries(); i++) {
// This list contains every query execution statistics of one query
// from the current worker
List<QueryExecutionStats> execs = data[worker.workerID()][i];
}
}
return BigInteger.ZERO;
}

Properties results = new Properties();
results.put(TOTAL_TIME, time);
results.put(TOTAL_SUCCESS, success);

Properties extra = getExtraMeta(p);
processData(extra, results);
}
```

## Close

In this method you should calculate your metric and send the results.
An example:

```java
protected void callbackClose() {
// create a model that contains the results
Model m = ModelFactory.createDefaultModel();

Property property = getMetricProperty();
Double sum = 0.0;

// Go over each worker and add metric results to model
for(Properties key : dataContainer.keySet()){
Double totalTime = (Double) dataContainer.get(key).get(TOTAL_TIME);
Integer success = (Integer) dataContainer.get(key).get(TOTAL_SUCCESS);

Double noOfQueriesPerHour = hourInMS * success * 1.0 / totalTime;
sum += noOfQueriesPerHour;
Resource subject = getSubject(key);

m.add(getConnectingStatement(subject));
m.add(subject, property, ResourceFactory.createTypedLiteral(noOfQueriesPerHour));
@Override
public Number calculateWorkerMetric(WorkerMetadata worker, List<QueryExecutionStats>[] data) {
for (int i = 0; i < worker.noOfQueries(); i++) {
// This list contains every query execution statistics of one query
// from the given worker
List<QueryExecutionStats> execs = data[i];
}
return BigInteger.ZERO;
}

// Add overall metric to model
m.add(getTaskResource(), property, ResourceFactory.createTypedLiteral(sum));

// Send data to storage
sendData(m);
}
```

## Constructor

The constructor parameters are provided the same way as for the tasks. Thus, simply look at the [Extend Task](../extend-task) page.
@Override
@Nonnull
public Model createMetricModel(StresstestMetadata task, Map<String, List<QueryExecutionStats>> data) {
for (String queryID : task.queryIDS()) {
// This list contains every query execution statistics of one query from
// every worker that executed this querys
List<QueryExecutionStats> execs = data.get(queryID);
}
}
```
40 changes: 8 additions & 32 deletions docs/develop/extend-result-storages.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,23 @@
# Extend Result Storages

If you want to use a different storage other than RDF, you can implement a different storage solution.

The current implementation of Iguana is highly optimized for RDF, thus we recommend you to work on top of the `TripleBasedStorage` class:
If you want to use a different storage other than RDF, you can implement a different storage solution.

```java
package org.benchmark.storage;

@Shorthand("MyStorage")
public class MyStorage extends TripleBasedStorage {

@Override
public void commit() {

}

@Override
public String toString(){
return this.getClass().getSimpleName();
}
}
```

## Commit
public class MyStorage implements Storage {

This method should take all the current results, store them, and remove them from the memory.

You can access the results at the Jena Model `this.metricResults`.

For example:

```java
@Override
public void commit() {
try (OutputStream os = new FileOutputStream(file.toString(), true)) {
RDFDataMgr.write(os, metricResults, RDFFormat.NTRIPLES);
metricResults.removeAll();
} catch (IOException e) {
LOGGER.error("Could not commit to NTFileStorage.", e);
@Override
public void storeResults(Model m) {
// method for storing model
}
}
```

The method `storeResults` will be called at the end of the task. The model from
the parameter contains the final result model for that task.

## Constructor

The constructor parameters are provided the same way as for the tasks. Thus, simply look at the [Extend Task](../extend-task) page.
2 changes: 1 addition & 1 deletion docs/download.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Prerequisites

You need to have Java 11 or higher installed.
You need to have Java 17 or higher installed.


In Ubuntu, you can install it by executing the following command:
Expand Down
23 changes: 13 additions & 10 deletions docs/shorthand-mapping.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
| Shorthand | Class Name |
|------------------------|-----------------------------------------------------------|
| Stresstest | `org.aksw.iguana.cc.tasks.impl.Stresstest` |
| Stresstest | `org.aksw.iguana.cc.tasks.stresstest.Stresstest` |
| ---------- | ------- |
| lang.RDF | `org.aksw.iguana.cc.lang.impl.RDFLanguageProcessor` |
| lang.SPARQL | `org.aksw.iguana.cc.lang.impl.SPARQLLanguageProcessor` |
Expand All @@ -15,13 +15,16 @@
| CLIInputPrefixWorker | `org.aksw.iguana.cc.worker.impl.CLIInputPrefixWorker` |
| MultipleCLIInputWorker | `org.aksw.iguana.cc.worker.impl.MultipleCLIInputWorker` |
| ---------- | ------- |
| NTFileStorage | `org.aksw.iguana.rp.storages.impl.NTFileStorage` |
| RDFFileStorage | `org.aksw.iguana.rp.storages.impl.RDFFileStorage` |
| TriplestoreStorage | `org.aksw.iguana.rp.storages.impl.TriplestoreStorage` |
| NTFileStorage | `org.aksw.iguana.cc.tasks.stresstest.storage.impl.NTFileStorage` |
| RDFFileStorage | `org.aksw.iguana.cc.tasks.stresstest.storage.impl.RDFFileStorage` |
| TriplestoreStorage | `org.aksw.iguana.cc.tasks.stresstest.storage.impl.TriplestoreStorage` |
| ---------- | ------- |
| QPS | `org.aksw.iguana.rp.metrics.impl.QPSMetric` |
| AvgQPS | `org.aksw.iguana.rp.metrics.impl.AvgQPSMetric` |
| NoQ | `org.aksw.iguana.rp.metrics.impl.NoQMetric` |
| NoQPH | `org.aksw.iguana.rp.metrics.impl.NoQPHMetric` |
| QMPH | `org.aksw.iguana.rp.metrics.impl.QMPHMetric` |
| EachQuery | `org.aksw.iguana.rp.metrics.impl.EQEMetric` |
| QPS | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.QPS` |
| PQPS | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.PQPS` |
| AvgQPS | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.AvgQPS` |
| PAvgQPS | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.PAvgQPS` |
| NoQ | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.NoQ` |
| NoQPH | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.NoQPH` |
| QMPH | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.QMPH` |
| AES | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.AggregatedExecutionStatistics` |
| EachQuery | `org.aksw.iguana.cc.tasks.stresstest.metrics.impl.EachExecutionStatistic` |
16 changes: 8 additions & 8 deletions docs/usage/configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ A connection has the following items:
* `updateEndpoint` - if your HTTP endpoint is an HTTP POST endpoint, you can set it with this item (optional)
* `user` - for authentication purposes (optional)
* `password` - for authentication purposes (optional)
* `version` - sets the version of the tested triplestore; if this is set, the resource URI will be ires:name-version (optional)
* `version` - sets the version of the tested triplestore (optional)

At first, it might be confusing to set up both an `endpoint` and `updateEndpoint`, but it is used, when you want your test to perform read and write operations simultaneously, for example, to test the impact of updates on the read performance of your triple store.

Expand Down Expand Up @@ -190,17 +190,18 @@ The `metrics` setting lets Iguana know what metrics you want to include in the r
Iguana supports the following metrics:

* Queries Per Second (`QPS`)
* Penalized Queries Per Second (`PQPS`)
* Average Queries Per Second (`AvgQPS`)
* Penalized Average Queries Per Second (`PAvgQPS`)
* Query Mixes Per Hour (`QMPH`)
* Number of Queries successfully executed (`NoQ`)
* Number of Queries per Hour (`NoQPH`)
* Each query execution (`EachQuery`) - experimental
* Each Execution Statistic (`EachQuery`)
* Aggregated Execution Statistics (`AES`)

For more details on each of the metrics have a look at the [Metrics](../metrics) page.

The `metrics` setting is optional and the default is set to every available metric, except `EachQuery`.

Let's look at an example:
The `metrics` setting is optional and the default is set to this:

```yaml
metrics:
Expand All @@ -209,11 +210,10 @@ metrics:
- className: "QMPH"
- className: "NoQ"
- className: "NoQPH"
- className: "AES"
```

In this case we use every metric that Iguana has implemented. This is the default.

However, you can also just use a subset of these metrics:
You can also use a subset of these metrics:

```yaml
metrics:
Expand Down
2 changes: 1 addition & 1 deletion docs/usage/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ Iguana will then let every Worker execute these queries against the endpoint.

## Prerequisites

You need to have Java 11 or higher installed.
You need to have Java 17 or higher installed.

In Ubuntu you can install it by executing the following command:
```bash
Expand Down
Loading
Loading