chore: update introduction to structure and unstructure model

GreptimeTeam · Aug 21, 2024 · 34f20b2 · 34f20b2
1 parent 9dbfbbd
commit 34f20b2
Showing 1 changed file with 19 additions and 0 deletions.
diff --git a/log-benchmark/README.md b/log-benchmark/README.md
@@ -9,6 +9,25 @@ Here are the versions of databases we used in the benchmark
 | Clickhouse    | 24.9.1.219 |
 | Elasticsearch | 8.15.0     |
 
+## Structured model vs Unstructured model
+We divide test into two parts, using structured model and unstructured model accordingly. You can also see the difference in create table clause.
+
+__Structured model__
+
+The log data is pre-processed into columns by vector. For example an insert request looks like following
+```SQL
+INSERT INTO test_table (bytes, http_version, ip, method, path, status, user, timestamp) VALUES ()
+```
+The goal is to test string/text support for each database. In real scenarios it means the datasource(or log data producers) have separate fields defined, or have already processed the raw input.
+
+__Unstructured model__
+
+The log data is inserted as a long string, and then we build fulltext index upon these strings. For example an insert request looks like following
+```SQL
+INSERT INTO test_table (message, timestamp) VALUES ()
+```
+The goal is to test fuzzy search performance for each database. In real scenarios it means the log is produced by some kind of middleware and inserted directly into the database.
+
 ## Creating tables
 See [here](./create_table.sql) for GreptimeDB and Clickhouse's create table clause. 
 The mapping of Elastic search is created automatically.