Skip to content

Commit

Permalink
added gremlin and cypher known limitation paragraph
Browse files Browse the repository at this point in the history
  • Loading branch information
lvca committed Nov 14, 2023
1 parent 2821469 commit 9669076
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 3 deletions.
9 changes: 8 additions & 1 deletion src/main/asciidoc/api/cypher.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ To use Cypher queries you can do directly from the <<Java-API,Java API>>, by usi

NOTE: Consider that Cypher queries are translated into Gremlin.
As much as ArcadeDB's Gremlin implementation is optimized, queries run slower than using native SQL or Java API.
Based on some internal benchmarks, ArcadeDB's native SQL (SELECT or MATCH) is much faster than Cypher (even 2,000% faster!).
Based on some internal benchmarks, ArcadeDB's native SQL (<<SQL-Select,SELECT>> or <<SQL-Match,MATCH>>) is much faster than Cypher (even 2,000% faster!).
The difference is larger with complex queries that work on many records.
A simple Cypher MATCH with a lookup with a simple condition will be closer to native SQL performance, but a scan or huge traversal will be much slower with Cypher, because of the underlying usage of Gremlin is not optimized for extreme performance.

Expand Down Expand Up @@ -72,6 +72,13 @@ curl -X POST "http://localhost:2480/command/graph" -d "{'language': 'cypher', 'c

---

[discrete]
==== Known Limitations with the Cypher Implementation

- ArcadeDB automatically handles conversion between compatible types, such as strings and numbers when possible. Cypher does not. So if you define a schema with ArcadeDB API and then you use Cypher for a traversal, assure you're using the type you defined in the schema. For example, if you define a property "id" to be a string, and then you're executing traversal by using integers for the ids, the result could be unpredictable.
- ArcadeDB's Cypher implementation is based on the https://github.com/opencypher/cypher-for-gremlin[Cypher For Gremlin] Open Source transpiler. This project is not actively maintained by Open Cypher anymore, so issues in the transpiler are hard to fix. Please bear this in mind if you're moving a large project in Cypher into ArcadeDB. The best way to address such issues is to rewrite the faulty cypher query into ArcadeDB <<SQL-Select,SELECT>> or <<SQL-Match,MATCH>> statement.


For more information about Cypher:

- https://opencypher.org/[Open Cypher]
Expand Down
11 changes: 9 additions & 2 deletions src/main/asciidoc/api/gremlin.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

image:../images/edit.png[link="https://github.com/ArcadeData/arcadedb-docs/blob/main/src/main/asciidoc/api/gremlin.adoc" float=right]

ArcadeDB supports Gremlin 3.6.x as query engine and in the <<Gremlin-Server,Gremlin Server>>.
ArcadeDB supports Gremlin 3.7.x as query engine and in the <<Gremlin-Server,Gremlin Server>>.
You can execute a gremlin query from pretty much everywhere.

If you're using ArcadeDB as embedded, please add the dependency to the `arcadedb-gremlin` library.
Expand All @@ -21,7 +21,7 @@ If you're using Maven include this dependency in your `pom.xml` file.
[discrete]
==== Gremlin from Java API

In order to execute a Gremlin query, you need to include the relevant jars, i.e. the apache-tinkerpop-gremlin-server libraries, plus gremlin-groovy, plus opencypher-util-9.0, in your class path.
In order to execute a Gremlin query, you need to include the relevant jars, i.e. the `apache-tinkerpop-gremlin-server` libraries, plus `gremlin-groovy`, plus `opencypher-util-9.0`, in your class path.
To execute a Gremlin query, use "gremlin" as first parameter in the query method.
Example:

Expand Down Expand Up @@ -86,6 +86,7 @@ You can edit the database name or add more databases under the Gremlin Server by

NOTE: If you're importing a database, use "graph" as the name of the database to be available through the <<Gremlin-Server,Gremlin Server>>


Start the Gremlin Server with `OpenBeer` as imported database with name `graph`, so it can be used through the Gremlin Server.

[source,shell]
Expand All @@ -111,6 +112,12 @@ var connection = DriverRemoteConnection.using(cluster);
var g = new GraphTraversalSource(connection);
```

[discrete]
==== Known Limitations with the Gremlin Implementation

- ArcadeDB automatically handles conversion between compatible types, such as strings and numbers when possible. Gremlin does not. So if you define a schema with ArcadeDB API and then you use Gremlin for a traversal, assure you're using the type you defined in the schema. For example, if you define a property "id" to be a string, and then you're executing traversal by using integers for the ids, the result could be unpredictable.
- ArcadeDB's Gremlin implementation always tries to optimize the Gremlin traversal by using ArcadeDB's internal query. While this is easy with simple traversal using `.has()` and `.hasLabel()`, it is unable to optimize more complex traversal with `select()` and `where()`. Instead of executing an optimized query, it could result in a full scan of the type, leaving to Gremlin the filtering. While the result of the traversal is still correct, the performance would be heavily impacted. Please consider using ArcadeDB's SQL or Native Select for the best performance with complex traversals.


For more information about Gremlin:

Expand Down

0 comments on commit 9669076

Please sign in to comment.