Introduction

SQL2Gremlin teaches Apache TinkerPop's Gremlin graph traversal language using typical patterns found when querying data with SQL. The format of the Gremlin results will not necessarily match the format of the SQL results. While SQL can only provide results in a tabular form, Gremlin provides various ways to structure a result set. Next, the Gremlin queries demonstrated are for elucidatory purposes and may not be the optimal way to retrieve the desired data. If a particular query runs slow and an optimal solution is desired, please do not hesitate to ask for help on the Gremlin-users mailing list. Finally, the SQL examples presented make use of T-SQL syntax. MySQL users may not know some of the expressions (e.g. paging), but should be able to understand the purpose of the query.

If you would like to see other SQL2Gremlin translations using the Northwind dataset, please provide a ticket on the SQL2Gremlin issue tracker.

Note	Acknowledgement Gremlin artwork by Ketrina Yim — "safety first."

Northwind Graph Model

Getting Started

To get started download the latest version of the Gremlin shell from www.tinkerpop.com and extract it. Then download the file northwind.groovy and start your Gremlin shell:

# find the latest Gremlin shell download
MIRRORS_URL=`curl -s http://tinkerpop.apache.org/ | grep -o 'http[s]*://.*\-console-.*\-bin\.zip'`
LATEST_URL=`curl -s $MIRRORS_URL | egrep -o 'http[s]?://[^"]*-console-[^"]*-bin.zip' | head -1`
LATEST_FILENAME=`echo ${LATEST_URL} | grep -o '[^/]*$'`
LATEST_DIRECTORY=`echo ${LATEST_FILENAME} | sed 's/-bin\.zip//'`

# download and extract the Gremlin shell
wget -q ${LATEST_URL}
unzip -q ${LATEST_FILENAME}

# start the Gremlin shell
wget -q http://sql2gremlin.com/assets/northwind.groovy -O /tmp/northwind.groovy
${LATEST_DIRECTORY}/bin/gremlin.sh -i /tmp/northwind.groovy

In your Gremlin shell create the Northwind graph, a graph traversal source and you’re ready to go:

gremlin> graph = NorthwindFactory.createGraph()
==>tinkergraph[vertices:3209 edges:6177]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:3209 edges:6177], standard]

The graph is now filled with vertices and edges. Vertices have a number of properties, depending on what they represent. The following properties are globally indexed for fast lookups:

VertexLabel	Property	Description
region	name	The code or name for a specific region.
country	name	The code or name for a specific country.
category	name	The name of a specific category.
customer	customerId	The well-known Northwind customer identifier (e.g. ALFKI).

Select

Select all

This sample shows how to query all categories.

SQL

SELECT *
  FROM Categories

Gremlin

g.V().hasLabel("category").valueMap()

References:

Has Step
ValueMap Step

Select single column

This sample shows how to query the names of all categories.

SQL

SELECT CategoryName
  FROM Categories

Gremlin

g.V().hasLabel("category").values("name")

References:

Has Step

Select multiple columns

This sample shows how to query the names and descriptions of all categories.

SQL

SELECT CategoryName, Description
  FROM Categories

Gremlin

g.V().hasLabel("category").valueMap("name", "description")

References:

Has Step
ValueMap Step

Select calculated column

This sample shows how to query the length of the name of all categories.

SQL

SELECT LEN(CategoryName)
  FROM Categories

Gremlin

g.V().hasLabel("category").values("name").
      map {it.get().length()}

References:

Has Step
Lambda Steps
String::length()

Select distinct values

This sample shows how to query all distinct lengths of category names.

SQL

SELECT DISTINCT LEN(CategoryName)
  FROM Categories

Gremlin

g.V().hasLabel("category").values("name").
      map {it.get().length()}.dedup()

References:

Dedup Step
Has Step
Lambda Steps
String::length()

Select scalar value

This sample shows how to query the length of the longest category name.

SQL

SELECT MAX(LEN(CategoryName))
  FROM Categories

Gremlin

g.V().hasLabel("category").values("name").
      map {it.get().length()}.max()

References:

Has Step
Lambda Steps
Max Step
String::length()

Filtering

Filter by equality

This sample shows how to query all products having no unit in stock.

SQL

SELECT ProductName, UnitsInStock
  FROM Products
 WHERE UnitsInStock = 0

Gremlin

g.V().has("product", "unitsInStock", 0).valueMap("name", "unitsInStock")

References:

Has Step
ValueMap Step

Filter by inequality

This sample shows how to query all products where the number of units on order is not 0.

SQL

SELECT ProductName, UnitsOnOrder
  FROM Products
 WHERE NOT(UnitsOnOrder = 0)

Gremlin

g.V().has("product", "unitsOnOrder", neq(0)).
      valueMap("name", "unitsOnOrder")

References:

Has Step
ValueMap Step
A Note on Predicates

Filter by value range

This sample shows how to query all products with a minimum price of 5 and maximum price below 10.

SQL

SELECT ProductName, UnitPrice
  FROM Products
 WHERE UnitPrice >= 5 AND UnitPrice < 10

Gremlin

g.V().has("product", "unitPrice", between(5f, 10f)).
      valueMap("name", "unitPrice")

References:

Has Step
ValueMap Step
A Note on Predicates

Multiple filter conditions

This sample shows how to query all discontinued products that are still not out of stock.

SQL

SELECT ProductName, UnitsInStock
  FROM Products
 WHERE Discontinued = 1
   AND UnitsInStock <> 0

Gremlin

g.V().has("product", "discontinued", true).has("unitsInStock", neq(0)).
     valueMap("name", "unitsInStock")

References:

Has Step
ValueMap Step
A Note on Predicates

Ordering

Order by value ascending

This sample shows how to query all products ordered by unit price.

SQL

  SELECT ProductName, UnitPrice
    FROM Products
ORDER BY UnitPrice ASC

Gremlin

g.V().hasLabel("product").order().by("unitPrice", incr).
      valueMap("name", "unitPrice")

References:

Has Step
Order Step
ValueMap Step

Order by value descending

This sample shows how to query all products ordered by descending unit price.

SQL

  SELECT ProductName, UnitPrice
    FROM Products
ORDER BY UnitPrice DESC

Gremlin

g.V().hasLabel("product").order().by("unitPrice", decr).
      valueMap("name", "unitPrice")

References:

Has Step
Order Step
ValueMap Step

Paging

Limit number of results

This sample shows how to query the first 5 products ordered by unit price.

SQL

  SELECT TOP (5) ProductName, UnitPrice
    FROM Products
ORDER BY UnitPrice

Gremlin

g.V().hasLabel("product").order().by("unitPrice", incr).limit(5).
      valueMap("name", "unitPrice")

References:

Has Step
Limit Step
Order Step
ValueMap Step

Paged result set

This sample shows how to query the next 5 products (page 2) ordered by unit price.

SQL

   SELECT Products.ProductName, Products.UnitPrice
     FROM (SELECT ROW_NUMBER()
                    OVER (
                      ORDER BY UnitPrice) AS [ROW_NUMBER],
                  ProductID
             FROM Products) AS SortedProducts
       INNER JOIN Products
               ON Products.ProductID = SortedProducts.ProductID
    WHERE [ROW_NUMBER] BETWEEN 6 AND 10
 ORDER BY [ROW_NUMBER]

Gremlin

g.V().hasLabel("product").order().by("unitPrice", incr).range(5, 10).
      valueMap("name", "unitPrice")

References:

Has Step
Range Step
Order Step
ValueMap Step

Grouping

Group by value

This sample shows how to determine the most used unit price.

SQL

  SELECT TOP(1) UnitPrice
    FROM (SELECT Products.UnitPrice,
                 COUNT(*) AS [Count]
            FROM Products
        GROUP BY Products.UnitPrice) AS T
ORDER BY [Count] DESC

Gremlin

g.V().hasLabel("product").groupCount().by("unitPrice").
      order(local).by(values, decr).select(keys).limit(local, 1)

References:

Has Step
GroupCount Step
Limit Step
Order Step
Select Step
ValueMap Step

Joining

Inner join

This sample shows how to query all products from a specific category.

SQL

    SELECT Products.ProductName
      FROM Products
INNER JOIN Categories
        ON Categories.CategoryID = Products.CategoryID
     WHERE Categories.CategoryName = 'Beverages'

Gremlin

g.V().has("name","Beverages").in("inCategory").values("name")

References:

Has Step
Vertex Steps

Left join

This sample shows how to count the number of orders for each customer.

SQL

    SELECT Customers.CustomerID, COUNT(Orders.OrderID)
      FROM Customers
 LEFT JOIN Orders
        ON Orders.CustomerID = Customers.CustomerID
  GROUP BY Customers.CustomerID

Gremlin

g.V().hasLabel("customer").match(
  __.as("c").values("customerId").as("customerId"),
  __.as("c").out("ordered").count().as("orders")
).select("customerId", "orders")

References:

As Step
Count Step
Has Step
Match Step
Select Step
Vertex Steps

Miscellaneous

Concatenate

This sample shows how to concatenate two result sets (customers whos company name starts with 'A' and customers whos company name starts with 'E').

SQL

SELECT [customer].[CompanyName]
  FROM [Customers] AS [customer]
 WHERE [customer].[CompanyName] LIKE 'A%'
 UNION ALL
SELECT [customer].[CompanyName]
  FROM [Customers] AS [customer]
 WHERE [customer].[CompanyName] LIKE 'E%'

Gremlin

g.V().hasLabel("customer").union(
  filter {it.get().value("company")[0] == "A"},
  filter {it.get().value("company")[0] == "E"}).values("company")

References:

Has Step
Lambda Steps
Union Step

Create, Update and Delete

This sample shows how to create new vertices and edges, how to update them and finally how to delete them.

SQL

INSERT INTO [Categories] ([CategoryName], [Description])
     VALUES (N'Merchandising', N'Cool products to promote Gremlin')

INSERT INTO [Products] ([ProductName], [CategoryID])
     SELECT TOP (1) N'Red Gremlin Jacket', [CategoryID]
       FROM [Categories]
      WHERE [CategoryName] = N'Merchandising'

UPDATE [Products]
   SET [Products].[ProductName] = N'Green Gremlin Jacket'
 WHERE [Products].[ProductName] = N'Red Gremlin Jacket'

DELETE FROM [Products]
 WHERE [Products].[ProductName] = N'Green Gremlin Jacket'

DELETE FROM [Categories]
 WHERE [Categories].[CategoryName] = N'Merchandising'

Gremlin

c = graph.addVertex(label, "category",
          "name", "Merchandising",
          "description", "Cool products to promote Gremlin")

p = graph.addVertex(label, "product",
          "name", "Red Gremlin Jacket")

p.addEdge("inCategory", c)

g.V().has("product", "name", "Red Gremlin Jacket").
      property("name", "Green Gremlin Jacket").iterate()

p.remove()
g.V().has("category", "name", "Merchandising").drop()

References:

Mutating the Graph
Has Step
Drop Step

CTE

Recursive query

This sample shows how to query all employees, their supervisors and their hierarchy level depending on where the employee is located in the supervisor chain.

SQL

WITH EmployeeHierarchy (EmployeeID,
                        LastName,
                        FirstName,
                        ReportsTo,
                        HierarchyLevel) AS
(
    SELECT EmployeeID
         , LastName
         , FirstName
         , ReportsTo
         , 1 as HierarchyLevel
      FROM Employees
     WHERE ReportsTo IS NULL

     UNION ALL

    SELECT e.EmployeeID
         , e.LastName
         , e.FirstName
         , e.ReportsTo
         , eh.HierarchyLevel + 1 AS HierarchyLevel
      FROM Employees e
INNER JOIN EmployeeHierarchy eh
        ON e.ReportsTo = eh.EmployeeID
)
  SELECT *
    FROM EmployeeHierarchy
ORDER BY HierarchyLevel, LastName, FirstName

Gremlin (hierarchical)

g.V().hasLabel("employee").where(__.not(out("reportsTo"))).
      repeat(__.in("reportsTo")).emit().tree().by(map {
        def employee = it.get()
        employee.value("firstName") + " " + employee.value("lastName")
      }).next()

You can also produce the same tabular result that’s produced by SQL.

Gremlin (tabular)

g.V().hasLabel("employee").where(__.not(out("reportsTo"))).
      repeat(__.as("reportsTo").in("reportsTo").as("employee")).emit().
      select(last, "reportsTo", "employee").by(map {
        def employee = it.get()
        employee.value("firstName") + " " + employee.value("lastName")
      })

References:

As Step
Has Step
Lambda Steps
Repeat Step
Select Step
Vertex Steps
Tree Step
Where Step

Complex

Pivots

This sample shows how to determine the average total order value per month for each customer.

SQL

    SELECT Customers.CompanyName,
           COALESCE([1], 0)  AS [Jan],
           COALESCE([2], 0)  AS [Feb],
           COALESCE([3], 0)  AS [Mar],
           COALESCE([4], 0)  AS [Apr],
           COALESCE([5], 0)  AS [May],
           COALESCE([6], 0)  AS [Jun],
           COALESCE([7], 0)  AS [Jul],
           COALESCE([8], 0)  AS [Aug],
           COALESCE([9], 0)  AS [Sep],
           COALESCE([10], 0) AS [Oct],
           COALESCE([11], 0) AS [Nov],
           COALESCE([12], 0) AS [Dec]
      FROM (SELECT Orders.CustomerID,
                   MONTH(Orders.OrderDate)                                   AS [Month],
                   SUM([Order Details].UnitPrice * [Order Details].Quantity) AS Total
              FROM Orders
        INNER JOIN [Order Details]
                ON [Order Details].OrderID = Orders.OrderID
          GROUP BY Orders.CustomerID,
                   MONTH(Orders.OrderDate)) o
     PIVOT (AVG(Total) FOR [Month] IN ([1],
                                       [2],
                                       [3],
                                       [4],
                                       [5],
                                       [6],
                                       [7],
                                       [8],
                                       [9],
                                       [10],
                                       [11],
                                       [12])) AS [Pivot]
INNER JOIN Customers
        ON Customers.CustomerID = [Pivot].CustomerID
  ORDER BY Customers.CompanyName

Gremlin

months = new java.text.DateFormatSymbols().getShortMonths().toList(); []
rowTotal = {it.get().value("unitPrice") * it.get().value("quantity")}; []

g.V().hasLabel("customer").order().by("customerId", incr).
      where(out("ordered")).as("customer").
      map {
        def m = g.V(it.get()).out("ordered").
                  group().by {new Date(it.value("orderDate")).getMonth()}.
                          by(out("contains").map(rowTotal).sum()).next()
        (0..11).collectEntries {[months[it], m.containsKey(it) ? m[it] : 0]}
      }.as("totals").select("customer", "totals").by(id).by()

References:

As Step
Group Step
Has Step
Order Step
Select Step
Sum Step
Where Step
Vertex Steps
Transform Collection to a Map with collectEntries
DateFormatSymbols::getShortMonths()

Recommendation

This sample shows how to recommend 5 products for a specific customer. The products are chosen as follows:

determine what the customer has already ordered
determine who else ordered the same products
determine what others also ordered
determine products which were not already ordered by the initial customer, but ordered by the others
rank products by occurence in other orders

SQL

  SELECT TOP (5) [t14].[ProductName]
    FROM (SELECT COUNT(*) AS [value],
                 [t13].[ProductName]
            FROM [customers] AS [t0]
     CROSS APPLY (SELECT [t9].[ProductName]
                    FROM [orders] AS [t1]
              CROSS JOIN [order details] AS [t2]
              INNER JOIN [products] AS [t3]
                      ON [t3].[ProductID] = [t2].[ProductID]
              CROSS JOIN [order details] AS [t4]
              INNER JOIN [orders] AS [t5]
                      ON [t5].[OrderID] = [t4].[OrderID]
               LEFT JOIN [customers] AS [t6]
                      ON [t6].[CustomerID] = [t5].[CustomerID]
              CROSS JOIN ([orders] AS [t7]
                          CROSS JOIN [order details] AS [t8]
                          INNER JOIN [products] AS [t9]
                                  ON [t9].[ProductID] = [t8].[ProductID])
                   WHERE NOT EXISTS(SELECT NULL AS [EMPTY]
                                      FROM [orders] AS [t10]
                                CROSS JOIN [order details] AS [t11]
                                INNER JOIN [products] AS [t12]
                                        ON [t12].[ProductID] = [t11].[ProductID]
                                     WHERE [t9].[ProductID] = [t12].[ProductID]
                                       AND [t10].[CustomerID] = [t0].[CustomerID]
                                       AND [t11].[OrderID] = [t10].[OrderID])
                     AND [t6].[CustomerID] <> [t0].[CustomerID]
                     AND [t1].[CustomerID] = [t0].[CustomerID]
                     AND [t2].[OrderID] = [t1].[OrderID]
                     AND [t4].[ProductID] = [t3].[ProductID]
                     AND [t7].[CustomerID] = [t6].[CustomerID]
                     AND [t8].[OrderID] = [t7].[OrderID]) AS [t13]
           WHERE [t0].[CustomerID] = N'ALFKI'
        GROUP BY [t13].[ProductName]) AS [t14]
ORDER BY [t14].[value] DESC

Gremlin

g.V().has("customer", "customerId", "ALFKI").as("customer").
      out("ordered").out("contains").out("is").aggregate("products").
      in("is").in("contains").in("ordered").where(neq("customer")).
      out("ordered").out("contains").out("is").where(without("products")).
      groupCount().order(local).by(values, decr).select(keys).limit(local, 5).
      unfold().values("name")

References:

Aggregate Step
As Step
GroupCount Step
Has Step
Limit Step
Order Step
Select Step
Unfold Step
Vertex Steps
Where Step
A Note on Predicates

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql2gremlin.asciidoc

sql2gremlin.asciidoc

Introduction

Northwind Graph Model

Getting Started

Select

Select all

Select single column

Select multiple columns

Select calculated column

Select distinct values

Select scalar value

Filtering

Filter by equality

Filter by inequality

Filter by value range

Multiple filter conditions

Ordering

Order by value ascending

Order by value descending

Paging

Limit number of results

Paged result set

Grouping

Group by value

Joining

Inner join

Left join

Miscellaneous

Concatenate

Create, Update and Delete

CTE

Recursive query

Complex

Pivots

Recommendation

Files

sql2gremlin.asciidoc

Latest commit

History

sql2gremlin.asciidoc

File metadata and controls

Introduction

Northwind Graph Model

Getting Started

Select

Select all

Select single column

Select multiple columns

Select calculated column

Select distinct values

Select scalar value

Filtering

Filter by equality

Filter by inequality

Filter by value range

Multiple filter conditions

Ordering

Order by value ascending

Order by value descending

Paging

Limit number of results

Paged result set

Grouping

Group by value

Joining

Inner join

Left join

Miscellaneous

Concatenate

Create, Update and Delete

CTE

Recursive query

Complex

Pivots

Recommendation