SQLFlow is a tool that automates data lineage discovery by analyzing the SQL script. It generates a nice clean diagram to show the dataflow among the table/view and columns in the data warehouse.
Support more than 20 major databases, including bigquery, couchbase, dax, db2, greenplum, hana, hive, impala, informix, mdx, mysql, netezza, openedge, oracle, postgresql, redshift, snowflake, sqlserver, sybase, teradata, vertica,
Just paste the SQL script and click a button, you will get the data lineage diagram instantly, highlight the dataflow in the graph with a simple mouse click.
You can also call the RESTful API provided by this tool in your program and get the data lineage and diagram model information in a JSON snippet to make further usage.
-
Send the SQL script received from the browser to the backend.
-
After receiving the result, which includes the data lineage and diagram model generated by the backend, visualize the diagram model in the browser.
-
Highlight the dataflow in the diagram when the user clicks on a specific table/column.
-
SQLFlow-Analyzer
: receiving the SQL script from the frontend and parse the SQL script into parse tree nodes by utilizing the GSP library, calculate the data lineage by analyzing AST.The
SQLFlow-Analyzer
component can be executed as a standalone tool. Take SQL text as input and generate data lineage in JSON format. CheckSQLFlow.java
for more. -
FlowLayout
: Calculating the layout of database objects(table/column) in the dlineage and generate the diagram model with all necessary position data, including nodes and edges.FlowLayout
depends ondoLayout library
to layout the database objects. -
Return a JSON snippet including the data lineage and diagram model to the frontend.
-
Visit SQLFlow Cloud using the browser
You may paste your SQL script into the SQLFlow web page or upload the SQL file to the site. Select the correct database and then click the visualize button.
-
Use RESTFul APIs
SQLFlow provides RESTful API, so your program can communicate with the SQLFlow backend directly. Sending the SQL to SQLFlow backend and receive a JSON snippet including the data lineage and diagram model for further processing in your program.
Please note that you need to set up the SQLFlow on-premise version on your server to use the API.
Or connect to the SQLFlow Cloud to use the RESTFul API
-
Install both frontend and backend on your own application/server (SQLFlow on-premise version)
Setup both the frontend and backend of SQLFlow on our server. Please check the setup manual.
To setup SQLFlow on your server, please contact us to obtain a commercial license to get all those distribution files of the SQLFlow on-premise version.
-
Integrate the frontend and backend to your data platform
The front end is written in Typescript to integrate it into your frontend if you use the same tech.
The backend part can be provided as Java library so you can be embedded into your program.
This article describes the relationship generated by the SQLFlow between column and column, column and table/view. One relation includes one target column and a relationship type, and one or more source columns.