Name		Name	Last commit message	Last commit date
parent directory ..
analyses		analyses
macros		macros
models/reviews		models/reviews
seeds		seeds
snapshots		snapshots
tests		tests
.gitignore		.gitignore
README.md		README.md
dbt_project.yml		dbt_project.yml

README.md

Applying Sentiment Analysis in the dbt DAG

In this example, we show you how to make multi-language sentiment analysis from within a dbt pipeline, using a simple sql expression like this:

select id,
       review,
       layer.predict("layer/nlptown/models/sentimentanalysis", ARRAY[review])
from {{ref('reviews')}}

We use the layer-dbt-adapter to enable the layer.predict function. With layer.predict, we can load and apply a pre-trained machine learning model to data within a dbt pipeline. This ML-enabled stage becomes part of the dbt execution DAG.

How to run

First, install the open-source Layer DBT Adapter. Currently, we only support BigQuery (more to come soon)

pip install dbt-layer-bigquery -U -q

Next, install the required libraries. This ML model is a finetuned Pytorch model open-sourced by NLPTown. So, we need some additional libraries for Pytorch.

pip install torch torchvision

Then, add a new BigQuery profile to your DBT profile. Name it as layer-profile, and don't forget to set type: layer_bigquery for the Layer adapter to work. Here is a sample profile:

layer-profile:
  target: dev
  outputs:
    dev:
      type: layer_bigquery
      method: service-account
      project: [GCP project id]
      dataset: [the name of your dbt dataset]
      threads: [1 or more]
      keyfile: [/path/to/bigquery/keyfile.json]

Now, are are ready to clone this repo and get to the folder for this example:

git clone https://github.com/layerai/examples-dbt
cd examples-dbt/sentiment_analysis

The example includes a sample dataset. You can seed the sample data reviews table to your DWH. The dataset includes a sample of multi-language product reviews from Amazon.

dbt seed

Finally, you can run the dbt project:

dbt run

When the project runs, the layer-dbt-adapter fetches the review text from the ref('products') relation and applies the sentiment model, producing a prediction score from 1 to 5 (1- lowest negative sentiment, 5- highest positive sentiment).

The application of the layer.predict function results in a column named predictioncontaining the predicted score for each row of the input dataset. In this dbt pipeline, the resulting dataset is written back to the DWH, resulting in a new table with the scored data.

Machine Learning Model

In this dbt example, we use a Bert model finetuned on product reviews in six languages: English, Dutch, German, French, Spanish, and Italian. It predicts the sentiment of the review as a number of stars (between 1 and 5).

This model is intended for direct use as a sentiment analysis model for product reviews in any of the six languages above, or for further finetuning on related sentiment analysis tasks.

To learn more about this machine learning model, visit:

https://app.layer.ai/layer/nlptown

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sentiment_analysis

sentiment_analysis

README.md

Applying Sentiment Analysis in the dbt DAG

How to run

Machine Learning Model

Files

sentiment_analysis

Directory actions

More options

Directory actions

More options

Latest commit

History

sentiment_analysis

Folders and files

parent directory

README.md

Applying Sentiment Analysis in the dbt DAG

How to run

Machine Learning Model