Skip to content
This repository was archived by the owner on Jan 21, 2022. It is now read-only.

Latest commit

 

History

History
88 lines (60 loc) · 4.06 KB

File metadata and controls

88 lines (60 loc) · 4.06 KB

Twitter Sentiment Analysis Processor

A processor that evaluates a machine learning model stored in TensorFlow Protobuf format. It operationalizes the https://github.com/danielegrattarola/twitter-sentiment-cnn

SCDF TF Sentiment

Input

Headers

  • content-type: application/json

Payload

  • JSON tweet message

Output

Headers

  • content-type: application/json

Payload

Decodes the evaluated result into POSITIVE, NEGATIVE and NEUTRAL values. Then creates and returns a simple JSON message with this structure:

N/A

Payload

Processor’s output uses TensorflowOutputConverter to convert the computed Tensor result into a serializable message. The default implementation uses JSON.

Custom TensorflowOutputConverter can provide more convenient data representations. See link::../spring-cloud-starter-stream-processor-twitter-sentiment/src/main/java/org/springframework/cloud/stream/app/twitter/sentiment/processor/TwitterSentimentTensorflowOutputConverter.java[TwitterSentimentTensorflowOutputConverter.java].

Options

The twitter-sentiment processor has the following options:

tensorflow.expression

How to obtain the input data from the input message. If empty it defaults to the input message payload. The headers[myHeaderName] expression to get input data from message's header using myHeaderName as a key. (Expression, default: <none>)

tensorflow.mode

The outbound message can store the inference result either in the payload or in a header with name outputName. The payload mode (default) stores the inference result in the outbound message payload. The inbound payload is discarded. The header mode stores the inference result in outbound message's header defined by the outputName property. The the inbound message payload is passed through to the outbound such. (OutputMode, default: <none>, possible values: payload,header)

tensorflow.model

The location of the pre-trained TensorFlow model file. The file, http and classpath schemas are supported. For archive locations takes the first file with '.pb' extension. Use the URI fragment parameter to specify an exact model name (e.g. https://foo/bar/model.tar.gz#frozen_inference_graph.pb) (Resource, default: <none>)

tensorflow.model-fetch

The TensorFlow graph model outputs. Comma separate list of TensorFlow operation names to fetch the output Tensors from. (List<String>, default: <none>)

tensorflow.output-name

The output data key used for the Header modes. (String, default: result)

tensorflow.twitter.vocabulary

The location of the word vocabulary file, used for training the model (Resource, default: <none>)

Build

$ ./mvnw clean install -PgenerateApps
$ cd apps

You can find the corresponding binder based projects here. You can then cd into one of the folders and build it:

$ ./mvnw clean package

Examples

java -jar twitter-sentiment-processor.jar --tensorflow.twitter.vocabulary= --tensorflow.model= \
    --tensorflow.modelFetch= --tensorflow.mode="

And here is a sample pipeline that computes sentiments for json tweets coming from the twitterstream source and using the pre-build minimal_graph.proto and vocab.csv:

tweets=twitterstream --access-token-secret=xxx --access-token=xxx --consumer-secret=xxx --consumer-key=xxx \
| filter --expression=#jsonPath(payload,'$.lang')=='en' \
| twitter-sentimet --vocabulary='https://storage.googleapis.com/scdf-tensorflow-models/twitter-sentiment/vocab.csv' \
   --output-name=output/Softmax --model='https://storage.googleapis.com/scdf-tensorflow-models/twitter-sentiment/minimal_graph.proto' \
   --model-fetch=output/Softmax \
| log