A processor that evaluates a machine learning model stored in TensorFlow Protobuf format. It operationalizes the https://github.com/danielegrattarola/twitter-sentiment-cnn
Decodes the evaluated result into POSITIVE, NEGATIVE and NEUTRAL values. Then creates and returns a simple JSON message with this structure:
N/A
Processor’s output uses TensorflowOutputConverter
to convert the computed Tensor
result into a serializable
message. The default implementation uses JSON.
Custom TensorflowOutputConverter
can provide more convenient data representations.
See link::../spring-cloud-starter-stream-processor-twitter-sentiment/src/main/java/org/springframework/cloud/stream/app/twitter/sentiment/processor/TwitterSentimentTensorflowOutputConverter.java[TwitterSentimentTensorflowOutputConverter.java].
The twitter-sentiment processor has the following options:
- tensorflow.expression
-
How to obtain the input data from the input message. If empty it defaults to the input message payload. The headers[myHeaderName] expression to get input data from message's header using myHeaderName as a key. (Expression, default:
<none>
) - tensorflow.mode
-
The outbound message can store the inference result either in the payload or in a header with name outputName. The payload mode (default) stores the inference result in the outbound message payload. The inbound payload is discarded. The header mode stores the inference result in outbound message's header defined by the outputName property. The the inbound message payload is passed through to the outbound such. (OutputMode, default:
<none>
, possible values:payload
,header
) - tensorflow.model
-
The location of the pre-trained TensorFlow model file. The file, http and classpath schemas are supported. For archive locations takes the first file with '.pb' extension. Use the URI fragment parameter to specify an exact model name (e.g. https://foo/bar/model.tar.gz#frozen_inference_graph.pb) (Resource, default:
<none>
) - tensorflow.model-fetch
-
The TensorFlow graph model outputs. Comma separate list of TensorFlow operation names to fetch the output Tensors from. (List<String>, default:
<none>
) - tensorflow.output-name
-
The output data key used for the Header modes. (String, default:
result
) - tensorflow.twitter.vocabulary
-
The location of the word vocabulary file, used for training the model (Resource, default:
<none>
)
$ ./mvnw clean install -PgenerateApps
$ cd apps
You can find the corresponding binder based projects here. You can then cd into one of the folders and build it:
$ ./mvnw clean package
java -jar twitter-sentiment-processor.jar --tensorflow.twitter.vocabulary= --tensorflow.model= \
--tensorflow.modelFetch= --tensorflow.mode="
And here is a sample pipeline that computes sentiments for json tweets coming from the twitterstream
source and
using the pre-build minimal_graph.proto
and vocab.csv
:
tweets=twitterstream --access-token-secret=xxx --access-token=xxx --consumer-secret=xxx --consumer-key=xxx \
| filter --expression=#jsonPath(payload,'$.lang')=='en' \
| twitter-sentimet --vocabulary='https://storage.googleapis.com/scdf-tensorflow-models/twitter-sentiment/vocab.csv' \
--output-name=output/Softmax --model='https://storage.googleapis.com/scdf-tensorflow-models/twitter-sentiment/minimal_graph.proto' \
--model-fetch=output/Softmax \
| log