NOTE: this is far from being ready for production. Use it at your own risk.
Using GenStage and setup as a system of one producer, one consumer-producer and one consumer. The producer fetch tweets from the Twitter Streaming APIs and sends data to the ProducerConsumer that sends data to the Consumer:
[A] -> [B] -> [C]
That means:
A is only a producer (and therefore a source) B is both producer and consumer C is only a consumer (and therefore a sink)
The producer (A) connects to the Twitter Stream API and read in all tweets sending them to the producer-consumer (B). The producer-consumer (B) will check if the hashtag appears in the tweet. If yes, it will send it to the consumer ( C ). The sink ( C ) will then log the tweet with a timestamp to the console.
The producer (A) receives messages from a continuous stream from Twitter and adds the tweets
to a :queue
. When handling the demand from the consumers, it will pick tweets out of the queue and dispatch them.
Elxir ~> 1.5
cd twitflow
mix deps.get
In the file config/config.exs
you should add the following keys and secrets.
You can get them on dev.twitter.com
config :twittex,
consumer_key: "",
consumer_secret: "",
token: "",
token_secret: ""
In the folder of the current application run mix run --no-halt
By default, the application will filter only tweets that include the word 'startup'. To change this, export an environment variable called MONITOR_HASHTAG
and give it the word or sentence you would like to follow.
Example:
export MONITOR_HASHTAG=love
mix run --no-halt
13:07:00.774 [info] Start monitoring tweets about love
13:07:07.913 [info] RT @lovely_zinampan: nakakainis yung teacher na nanghuhula lang ng ibibigay na grade sa estudyante
tangin*🙊😞
13:07:10.934 [info] RT @loveloiniee: Hala paktay nagbisaya na hahahahahaha
WansaGelliInABottle BukasNa https://t.co/krwFcUoDIO
13:07:20.888 [info] RT @PreetBharara: Some people hate this president BECAUSE they love this country https://t.co/B03e3mXKTz
13:07:22.901 [info] RT @razalfaro: words cannot describe how beautiful you are.
I love you to death Meng..❤️
@mainedcm
#MaineForRejoice https://t.co/YgYSobzL…
13:07:28.931 [info] RT @Bestloveporn: https://t.co/JV8b74p0w1
- tests
- tests
- simplify the producer logic
- The producer is coupled with the Twitter API stream and this could be avoided in different ways, however, in this experiment the intention was to demonstrate how producers, producer consumers and consumers can interact with each other. So the focus was spent there.
- The application may crash reaching the maximum rate of restarts for some of the workers, like the producer. This may be solved either isolating even more the workers and the points of failure or increasing the maximum number of restarts or using some other strategy in production.
- Fun fact, the Twittex library pushes back-pressure at the TCP level
- The design of the supervision tree has to be adapted after some learning on the specific domain
- Given that now all tweets are added to a :queue, there is a risk that the queue could get too big. This may be avoided adding more consumers as the queue grows.