This is a Spark/Cassandra demo using the open-source Spark Cassandra Connector
There are 2 packages with 2 distinct demos
- us.unemployment.demo
- Ingestion
- FromCSVToCassandra: read US employment data from CSV file into Cassandra
- FromCSVCaseClassToCassandra: read US employment data from CSV file, create case class and insert into Cassandra
- Read
- FromCassandraToRow: read US employment data from Cassandra into CassandraRow low-level object
- FromCassandraToCaseClass: read US employment data from Cassandra into custom Scala case class, leveraging the built-in object mapper
- FromCassandraToSQL: read US employment data from Cassandra using SparkSQL a the connector integration
- twitter.stream
- TwitterStreaming: demo of Twitter stream saved back to Cassandra (stream IN). To make this demo work, you need to start the job with the following info:
<ol> <li>-Dtwitter4j.oauth.consumerKey="value"</li> <li>-Dtwitter4j.oauth.consumerSecret="value"</li> <li>-Dtwitter4j.oauth.accessToken="value"</li> <li>-Dtwitter4j.oauth.accessTokenSecret="value"</li> </ol> If you don't have a Twitter app credentials, create a new apps at <a href="https://apps.twitter.com/" target="_blank">https://apps.twitter.com/</a>
- weather.data.demo
- Data preparation
- Go to the folder main/data
- Execute $CASSANDRA_HOME/bin/cqlsh -f weather_data_schema.cql from this folder. It should create the keyspace spark_demo and some tables
- Download the Weather_Raw_Data_2014.csv.gz from here (>200Mb)
- Unzip it somewhere on your disk
- Ingestion
- WeatherDataIntoCassandra: read all the Weather_Raw_Data_2014.csv file (30.106 lines) and insert the data into Cassandra. It may take some time before the ingestion is done so go take a long coffee ( < 1 hour on my MacBookPro 15") Please do not forget set the path to this file by changing the WeatherDataIntoCassandra.WEATHER_2014_CSV value
- Read
- WeatherDataFromCassandra: read all raw weather data plus all weather stations details, filter the data by French station and take data only between March and June 2014. Then compute average on temperature and pressure