This repository contains a number of example applications that can be built and run on PNDA. Each application directory contains more detailed information.
Examples of consuming data from Kafka and populating both HBase and OpenTSDB with simple Scala based Spark Streaming applications.
- Write to HBase (scala)
- Write to OpenTSDB (scala)
- Count messages (python)
Example of consuming data ingested by Gobblin on a batch basis and producing Parquet datasets, optimized for consumption by Impala.
- Write to parquet format (scala)
- Write to parquet format (python)
Example of a notebook for manipulating network data.
Application that runs the H2O data science platform as an application on PNDA.
- Count Words (scala) Count the words from Socket.
- Count Words (python) Count the words from input file.
- Flink Windows (java) host-network-data-usage illustrating Flink windows, triggers and event processing.
- Count Hashtags (java) specific word count from input file illustrating metrics, counters and accumulators.
An example of a package containing multiple application component types, in this case a Spark app and related Jupyter notebook.