The PNDA distribution is available on GitHub at:
It consists of the following source code repositories and sub-projects:
- platform-salt: provisioning logic for creating PNDA
- pnda-cli: orchestration application for creating PNDA on AWS, OpenStack or an existing pre-prepared cluster
- pnda-dib-elements: tools for building disk image templates
- pnda: pnda release notes and build system
- platform-libraries: libraries for working with interactive notebooks
- platform-tools: tools for operating a cluster
- bulkingest: tools for performing a bulk ingest of data
- platform-console-frontend: “single pane of glass” giving operational overview and access to application and data management functions
- platform-console-backend: APIs that provide data to the console frontend
- console-backend-data-logger: APIs to ingest data
- console-backend-data-manager: APIs to provide data
- platform-testing: modules that test both the end to end platform and individual components and collect metrics
- platform-deployment-manager: API to manage packages and application deployment and lifecycle
- platform-data-mgmnt: tools to manage data retention
- data-service: API to set data retention policies
- hdfs-cleaner: cron job to clean up HDFS data
- oozie-templates: templates that archive or delete data
- platform-package-repository: manages a simple package repository backed by OpenStack Swift
- gobblin: customized fork of the Gobblin data ingest framework
- prod-odl-kafka: plugin to ingest data from OpenDaylight
- logstash-codec-pnda-avro: patched AVRO codec ingest data from Logstash
- example-applications: example applications that can be built and run on PNDA
- spark-batch: example batch data processing application
- spark-streaming: example streaming data processing application
- jupyter-notebooks: examples for working with Jupyter notebooks
- kafka-spark-opentsdb: example consumer that feeds data to OpenTSDB
- example-kafka-clients: examples for working with kafka clients
- pnda-guide: this guide