Kafka is the "front door" of PNDA, allowing the ingest of high-velocity data streams, distributing data to all interested consumers and decoupling data sources from data processing applications and platform clients.
It is normally not necessary to create a new producer to start acquiring network data as any data is supported out of the box. Additionally, some encodings do benefit from special accommodations. But even for these, no new producer should be required as there are a growing number of data plugins that have already been integrated with PNDA. It’s not always clear which plugins to use for which data types, hence we’ve summarized some common combinations in the table at the bottom of this page.
If you do have other data sources you want to integrate with PNDA it’s easy enough to write a PNDA producer – see producer.md
PNDA adopts a schema-on-read approach to data processing, so all data directed towards the platform is stored in as close to its raw form as possible. When data is persisted, each datum is ensured compliance to a consistent Avro wrapper that contains both the logical source of the data and a timestamp besides the data payload.
Kafka data is stored in topics, each topic being divided into partitions and each partition being replicated to avoid data loss. Ingest is achieved by delivering data through a "producer" which is implemented to send data to one or more well defined topics by direct connection to the broker cluster. Load balancing is carried out by the broker cluster itself via negotiation with topic partition leaders.
PNDA is typically deployed with a set of well defined topics in accordance with the deployment context, each topic being carefully configured with a set of replicated partitions in line with the expected ingest and consumption rates. Please refer to our Topic Preparation Guide to understand how to create and setup up topics. By convention topics are named according to a hierarchical scheme such that consumers are able to "whitelist" data of interest and subscribe to multiple topics at once (e.g. mytelco.service6.netflow.*
or mytelco.*
).
PNDA includes tools for managing topics, partitions and brokers and for monitoring the data flow across them.
Integrators can make use of the high and low level Kafka APIs. Please refer to our Topic Preparation Guide to discover how to leverage advanced feature that come with some dedicated encodings and our Data Preparation Guide to understand how to encapsulate data for those encoding options.