Skip to content

Brooklin Architecture

Lee Dongjin edited this page Aug 17, 2019 · 20 revisions

Brooklin is a Java server application that is typically deployed to a cluster of machines. Each machine can host one or more instances of Brooklin, each of which offers the exact same set of capabilities.

Key Concepts

  • The most fundamental concept in Brooklin is a Datastream.
  • A Datastream is a description of a data pipe between two systems; a source system from which data is consumed and a destination system to which this data is produced.
  • Brooklin allows you to create as many Datastreams as you need to set up different data pipes between various source and destination systems.
  • To support high scalability, Brooklin expects the data streamed between source and destination systems to be partitioned. If the data is not partitioned, it is treated as a stream with a single partition.
  • Likewise, Brooklin breaks every partitioned Datastream into multiple DatastreamTasks, each of which is limited to a subset of the total partitions, that are all processed concurrently for higher throughput.
  • Connector is the abstraction representing modules that consume data from source systems.
  • Different Connector implementations can be written to support consuming data from different source systems.
  • Every Connector is associated with an AssignmentStrategy that determines (a) how Datastreams are broken into DatastreamTasks and (b) how these DatastreamTasks are distributed among the different instances of Brooklin server app.
  • An example Connector implementation is KafkaConnector, which is intended for consuming data from Kafka.
  • TransportProvider is the abstraction representing modules that produce data to destination systems.
  • Different TransportProvider implementations can be written to support producing data to different destination systems.
  • An example TransportProvider implementation is KafkaTransportProvider, which is intended for producing data to Kafka.
  • Brooklin Coordinator is the module responsible for managing the different Connector implementations.
  • There is only a single Coordinator object in every Brooklin server app instance.
  • A Coordinator can either be a leader or non-leader.
  • In a Brooklin cluster, only one Coordinator is designated as the leader while the rest are non-leaders.
  • Brooklin employs the ZooKeeper election recipe for electing the leader Coordinator.
  • In addition to managing Connectors, the leader Coordinator is responsible for dividing the work among all the Coordinators in the cluster (including itself).
  • The leader Coordinator uses the AssignmentStrategy implementation specified in the relevant Connector to break down the work required for every Datastream into one or more DatastreamTasks.
  • An example AssignmentStrategy offered by Brooklin is the LoadbalancingStrategy, which evenly distributes all the available DatastreamTasks across all Coordinator instances.

Architecture

  • Brooklin server application is typically deployed to one or more machines, all using ZooKeeper as the source of truth for Datastream and DatastreamTask metadata.
  • Information about the different instances of Brooklin server app as well as their DatastreamTask assignments is also stored in ZooKeeper.
  • Every Brooklin instance exposes a REST endpoint — aka Datastream Management Service (DMS) — that offers CRUD operations on Datastreams over HTTP.

A good way to understand the architecture of Brooklin is to go through the workflow of creating a new Datastream.

Datastream Creation Workflow

The figure below illustrates the main steps involved in Datastream creation.

Brooklin Datastream Creation Workflow

  1. A Brooklin client sends a Datastream creation request to a Brooklin cluster.

  2. The request is routed to the Datastream Management Service REST endpoint of any instance of the Brooklin server app.

  3. The Datastream data is verified and written to ZooKeeper under a certain znode that the leader Coordinator is watching for changes.

  4. The leader Coordinator gets notified of the new Datastream znode creation.

  5. The leader Coordinator reads the metadata of the newly created Datastream, and uses the AssignmentStrategy associated with the relevant Connector to break down the Datastream into one or more DatastreamTasks. It assigns these DatastreamTasks to the available Coordinator instances by writing the breakdown and assignment info to ZooKeeper.

  6. The affected Coordinators get notified of the new DatastreamTask assignments created under their respective znodes, which they read and process immediately using the relevant Connectors (i.e. consume from source, produce to destination).

ZooKeeper

In addition to leader Coordinator election, Brooklin maintains the following pieces of info in ZooKeeper:

  • Registered Connector types
  • Datastream and DatastreamTasks metadata
  • DatastreamTask state information (e.g. offsets/checkpoints, processing errors)
  • Brooklin app instances
  • DatastreamTask assignments of Brooklin app instances

The figure below illustrates the overall structure of the data Brooklin maintains in ZooKeeper.

Brooklin Data Structure in ZooKeeper

Structure

  • Each Brooklin cluster is mapped to one top-level znode.
  • /dms is where the definitions of individual Datastreams are persisted (in JSON)
  • /connectors
    • A sub-znode for every registered Connector type in this cluster is created under this znode.
    • /connectors/<connector-type>/ is where the definitions of all the DatastreamTasks handled by this Connector type are located, one znode per DatastreamTask.
    • There are two child nodes under every /connectors/<connector-type>/<datastreamtask> znode: config and state.
    • config stores the definitions of the DatastreamTasks
    • state stores the progress info of the Connectors (offsets/checkpoints)
  • /liveinstances
    • This is where the different Brooklin instances create ephemeral znodes with incremental sequence numbers (created using the EPHEMERAL_SEQUENTIAL mode) for the purposes of leader Coordinator election.
    • The value associated with each sub-znode is the hostname of the corresponding Brooklin instance.
  • /instances
    • This is where every Brooklin instance creates a persistent znode for itself. Each znode has a name composed of the concatenation of the instance hostname and its unique sequence number under /liveinstances.
    • Two sub-znodes are created under each Brooklin instance znode — assignments and errors.
    • assignments contains the names of all the DatastreamTasks assigned to this Brooklin instance.
    • errors is where messages about errors encountered by this Brooklin instance are persisted.

Workflow

  • The leader Coordinator watches the /dms znode for changes in order to get notified when Datastreams are created, deleted, or altered.
  • Leader or not, every Coorindator watches its corresponding znode under /instances in order to get notified when changes are made to its assignments child znode.
  • The leader Coordinator assigns DatastreamTasks to individual Brooklin instances by changing the assignment znode under /instances/<instance-name>.

REST Endpoints

Brooklin uses Rest.li to expose 3 REST endpoints:

  1. Datastream Management Service (DMS)

    In addition to basic CRUD operations, this endpoint offers some advanced operations on Datastreams, e.g.

    • batch operations (e.g. get_all, batch_update)
    • pause and resume operations on entire Datastreams as well as individual partitions (in case of partitioned Datastreams)
  2. Health Monitoring

    This endpoint offers information about the overall health of a single Brooklin instance, e.g.

    • Instance name
    • Brooklin cluster name
    • All Connectors registered on the instance
    • All DatastreamTasks assigned to the instance
    • All DatastreamTask status (e.g. good, faulty, paused ... etc.)
    • Source and Destination info for all DatastreamTasks assigned to the instance
    • The number of partitions of every DatastreamTask
  3. Diagnostics

    • This endpoint offers a greater level of detail about individual Brooklin components. For example, it can be queried for detailed status of an individual Connector or Datastream.

    • Furthermore, this endpoint can provide a diagnostic digest aggregated from all Brooklin instances in the same cluster, sparing administrators the trouble of querying every instance individually and aggregating the stats externally.

Please refer to the REST Endpoints wiki page for detailed information about the options offered by these endpoints.

Metrics

Brooklin uses the Metrics library to emit metrics about various aspects of its operation. These metrics are emitted through JMX (using JmxReporter), and exported to CSV files (using CsvReporter) if a valid file path is provided to Brooklin through configuration.