From 03b1accbc0d50417ac263193c4f406ee952a9aeb Mon Sep 17 00:00:00 2001 From: dehora Date: Wed, 22 Jun 2016 10:14:31 +0200 Subject: [PATCH 1/4] Add the nakadi 0.6.0 spec --- docs/api-spec-oai/nakadi-oai-0.6.0.yaml | 1120 +++++++++++++++++++++ docs/api-spec-oai/nakadi-oai-current.yaml | 2 +- 2 files changed, 1121 insertions(+), 1 deletion(-) create mode 100644 docs/api-spec-oai/nakadi-oai-0.6.0.yaml diff --git a/docs/api-spec-oai/nakadi-oai-0.6.0.yaml b/docs/api-spec-oai/nakadi-oai-0.6.0.yaml new file mode 100644 index 0000000..ab31164 --- /dev/null +++ b/docs/api-spec-oai/nakadi-oai-0.6.0.yaml @@ -0,0 +1,1120 @@ +swagger: '2.0' +info: + title: Nakadi Event Bus API Definition + description: | + + Nakadi at its core aims at being a generic and content-agnostic event broker with a convenient + API. In doing this, Nakadi abstracts away, as much as possible, details of the backing + messaging infrastructure. The single currently supported messaging infrastructure is Kafka + (Kinesis is planned for the future). + + In Nakadi every Event has an EventType, and a **stream** of Events is exposed for each + registered EventType. + + An EventType defines properties relevant for the operation of its associated stream, namely: + + * The **schema** of the Event of this EventType. The schema defines the accepted format of + Events of an EventType and will be, if so desired, enforced by Nakadi. Usually Nakadi will + respect the schema for the EventTypes in accordance to how an owning Application defines them. + **Note:** *Currently the specification of the schema must be pushed into Nakadi on EventType + creation; in the future, assuming that Applications will expose the schema for its owned + resources, Nakadi might support fetching the schema directly from them.* + + * The expected **validation** and **enrichment** procedures upon reception of an Event. + Validation define conditions for the acceptance of the incoming Event and are strictly enforced + by Nakadi. Usually the validation will enforce compliance of the payload (or part of it) with + the defined schema of its EventType. Enrichment specify properties that are added to the payload + (body) of the Event before persisting it. Usually enrichment affects the metadata of an Event + but is not limited to. + + * The **ordering** expectations of Events in this stream. Each EventType will have its Events + stored in an underlying logical stream (the Topic) that is physically organized in disjoint + collections of strictly ordered Events (the Partition). The EventType defines the field that + acts as evaluator of the ordering (that is, its partition key); this ordering is guaranteed by + making Events whose partition key resolves to the same Partition (usually a hash function on its + value) be persisted strictly ordered in a Partition. In practice this means that all Events + within a Partition have their relative order guaranteed: Events (of a same EventType) that are + *about* a same data entity (that is, have the same value on its Partition key) reach always the + same Partition, the relative ordering of them is secured. This mechanism implies that no + statements can be made about the relative ordering of Events that are in different partitions. + + Except for defined enrichment rules, Nakadi will never manipulate the content of any Event. + + Clients of Nakadi can be grouped in 2 categories: **EventType owners** and **Clients** (clients + in turn are both **Producers** and **Consumers** of Events). Event Type owners interact with + Nakadi via the **Schema Registry API** for the definition of EventTypes, while Clients via the + streaming API for submission and reception of Events. + + A low level **Unmanaged API** is available, providing full control and responsibility of + position tracking and partition resolution (and therefore ordering) to the Clients. + + In the high level **Subscription API** the consumption of Events proceeds via the establishment + of a named **Subscription** to an EventType. Subscriptions are persistent relationships from an + Application (which might have several instances) and the stream of one or more EventType's, + whose consumption tracking is managed by Nakadi, freeing Consumers from any responsibility in + tracking of the current position on a Stream. + + **Note** *Currently the high level API is out of scope in this specification. It is in the + short term plan to be included.* + + + Scope and status of the API + --------------------------------- + + The API specification is in **draft** state and is subject to change. + + In this document, you'll find: + + * The Schema Registry API, including configuration possibilities for the Schema, Validation, + Enrichment and Partitioning of Events, and their effects on reception of Events. + + * The existing event format (see definition of Event, BusinessEvent and DataChangeEvent) + (Note: in the future this is planned to be configurable and not an inherent part of this API). + + * Unmanaged API: provides low level access to an event stream with information that allows + consumers to detect their position for each partition in the stream via a `Cursor`. + + Notable omissions here are: + + * The Managed (or "high level") API: this will be a contract between Nakadi and consumers to + allow the latter to establish subscriptions and have offset information managed by the Nakadi + service. + + * Enrichment options. Enrichment is currently limited to metadata enrichment for the business + and data change types, but the API is designed to allow more options. + + * More extensive security scopes (OAuth) for the different operations in the API. + + * Explicit control of an event type's creation parameters (the number of partitions, retention + times, etc), as well as their modification. + + + version: '0.6.0' + contact: + name: Team Aruha @ Zalando + email: team-aruha+nakadi-maintainers@zalando.de +schemes: + - https +consumes: + - application/json +produces: + - application/json +securityDefinitions: + oauth2: + type: oauth2 + flow: implicit + authorizationUrl: 'https://auth.example.com/oauth2/tokeninfo' + scopes: + nakadi.config.write: | + Grants access for changing Nakadi configuration. + nakadi.event_type.write: | + Grants access for applications to define and update EventTypes. + nakadi.event_stream.write: | + Grants access for applications to submit Events. + nakadi.event_stream.read: | + Grants access for consuming Event streams. + +paths: + /event-types: + get: + tags: + - schema-registry-api + description: Returns a list of all registered `EventType`s + parameters: + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + responses: + '200': + description: Ok + schema: + type: array + items: + $ref: '#/definitions/EventType' + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + post: + tags: + - schema-registry-api + security: + - oauth2: ['nakadi.event_type.write'] + description: | + Creates a new `EventType`. + + The fields validation-strategies, enrichment-strategies and partition-resolution-strategy + have all an effect on the incoming Event of this EventType. For its impacts on the reception + of events please consult the Event submission API methods. + + * Validation strategies define an array of validation stategies to be evaluated on reception + of an `Event` of this `EventType`. Details of usage can be found in this external document + + - http://zalando.github.io/nakadi-manual/ + + * Enrichment strategy. (todo: define this part of the API). + + * The schema of an `EventType` is defined as an `EventTypeSchema`. Currently only + the value `json-schema` is supported, representing JSON Schema draft 04. + + Following conditions are enforced. Not meeting them will fail the request with the indicated + status (details are provided in the Problem object): + + * EventType name on creation must be unique (or attempting to update an `EventType` with + this method), otherwise the request is rejected with status 409 Conflict. + + * Using `EventTypeSchema.type` other than json-schema or passing a `EventTypeSchema.schema` + that is invalid with respect to the schema's type. Rejects with 422 Unprocessable entity. + + * Referring any Enrichment or Partition strategies that do not exist or + whose parametrization is deemed invalid. Rejects with 422 Unprocessable entity. + + Nakadi MIGHT impose necessary schema, validation and enrichment minimal configurations that + MUST be followed by all EventTypes (examples include: validation rules to match the schema; + enriching every Event with the reception date-type; adhering to a set of schema fields that + are mandatory for all EventTypes). **The mechanism to set and inspect such rules is not + defined at this time and might not be exposed in the API.** + + parameters: + - name: event-type + in: body + description: EventType to be created + schema: + $ref: '#/definitions/EventType' + required: true + responses: + '201': + description: Created + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + '409': + description: Conflict, for example on creation of EventType with already existing name. + schema: + $ref: '#/definitions/Problem' + '422': + description: Unprocessable Entity + schema: + $ref: '#/definitions/Problem' + + /event-types/{name}: + get: + tags: + - schema-registry-api + description: | + Returns the `EventType` identified by its name. + parameters: + - name: name + in: path + description: Name of the EventType to load. + type: string + required: true + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + responses: + '200': + description: Ok + schema: + $ref: '#/definitions/EventType' + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + + put: + tags: + - schema-registry-api + security: + - oauth2: ['nakadi.event_type.write'] + description: | + Updates the `EventType` identified by its name. Behaviour is the same as creation of + `EventType` (See POST /event-type) except where noted below. + + The name field cannot be changed. Attempting to do so will result in a 422 failure. + + At this moment changes in the schema are not supported and will produce a 422 + failure. (todo: define conditions for backwards compatible extensions in the schema) + parameters: + - name: name + in: path + description: Name of the EventType to update. + type: string + required: true + - name: event-type + in: body + description: EventType to be updated. + schema: + $ref: '#/definitions/EventType' + required: true + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + responses: + '200': + description: Ok + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + '422': + description: Unprocessable Entity + schema: + $ref: '#/definitions/Problem' + delete: + tags: + - schema-registry-api + security: + - oauth2: ['nakadi.config.write'] + description: | + Deletes an `EventType` identified by its name. All events in the `EventType`'s stream' will + also be removed. **Note**: deletion happens asynchronously, which has the following + consequences: + + * Creation of an equally named `EventType` before the underlying topic deletion is complete + might not succeed (failure is a 409 Conflict). + + * Events in the stream may be visible for a short period of time before being removed. + + parameters: + - name: name + in: path + description: Name of the EventType to delete. + type: string + required: true + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + responses: + '200': + description: EventType is successfuly removed + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + + /event-types/{name}/events: + post: + tags: + - stream-api + security: + - oauth2: ['nakadi.event_stream.write'] + description: | + Publishes a batch of `Event`s of this `EventType`. All items must be of the EventType + identified by `name`. + + Reception of Events will always respect the configuration of its `EventType` with respect to + validation, enrichment and partition. The steps performed on reception of incoming message + are: + + 1. Every validation rule specified for the `EventType` will be checked in order against the + incoming Events. Validation rules are evaluated in the order they are defined and the Event + is **rejected** in the first case of failure. If the offending validation rule provides + information about the violation it will be included in the `BatchItemResponse`. If the + `EventType` defines schema validation it will be performed at this moment. + + 1. Once the validation succeeded, the content of the Event is updated according to the + enrichment rules in the order the rules are defined in the `EventType`. No preexisting + value might be changed (even if added by an enrichment rule). Violations on this will force + the immediate **rejection** of the Event. The invalid overwrite attempt will be included in + the item's `BatchItemResponse` object. + + 1. The incoming Event's relative ordering is evaluated according to the rule on the + `EventType`. Failure to evaluate the rule will **reject** the Event. + + Given the batched nature of this operation, any violation on validation or failures on + enrichment or partitioning will cause the whole batch to be rejected, i.e. none of its + elements are pushed to the underlying broker. + + Failures on writing of specific partitions to the broker might influence other + partitions. Failures at this stage will fail only the affected partitions. + + parameters: + - name: name + in: path + type: string + description: Name of the EventType + required: true + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + - name: event + in: body + description: The Event being published + schema: + type: array + items: + $ref: '#/definitions/Event' + required: true + responses: + '200': + description: All events in the batch have been successfully published. + '207': + description: | + At least one event has failed to be submitted. The batch might be partially submitted. + schema: + type: array + items: + $ref: '#/definitions/BatchItemResponse' + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + '422': + description: | + At least one event failed to be validated, enriched or partitioned. None were submitted. + schema: + type: array + items: + $ref: '#/definitions/BatchItemResponse' + get: + tags: + - stream-api + - unmanaged-api + security: + - oauth2: ['nakadi.event_stream.read'] + description: | + Starts a stream delivery for the specified partitions of the given EventType. + + The event stream is formatted as a sequence of `EventStreamBatch`es separated by `\n`. Each + `EventStreamBatch` contains a chunk of Events and a `Cursor` pointing to the **end** of the + chunk (i.e. last delivered Event). The cursor might specify the offset with the symbolic + value `BEGIN`, which will open the stream starting from the oldest available offset in the + partition. + + Currently the `application/x-json-stream` format is the only one supported by the system, + but in the future other media types may be supported. + + If streaming for several distinct partitions, each one is an independent `EventStreamBatch`. + + The initialization of a stream can be parameterized in terms of size of each chunk, timeout + for flushing each chunk, total amount of delivered Events and total time for the duration of + the stream. + + Nakadi will keep a streaming connection open even if there are no events to be delivered. In + this case the timeout for the flushing of each chunk will still apply and the + `EventStreamBatch` will contain only the Cursor pointing to the same offset. This can be + treated as a keep-alive control for some load balancers. + + The tracking of the current offset in the partitions and of which partitions is being read + is in the responsibility of the client. No commits are needed. + produces: + - application/x-json-stream + parameters: + - name: name + in: path + description: EventType name to get events about + type: string + required: true + - name: X-nakadi-cursors + in: header + description: | + Cursors indicating the partitions to read from and respective starting offsets. + + Assumes the offset on each cursor is not inclusive (i.e., first delivered Event is the + **first one after** the one pointed to in the cursor). + + If the header is not present, the stream for all partitions defined for the EventType + will start from the newest event available in the system at the moment of making this + call. + + **Note:** we are not using query parameters for passing the cursors only because of the + length limitations on the HTTP query. Another way to initiate this call would be the + POST method with cursors passed in the method body. This approach can implemented in the + future versions of this API. + + required: false + type: string + format: serialized json array of '#/definitions/Cursor' + - name: batch_limit + in: query + description: | + Maximum number of `Event`s in each chunk (and therefore per partition) of the stream. + + * If 0 or unspecified will buffer Events indefinitely and flush on reaching of + `batch_flush_timeout`. + type: integer + format: int32 + required: false + default: 1 + - name: stream_limit + in: query + description: | + Maximum number of `Event`s in this stream (over all partitions being streamed in this + connection). + + * If 0 or undefined, will stream batches indefinitely. + + * Stream initialization will fail if `stream_limit` is lower than `batch_limit`. + type: integer + format: int32 + required: false + default: 0 + - name: batch_flush_timeout + in: query + description: | + Maximum time in seconds to wait for the flushing of each chunk (per partition). + + * If the amount of buffered Events reaches `batch_limit` before this + `batch_flush_timeout` is reached, the messages are immediately flushed to the client and + batch flush timer is reset. + + * If 0 or undefined, will assume 30 seconds. + type: number + format: int32 + required: false + default: 30 + - name: stream_timeout + in: query + description: | + Maximum time in seconds a stream will live before being interrupted. + If value is zero, streams indefinitely. + + If this timeout is reached, any pending messages (in the sense of `stream_limit`) will + be flushed to the client. + + Stream initialization will fail if `stream_timeout` is lower than `batch_flush_timeout`. + type: number + format: int32 + required: false + default: 60 + - name: stream_keep_alive_limit + in: query + description: | + Maximum number of keep-alive messages to get in a row before closing the connection. + + If 0 or undefined will send keep alive messages indefinitely. + type: integer + format: int32 + required: false + default: 0 + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + + responses: + '200': + description: | + Starts streaming to the client. + Stream format is a continuous series of `EventStreamBatch`s separated by `\n` + schema: + $ref: '#/definitions/EventStreamBatch' + '401': + description: Not authenticated + schema: + $ref: '#/definitions/Problem' + '422': + description: Unprocessable entity + schema: + $ref: '#/definitions/Problem' + + '/event-types/{name}/partitions': + get: + tags: + - unmanaged-api + - monitoring + - management-api + security: + - oauth2: ['nakadi.event_stream.read'] + description: | + Lists the `Partition`s for the given event-type. + + This endpoint is mostly interesting for monitoring purposes or in cases when consumer wants + to start consuming older messages. + + parameters: + - name: name + in: path + description: EventType name + type: string + required: true + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + responses: + '200': + description: OK + schema: + type: array + description: An array of `Partition`s + items: + $ref: '#/definitions/Partition' + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + + /event-types/{name}/partitions/{partition}: + get: + tags: + - unmanaged-api + - management-api + security: + - oauth2: ['nakadi.event_stream.read'] + + description: Returns the given `Partition` of this EventType + parameters: + - name: name + in: path + description: EventType name + type: string + required: true + - name: partition + in: path + description: Partition id + type: string + required: true + - name: X-Flow-Id + in: header + description: | + The flow id of the request, which is written into the logs and passed to called + services. Helpful for operational troubleshooting and log analysis. + type: string + format: flow-id + responses: + '200': + description: OK + schema: + $ref: '#/definitions/Partition' + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + + '/registry/enrichment-strategies': + get: + tags: + - schema-registry-api + description: | + Lists all of the enrichment strategies supported by this Nakadi installation. Special or + custom strategies besides the defaults will be listed here. + responses: + '200': + description: Returns a list of all enrichment strategies known to Nakadi + schema: + type: array + items: + type: string + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + + '/registry/partition-strategies': + get: + tags: + - schema-registry-api + description: | + Lists all of the partition resolution strategies supported by this installation of Nakadi. + Special or custom strategies besides the defaults will be listed here. + + Nakadi currently offers these inbuilt strategies: + + - `random`: Resolution of the target partition happens randomly (events are evenly + distributed on the topic's partitions). + + - `user_defined`: Target partition is defined by the client. As long as the indicated + partition exists, Event assignment will respect this value. Correctness of the relative + ordering of events is under the responsibility of the Producer. Requires that the client + provides the target partition on `metadata.partition` (See `EventMetadata`). Failure to do + so will reject the publishing of the Event. + + - `hash`: Resolution of the partition follows the computation of a hash from the value of + the fields indicated in the EventType's `partition_key_fields`, guaranteeing that Events + with same values on those fields end in the same partition. Given the event type's category + is DataChangeEvent, field path is considered relative to "data". + responses: + '200': + description: Returns a list of all partitioning strategies known to Nakadi + schema: + type: array + items: + type: string + '401': + description: Client is not authenticated + schema: + $ref: '#/definitions/Problem' + + +# ################################### # +# # +# Definitions # +# # +# ################################### # + +definitions: + Event: + type: object + description: | + **Note** The Event definition will be externalized in future versions of this document. + + A basic payload of an Event. The actual schema is dependent on the information configured for + the EventType, as is its enforcement (see POST /event-types). Setting of metadata properties + are dependent on the configured enrichment as well. + + For explanation on default configurations of validation and enrichment, see documentation of + `EventType.category`. + + For concrete examples of what will be enforced by Nakadi see the objects BusinessEvent and + DataChangeEvent below. + + EventMetadata: + type: object + description: | + Metadata for this Event. + + Contains commons fields for both Business and DataChange Events. Most are enriched by Nakadi + upon reception, but they in general MIGHT be set by the client. + properties: + eid: + description: | + Identifier of this Event. + + Clients MUST generate this value and it SHOULD be guaranteed to be unique from the + perspective of the producer. Consumers MIGHT use this value to assert uniqueness of + reception of the Event. + type: string + format: uuid + example: '105a76d8-db49-4144-ace7-e683e8f4ba46' + event_type: + description: | + The EventType of this Event. This is enriched by Nakadi on reception of the Event + based on the endpoint where the Producer sent the Event to. + + If provided MUST match the endpoint. Failure to do so will cause rejection of the + Event. + type: string + example: 'pennybags.payment-business-event' + occurred_at: + description: | + Timestamp of creation of the Event generated by the producer. + type: string + format: date-time + example: '1996-12-19T16:39:57-08:00' + received_at: + type: string + description: | + Timestamp of the reception of the Event by Nakadi. This is enriched upon reception of + the Event. + If set by the producer Event will be rejected. + format: date-time + example: '1996-12-19T16:39:57-08:00' + parent_eids: + type: array + items: + type: string + format: uuid + description: | + Event identifier of the Event that caused the generation of this Event. + Set by the producer. + example: '105a76d8-db49-4144-ace7-e683e8f4ba46' + flow_id: + description: | + The flow-id of the producer of this Event. As this is usually a HTTP header, this is + enriched from the header into the metadata by Nakadi to avoid clients having to + explicitly copy this. + type: string + example: 'JAh6xH4OQhCJ9PutIV_RYw' + partition: + description: | + Indicates the partition assigned to this Event. + + Required to be set by the client if partition strategy of the EventType is + 'user_defined'. + type: string + example: '0' + required: + - eid + - occurred_at + + BusinessEvent: + description: | + A Business Event. + + Usually represents a status transition in a Business process. + allOf: + - $ref: '#/definitions/Event' + - type: object + properties: + metadata: + $ref: '#/definitions/EventMetadata' + required: + - metadata + + DataChangeEvent: + description: | + A Data change Event. + + Represents a change on a resource. Also contains indicators for the data + type and the type of operation performed. + allOf: + - $ref: '#/definitions/Event' + - type: object + properties: + data_type: + type: string + example: 'pennybags:order' + data_op: + type: string + enum: ['C', 'U', 'D', 'S'] + description: | + The type of operation executed on the entity. + * C: Creation + * U: Update + * D: Deletion + * S: Snapshot + metadata: + $ref: '#/definitions/EventMetadata' + data: + type: object + description: | + The payload of the type + required: + - data + - metadata + - data_type + - data_op + + Problem: + type: object + properties: + type: + type: string + format: uri + description: | + An absolute URI that identifies the problem type. When dereferenced, it SHOULD provide + human-readable API documentation for the problem type (e.g., using HTML). This Problem + object is the same as provided by https://github.com/zalando/problem + example: http://httpstatus.es/503 + title: + type: string + description: | + A short, summary of the problem type. Written in English and readable for engineers + (usually not suited for non technical stakeholders and not localized) + example: Service Unavailable + status: + type: integer + format: int32 + description: | + The HTTP status code generated by the origin server for this occurrence of the problem. + example: 503 + detail: + type: string + description: | + A human readable explanation specific to this occurrence of the problem. + example: Connection to database timed out + instance: + type: string + format: uri + description: | + An absolute URI that identifies the specific occurrence of the problem. + It may or may not yield further information if dereferenced. + required: + - type + - title + - status + + Partition: + description: | + Partition information. Can be helpful when trying to start a stream using an unmanaged API. + + This information is not related to the state of the consumer clients. + required: + - partition + - oldest_available_offset + - newest_available_offset + properties: + partition: + type: string + oldest_available_offset: + description: | + An offset of the oldest available Event in that partition. This value will be changing + upon removal of Events from the partition by the background archiving/cleanup mechanism. + type: string + newest_available_offset: + description: | + An offset of the newest available Event in that partition. This value will be changing + upon reception of new events for this partition by Nakadi. + + This value can be used to construct a cursor when opening streams (see + `GET /event-type/{name}/events` for details). + + Might assume the special name BEGIN, meaning a pointer to the offset of the oldest + available event in the partition. + type: string + + Cursor: + required: + - partition + - offset + properties: + partition: + type: string + description: | + Id of the partition pointed to by this cursor. + offset: + type: string + description: | + Offset of the event being pointed to. + + EventStreamBatch: + description: | + One chunk of events in a stream. A batch consists of an array of `Event`s plus a `Cursor` + pointing to the offset of the last Event in the stream. + + The size of the array of Event is limited by the parameters used to initialize a Stream. + + If acting as a keep alive message (see `GET /event-type/{name}/events`) the events array will + be omitted. + + Sequential batches might present repeated cursors if no new events have arrived. + required: + - cursor + properties: + cursor: + $ref: '#/definitions/Cursor' + events: + type: array + items: + $ref: '#/definitions/Event' + + EventType: + description: An event type defines the schema and its runtime properties. + properties: + name: + type: string + description: | + Name of this EventType. The name is constrained by a regular expression. + + Note: the name can encode the owner/responsible for this EventType and ideally should + follow a common pattern that makes it easy to read an understand, but this level of + structure is not enforced. For example a team name and data type can be used such as + 'acme-team.price-change'. + pattern: '[a-zA-Z][-0-9a-zA-Z_]*(\.[a-zA-Z][-0-9a-zA-Z_]*)*' + example: order.order_cancelled, acme-platform.users + owning_application: + type: string + description: | + Indicator of the (Stups) Application owning this `EventType`. + example: price-service + category: + type: string + enum: + - undefined + - data + - business + description: | + Defines the category of this EventType. + + The value set will influence, if not set otherwise, the default set of + validations, enrichment-strategies, and the effective schema for validation in + the following way: + + - `undefined`: No predefined changes apply. The effective schema for the validation is + exactly the same as the `EventTypeSchema`. + + - `data`: Events of this category will be DataChangeEvents. The effective schema during + the validation contains `metadata`, and adds fields `data_op` and `data_type`. The + passed EventTypeSchema defines the schema of `data`. + + - `business`: Events of this category will be BusinessEvents. The effective schema for + validation contains `metadata` and any additionally defined properties passed in the + `EventTypeSchema` directly on top level of the Event. If name conflicts arise, creation + of this EventType will be rejected. + + enrichment_strategies: + description: | + Determines the enrichment to be performed on an Event upon reception. Enrichment is + performed once upon reception (and after validation) of an Event and is only possible on + fields that are not defined on the incoming Event. + + For event types in categories 'business' or 'data' it's mandatory to use + metadata_enrichment strategy. For 'undefined' event types it's not possible to use this + strategy, since metadata field is not required. + + See documentation for the write operation for details on behaviour in case of unsuccessful + enrichment. + type: array + items: + type: string + enum: + - metadata_enrichment + + partition_strategy: + description: | + Determines how the assignment of the event to a partition should be handled. + + For details of possible values, see GET /registry/partition-strategies. + type: string + default: 'random' + + schema: + type: object + $ref: '#/definitions/EventTypeSchema' + description: | + The schema for this EventType. Submitted events will be validated against it. + + partition_key_fields: + type: array + items: + type: string + description: | + Required when 'partition_resolution_strategy' is set to 'hash'. Must be absent otherwise. + Indicates the fields used for evaluation the partition of Events of this type. + + If set it MUST be a valid required field as defined in the schema. + + default_statistics: + type: object + $ref: '#/definitions/EventTypeStatistics' + description: | + Statistics of this EventType used for optimization purposes. Internal use of these values + might change over time. + (TBD: measured statistics) + + required: + - name + - category + - owning_application + - schema + + EventTypeSchema: + properties: + type: + type: string + enum: + - json_schema + description: | + The type of schema definition. Currently only json_schema (JSON Schema v04) is supported, but in the + future there could be others. + schema: + type: string + $ref: '#/definitions/EventTypeSchema' + description: | + The schema as string in the syntax defined in the field type. Failure to respect the + syntax will fail any operation on an EventType. + + To have a generic, undefined schema it is possible to define the schema as `"schema": + "{\"additionalProperties\": true}"`. + required: + - type + - schema + + EventTypeStatistics: + type: object + description: | + Operational statistics for an EventType. This data is generated by Nakadi based on the runtime + and might be used to guide changes in internal parameters. + + properties: + messages_per_minute: + type: integer + description: | + Write rate for events of this EventType. This rate encompasses all producers of this + EventType for a Nakadi cluster. + + Measured in event count per minute. + + message_size: + type: integer + description: | + Average message size for each Event of this EventType. Includes in the count the whole serialized + form of the event, including metadata. + Measured in bytes. + + read_parallelism: + type: integer + description: | + Amount of parallel readers (consumers) to this EventType. + + write_parallelism: + type: integer + description: | + Amount of parallel writers (producers) to this EventType. + required: + - messages_per_minute + - message_size + - read_parallelism + - write_parallelism + + BatchItemResponse: + description: | + A status corresponding to one individual Event's publishing attempt. + properties: + eid: + type: string + format: uuid + description: | + eid of the corresponding item. Will be absent if missing on the incoming Event. + publishing_status: + type: string + enum: + - submitted + - failed + - aborted + description: | + Indicator of the submission of the Event within a Batch. + + - "submitted" indicates successful submission, including commit on he underlying broker. + + - "failed" indicates the message submission was not possible and can be resubmitted if so + desired. + + - "aborted" indicates that the submission of this item was not attempted any further due + to a failure on another item in the batch. + + step: + type: string + enum: + - none + - validating + - enriching + - partitioning + - publishing + description: | + Indicator of the step in the publishing process this Event reached. + + In Items that "failed" means the step of the failure. + + - "none" indicates that nothing was yet attempted for the publishing of this Event. Should + be present only in the case of aborting the publishing during the validation of another + (previous) Event. + + - "validating", "enriching", "partitioning" and "publishing" indicate all the + corresponding steps of the publishing process. + detail: + type: string + description: | + Human readable information about the failure on this item. Items that are not "submitted" + should have a description. + required: + - publishing_status \ No newline at end of file diff --git a/docs/api-spec-oai/nakadi-oai-current.yaml b/docs/api-spec-oai/nakadi-oai-current.yaml index 54fc363..a878be5 120000 --- a/docs/api-spec-oai/nakadi-oai-current.yaml +++ b/docs/api-spec-oai/nakadi-oai-current.yaml @@ -1 +1 @@ -nakadi-oai-0.5.1.yaml \ No newline at end of file +nakadi-oai-0.6.0.yaml \ No newline at end of file From 70658b3efbe0e77954b1491cee96f0adbb139016 Mon Sep 17 00:00:00 2001 From: dehora Date: Wed, 22 Jun 2016 10:15:11 +0200 Subject: [PATCH 2/4] Clarify cursor positioning --- docs/api-spec-oai/nakadi-oai-0.6.0.yaml | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/api-spec-oai/nakadi-oai-0.6.0.yaml b/docs/api-spec-oai/nakadi-oai-0.6.0.yaml index ab31164..ca074e6 100644 --- a/docs/api-spec-oai/nakadi-oai-0.6.0.yaml +++ b/docs/api-spec-oai/nakadi-oai-0.6.0.yaml @@ -89,7 +89,7 @@ info: times, etc), as well as their modification. - version: '0.6.0' + version: '0.6.1' contact: name: Team Aruha @ Zalando email: team-aruha+nakadi-maintainers@zalando.de @@ -399,9 +399,10 @@ paths: The event stream is formatted as a sequence of `EventStreamBatch`es separated by `\n`. Each `EventStreamBatch` contains a chunk of Events and a `Cursor` pointing to the **end** of the - chunk (i.e. last delivered Event). The cursor might specify the offset with the symbolic - value `BEGIN`, which will open the stream starting from the oldest available offset in the - partition. + chunk (i.e. last delivered Event). The default cursor position is at the front of the + stream, with new events that arrive being sent to the client. The cursor's position in the + stream can be specified by sending the value `BEGIN`, which opens the stream starting from + the oldest available offset in the partition. Currently the `application/x-json-stream` format is the only one supported by the system, but in the future other media types may be supported. From 14196203ab0682866e794df2c8244153a7a64ed4 Mon Sep 17 00:00:00 2001 From: dehora Date: Wed, 22 Jun 2016 10:18:37 +0200 Subject: [PATCH 3/4] Rebuild api reference --- docs/api-spec-generated/overview.md | 2 +- docs/api-spec-generated/paths.md | 7 ++++--- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/api-spec-generated/overview.md b/docs/api-spec-generated/overview.md index 1f1be55..4c612e0 100644 --- a/docs/api-spec-generated/overview.md +++ b/docs/api-spec-generated/overview.md @@ -97,7 +97,7 @@ service. ### Version information -*Version* : 0.6.0 +*Version* : 0.6.1 ### Contact information diff --git a/docs/api-spec-generated/paths.md b/docs/api-spec-generated/paths.md index e0dc28d..0abb5fb 100644 --- a/docs/api-spec-generated/paths.md +++ b/docs/api-spec-generated/paths.md @@ -302,9 +302,10 @@ Starts a stream delivery for the specified partitions of the given EventType. The event stream is formatted as a sequence of `EventStreamBatch`es separated by `\n`. Each `EventStreamBatch` contains a chunk of Events and a `Cursor` pointing to the **end** of the -chunk (i.e. last delivered Event). The cursor might specify the offset with the symbolic -value `BEGIN`, which will open the stream starting from the oldest available offset in the -partition. +chunk (i.e. last delivered Event). The default cursor position is at the front of the +stream, with new events that arrive being sent to the client. The cursor's position in the +stream can be specified by sending the value `BEGIN`, which opens the stream starting from +the oldest available offset in the partition. Currently the `application/x-json-stream` format is the only one supported by the system, but in the future other media types may be supported. From 8dc7d4e4e00bf41f96cecac15d0ec599011a7668 Mon Sep 17 00:00:00 2001 From: dehora Date: Mon, 27 Jun 2016 11:05:39 +0100 Subject: [PATCH 4/4] wip --- docs/api-spec-oai/nakadi-oai-0.6.0.yaml | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/api-spec-oai/nakadi-oai-0.6.0.yaml b/docs/api-spec-oai/nakadi-oai-0.6.0.yaml index ca074e6..7ffa98c 100644 --- a/docs/api-spec-oai/nakadi-oai-0.6.0.yaml +++ b/docs/api-spec-oai/nakadi-oai-0.6.0.yaml @@ -398,8 +398,8 @@ paths: Starts a stream delivery for the specified partitions of the given EventType. The event stream is formatted as a sequence of `EventStreamBatch`es separated by `\n`. Each - `EventStreamBatch` contains a chunk of Events and a `Cursor` pointing to the **end** of the - chunk (i.e. last delivered Event). The default cursor position is at the front of the + `EventStreamBatch` contains a chunk of Events and a `Cursor` pointing to the end of the + partition (i.e. last delivered Event), making the default cursor position at the front of the stream, with new events that arrive being sent to the client. The cursor's position in the stream can be specified by sending the value `BEGIN`, which opens the stream starting from the oldest available offset in the partition.