From 18365316f70b588db30be378bb33e706455b3868 Mon Sep 17 00:00:00 2001 From: card Date: Thu, 26 Oct 2023 22:41:48 -0400 Subject: [PATCH 1/8] initial streaming ADR draft --- .../2023-11-07_029-streaming-persistence.md | 72 +++++++++++++++++++ 1 file changed, 72 insertions(+) create mode 100644 docs/adr/2023-11-07_029-streaming-persistence.md diff --git a/docs/adr/2023-11-07_029-streaming-persistence.md b/docs/adr/2023-11-07_029-streaming-persistence.md new file mode 100644 index 00000000000..3dc50642c31 --- /dev/null +++ b/docs/adr/2023-11-07_029-streaming-persistence.md @@ -0,0 +1,72 @@ +--- +slug: 29 +title: | + 29. EventServer abstraction for event persistence +authors: [] +tags: [] +--- + +## Status +N/A + +## Context + +The current Hydra API is materialized as a firehose WebSocket connection: upon connecting, a client receives a deluge of information, such as all past transactions. Additionally, there is no way to pick and choose which messages to receive, and not all events are exposed via the API. + +This forces some awkward interactions when integrating with a Hydra node externally, such as when using it as a component in other protocols, or trying to operationalize the Hydra server for monitoring and persistence. Here's just a few of the things that prove to be awkward: + - An application would need to store it's own notion of progress through the event log, so it knew which messages to "ignore" on reconnection + - An application that wanted to write these hydra events to a message broker like RabbitMQ or Kinesis would need it's own custom integration layer + - An application that only cared about a limited set of message types may be overwhelmed (and slowed down) by the barage of messages on connection + - An application that wanted a deeper view into the workings of the hydra protocol would have no access to several of the internal event types + +Additionally, much of the changes that would need to happen to solve any of these would apply equally well to the on-disk persistence the Hydra node currently provides. + +------------------------------------- + +- Current client API is a stateful WebSocket connection. Client API has its own incrementally persisted state that must be maintained, and includes a monotonically increasing `Seq` ID, but this is based off the events a client observes, not the Hydra host's event log. + +- Client must be flexible and ready to handle many different events + +- There is no way to simply hand off transactions to the hyrda-node currently, a full connection must be initiated before observed transactions can be applied, and bulky JSON objects can slow down "bulk" operations + +- Many applications, like SundaeSwap's Gummiworm Protocol, do not need need the full general API, and benefit from any performance improvement + +- The challenge of subscribing to event types is complex to handle, but would be applicable to many parts of Hydra, not just for subscribing to new transactions. It's a good fit for passing stuff off to a message queue (MQTT, Kafka, whatever), probably from a dedicated process (see "Chain Server" below) + - Could also just directly use a "compatible" spec like STOMP or MQTT, but that would lock in implementation details + +- Previous mock chain using ZeroMQ was [removed][zmq] as part of [#119][#119], due to complexity and feeling of ZeroMQ being unmaintained (primary author no longer contributing, new tag release versions not being published) + - This mock chain was used to mock the layer L1 chain, not the L2 ledger +- Attempt in February 2023 to externalize chainsync server as part of [#230][#230] + - Similar to [#119][#119], but the "Chain Server" component in charge of translating Hydra Websocket messages into direct chain, mock chain, or any hypothetical message queue, not necessarily just ZeroMQ + - Deemed low priority due to ambiguous use-case at the time. SundaeSwap's Gummiworm Protocol would benefit from the additional control enabled by the Event Server + +# Decision + +A new abstraction, the EventServer, will be introduced. Each internal hydra event will be sent to the event server, which will be responsible for persisting that event and returning a durable monotonically increasing global event ID. Different implementations of the event server can handle this event differently. Initially, we will implement the following: + - MockEventServer, which increments a counter for the ID, and discards the event + - FileEventServer, which increments a counter for the ID, and encapsulates the existing file persistence mechanism + - SocketEventServer, which increments a counter for the ID, and writes the event to an arbitrary unix socket + - WebsocketBroadcastEventServer, which broadcasts the publicly visible events over the websocket API + - UTxOStateServer, which increments a counter for the ID, and updates a file with the latest UTxO after the event + - Generalizes the UTxO persistence mechanism introduced in [offline mode][offline-mode] + - May be configured to only persist UTxO state periodically and/or on SIGTERM, allowing for performant in-memory Hydra usage. This is intended for offline mode usecases where the transactions ingested are already persisted elsewhere. + - One configuration which we expect will be common and useful, is the usage of a MultiplexingEventServer configured with a primary MockEventServer, and a secondary UTxOStateServer, which will persist the UTxO state to disk. + - MultiplexingEventServer, which has a primary event server (which it writes to first, and uses its ID) and a list of secondary event servers (which it writes to in sequence, but discards the ID) + +New configuration options will be introduced for choosing between and configuring these options for the event server. The default will be a multiplexing event server, using the file event server as its primary event server, and a websocket broadcast event server as its secondary event server. + +## Consequences + +The primary consequence of this is to enable deeper integration and better operationalization of the Hydra node. For example: +- Users may now use the SocketEventServer to implement custom integrations with existing ecosystem tools +- To avoid the overhead of a unix socket, they may submit pull requests to add integrations with the most popular tools +- Developers may more easily maintain downstream forks with custom implementations that aren't appropriate for community-wide adoption, such as the Gummiworm Protocol +- Developers can get exposure to events that aren't normally surfaced in the websocket API +- Logging, metrics, and durability can be improved or tailored to the application through such integrations + +Note that while a future goal of this work is to improve the websocket API, making it more stateless and "subscription" based, this ADR does not seek to make those changes, only make them easier to implement in the future. + +[offline-mode]: 2023-10-16_028_offline_adr.md +[#119]: https://github.com/input-output-hk/hydra/pull/119 +[zmq]: https://github.com/input-output-hk/hydra/blob/41598800a9e0396c562a946780909732e5332245/CHANGELOG.md?plain=1#L710- +[#230]: https://github.com/input-output-hk/hydra/pull/230 \ No newline at end of file From 5c169a7e9af21b4e630949f0868cc544761e5d37 Mon Sep 17 00:00:00 2001 From: card Date: Tue, 21 Nov 2023 08:57:39 -0500 Subject: [PATCH 2/8] streaming persistence ADR proposal revision --- .../2023-11-07_029-streaming-persistence.md | 94 ++++++++++++++----- 1 file changed, 70 insertions(+), 24 deletions(-) diff --git a/docs/adr/2023-11-07_029-streaming-persistence.md b/docs/adr/2023-11-07_029-streaming-persistence.md index 3dc50642c31..4c0bf090076 100644 --- a/docs/adr/2023-11-07_029-streaming-persistence.md +++ b/docs/adr/2023-11-07_029-streaming-persistence.md @@ -2,7 +2,7 @@ slug: 29 title: | 29. EventServer abstraction for event persistence -authors: [] +authors: [@cardenaso11] tags: [] --- @@ -14,10 +14,10 @@ N/A The current Hydra API is materialized as a firehose WebSocket connection: upon connecting, a client receives a deluge of information, such as all past transactions. Additionally, there is no way to pick and choose which messages to receive, and not all events are exposed via the API. This forces some awkward interactions when integrating with a Hydra node externally, such as when using it as a component in other protocols, or trying to operationalize the Hydra server for monitoring and persistence. Here's just a few of the things that prove to be awkward: - - An application would need to store it's own notion of progress through the event log, so it knew which messages to "ignore" on reconnection + - An application that only cared about certain Hydra resources (like say, the current UTxO set), would be unable to subscribe to just that resource. The application would be made to either parse the Hydra incremental persistence log directly (which may change), or subscribe to the full firehose of Websocket API events. - An application that wanted to write these hydra events to a message broker like RabbitMQ or Kinesis would need it's own custom integration layer - - An application that only cared about a limited set of message types may be overwhelmed (and slowed down) by the barage of messages on connection - - An application that wanted a deeper view into the workings of the hydra protocol would have no access to several of the internal event types + - An application that wishes to build on Hydra would need to fork the Hydra project in order to change how file persistence worked. + Additionally, much of the changes that would need to happen to solve any of these would apply equally well to the on-disk persistence the Hydra node currently provides. @@ -27,45 +27,91 @@ Additionally, much of the changes that would need to happen to solve any of thes - Client must be flexible and ready to handle many different events -- There is no way to simply hand off transactions to the hyrda-node currently, a full connection must be initiated before observed transactions can be applied, and bulky JSON objects can slow down "bulk" operations +- There is no way to simply hand off transactions to the hyrda-node currently, a full connection must be initiated before observed transactions can be applied. -- Many applications, like SundaeSwap's Gummiworm Protocol, do not need need the full general API, and benefit from any performance improvement +- Using history=0 in the query string allows ignoring prior events, but it's not possible to ignore events going forward, only disabling UTxO on Snapshot events. -- The challenge of subscribing to event types is complex to handle, but would be applicable to many parts of Hydra, not just for subscribing to new transactions. It's a good fit for passing stuff off to a message queue (MQTT, Kafka, whatever), probably from a dedicated process (see "Chain Server" below) - - Could also just directly use a "compatible" spec like STOMP or MQTT, but that would lock in implementation details +- Many applications, like SundaeSwap's Gummiworm Protocol, would benefit from using the API in ways which are currently not supported +- Custom event management is intended to be for a minority of Hydra users, that intend to integrate heavily with Hydra. It's a good fit for passing stuff off to a message queue (MQTT, Kafka, whatever) for further processing, probably from a dedicated process (see "Chain Server" below) + - The event management is intended to modularize and unify the Websocket API, and (incremental, full) persistence + - The default event management is intended to be transparent to most users + - There exist "highly compatible" specifications like STOMP or MQTT, but supporting these directly as a substitute for API or persistence, would lock in a significant amount of implementation details + + + - Previous mock chain using ZeroMQ was [removed][zmq] as part of [#119][#119], due to complexity and feeling of ZeroMQ being unmaintained (primary author no longer contributing, new tag release versions not being published) - This mock chain was used to mock the layer L1 chain, not the L2 ledger - Attempt in February 2023 to externalize chainsync server as part of [#230][#230] - Similar to [#119][#119], but the "Chain Server" component in charge of translating Hydra Websocket messages into direct chain, mock chain, or any hypothetical message queue, not necessarily just ZeroMQ - Deemed low priority due to ambiguous use-case at the time. SundaeSwap's Gummiworm Protocol would benefit from the additional control enabled by the Event Server -# Decision +- Offline mode intended to persist UTxO state to a file for simplified offline-mode persistence. As a standalone feature, the interface would be too ad-hoc. A less ad-hoc way to keep a single updated UTxO state file, would instead allow for keeping an updated file for any Hydra resource. -A new abstraction, the EventServer, will be introduced. Each internal hydra event will be sent to the event server, which will be responsible for persisting that event and returning a durable monotonically increasing global event ID. Different implementations of the event server can handle this event differently. Initially, we will implement the following: - - MockEventServer, which increments a counter for the ID, and discards the event - - FileEventServer, which increments a counter for the ID, and encapsulates the existing file persistence mechanism - - SocketEventServer, which increments a counter for the ID, and writes the event to an arbitrary unix socket - - WebsocketBroadcastEventServer, which broadcasts the publicly visible events over the websocket API - - UTxOStateServer, which increments a counter for the ID, and updates a file with the latest UTxO after the event - - Generalizes the UTxO persistence mechanism introduced in [offline mode][offline-mode] - - May be configured to only persist UTxO state periodically and/or on SIGTERM, allowing for performant in-memory Hydra usage. This is intended for offline mode usecases where the transactions ingested are already persisted elsewhere. - - One configuration which we expect will be common and useful, is the usage of a MultiplexingEventServer configured with a primary MockEventServer, and a secondary UTxOStateServer, which will persist the UTxO state to disk. - - MultiplexingEventServer, which has a primary event server (which it writes to first, and uses its ID) and a list of secondary event servers (which it writes to in sequence, but discards the ID) - -New configuration options will be introduced for choosing between and configuring these options for the event server. The default will be a multiplexing event server, using the file event server as its primary event server, and a websocket broadcast event server as its secondary event server. +# Decision +Each internal hydra event will have a durable, monotonically increasing event ID, ordering all the internal events in the persistence log. + +A new abstraction, the EventSink, will be introduced. The node's state will contain a non-empty list of EventSinks. Each internal hydra event will be sent to a non-empty list of event sinks, which will be responsible for persisting or serving that event in a specific manner. When multiple event sinks are specified, they run in order. Active event can change at runtime. Initially, we will implement the following: + - MockEventSink, which discards the event. This is exclusive to offline mode, since this would change the semantics of the Hydra node in online mode. This is the only way to have an "empty" EventSink list + - APIBroadcastEventSink, which broadcasts the publicly visible resource events over the websocket API. + - Subsumes the existing Websocket API. + - Can be created via CLI subcommand if Websocket client IP is known. Can be created at runtime by the top-level API (APIServerEventSource) + - Two modes: + - Full event log mode. Similar to existing Websocket API, broadcasts all events. + - Single-resource event log mode. Broadcasts the state changes of a single resource. + - One APIBroadcastEventSink per listener. A Hydra node running with no one listening would have 0 APIBroadcastEventSink's in the EventSink list + - Establishing a websocket connection will add a new event sink to handle broadcasting messages + - Resources should all support JSON content type. UTXO resource, Tx resource, should support CBOR. + - APIBroadcastResourceSink, which broadcasts the latest resource after a StateChanged event + - Runs in single-resource event log mode, broadcasting the current state of a single resource. + - EventFileSink, which updates a file with the state changed by a StateChanged event + - Two modes: + - Full event log mode. Encapsulates the existing incremental file persistence. Appends all server events incrementally to a file. + - One of these in the EventSink list is required in Online mode, for proper Hydra semantics + - Single-resource event log mode. Incrementally appends an event log file for a single resource + - Persists StateChanged changes + - ResourceFileSink, which updates a file with the latest resource after a StateChanged event + - Two modes: + - Full event log mode. Encapsulates the existing non-incremental full file persistence mechanism. Appends all server events incrementally to a file. + - Single-resource event log mode. Maintains an up-to-date file for a single resource + - Consuming an up-to-date single resource will no longer be coupled with overall Hydra state format, only the encoding schema of that particular resource + - Generalizes the UTxO persistence mechanism previously discussed in [offline mode][offline-mode] + - May be configured to only persist resource state: + - Periodically, if the last write was more than a certain interval ago + - On graceful shutdown/SIGTERM + - Allows for performant in-memory Hydra usage, for offline mode usecases where the transactions ingested are already persisted elsewhere, and only certain resources are considered important + + - One configuration which we expect will be common and useful, is the usage of a ResourceFileSink on the UTxO resource in tandem with a MockEventServer in offline mode. + +The event server will be configured via a new subcommand ("initial-sinks"), which takes an unbounded list of positional arguments. Each positional argument adds a sink to the initial event sink list. There is one argument constructor per EventSink type. Arguments are added in-order to the initial EventSink list. The default parameters configure an EventFileSink for full incremental persistence. + +The top-level API will change to implement the API changes described in [ADR 25][adr-25] + - Top level Websocket subscription adds a Full event log EventFileSink + - API on /vN/ (for some N) will feature endpoints different resources + - POST verbs emit state change events to modify the resource in question + - Websocket upgrades on GET emit state change events for the EventSink list (itself a resource) to establish new ongoing Websocket client subscriptions + - This will expose the new single-resource APIBroadcastEventSink and APIBroadcastResourceSink + + +A new abstraction, the EventSource, will be introduced. + - APIServerEventSource + - Top level API + - Top level Websocket API adds a Full event log EventFileSink + - Handles non-websocket-upgraded single-shot HTTP API verbs [ADR 25][adr-25] + - POST verbs emit state change events to modify the resource in question + - Websocket upgraded verbs modify the EventSink list (itself a resource) to establish new ongoing subscribers ## Consequences The primary consequence of this is to enable deeper integration and better operationalization of the Hydra node. For example: -- Users may now use the SocketEventServer to implement custom integrations with existing ecosystem tools -- To avoid the overhead of a unix socket, they may submit pull requests to add integrations with the most popular tools +- Users may now use the new sinks to implement custom integrations with existing ecosystem tools +- Users may use the file sinks to reduce overhead significantly in Offline mode - Developers may more easily maintain downstream forks with custom implementations that aren't appropriate for community-wide adoption, such as the Gummiworm Protocol -- Developers can get exposure to events that aren't normally surfaced in the websocket API - Logging, metrics, and durability can be improved or tailored to the application through such integrations Note that while a future goal of this work is to improve the websocket API, making it more stateless and "subscription" based, this ADR does not seek to make those changes, only make them easier to implement in the future. +[adr-25]: https://hydra.family/head-protocol/adr/25/ [offline-mode]: 2023-10-16_028_offline_adr.md [#119]: https://github.com/input-output-hk/hydra/pull/119 [zmq]: https://github.com/input-output-hk/hydra/blob/41598800a9e0396c562a946780909732e5332245/CHANGELOG.md?plain=1#L710- From ca81345bc5264b3fb2dfeb7e73c4ae89b16cbc57 Mon Sep 17 00:00:00 2001 From: card Date: Tue, 5 Dec 2023 11:43:31 -0500 Subject: [PATCH 3/8] Narrow and specify scope of streaming streaming ADR Co-authored-by: Pi Lanningham --- .../2023-11-07_029-streaming-persistence.md | 112 +++--------------- 1 file changed, 17 insertions(+), 95 deletions(-) diff --git a/docs/adr/2023-11-07_029-streaming-persistence.md b/docs/adr/2023-11-07_029-streaming-persistence.md index 4c0bf090076..6d288c5f360 100644 --- a/docs/adr/2023-11-07_029-streaming-persistence.md +++ b/docs/adr/2023-11-07_029-streaming-persistence.md @@ -2,117 +2,39 @@ slug: 29 title: | 29. EventServer abstraction for event persistence -authors: [@cardenaso11] +authors: [@cardenaso11, @quantumplation] tags: [] --- ## Status -N/A +Draft ## Context -The current Hydra API is materialized as a firehose WebSocket connection: upon connecting, a client receives a deluge of information, such as all past transactions. Additionally, there is no way to pick and choose which messages to receive, and not all events are exposed via the API. +The Hydra node represents a significant engineering asset, providing layer 1 monitoring, peer to peer consensus, durable persistence, and an isomorphic Cardano ledger. Because of this, it is being eyed as a key building block not just in Hydra based applications, but other protocols as well. -This forces some awkward interactions when integrating with a Hydra node externally, such as when using it as a component in other protocols, or trying to operationalize the Hydra server for monitoring and persistence. Here's just a few of the things that prove to be awkward: - - An application that only cared about certain Hydra resources (like say, the current UTxO set), would be unable to subscribe to just that resource. The application would be made to either parse the Hydra incremental persistence log directly (which may change), or subscribe to the full firehose of Websocket API events. - - An application that wanted to write these hydra events to a message broker like RabbitMQ or Kinesis would need it's own custom integration layer - - An application that wishes to build on Hydra would need to fork the Hydra project in order to change how file persistence worked. - +One remaining difficulty in integrating Hydra into a fully productionalized software stack is the persistence model. Currently, the Hydra node achieves durability by writing a sequence of "StateChanged" events to disk. If a node is interrupted, upon restarting, it can replay just these events, ignoring their corresponding effects, to recover the same internal state it had when it was interrupted. However, downstream consumers don't have the same luxury. -Additionally, much of the changes that would need to happen to solve any of these would apply equally well to the on-disk persistence the Hydra node currently provides. - -------------------------------------- - -- Current client API is a stateful WebSocket connection. Client API has its own incrementally persisted state that must be maintained, and includes a monotonically increasing `Seq` ID, but this is based off the events a client observes, not the Hydra host's event log. - -- Client must be flexible and ready to handle many different events - -- There is no way to simply hand off transactions to the hyrda-node currently, a full connection must be initiated before observed transactions can be applied. - -- Using history=0 in the query string allows ignoring prior events, but it's not possible to ignore events going forward, only disabling UTxO on Snapshot events. - -- Many applications, like SundaeSwap's Gummiworm Protocol, would benefit from using the API in ways which are currently not supported - -- Custom event management is intended to be for a minority of Hydra users, that intend to integrate heavily with Hydra. It's a good fit for passing stuff off to a message queue (MQTT, Kafka, whatever) for further processing, probably from a dedicated process (see "Chain Server" below) - - The event management is intended to modularize and unify the Websocket API, and (incremental, full) persistence - - The default event management is intended to be transparent to most users - - There exist "highly compatible" specifications like STOMP or MQTT, but supporting these directly as a substitute for API or persistence, would lock in a significant amount of implementation details - - - -- Previous mock chain using ZeroMQ was [removed][zmq] as part of [#119][#119], due to complexity and feeling of ZeroMQ being unmaintained (primary author no longer contributing, new tag release versions not being published) - - This mock chain was used to mock the layer L1 chain, not the L2 ledger -- Attempt in February 2023 to externalize chainsync server as part of [#230][#230] - - Similar to [#119][#119], but the "Chain Server" component in charge of translating Hydra Websocket messages into direct chain, mock chain, or any hypothetical message queue, not necessarily just ZeroMQ - - Deemed low priority due to ambiguous use-case at the time. SundaeSwap's Gummiworm Protocol would benefit from the additional control enabled by the Event Server - -- Offline mode intended to persist UTxO state to a file for simplified offline-mode persistence. As a standalone feature, the interface would be too ad-hoc. A less ad-hoc way to keep a single updated UTxO state file, would instead allow for keeping an updated file for any Hydra resource. +We propose generalizing the persistence mechanism to open the door to a plugin based approach to extending the Hydra node. # Decision -Each internal hydra event will have a durable, monotonically increasing event ID, ordering all the internal events in the persistence log. -A new abstraction, the EventSink, will be introduced. The node's state will contain a non-empty list of EventSinks. Each internal hydra event will be sent to a non-empty list of event sinks, which will be responsible for persisting or serving that event in a specific manner. When multiple event sinks are specified, they run in order. Active event can change at runtime. Initially, we will implement the following: - - MockEventSink, which discards the event. This is exclusive to offline mode, since this would change the semantics of the Hydra node in online mode. This is the only way to have an "empty" EventSink list - - APIBroadcastEventSink, which broadcasts the publicly visible resource events over the websocket API. - - Subsumes the existing Websocket API. - - Can be created via CLI subcommand if Websocket client IP is known. Can be created at runtime by the top-level API (APIServerEventSource) - - Two modes: - - Full event log mode. Similar to existing Websocket API, broadcasts all events. - - Single-resource event log mode. Broadcasts the state changes of a single resource. - - One APIBroadcastEventSink per listener. A Hydra node running with no one listening would have 0 APIBroadcastEventSink's in the EventSink list - - Establishing a websocket connection will add a new event sink to handle broadcasting messages - - Resources should all support JSON content type. UTXO resource, Tx resource, should support CBOR. - - APIBroadcastResourceSink, which broadcasts the latest resource after a StateChanged event - - Runs in single-resource event log mode, broadcasting the current state of a single resource. - - EventFileSink, which updates a file with the state changed by a StateChanged event - - Two modes: - - Full event log mode. Encapsulates the existing incremental file persistence. Appends all server events incrementally to a file. - - One of these in the EventSink list is required in Online mode, for proper Hydra semantics - - Single-resource event log mode. Incrementally appends an event log file for a single resource - - Persists StateChanged changes - - ResourceFileSink, which updates a file with the latest resource after a StateChanged event - - Two modes: - - Full event log mode. Encapsulates the existing non-incremental full file persistence mechanism. Appends all server events incrementally to a file. - - Single-resource event log mode. Maintains an up-to-date file for a single resource - - Consuming an up-to-date single resource will no longer be coupled with overall Hydra state format, only the encoding schema of that particular resource - - Generalizes the UTxO persistence mechanism previously discussed in [offline mode][offline-mode] - - May be configured to only persist resource state: - - Periodically, if the last write was more than a certain interval ago - - On graceful shutdown/SIGTERM - - Allows for performant in-memory Hydra usage, for offline mode usecases where the transactions ingested are already persisted elsewhere, and only certain resources are considered important +We propose adding a "persistence combinator", which can combine one or more "persistenceIncremental" instances. - - One configuration which we expect will be common and useful, is the usage of a ResourceFileSink on the UTxO resource in tandem with a MockEventServer in offline mode. +When appending to the combinator, it will forward the append to each persistence mechanism. -The event server will be configured via a new subcommand ("initial-sinks"), which takes an unbounded list of positional arguments. Each positional argument adds a sink to the initial event sink list. There is one argument constructor per EventSink type. Arguments are added in-order to the initial EventSink list. The default parameters configure an EventFileSink for full incremental persistence. +As the node starts up, as part of recovering the node state, it will also ensure "at least once" semantics for each persistence mechanism. It will maintain a notion of a "primary" instance, from which events are loaded, and "secondary instances", which are given the opportunity to re-observe each event. -The top-level API will change to implement the API changes described in [ADR 25][adr-25] - - Top level Websocket subscription adds a Full event log EventFileSink - - API on /vN/ (for some N) will feature endpoints different resources - - POST verbs emit state change events to modify the resource in question - - Websocket upgrades on GET emit state change events for the EventSink list (itself a resource) to establish new ongoing Websocket client subscriptions - - This will expose the new single-resource APIBroadcastEventSink and APIBroadcastResourceSink +Each persistence mechanism will be responsible for it's own durability; for example, it may maintain its own checkpoint, and only re-broadcast the event if its after the checkpoint. - -A new abstraction, the EventSource, will be introduced. - - APIServerEventSource - - Top level API - - Top level Websocket API adds a Full event log EventFileSink - - Handles non-websocket-upgraded single-shot HTTP API verbs [ADR 25][adr-25] - - POST verbs emit state change events to modify the resource in question - - Websocket upgraded verbs modify the EventSink list (itself a resource) to establish new ongoing subscribers +The exact implementation details of the above are left unspecified, to allow some flexibility and experimentation on the exact mechanism to realize these goals. ## Consequences -The primary consequence of this is to enable deeper integration and better operationalization of the Hydra node. For example: -- Users may now use the new sinks to implement custom integrations with existing ecosystem tools -- Users may use the file sinks to reduce overhead significantly in Offline mode -- Developers may more easily maintain downstream forks with custom implementations that aren't appropriate for community-wide adoption, such as the Gummiworm Protocol -- Logging, metrics, and durability can be improved or tailored to the application through such integrations - -Note that while a future goal of this work is to improve the websocket API, making it more stateless and "subscription" based, this ADR does not seek to make those changes, only make them easier to implement in the future. - -[adr-25]: https://hydra.family/head-protocol/adr/25/ -[offline-mode]: 2023-10-16_028_offline_adr.md -[#119]: https://github.com/input-output-hk/hydra/pull/119 -[zmq]: https://github.com/input-output-hk/hydra/blob/41598800a9e0396c562a946780909732e5332245/CHANGELOG.md?plain=1#L710- -[#230]: https://github.com/input-output-hk/hydra/pull/230 \ No newline at end of file +Here are the consequences we forsee from this change: +- The default operation of the node remains unchanged +- Projects forking the hydra node have a natively supported mechanism to extend node persistence +- These extensions can preserve robust "at least once" semantics for each hydra event +- Sundae Labs will build a "Save transaction batches to S3" proof of concept extension +- Sundae Labs will build a "Scrolls source" proof of concept integration +- This may also enable a future ADRs for dynamically loaded plugins without having to fork the Hydra node at all \ No newline at end of file From da760c18e709045d56db84fe3ab2fe228f2cf37f Mon Sep 17 00:00:00 2001 From: Sebastian Nagel Date: Tue, 12 Dec 2023 16:59:01 +0100 Subject: [PATCH 4/8] Rewrite ADR29 to be about EventSource/EventSink --- .../2023-11-07_029-streaming-persistence.md | 66 ++++++++++++++----- 1 file changed, 49 insertions(+), 17 deletions(-) diff --git a/docs/adr/2023-11-07_029-streaming-persistence.md b/docs/adr/2023-11-07_029-streaming-persistence.md index 6d288c5f360..f0ff8c19df2 100644 --- a/docs/adr/2023-11-07_029-streaming-persistence.md +++ b/docs/adr/2023-11-07_029-streaming-persistence.md @@ -1,8 +1,8 @@ --- slug: 29 title: | - 29. EventServer abstraction for event persistence -authors: [@cardenaso11, @quantumplation] + 29. EventSource & EventSink abstractions +authors: [@cardenaso11, @quantumplation, @ch1bo] tags: [] --- @@ -11,30 +11,62 @@ Draft ## Context -The Hydra node represents a significant engineering asset, providing layer 1 monitoring, peer to peer consensus, durable persistence, and an isomorphic Cardano ledger. Because of this, it is being eyed as a key building block not just in Hydra based applications, but other protocols as well. +* The Hydra node represents a significant engineering asset, providing layer 1 monitoring, peer to peer consensus, durable persistence, and an isomorphic Cardano ledger. Because of this, it is being eyed as a key building block not just in Hydra based applications, but other protocols as well. -One remaining difficulty in integrating Hydra into a fully productionalized software stack is the persistence model. Currently, the Hydra node achieves durability by writing a sequence of "StateChanged" events to disk. If a node is interrupted, upon restarting, it can replay just these events, ignoring their corresponding effects, to recover the same internal state it had when it was interrupted. However, downstream consumers don't have the same luxury. +* Currently the `hydra-node` uses a very basic persistence mechanism for it's internal `HeadState`, that is saving `StateChanged` events to file on disk and reading them back to load and re-aggregate the `HeadState` upon startup. + - Some production setups would benefit from storing these events to a service like Amazon Kinesis data stream instead of local files. -We propose generalizing the persistence mechanism to open the door to a plugin based approach to extending the Hydra node. +* The `hydra-node` websocket-based API is the only available event stream right now and might not fit all purposes. + - See also ADR [3](/adr/3) and [25](/adr/25) + - Internally, this is realized as a single `Server` handle which can `sendOutput :: ServerOutput tx -> m ()` + - These `ServerOutput`s closely relate to `StateChanged` events and `ClientEffect`s are yielded by the logic layer often together with the `StateChanged`. For example: + ```hs + onInitialChainAbortTx newChainState committed headId = + StateChanged HeadAborted{chainState = newChainState} + <> Effects [ClientEffect $ ServerOutput.HeadIsAborted{headId, utxo = fold committed}] + ``` + +* Users of `hydra-node` are interested to add alternative implementations for storing, loading and consuming events of the Hydra protocol. # Decision -We propose adding a "persistence combinator", which can combine one or more "persistenceIncremental" instances. +* We create two new interfaces in the `hydra-node` architecture: + + - ```data EventSource e m = EventSource { getEvents :: m [e] }``` + - ```data EventSink e m = EventSink { putEvent :: e -> m () }``` + +* We realize our current `PersistenceIncremental` used for persisting `StateChanged` events is both an `EventSource` and an `EventSink` -When appending to the combinator, it will forward the append to each persistence mechanism. +* We drop the `persistence` from the main handle `HydraNode tx m`, add **one** `EventSource` and allow **many** `EventSinks` -As the node starts up, as part of recovering the node state, it will also ensure "at least once" semantics for each persistence mechanism. It will maintain a notion of a "primary" instance, from which events are loaded, and "secondary instances", which are given the opportunity to re-observe each event. +```hs +data HydraNode tx m = HydraNode + { -- ... + , eventSource :: EventSource (StateChanged tx) m + , eventSinks :: [EventSink (StateChanged tx) m] + } +``` -Each persistence mechanism will be responsible for it's own durability; for example, it may maintain its own checkpoint, and only re-broadcast the event if its after the checkpoint. +* The `stepHydraNode` main loop does call `putEvent` on all `eventSinks` in sequence. + + - TBD: transaction semantics? what happens if one fails? -The exact implementation details of the above are left unspecified, to allow some flexibility and experimentation on the exact mechanism to realize these goals. +* The default `hydra-node` main loop does use the file-based `EventSource` and a single file-based `EventSink` (using the same file). + +* TBD: As the node starts up, when loading events from the `EventSource`, it will also ensure "at least once" semantics for each `EventSink`. + +* We realize that the `EventSource` and `EventSink` handles, as well as their aggregation in `HydraNode` are used as an API by forks of the `hydra-node` and try to minimize changes to it. ## Consequences -Here are the consequences we forsee from this change: -- The default operation of the node remains unchanged -- Projects forking the hydra node have a natively supported mechanism to extend node persistence -- These extensions can preserve robust "at least once" semantics for each hydra event -- Sundae Labs will build a "Save transaction batches to S3" proof of concept extension -- Sundae Labs will build a "Scrolls source" proof of concept integration -- This may also enable a future ADRs for dynamically loaded plugins without having to fork the Hydra node at all \ No newline at end of file +* TODO: Naming conflicts / overloaded terms -> should resolve by calling something not "Event"? + +* The default operation of the `hyda-node` remains unchanged +* The API `Server` can be modelled and refactored as an `EventSink` + - TBD: Do `Network` and `Chain` parts qualify as `EventSink`s as well or shall those be triggered by `Effect`s still? +* Projects forking the hydra node have a natively supported mechanism to extend node persistence +* These extensions can preserve robust "at least once" semantics for each hydra event +* Sundae Labs can build a "Save transaction batches to S3" proof of concept `EventSink` +* Sundae Labs can build a "Scrolls source" `EventSink` +* Sundae Labs can build a "Amazon Kinesis" `EventSource` and `EventSink` +* Extension points like `EventSource` and `EventSink` could be dynamically loaded as plugins without having to fork `hydra-node` (maybe in a future ADR) From 57959233f758a75387d2f98350f0b2cb824c7acc Mon Sep 17 00:00:00 2001 From: Sebastian Nagel Date: Tue, 12 Dec 2023 17:41:33 +0100 Subject: [PATCH 5/8] Rename ADR 29 --- ...md => 2023-11-07_029-event-source-sink.md} | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) rename docs/adr/{2023-11-07_029-streaming-persistence.md => 2023-11-07_029-event-source-sink.md} (85%) diff --git a/docs/adr/2023-11-07_029-streaming-persistence.md b/docs/adr/2023-11-07_029-event-source-sink.md similarity index 85% rename from docs/adr/2023-11-07_029-streaming-persistence.md rename to docs/adr/2023-11-07_029-event-source-sink.md index f0ff8c19df2..674e50b8a14 100644 --- a/docs/adr/2023-11-07_029-streaming-persistence.md +++ b/docs/adr/2023-11-07_029-event-source-sink.md @@ -59,14 +59,13 @@ data HydraNode tx m = HydraNode ## Consequences -* TODO: Naming conflicts / overloaded terms -> should resolve by calling something not "Event"? - -* The default operation of the `hyda-node` remains unchanged -* The API `Server` can be modelled and refactored as an `EventSink` +* The default operation of the `hyda-node` remains unchanged. +* There are other things called `Event` and `EventQueue(putEvent)` right now in the `hydra-node`. This is getting confusing and when we implement this, we should also rename several things first (tidying). +* The API `Server` can be modelled and refactored as an `EventSink`. - TBD: Do `Network` and `Chain` parts qualify as `EventSink`s as well or shall those be triggered by `Effect`s still? -* Projects forking the hydra node have a natively supported mechanism to extend node persistence -* These extensions can preserve robust "at least once" semantics for each hydra event -* Sundae Labs can build a "Save transaction batches to S3" proof of concept `EventSink` -* Sundae Labs can build a "Scrolls source" `EventSink` -* Sundae Labs can build a "Amazon Kinesis" `EventSource` and `EventSink` -* Extension points like `EventSource` and `EventSink` could be dynamically loaded as plugins without having to fork `hydra-node` (maybe in a future ADR) +* Projects forking the hydra node have a natively extensions points for producing and consuming events. +* TBD: These extensions can preserve robust "at least once" semantics for each hydra event. +* Sundae Labs can build a "Save transaction batches to S3" proof of concept `EventSink`. +* Sundae Labs can build a "Scrolls source" `EventSink`. +* Sundae Labs can build a "Amazon Kinesis" `EventSource` and `EventSink`. +* Extension points like `EventSource` and `EventSink` could be dynamically loaded as plugins without having to fork `hydra-node` (maybe in a future ADR). From c0f40f4cc6714b5a2cbd37b9b3b614cd1e24f5e0 Mon Sep 17 00:00:00 2001 From: Sebastian Nagel Date: Thu, 14 Dec 2023 15:39:56 +0100 Subject: [PATCH 6/8] Define events to be re-submitted on startup Also incorporate minor reviewer comments --- docs/adr/2023-11-07_029-event-source-sink.md | 31 +++++++++++++------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/docs/adr/2023-11-07_029-event-source-sink.md b/docs/adr/2023-11-07_029-event-source-sink.md index 674e50b8a14..7f0b8ec1200 100644 --- a/docs/adr/2023-11-07_029-event-source-sink.md +++ b/docs/adr/2023-11-07_029-event-source-sink.md @@ -3,7 +3,7 @@ slug: 29 title: | 29. EventSource & EventSink abstractions authors: [@cardenaso11, @quantumplation, @ch1bo] -tags: [] +tags: [Draft] --- ## Status @@ -47,25 +47,36 @@ data HydraNode tx m = HydraNode } ``` -* The `stepHydraNode` main loop does call `putEvent` on all `eventSinks` in sequence. - - - TBD: transaction semantics? what happens if one fails? +* The `hydra-node` will load events and __hydra_te its `HeadState` using `getEvents` of the single `eventSource`. -* The default `hydra-node` main loop does use the file-based `EventSource` and a single file-based `EventSink` (using the same file). +* The `stepHydraNode` main loop does call `putEvent` on all `eventSinks` in sequence. Any failure will make the `hydra-node` process terminate and require a restart. + +* When loading events from `eventSource` on `hydra-node` startup, it will also re-submit events via `putEvent` to all `eventSinks`. -* TBD: As the node starts up, when loading events from the `EventSource`, it will also ensure "at least once" semantics for each `EventSink`. +* The default `hydra-node` main loop does use the file-based `EventSource` and a single file-based `EventSink` (using the same file). * We realize that the `EventSource` and `EventSink` handles, as well as their aggregation in `HydraNode` are used as an API by forks of the `hydra-node` and try to minimize changes to it. ## Consequences * The default operation of the `hyda-node` remains unchanged. + * There are other things called `Event` and `EventQueue(putEvent)` right now in the `hydra-node`. This is getting confusing and when we implement this, we should also rename several things first (tidying). + +* Interface first: Implementations of `EventSink` should specify their format in a non-ambiguous and versioned way, especially when a corresponding `EventSource` exists. + * The API `Server` can be modelled and refactored as an `EventSink`. - - TBD: Do `Network` and `Chain` parts qualify as `EventSink`s as well or shall those be triggered by `Effect`s still? -* Projects forking the hydra node have a natively extensions points for producing and consuming events. -* TBD: These extensions can preserve robust "at least once" semantics for each hydra event. + +* Projects forking the hydra node have dedicated extension points for producing and consuming events. + * Sundae Labs can build a "Save transaction batches to S3" proof of concept `EventSink`. * Sundae Labs can build a "Scrolls source" `EventSink`. * Sundae Labs can build a "Amazon Kinesis" `EventSource` and `EventSink`. -* Extension points like `EventSource` and `EventSink` could be dynamically loaded as plugins without having to fork `hydra-node` (maybe in a future ADR). + +## Out of scope / future work + +* Available implementations for `EventSource` and `EventSink` could be + - configured upon `hydra-node` startup using for example URIs: `--event-source file://state` or `--event-sink s3://some-bucket` + - dynamically loaded as plugins without having to fork `hydra-node`. + +* The `Network` and `Chain` parts qualify as `EventSink`s as well or shall those be triggered by `Effect`s still? From 526103f30970ca022ed7a84921ff3a17698bf687 Mon Sep 17 00:00:00 2001 From: Sebastian Nagel Date: Thu, 14 Dec 2023 17:49:05 +0100 Subject: [PATCH 7/8] Fix authors --- docs/adr/2023-11-07_029-event-source-sink.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/adr/2023-11-07_029-event-source-sink.md b/docs/adr/2023-11-07_029-event-source-sink.md index 7f0b8ec1200..3b89cdf1e4f 100644 --- a/docs/adr/2023-11-07_029-event-source-sink.md +++ b/docs/adr/2023-11-07_029-event-source-sink.md @@ -2,7 +2,7 @@ slug: 29 title: | 29. EventSource & EventSink abstractions -authors: [@cardenaso11, @quantumplation, @ch1bo] +authors: [cardenaso11, quantumplation, ch1bo] tags: [Draft] --- From d96faca5893a073eff4dda8d0dd15655210a31f7 Mon Sep 17 00:00:00 2001 From: Pi Lanningham Date: Thu, 14 Dec 2023 12:21:18 -0500 Subject: [PATCH 8/8] Add Pi to authors --- docs/authors.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/docs/authors.yaml b/docs/authors.yaml index b805ec4f9bc..9e4dbf8921f 100644 --- a/docs/authors.yaml +++ b/docs/authors.yaml @@ -33,3 +33,9 @@ cardenaso11: title: Software Engineer url: https://github.com/cardenaso11 image_url: https://github.com/cardenaso11.png + +quantumplation: + name: Pi Lanningham + title: Chief Technology Officer, Sundae Labs + url: https://github.com/quantumplation + image_url: https://github.com/quantumplation.png