Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build customerd architecture with an RPC interface #347

Open
3 of 6 tasks
marsella opened this issue Jan 11, 2022 · 4 comments
Open
3 of 6 tasks

Build customerd architecture with an RPC interface #347

marsella opened this issue Jan 11, 2022 · 4 comments
Assignees
Labels

Comments

@marsella
Copy link
Contributor

marsella commented Jan 11, 2022

Motivation

The customer architecture was originally created to be completely ephemeral: it would execute an operation (like a payment) and then cease to exist. However, this was an incorrect assumption. The customer needs a chain watcher to be running at any time that it has an open channel, to monitor for closing behaviors.

We added a long-running watcher process, but this is not tightly linked to other customer processes. For example, it's possible to create a new channel without first starting the watcher, which is a protocol violation and can result in loss of funds #241. Instead, there should be a customer daemon, without which no ephemeral customer operations can be executed.

Goal

customerd is a long-running daemon process that runs in the background. It holds a database connection, maintains a chain watcher, and executes a queue of customer operations as they are requested.

Customer operations are initiated on the command line. They cannot run if the customerd is not initialized. They communicate with customerd using an RPC protocol.

The RPC protocol itself is a specification that describes the ways other processes can communicate with the daemon. Each current command line operation will have a corresponding well-defined request. These will need to be documented, probably in zkchanels-spec. There will be an additional command to kill the daemon and shut down the chain watcher.

Advantages of this approach include

Existing work

There's a (commented-out) daemon architecture in the customer watcher. This was designed to be a ping-only daemon that would check the chain when it got a ping from another function. The ping interaction is a Dialectic protocol. It sets up a Server, broadcasts on localhost, and refreshes the watcher on receiving a request.

I think the reusable part of this architecture is the Server. It was refactored in #146 to remove the TLS requirement, so requests can come in on the local network (the network options are now either TLS or TCP).

The server is parameterized by a dialectic protocol.

Next steps

  • figure out the difference between TLS and TCP, determine whether we need a different IoStream for this application
  • Determine what the dialectic protocol for an RPC interaction looks like (just Request -> Response <- ?)
  • Queuing protocol: sketch out some options to achieve the goal of executing operations as they arrive, but also checking the chain every 1 minute
  • Draft the RPC request / response spec
  • Modify the watch command to set up a simple server in addition to the polling service
  • Update a simple command (list?) to send an RPC request instead of executing itself
@marsella marsella self-assigned this Jan 11, 2022
@marsella marsella added the Epic label Jan 11, 2022
@marsella
Copy link
Contributor Author

TCP or TCP+TLS?

TCP is a communication protocol that describes how to move information from point A to point B. TLS is an encryption protocol. There's no impediment to using TLS with JSON-RPC. However, TLS requires valid certificates approved by a certificate authority.

Most processes would be accessing the daemon from the same machine, so it's a bit of a weird trust model to require external validation of your local daemon -- if an attacker can impersonate and run a daemon on your local machine, you probably have bigger problems.

The tradeoffs are less clear to me for processes that access the demon from a different machine on the same LAN, or for a scenario where we configure the daemon to be accessible from outside the LAN. I am willing to make the assumption that such access should not be possible, and use unencrypted communication (TCP only) for the daemon for now.

@marsella
Copy link
Contributor Author

JSON-RPC Dialectic protocol

The JSON-RPC spec is straightforward: the client must send a Request. The server must respond with a Response, or with nothing if the Request is a notification type.

Then the dialectic protocol is probably going to be

  1. Customer chooses from two options (Request or Notification)
  2. If Notification, the customer sends a Notification
  3. If Request, the customer sends a Request and receives a Response
  4. Close the connection

The response will either have a result or an error, but I think it makes sense to encode this locally (like, parse a Response type to get an Result<RpcResult, RpcError>) rather than as a second choice in the RPC network protocol.

I am not sure that this will be compatible with standard (non-Dialectic) RPC servers. The response-parsing is, but the initial choice is not. I think this is the issue is already raised in Dialectic. However, I am willing to make this compromise in order to not re-write the server from scratch. It may be the case that our spec will not use any Notifications, in which case we can be fully compatible.

@marsella
Copy link
Contributor Author

Queuing protocols

With the current chain-watching infrastructure, we want the daemon to check the chain every 1 minute, and we want it to process other requests sent in via RPC. Eventually, the chain watcher should get updated to a notification service, where the daemon should receive push notifications and react to them as they arrive.

In general, requests cannot be parallelized (e.g. you can only execute one payment at a time). So the request queue should just be a normal queue, maybe holding futures, and then we can await them in order. When requests come in, they go at the back of the queue. When chain-watching steps come in, they go at the front. This could still cause chain-watching steps to get highly delayed, though, if e.g. the previous thing on the queue involves posting to chain and it takes 20 minutes.

Instead, the queue should just run as a separate task, like the merchant server. In this architecture, we spawn two tasks, one with a running server and one with a looping polling service. Both of these are expected to run forever, unless they encounter an error. If one of them raises a fatal error or if the server gets a "kill" request, they both shut down (in particular, the function ends without waiting for the uncompleted task to terminate).

With this architecture, we don't need any kind of special queue algorithm. It's just a standard FIFO queue.

@marsella
Copy link
Contributor Author

Server

Upon further reflection, I think we can't use our Server code until Dialectic fixes the self-documenting choice issue. We need to be able to reject Notifications (even if they aren't allowed in our protocol) by closing the channel, but Dialectic won't allow that without another choice message. A normal RPC client won't know what to do with that choice.

Next step: determine what the server in json-rpc provides. Look at other options for Rust JSON-RPC libraries and see what they provide. Edit this comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant