-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(robot-server, app): add a new endpoint for fast-fetching all run commands #15031
feat(robot-server, app): add a new endpoint for fast-fetching all run commands #15031
Conversation
@@ -428,7 +428,7 @@ def get_commands_slice( | |||
) | |||
.order_by(run_command_table.c.index_in_run) | |||
) | |||
|
|||
# select_all_cmds_as_doc = sqlalchemy.select(sqlalchemy.func.json_array_elements) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops. Left over by mistake. Removing
const parsedCommandsFromQuery = commandsFromQuery.map(command => | ||
JSON.parse(command) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would actually have the react query do this for us, since anyone using this hook will probably want the data to be deserialized
@@ -67,7 +77,7 @@ export const RunPreviewComponent = ( | |||
] = React.useState<boolean>(true) | |||
if (robotSideAnalysis == null) return null | |||
const commands = | |||
(isRunTerminal ? commandsFromQuery : robotSideAnalysis.commands) ?? [] | |||
(isRunTerminal ? parsedCommandsFromQuery : robotSideAnalysis.commands) ?? [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
talked over slack but falling back to analysis commands if the run commands are empty is probably better
enabled: isRunTerminal, | ||
}).data?.data | ||
const commandsFromQuery = | ||
useAllCommandsAsPreSerializedList( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
frontend code looks good! i tested on a bot and this seems to work great — nice job y'all!
staleTime: Infinity, | ||
cacheTime: Infinity, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess we don't need staleTime
and cacheTime
anymore now that we're using the notification service right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to leave those there in the event that we can't establish MQTT connection and need to fallback to HTTP with those options passed through?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes you are totally right!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job on wiring up the notifications!
@ncdiehl11 This is my bad for not telling you earlier (and it's also not a blocker) - on L43 of We should add the HTTP polling equivalent to the list to prevent devs from forgetting and using it by mistake. |
@@ -351,6 +365,53 @@ async def get_run_commands( | |||
) | |||
|
|||
|
|||
@PydanticResponse.wrap_route( | |||
commands_router.get, | |||
path="/runs/{runId}/commandsAsPreSerializedList", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to do this that doesn't require:
- A separate endpoint path?
- New 503 semantics for runs that haven't completed?
i.e. could we make this something like GET /runs/{id}/commands?validate=false
instead?
I know we're repeating how we did things with GET /protocols/analyses/{id}/asDocument
, but I think it was a mistake for that endpoint to work like that, in hindsight.
Unifying them would have integration testing benefits. Imagine just parametrizing every Tavern test with ?validate=true
/?validate=false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a way to do this that doesn't require:
A separate endpoint path?
I think the separate endpoint path is the right thing to do here because the shape/type of the response is different from the regular ../commands
endpoint
New 503 semantics for runs that haven't completed?
Talked about this briefly on the call and posting here for posterity- Not having a 503 for runs can be two things-
- returning an empty list in the response: this is misleading and can cause confusion as to whether this run just didn't have any commands or whether the commands just haven't been serialized yet. We agreed that this is not a good option
- serializing commands from memory (since it's a current run) and returning them as response: this is a good option but this too takes a non-ideal amount of processing time to respond with. I will test for exactly how much time and respond here. If it's faster/ takes similar time as fetching pre-serialized commands, then I'll replace the 503 with it. If it takes more than twice the amount it would to fetch pre-serialized commands, then I think it would be more disruptive to the client flow (and user experience) if the client hits the endpoint at the wrong time and ends up having to wait a long time and blocking rest of the flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the separate endpoint path is the right thing to do here because the shape/type of the response is different from the regular
../commands
endpoint
Definitely, if the response shape is different. If possible, I'm saying the response shape should not be different. (Discussed in the other thread: #15031 (comment))
If it takes more than twice the amount it would to fetch pre-serialized commands, then I think it would be more disruptive to the client flow (and user experience) if the client hits the endpoint at the wrong time and ends up having to wait a long time and blocking rest of the flow.
I'm not totally following this. How does returning 503 avoid the client having to wait a long time and avoid blocking the rest of the flow? Either way, the client can only move on after the server has spent time serializing everything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM given that it's documented as experimental, but see question above. Thank you!
[spoke in person] holding off on this in this PR as we are targeting release branch instead of edge where those eslint file rules were added @mjhuff |
- !anystr | ||
- !anystr | ||
- !anystr | ||
- !anystr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I'm just noticing this. Are we returning something like this?
{
"data": [
"{\"commandType\": \"home\", ...}" // JSON list of strings that are parseable as JSON
]
}
If so, can we concatenate the command strings on the server side so we always emit straight JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, your example is what the response looks like.
Are you suggesting changing to something like this?-
{
"data":
' "{\"commandType\": \"home\", ...}", "{\"commandType\": \"aspirate\", ...}", "{...}" '
}
..so that data is a single string of concatenated individual command strings that are parseable as JSON?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, sorry, I was unclear. I'm suggesting changing to something like this:
{
"data": [
{
"commandType": "home",
// ...
},
{
"commandType": "loadLabware",
// ..
},
// ...
]
}
So that it matches the normal GET /runs/{id}/commands
.
[Edit] Sorry, I'm still being unclear. I mean we could do something like this:
response_body = '{"data":[' + ','.join(serialized_commands) + ']}'
If concatenating JSON fragments like that feels too icky, I'm sure there's a way to write a custom JSON encoder to do it. [Edit: Nope, standard library JSON encoders can't do this.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just tested this and it's 5x more expensive than just returning the serialized commands list. It's still quite faster than doing the full conversion to pydantic response models so we'll keep that as an option. I've added a todo comment to address this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For posterity: we talked about this in Slack, and the "5x more expensive" result was with code roughly like:
commands_list = [json.loads(c) for c in command_jsons_from_sql]
Rather than with ','.join()
.
engine_state_summary = self._run_hooks.get_state_summary(self._run_hooks.run_id) | ||
if ( | ||
isinstance(engine_state_summary, StateSummary) | ||
and engine_state_summary.completedAt is not None | ||
): | ||
await self.publish_pre_serialized_commands_notification(run_id=run_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed in a call, but I didn't explain it well, so I'll try to be more coherent here...
You explained that this notifies the client right away if the commands are already available. That totally makes sense, because there won't be any more notifications later.
However, I think we should go further and send this first notification unconditionally, even when the commands are not available yet. Because I think that's how these topics should work in general.
I view it like this: This notification topic is not meant to say when the commands are ready. It's merely meant to say when to hit the HTTP endpoint, and the HTTP endpoint tells you whether the commands are ready.
Under the scheme I'm describing, an initial unconditional notification would basically confirm to the client that the subscription is active. This is needed in the general case because it solves a missed-update problem (see option 1); every topic should do it or something equivalent.
Maybe you're thinking of the conditional that you have right now as an optimization. Like, the server knows that the client won't like the GET
response that is has to offer right now, so why bother inducing a GET
request?
But I think we want to avoid the notifications system becoming a bunch of bespoke mini-APIs each with its own rules. Instead, we want it to be dumb and generic and consistent so we can stamp it out on all of our endpoints without thinking about it, share infrastructural code, integration-test it in generic form, and so on. If that comes at a cost of occasionally prompting the client to send out a superfluous GET
, I think it's worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, though this doesn't have to change for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think those are valid points. I can make the notification unconditional upon subscription.
But I think it might need some app side changes too, unless handling a 503 response is already implemented. Thoughts @ncdiehl11 @shlokamin ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely feel free to leave it out if it needs nontrivial changes or entails testing everything again. I'd be happy with a # todo
comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To @SyntaxColoring 's point, currently, the client always performs an implicit endpoint hit whenever it subscribes to a topic to solve for the missed-update problem. This behavior is unsurprising from a client perspective, because this is what happens in a non-notification, vanilla HTTP hook (more importantly, see the aside below). In the current iteration of notifications, the server is responsible only for publishing updates for a given topic.
After discussion with @shlokamin the intent of this fast-fetch behavior, it doesn't sound like we need this need this bit. The runs_publisher is always initialized at the start of a run, and the resources this endpoint/topic manage do not change until the end of a run (if I understand correctly?).
The reason initialize()
contains a publish for /runs/:runId is that an update to the underlying resource occurs at this point in time. This accounts for the following scenario:
- A client subscribes to a non-yet existing run.
- Simultaneously, the client performs an initial GET and the resources errors out/whatever it currently does.
- The run is created. At this point, the /runs/:runId resource is updated, so we need to publish the update to prevent the client from holding stale data.
TLDR: If the fast-fetch resource can't actually update when a run is created, we don't need this bit. The client will GET it whenever it subscribes, and the endpoint should respond with the current state of the resource. Only publish when the resource updates.
ASIDE: The concept of confirming an active subscription does not play well with the MQTT broker-client model, as the robot-server (a client of the broker) has no ability to know what clients are subscribed to a given topic. The pub-sub model is meant to account for present and future states of a resource, not past states. The responsibility to solve the stale data problem must fall on the client in MQTT's pub-sub model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@shlokamin is there precedent here for checking the response code? Currently, I am null checking the response data object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ncdiehl11 Yes, look at useCurrentMaintenanceRun
for an example
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the additional context @mjhuff!
I will remove these lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sanni-t and thank you for bearing with me! 😄
…t, add react wrapper, utilize in runpreview
… comes back empty
8fba742
to
eb5239f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See inline - what about returning jsonlines instead of encapsulated json?
async def get_run_commands_as_pre_serialized_list( | ||
runId: str, | ||
run_data_manager: RunDataManager = Depends(get_run_data_manager), | ||
) -> PydanticResponse[SimpleMultiBody[str]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the fundamental idea with this endpoint is to avoid server work by not doing json serialization, what if this endpoint's response was in the form of jsonlines, json object records (aka the json-formatted Command instances we get from the database) that are line-separated? I think this is a good idea for two reasons:
- It requires less server work. With a
SimpleMultiBody
like this, we still have to do the work of built in json module encoding all those commands into a json array - things like escaping all the internal quotes - It generalizes better (IMO) to queries for running runs, since it's not totally contained and therefore doesn't really have "this is a complete view" semantics
If we're concerned about breaking similarity with the other elements of our api by not being json, one thing we can do is let the client specify Accept
for either application/json
(MultiBody
response) or application/jsonlines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can get advantage 1 without giving up our usual response format, right? JSON lines would look basically this:
response_body = "\n".join(serialized_commands)
But this would be equally fast:
response_body = '{"data":[' + ','.join(serialized_commands) + ']}'
And I'm sure there's a way to write a custom JSON encoder to do it if concatenating JSON fragments like that feels too icky. [Edit: Nope, standard library JSON encoders can't do this.]
Could you clarify your point 2? I'm not sure what you mean by "contained" or how that affects generalization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting idea. I've added a todo comment to investigate this option further.
… commands (#15031) # Overview **Robot server changes:** Adds a new endpoint- `GET /runs/:run_id/commandsAsPreSerializedList` to the run commands router. This endpoint returns a list of pre-serialized commands (that are valid json objects) of a finished run. This endpoint is a much faster alternative to the `GET /runs/:run_id/commands` endpoint when fetching all commands of a completed run. Also adds notification publishing when pre-serialized commands become available for a run. **App changes** closes RQA-2645 and RQA-2647 # Risk assessment Back-end: New endpoint so impact on existing code is close to none. App: Medium. Fixes issues in existing behavior of handling historical runs. --------- Co-authored-by: ncdiehl11 <[email protected]> Co-authored-by: ncdiehl11 <[email protected]>
Overview
Robot server changes:
Adds a new endpoint-
GET /runs/:run_id/commandsAsPreSerializedList
to the run commands router. This endpoint returns a list of pre-serialized commands (that are valid json objects) of a finished run.This endpoint is a much faster alternative to the
GET /runs/:run_id/commands
endpoint when fetching all commands of a completed run.Also adds notification publishing when pre-serialized commands become available for a run.
App changes
closes RQA-2645 and RQA-2647
Test Plan
Changelog
added the new endpoint to
runs/commands_router
added run commands fetching to
run_store
added unit tests and integration tests
added a runs publish topic for pre-serialized commands
added notification publishing for pre-serialized commands availability after a run has been committed to the database. This happens in two places:
RunController
, after a run has finished running (either successfully ended, failed with error or stopped by the client)RunDataManager
after a run has been un-currented.added an api client function for getting from new commandsAsPreSerialized endpoint, along with react query and notifications wrapper hooks
implement the above in RunPreview component for terminal runs
Review requests
Risk assessment
New endpoint so impact on existing code is close to none.