Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(robot-server): maintain correct order of protocol analyses #14762

Conversation

sanni-t
Copy link
Member

@sanni-t sanni-t commented Apr 1, 2024

Closes AUTH-229

Overview

Updates the /protocols endpoints to always maintain the order of list of analyses as most-recently-started-analysis last, making sure to verify if a new analysis needs to be triggered because of new run-time-parameter values for a previously uploaded protocol.

To achieve that, this PR does the following things:

  1. updates the analysis table in the database to include a new column where we save info about the RTP values used in the analysis in question.
  2. adds functionality to read and write those values to/ from database
  3. adds functionality to compare the new RTP values (if any) sent by the client, with those used in the most recent analysis.
  4. updates the protocol upload endpoint to:
    • If the protocol did not exist before: create a new protocol resource, start a new analysis on that resource (existing functionality)
    • If the protocol exists and its most recent analysis used the same parameters' values as the ones in current request, do not create a new protocol resource, and do not trigger a new analysis
    • If the protocol exists and its most recent analysis used different parameters' values than the ones in current request, do not create a new protocol resource, but trigger a new analysis using the new RTP override values.

Test Plan

(Can use test protocol from this PR)

  • Upload a protocol with Run-time parameters defined, but no RTP values specified. Verify that analysis contains correct RTP values (should be the default values)
  • Upload the same protocol, without any RTP values again, verify that there is still only 1 analysis for the protocol and that's the one from before
  • Upload same protocol, with some RTPs explicitly set to default values in the request. Verify that there's still only the one analysis
  • Upload same protocol with some RTPs set to non-default values in the request. Verify that a new analysis is added to the list of analyses and this analysis uses the new RTP values
  • Upload same protocol with no RTP values specified again. Verify that another analysis is appended to the end of the list of analyses and this one uses the default RTP values.

Review requests

  • Make sure code & comments make sense
  • See if I've missed any cases

Risk assessment

Medium. Does database update and fixes the analysis order that was broken by #14688

Copy link

codecov bot commented Apr 1, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.19%. Comparing base (0fbb4c7) to head (8c96653).
Report is 29 commits behind head on edge.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             edge   #14762      +/-   ##
==========================================
+ Coverage   67.17%   67.19%   +0.02%     
==========================================
  Files        2495     2495              
  Lines       71483    71405      -78     
  Branches     9020     8992      -28     
==========================================
- Hits        48020    47984      -36     
+ Misses      21341    21305      -36     
+ Partials     2122     2116       -6     
Flag Coverage Δ
g-code-testing 92.43% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
api/src/opentrons/protocol_engine/types.py 97.50% <ø> (ø)
...t-server/robot_server/protocols/analysis_models.py 100.00% <ø> (ø)
...ot-server/robot_server/protocols/analysis_store.py 100.00% <ø> (ø)
robot-server/robot_server/protocols/router.py 100.00% <ø> (ø)

@sanni-t sanni-t marked this pull request as ready for review April 1, 2024 21:05
@sanni-t sanni-t requested review from a team as code owners April 1, 2024 21:05
@sanni-t sanni-t requested review from SyntaxColoring, a team, jbleon95 and Elyorcv and removed request for a team April 1, 2024 21:05
Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay!

Comment on lines 314 to 317
# We only check for matching RTP values if the given protocol ID
# (& hence its summary) exists. So this assertion should never be false unless
# somehow a protocol resource is created without an analysis record.
assert len(analyses) > 0, "This protocol has no analysis associated with it."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can happen, can't it?

  1. Upload a protocol. This starts an analysis in the background.
  2. Shut off the robot before the analysis completes.
  3. Power on the robot. The protocol will exist without any analyses.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks right. Previously, that would result in the responses providing an empty list of analyses while now it would error out. I feel like raising an error is slightly better than just returning an empty list, but both approaches make the protocol resource unusable without having a way to force reanalysis.

Hmm.. it might be better to return False here so that it triggers a new analysis. There shouldn't be any risk of corrupting the database if we do that, ya?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, returning False and triggering a new analysis seems like a good idea. No, I don't see it causing any database problems.

Comment on lines 320 to 322
assert (
last_analysis.status != AnalysisStatus.PENDING
), "Protocol must not already have a pending analysis."
Copy link
Contributor

@SyntaxColoring SyntaxColoring Apr 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can also trigger, I think?

  1. Upload a protocol. This starts an analysis in the background.
  2. Before analysis completes, quickly upload the same protocol.

[Edit: Oh, yeah, and this is causing this.]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK so correct me if I'm wrong, but as far as I can tell, the background is this: You don't want to start a redundant analysis. To determine if an analysis is redundant, you need the overrides supplied to the last analysis, and the protocol's default values when no overrides are supplied. To get the protocol's default values, you need some analysis to have completed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's right, then it seems like a solvable problem.

What we want, ideally, is for the server to be able to get a protocol's available parameters and their default values quickly—say, within 1–2 seconds—without having to do a full analysis with aspirate()s and dispense()s and everything.

So when a client does POST /protocols:

  • If there are no analyses for that protocol yet:
    • Start one.
  • If the last analysis is completed:
    • And it has the same runtime parameters as this POST request:
      • Keep that one.
    • And it has different runtime parameters from this POST request:
      • Start a new one.
  • If the last analysis is pending:
    • Wait until we have details about the protocol's runtime parameters, which should only take 1–2 seconds. Then, handle as above.

This is all seems readily possible to me. The run-time parameters Python Protocol API was designed for it, with its isolated add_parameters() function, which we can call without calling run().

Does that seem doable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. Maybe we can aim for it in the long term (or even medium term) but it's not doable in the current architecture of the protocol runner and analyzer. Even though add_parameters() is separate from run(), for a few reasons discussed previously, we decided to make handling the actual parameters parsing and setting as part of protocol execution (inside the simulating runner in case of analysis) as opposed to handling it inside an intermediate layer like the protocol reader. Which means that the RTP information is not available until the analysis is complete.

Let's say we refactor the runner to make the protocol's RTP values accessible before the analysis is completed, we still have the protocol analysis & analysis store working on this binary concept of 'pending' vs 'completed' analysis where a protocol's info is only available in the CompletedAnalysisStore. For the 'pending with RTP values parsed' state, we will need to introduce a new functionality for either the pending store or just redo this concept of the binary states for analysis store.

Once we make even that change so that the server can now successfully cross-check against the most recent RTP values, we will have to decide what happens if the new request used different RTP values than the ones in a pending request; do you start another analysis right away and keep the last one running too? Or do you cancel the previous analysis?

So there's a lot of changes that we will need to make in order to implement the behavior you are describing.
I agree that it's a better behavior though. I'm going to make a ticket for it. But until then, I think it's totally fine to say that you will need to wait for the current analysis to complete before reuploading the protocol. I believe the app makes it nearly impossible to re-upload a protocol anyway because of how runs are created. So I don't think it's a case you would run into easily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, your call!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, though:

  • If you think this is going to be the HTTP-client-facing behavior for the medium term, I would try to get it to return an intentional 503 busy response instead of a 500 internal server error.
  • In any case, I would change this to not be an assert, since that implies this condition will never happen except by robot-server bugs, whereas it really depends on client behavior.
    • If you're switching the response to HTTP 503 and settling in for this to be the HTTP API, a tri-state "yes"/"no"/"busy" return value seems appropriate.
    • If you're keeping the response as an HTTP 500 and calling this a known bug, I actually think a NotImplementedError would be correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that. I'll go with the 503 busy response.

robot-server/robot_server/protocols/analysis_store.py Outdated Show resolved Hide resolved
@sanni-t sanni-t requested a review from SyntaxColoring April 3, 2024 20:06
Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A question on another assert and on a Tavern test. Otherwise, this looks good to me. Thanks!

Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeet

@sanni-t sanni-t merged commit 3d34031 into edge Apr 3, 2024
22 checks passed
Carlos-fernandez pushed a commit that referenced this pull request May 20, 2024
Closes AUTH-229

# Overview

Updates the `/protocols` endpoints to always maintain the order of list
of analyses as most-recently-started-analysis last, making sure to
verify if a new analysis needs to be triggered because of new
run-time-parameter values for a previously uploaded protocol.

# Risk assessment

Medium. Does database update and fixes the analysis order that was
broken by #14688

---------

Co-authored-by: Max Marrone <[email protected]>
@sanni-t sanni-t deleted the AUTH-229-robot-server-list-of-analyses-should-maintain-correct-order branch July 15, 2024 21:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants