Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(api, robot-server): use runtime parameter files set in protocols and runs to set in-protocol values #15855

Merged
merged 15 commits into from
Aug 2, 2024

Conversation

jbleon95
Copy link
Contributor

@jbleon95 jbleon95 commented Jul 31, 2024

Overview

Closes AUTH-419.

This PR hooks up the setting of runtime parameter files in the POST /runs, /protocols and /protocols/{protocolId}/analyses to the actual CSVParameter object used in the protocol context.

This file is sent via a dictionary of parameter name and file-id, so we resolve the file-id into a Path by using the data_files_directory and data_files_store. This is then passed to the ParameterContext and used to open up a temporary file that contains the contents of the actual file stored on the robot.

Test Plan and Hands on Testing

Tested end to end via the following protocol and two CSV files

metadata = {
    'protocolName': 'CSV End to End Test',
}

requirements = {
    "robotType": "Flex",
    "apiLevel": "2.18"
}


def add_parameters(parameters):
    parameters.add_str(
        display_name="Pipette Name",
        variable_name="pipette",
        choices=[
            {"display_name": "Single channel 50µL", "value": "flex_1channel_50"},
            {"display_name": "Single channel 1000µL", "value": "flex_1channel_1000"},
            {"display_name": "Eight Channel 50µL", "value": "flex_8channel_50"},
            {"display_name": "Eight Channel 1000µL", "value": "flex_8channel_1000"},
        ],
        default="flex_1channel_50",
        description="What pipette to use during the protocol.",
    )
    parameters.add_csv_file(
        display_name="CSV Data",
        variable_name="csv_data",
        description="CSV file containing labware and volume information."
    )



def run(context):
    PIPETTE_NAME = context.params.pipette

    trash_bin = context.load_trash_bin('A3')
    tip_rack = context.load_labware('opentrons_flex_96_tiprack_50ul', 'D2')

    pipette = context.load_instrument(PIPETTE_NAME, mount="left", tip_racks=[tip_rack])
    csv_file_data = context.params.csv_data.parse_as_csv()

    labware_name = csv_file_data[1][0].strip()
    location = csv_file_data[1][1].strip()
    volume = float(csv_file_data[1][2])


    labware = context.load_labware(labware_name, location)
    pipette.pick_up_tip()
    pipette.aspirate(volume, labware.wells()[0].top())
    pipette.dispense(volume, labware.wells()[1].top())
    pipette.drop_tip()
Labware Name, Location, Volume
opentrons_96_wellplate_200ul_pcr_full_skirt, C1, 20
Labware Name, Location, Volume
nest_96_wellplate_100ul_pcr_full_skirt, C2, 30

Changelog

  • Resolve runTimeParameterFiles into a dictionary of Paths
  • pass the run_time_param_paths all the way to the ParameterContext where they can be opened as temporary file handlers
  • close the file handlers upon protocol end
  • Allow importing of CSVParameter from the protocol_api namespace.

Review requests

Risk assessment

Medium.

@jbleon95 jbleon95 marked this pull request as ready for review August 2, 2024 18:39
@jbleon95 jbleon95 requested review from a team as code owners August 2, 2024 18:39
Copy link
Contributor

@SyntaxColoring SyntaxColoring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mostly makes sense to me and looks good to merge, though I'm not super familiar with the whole RTP flow.

My main concern going forward is the open()-close()-pair resource management thing.

@@ -188,6 +192,7 @@ async def create(
notify_publishers: Callable[[], None],
protocol: Optional[ProtocolResource],
run_time_param_values: Optional[PrimitiveRunTimeParamValuesType] = None,
run_time_param_paths: Optional[CSVRuntimeParamPaths] = None,
Copy link
Contributor

@SyntaxColoring SyntaxColoring Aug 2, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't really matter, but out of curiosity, why are we adding a second run_time_param_paths argument all the way down, instead of extending the existing run_time_param_values arg to also carry CSV paths?

Like, it's not like we have 4 different arguments for int params, string params, boolean params, and float params.

Structurally, one argument seems more correct to me because it makes it structurally impossible for different parameters to have colliding keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, this is an artifact of these originally being a dictionary of param names to file-ids, which since file IDs were strings required a different variable for disambiguation. Now that they're Path objects that isn't needed, but I'm gonna put a TODO to fix that up next week before the code freeze.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer the separate arguments for two reasons-

  1. it keeps the meaning of a parameter's 'value' consistent throughout code. A CSV parameter's 'value' is its file with its contents, while we are passing the file IDs/ paths in these arguments, not the file itself. So the two arguments keep the two concerns separate.
  2. because a parameter's value and csv parameter's file paths have to be handled differently, aka, you 'set a param value' vs. 'initialize a file using file path', the two separate arguments make it very easy to split the handling. If the prim. param values & csv param paths are passed in as one entity, you end up having to look up the type of the parameter and then handle the param accordingly. There are a few places in code where we need to know the param type to do this so it adds a lot of repetitive 'is this a CSV param' code.

But if you still think it's better to have one arg for both then I won't oppose it.

robot-server/robot_server/runs/run_models.py Outdated Show resolved Hide resolved
robot-server/robot_server/protocols/router.py Outdated Show resolved Hide resolved
robot-server/robot_server/runs/router/base_router.py Outdated Show resolved Hide resolved
Comment on lines +194 to +198
if rtp_files:
rtp_paths = {
name: data_files_directory / file_id / data_files_store.get(file_id).name
for name, file_id in rtp_files.items()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this have validation so invalid file IDs don't raise a 500 error? Or a # todo.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, interesting point. Raising a 500 error here would be very different behavior compared to other parameter validation we do, but just plain skipping invalid file IDs would currently lead to the same error as not providing any file-id to begin with. We can discuss this further but I'll make a TODO so this gets addressed in next week's follow up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't think you'd want to skip it. I think you'd want to treat it like when you try to create a run with a protocolId that doesn't point to a valid protocol.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the file ID doesn't exist, an error will eventually get raised when we set the CSV parameter during orchestrator load. So it will definitely error out shortly after this, even if it's not validated here and I think that's ok. Because a CSV file isn't exactly like a protocol file. The CSV file's validation will happen in the same place where rest of the RTP parameters' validation occurs.

raise
finally:
if parameter_context is not None:
parameter_context.close_csv_files()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad we're thinking about closing the files when we're done with them, but is this the correct place to do it? Where are the files opened?

For example, it looks like one route for them to be opened is ProtocolExecutor.extract_run_parameters(), when a protocol is loaded, and that does not necessarily seem paired with any parameter_context.close_csv_files().

(The guideline that I'm aiming to follow is that every open has exactly one close, and the thing that does the open is the same as the thing that does the close.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this was probably my biggest ❓ with this PR. I didn't immediately see a good place to tightly couple the two as you suggest and so I put it in the place where the majority of protocols would enter in the first place, but I agree it's imperfect. I think this is probably a sign that these should be opened in a different location which would also tie into the next comment you have about file opening. Again, another TODO for next week so science can be unblocked from a functional version of this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I would look into doing the open() and close() in exec_run() or one of exec_run()'s callers. I haven't looked at them too closely, so big grain of salt.

Comment on lines 243 to 253
file_id = file_path.parent.name
file_name = file_path.name
with file_path.open() as csv_file:
contents = csv_file.read()

temporary_file = tempfile.NamedTemporaryFile("r+")
temporary_file.write(contents)
temporary_file.flush()

parameter_file = open(temporary_file.name, "r")
temporary_file.close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm reading this correctly:

  1. We are given a path, the "original file," which we want to treat as read-only.
  2. We open the file at that path and copy its contents to a new, temporary file. This makes sense.
  3. We close that temporary file and then reopen it, and this reopened file is what we use for general use during the protocol.

Why are we doing step 3?


Also, I know you're not supporting this now, but heads up for the future: Any kind of open() at this stage is making an assumption about the file's encoding before the user's Python has a chance to run and tell us what encoding it expects. If you ever support custom encodings, the open() will have to move somewhere else, fundamentally. So let's not bake it too deep into the structure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Step 3 is to enforce read-only permissions on the file, since to write to the temporary file we need to make it read-write privileges. Since it's a temporary file there's no inherent danger in giving write privileges but it seemed the more "correct" thing to enforce.

As for encoding you can actually change the encoding of an active file even after opening it via reconfigure, but again with the previous reply to your comment this code may end up moving closer to where the user is using it, more easily allowing this for future types of files (CSV files I feel are pretty safe to assume UTF-8 text)

@@ -72,6 +73,7 @@
"ALL",
"OFF_DECK",
"RuntimeParameterRequiredError",
"CSVParameter",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this reexport for? Are we expecting users of the Python Protocol API to import this in their scripts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was tacked on but via a discussion with Ed we want this easily exposed since this interface is a first class object in our API now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see now, that makes sense. We should aim to define the class inside opentrons.protocol_api instead of opentrons.protocols.parameters.types, then. Until we do that, mind leaving a note on the class definition so it's obvious that it's user-facing?

@SyntaxColoring
Copy link
Contributor

SyntaxColoring commented Aug 2, 2024

Also, reminder to rebase onto chore_release-8.0.0 before merging.

@jbleon95 jbleon95 changed the base branch from edge to chore_release-8.0.0 August 2, 2024 19:45
Copy link
Contributor

@TamarZanzouri TamarZanzouri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree with @SyntaxColoring comments and also agree with adding a TODO for these cases for now. besides that LGTM!

Copy link
Member

@sanni-t sanni-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Left a couple of comments.
Tested on OT-2 and verified that CSV protocol works end-to-end! 🙌 🙌 🙌

@jbleon95 jbleon95 merged commit f2f3c74 into chore_release-8.0.0 Aug 2, 2024
24 checks passed
jbleon95 added a commit that referenced this pull request Aug 7, 2024
# Overview

This PR follows up on #15855 with various small refactors/improvements.

The biggest change in this PR is refactoring how we open the CSV
parameter files. Previously we were opening them when we first
initialized the parameter. This proved to be a potential issue because
we were only closing the files when we reached the execute step of
running the protocol (i.e. in a `finally` clause within `run_protocol`),
yet there was the small yet possible chance of not reaching that step
(for instance if the initial home command failed unexpectedly). To solve
this, we are now passing the `Path` object to the `CSVParameter`
interface and only opening the file when it is first accessed in the
protocol, meaning it will _always_ occur within that try/except block.

Another change introduced is properly raising a 400 error for `POST`s to
the `/runs`, `/protocols` and `/protocols/{protocolId}/analyses`
endpoints when referencing data files (for runtime parameter files) that
don't exist.

API Version is now also enforced for using `add_csv_file` which now
requires your protocol to be at 2.20. The api version is also now passed
down to the `CSVParameter` interface, though it is not currently used to
gate anything yet.

## Test Plan and Hands on Testing

Tested end to end on hardware with the same protocol/csv files as
#15855.

## Changelog

- Pass `Path` to `CSVParameter` interface and open files within that.
- Raise a 400 error when referencing non existent data file IDs for
`/runs`, `/protocols` and `/protocols/{protocolId}/analyses`
- Require API version 2.20 for `add_csv_file` 
- Fixes/improvements for testing CSV files/file opening in protocol api

## Review requests

## Risk assessment

Low.
sanni-t pushed a commit that referenced this pull request Aug 12, 2024
…nd less reliance on file reading/creating (#15956)

# Overview

This PR follows up both #15855 and #15907 with further refinements and
small changes to defining and setting CSV parameters, along with behind
the scenes refactors.

The two main user/client facing changes of this PR are changing the HTTP
status code when invalid data file IDs are sent to the `POST` `/runs`,
`/protocols` and `/protocols/{protocolId}/analyses` endpoints from 400
to 422, and raising a new `IncompatibleParameterError` when multiple CSV
file parameters are added to a single protocol. The change in status
code is to reflect that the error is not because of a malformed request,
but because there was something in the request that could not be
processed (in this case being the data file ID). The new error was added
to align with the original PRD and the current UI. While there's no
inherent technical limitation for multiple CSV parameters (other than
the theoretical limitations with memory, number of file handlers, etc),
there are unexpected UI bugs when multiple ones are defined and this
change enforces only one per protocol.

The other major change of this PR is a non-user facing refactor of how
we set the `CSVParameter` interface. Rather than passing the `Path` of
the file to the interface and then using a `TemporaryFile` as the basis
for all further access of the CSV file and it's contents, now the
contents of the file (passed as `bytes`) is what everything else is
based off of, including CSV parsing in `parse_as_csv`. With this change,
we only ever make a temporary file handler when the user accesses
`.file`. With this change reducing the chance of there being an open
file handler, a new `file_opened` property was added to the public
interface to reduce needless file opening. This is technically
user-facing code meant more for internal usage, but I see no danger in
exposing it, though if needed we can tag it is a private non-guaranteed
method.

## Changelog

- Raise `IncompatibleParameterError` when `add_csv_file` in the
`ParameterContext` is used more than once
- Change status code of invalid data file IDs posted to runs and
protocols endpoints from `400` to `422`
- Refactor `CSVParameter` to only open temporary file handler when user
requests one

## Review requests

Should we label `file_opened` as private/use another way of preventing
spurious temporary file creation just to close them?

## Risk assessment

Low.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants