Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataops 467: Removed runfolder directory created in delivery_runfolders_for_project_workflow #57

Merged
merged 12 commits into from
Oct 11, 2024

Conversation

nkongenelly
Copy link
Contributor

@nkongenelly nkongenelly commented Sep 16, 2024

What problems does this PR solve?
The code implemented here fixes the duplication of staging runfolders in the staging service

How has the changes been tested?
test_integration_dds.py integration file has also been updated to reflect the bug before and made to pass with the fix implemented here

Verification protocol: https://snpseq.atlassian.net/wiki/spaces/AR/pages/3639574530/arteria-delivery+v3.2.1

Reasons for careful code review
The code implementation should only affect the /stage/project/runfolders/(.+) endpoint

  • This PR contains code that could remove data

Copy link
Contributor

@Aratz Aratz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

I have one question regarding the fix itself, and some additional comments:

  • Test TestIntegration.test_can_return_flowcells is failing, can you look into it, and also fix .github/workflows/test.yml so that it runs the tests on pull requests as well? (basically instead of on: [push] is should be on: [push, pull_request])
  • There are a lot of added/removed blank lines. I suspect this was not meant to be committed. In general, it is good practice to check what changes you're committing (eg with git diff before writing the commit). Could you please clean it up a little bit so that this PR only contains changes related to the double runfolder issue?

delivery/services/delivery_service.py Outdated Show resolved Hide resolved
@@ -30,7 +30,6 @@ def _construct_response_from_project_and_status(self, staging_order_projects_and

return link_results, id_results


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should stay (see https://peps.python.org/pep-0008/#blank-lines)

@@ -184,7 +190,6 @@ def deliver_all_runfolders_for_project(self, project_name, mode):
batch_nbr=batch_nbr)

self.delivery_sources_repo.add_source(source)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could line could stay too

@@ -29,7 +29,6 @@ def __init__(self, runfolder_service, file_system_service=FileSystemService()):
"""
self.runfolder_service = runfolder_service
self.file_system_service = file_system_service

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, see PEP8

@@ -116,6 +115,7 @@ def organise_project(
organised_project_runfolder_path,
lanes))
# symlink the project files

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No line should be added here, the comment above should belong to the same block as the code that follows, see https://peps.python.org/pep-0008/#block-comments

Comment on lines 105 to 106
execution_result.stdout,
re.MULTILINE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was correctly indented before :( lines 105 and 106 should be aligned with the first parameter of the search function above in this case.

@@ -149,7 +147,6 @@ def stage_order(self, stage_order):
"external_program_service": self.external_program_service,
"staging_repo": self.staging_repo,
"session_factory": self.session_factory}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line could be kept.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here

@nkongenelly nkongenelly changed the title Dataops 467 Dataops 467: Removed runfolder directory created in delivery_runfolders_for_project_workflow Sep 17, 2024
@nkongenelly
Copy link
Contributor Author

Thank you so much for reviewing this PR. I have now updated the code with the suggestions you made, apart from the symlink part in delivery_service that I have replied to your comment above.

@nkongenelly
Copy link
Contributor Author

@Aratz , This PR is now ready for re-review.

Copy link
Contributor

@Aratz Aratz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I think the code looks great now, just have some comments about the tests.

Also, I would really recommend you set up a a linter in your text editor and/or run pycodestyle or any other linter on the changes you made before you push them. As of now, the code in general is very unlinted, so it is hard to enforce that with continuous integration, but it would be great if we could at least make sure that everything we touch is linted :)

tests/integration_tests/test_integration_dds.py Outdated Show resolved Hide resolved
tests/integration_tests/test_integration_dds.py Outdated Show resolved Hide resolved
@Aratz Aratz self-requested a review October 2, 2024 11:47
Copy link
Contributor

@Aratz Aratz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Copy link
Member

@matrulda matrulda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Just some minor comments about the test.

# Assert the staged folder structure has only one runfolder folder
temp_staging_dir = f"/tmp/{staging_order_ids.get('JKL_123')}/JKL_123"
for runfolder in os.listdir(temp_staging_dir):
self.assertFalse(set(runfolder).issuperset(set(os.listdir(f"{temp_staging_dir}/{runfolder}"))))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question here, is it necessary to check this using sets? Since listdir returns a list, can't you do this:

Suggested change
self.assertFalse(set(runfolder).issuperset(set(os.listdir(f"{temp_staging_dir}/{runfolder}"))))
self.assertFalse(runfolder in os.listdir(f"{temp_staging_dir}/{runfolder}"))

@nkongenelly
Copy link
Contributor Author

@matrulda , thank you so much for the review. I have now pushed the changes

@matrulda
Copy link
Member

@nkongenelly Great, looks good 👍

@nkongenelly nkongenelly merged commit 4e427a3 into arteria-project:master Oct 11, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants