Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Optionally remove single-child top-level directory from extracted archives #579

Conversation

josephine-funken
Copy link
Collaborator

@josephine-funken josephine-funken commented Sep 26, 2023

Description

Removes top-level directories from extracted files if its only child is a directory.
This top-level directory in the archive is unnecessary and creates longer file paths than needed.

Fixes issue #401

Implemented changes

  • Remove top-level directories by default in Dataset.extract(), dataset_download.extract_dataset() and utils.archives.extract_archive().

Type of change

  • New feature (non-breaking change which adds functionality)

How Has This Been Tested?

  • tests/utils/archives_test.test_extract_archive_destination_path_None()
  • test_extract_archive_destination_path_not_None()

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules
  • I have checked my code and corrected any misspellings

@SiQube
Copy link
Member

SiQube commented Sep 26, 2023

@josephine-funken thank you for this great work! there seems to be a merge conflict which needs to be removed first -- if you have any questions how to resolve it feel free to reach out.

@codecov
Copy link

codecov bot commented Sep 27, 2023

Codecov Report

All modified lines are covered by tests ✅

Comparison is base (417a305) 100.00% compared to head (3118012) 100.00%.

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #579   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           52        52           
  Lines         2337      2348   +11     
  Branches       582       587    +5     
=========================================
+ Hits          2337      2348   +11     
Files Coverage Δ
src/pymovements/dataset/dataset.py 100.00% <100.00%> (ø)
src/pymovements/dataset/dataset_download.py 100.00% <ø> (ø)
src/pymovements/utils/archives.py 100.00% <100.00%> (ø)
src/pymovements/utils/downloads.py 100.00% <ø> (ø)

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@josephine-funken
Copy link
Collaborator Author

Added functionality to remove top-level directories to downloads.download_and_extract_archive() and added parameter to corresponding test.

@dkrako dkrako reopened this Sep 28, 2023
Copy link
Contributor

@dkrako dkrako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this work!

However I would like to request two changes.

  • change the file-handling from copy/remove to move/remove
  • the elif clause can probably be refactored into the if clause which would hopefully raise coverage to 100%

src/pymovements/utils/archives.py Outdated Show resolved Hide resolved
src/pymovements/utils/archives.py Outdated Show resolved Hide resolved
@dkrako dkrako changed the title 401 Remove top-level directory from extracted files if it has only single child feat: Optionally remove single-child top-level directory from extracted archives Sep 28, 2023
@github-actions github-actions bot added the enhancement New feature or request label Oct 12, 2023
…d-files-if-it-has-only-single-child

# Conflicts:
#	tests/unit/utils/archives_test.py
Copy link
Contributor

@dkrako dkrako left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great clean work! Thanks a lot!

@dkrako dkrako enabled auto-merge (squash) October 13, 2023 08:46
@dkrako dkrako merged commit b6538a3 into main Oct 13, 2023
18 checks passed
@dkrako dkrako deleted the 401-remove-top-level-directory-from-extracted-files-if-it-has-only-single-child branch October 13, 2023 08:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remove top-level directory from extracted files if it has only single child
4 participants