[ci] - fix(test): fix test workflow #72

JulesBelveze · 2024-11-03T17:06:58Z

This PR aims at fixing the test CI workflow

…flow - Change the `uv sync` command to install all extras during PR checks

…t code during loading - Allow datasets library to execute remote code by setting `trust_remote_code=True`, improving compatibility with datasets hosted externally

- Refactor the line that loads the dataset to span multiple lines for better code readability - Maintain functionality of trusting remote dataset code by setting `trust_remote_code=True` in a more readable format

…eferences - Deleted the ConferenceDataset class to streamline local_datasets - Removed ConferenceDataset import from the __init__.py to clean up package initialization [docs] - docs: update data documentation to reflect removed ConferenceDataset - Removed reference to ConferenceDataset in the data.rst docs to keep documentation accurate

- Deleted the `local_datasets` module which managed unlabeled and parallel datasets - Data related to BERT squeeze training removed from .gitignore indicating possible deprecation or refactoring [docs] - docs: update documentation to reflect codebase changes - Removed documentation entries for the now-deleted `local_datasets` module in `bert_squeeze`

…ranslation tasks - Removed hardcoded text column name in favor of dynamic translation column configuration - Added a filter to exclude entries without translations before tokenization - Fixed mismatched attention mask column name in tokenized_dataset [tests] - test: change DistilAssistant test to use `kmfoda/booksum` dataset parameters - Updated test cases to use `booksum` dataset path and specific configuration parameters like `percent`, `target_col`, and `source_col` - Modified asserts to expect different lengths for train and validation data loaders based on `booksum` dataset

JulesBelveze added 6 commits November 3, 2024 18:05

[.github] - devops: update dependency installation command in PR work…

58dc815

…flow - Change the `uv sync` command to install all extras during PR checks

[bert_squeeze/data/modules] - feature: enable trust for remote datase…

123dca3

…t code during loading - Allow datasets library to execute remote code by setting `trust_remote_code=True`, improving compatibility with datasets hosted externally

[bert_squeeze/data] - refactor: improve readability of dataset loading

a49991b

- Refactor the line that loads the dataset to span multiple lines for better code readability - Maintain functionality of trusting remote dataset code by setting `trust_remote_code=True` in a more readable format

JulesBelveze merged commit aec9078 into main Nov 10, 2024
2 checks passed

JulesBelveze deleted the fix/ci-test branch November 10, 2024 14:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ci] - fix(test): fix test workflow #72

[ci] - fix(test): fix test workflow #72

JulesBelveze commented Nov 3, 2024

[ci] - fix(test): fix test workflow #72

[ci] - fix(test): fix test workflow #72

Conversation

JulesBelveze commented Nov 3, 2024