Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump datasets from 2.18.0 to 2.19.0 #80

Closed
wants to merge 1 commit into from

Conversation

dependabot[bot]
Copy link

@dependabot dependabot bot commented on behalf of github Apr 22, 2024

Bumps datasets from 2.18.0 to 2.19.0.

Release notes

Sourced from datasets's releases.

2.19.0

Dataset Features

  • Add Polars compatibility by @​psmyth94 in huggingface/datasets#6531
    • convert to a Polars dataframe using .to_polars();
      import polars as pl
      from datasets import load_dataset
      ds = load_dataset("DIBT/10k_prompts_ranked", split="train")
      ds.to_polars() \
          .groupby("topic") \
          .agg(pl.len(), pl.first()) \
          .sort("len", descending=True)
    • Use Polars formatting to return Polars objects when accessing a dataset:
      ds = ds.with_format("polars")
      ds[:10].group_by("kind").len()
  • Add fsspec support for to_json, to_csv, and to_parquet by @​alvarobartt in huggingface/datasets#6096
    • Save on HF in any file format:
      ds.to_json("hf://datasets/username/my_json_dataset/data.jsonl")
      ds.to_csv("hf://datasets/username/my_csv_dataset/data.csv")
      ds.to_parquet("hf://datasets/username/my_parquet_dataset/data.parquet")
  • Add mode parameter to Image feature by @​mariosasko in huggingface/datasets#6735
    • Set images to be read in a certain mode like "RGB"
      dataset = dataset.cast_column("image", Image(mode="RGB"))
  • Add CLI function to convert script-dataset to Parquet by @​albertvillanova in huggingface/datasets#6795
    • run command to open a PR in script-based dataset to convert it to Parquet:
      datasets-cli convert_to_parquet <dataset_id>
      
  • Add Dataset.take and Dataset.skip by @​lhoestq in huggingface/datasets#6813
    • same as IterableDataset.take and IterableDataset.skip
      ds = ds.take(10)  # take only the first 10 examples

General improvements and bug fixes

... (truncated)

Commits
  • 0d3c746 Release: 2.19.0 (#6825)
  • 0bc709a Fix parquet export infos (#6822)
  • 2a14271 Make convert_to_parquet CLI command create script branch (#6809)
  • 5eb93f6 Support indexable objects in Dataset.__getitem__ (#6817)
  • 8983a3b add allow_primitive_to_str and allow_decimal_to_str instead of allow_number_t...
  • a188022 Extract data on the fly in packaged builders (#6784)
  • ed8860f Remove os.path.relpath in resolve_patterns (#6815)
  • 55eb1d9 Add Dataset.take and Dataset.skip (#6813)
  • 0f1f27c Multithreaded downloads (#6794)
  • 91b07b9 Fix cache path to snakecase for CachedDatasetModuleFactory and Cache (#6754)
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [datasets](https://github.com/huggingface/datasets) from 2.18.0 to 2.19.0.
- [Release notes](https://github.com/huggingface/datasets/releases)
- [Commits](huggingface/datasets@2.18.0...2.19.0)

---
updated-dependencies:
- dependency-name: datasets
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file minor labels Apr 22, 2024
Copy link
Author

dependabot bot commented on behalf of github May 13, 2024

Superseded by #91.

@dependabot dependabot bot closed this May 13, 2024
@dependabot dependabot bot deleted the dependabot/pip/datasets-2.19.0 branch May 13, 2024 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file minor
Projects
None yet
Development

Successfully merging this pull request may close these issues.

0 participants