Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update all non-major dependencies - abandoned #198

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

renovate[bot]
Copy link

@renovate renovate bot commented Aug 28, 2021

WhiteSource Renovate

This PR contains the following updates:

Package Type Update Change
pre-commit/pre-commit-hooks repository minor v3.2.0 -> v3.4.0
python minor 3.8.0 -> 3.9.6
timothycrosley/isort repository minor 5.6.4 -> 5.9.3
transformers minor ==4.3.3 -> ==4.9.2

Release Notes

pre-commit/pre-commit-hooks

v3.4.0

Compare Source

Features

v3.3.0

Compare Source

Features
Deprecations
  • check-byte-order-marker is now deprecated for fix-byte-order-marker
timothycrosley/isort

v5.9.3

Compare Source

  • Improved text of skipped file message to mention gitignore feature.
    • Made all exceptions pickleable.
    • Fixed #​1779: Pylama integration ignores pylama specific isort config overrides.
    • Fixed #​1781: --from-first CLI flag shouldn't take any arguments.
    • Fixed #​1792: Sorting literals sometimes ignored when placed on first few lines of file.
    • Fixed #​1777: extend_skip is not honored wit a git submodule when skip_gitignore=true.

v5.9.2

Compare Source

  • Improved behavior of isort --check --atomic against Cython files.
    • Fixed #​1769: Future imports added below assignments when no other imports present.
    • Fixed #​1772: skip-gitignore will check files not in the git repository.
    • Fixed #​1762: in some cases when skip-gitignore is set, isort fails to skip any files.
    • Fixed #​1767: Encoding issues surfacing when invalid characters set in __init__.py files during placement.
    • Fixed #​1771: Improved handling of skips against named streamed in content.

v5.9.1

Compare Source

  • Fixed #​1758: projects with many files and skip_ignore set can lead to a command-line overload.

v5.9.0

Compare Source

Goal Zero (Tickets related to aspirational goal of achieving 0 regressions for remaining 5.0.0 lifespan):
  • Implemented #​1394: 100% branch coverage (in addition to line coverage) enforced.
  • Implemented #​1751: Strict typing enforcement (turned on mypy strict mode).

v5.8.0

Compare Source

  • Fixed #​1631: as import comments can in some cases be duplicated.
    • Fixed #​1667: extra newline added with float-to-top, after skip, in some cases.
    • Fixed #​1594: incorrect placement of noqa comments with multiple from imports.
    • Fixed #​1566: in some cases different length limits for dos based line endings.
    • Implemented #​1648: Export MyPY type hints.
    • Implemented #​1641: Identified import statements now return runnable code.
    • Implemented #​1661: Added "wemake" profile.
    • Implemented #​1669: Parallel (-j) now defaults to number of CPU cores if no value is provided.
    • Implemented #​1668: Added a safeguard against accidental usage against /.
    • Implemented #​1638 / #​1644: Provide a flag --overwrite-in-place to ensure same file handle is used after sorting.
    • Implemented #​1684: Added support for extending skips with --extend-skip and --extend-skip-glob.
    • Implemented #​1688: Auto identification and skipping of some invalid import statements.
    • Implemented #​1645: Ability to reverse the import sorting order.
    • Implemented #​1504: Added ability to push star imports to the top to avoid overriding explicitly defined imports.
    • Documented #​1685: Skip doesn't support plain directory names, but skip_glob does.

v5.7.0

Compare Source

  • Fixed #​1612: In rare circumstances an extra comma is added after import and before comment.
    • Fixed #​1593: isort encounters bug in Python 3.6.0.
    • Implemented #​1596: Provide ways for extension formatting and file paths to be specified when using streaming input from CLI.
    • Implemented #​1583: Ability to output and diff within a single API call to isort.file.
    • Implemented #​1562, #​1592 & #​1593: Better more useful fatal error messages.
    • Implemented #​1575: Support for automatically fixing mixed indentation of import sections.
    • Implemented #​1582: Added a CLI option for skipping symlinks.
    • Implemented #​1603: Support for disabling float_to_top from the command line.
    • Implemented #​1604: Allow toggling section comments on and off for indented import sections.
huggingface/transformers

v4.9.2

Compare Source

v4.9.2: Patch release

v4.9.1

Compare Source

v4.9.1: Patch release

Fix barrier for SM distributed #​12853 (@​sgugger)

v4.9.0

Compare Source

v4.9.0: TensorFlow examples, CANINE, tokenizer training, ONNX rework

ONNX rework

This version introduces a new package, transformers.onnx, which can be used to export models to ONNX. Contrary to the previous implementation, this approach is meant as an easily extendable package where users may define their own ONNX configurations and export the models they wish to export.

python -m transformers.onnx --model=bert-base-cased onnx/bert-base-cased/
Validating ONNX model...
        -[✓] ONNX model outputs' name match reference model ({'pooler_output', 'last_hidden_state'}
        - Validating ONNX Model output "last_hidden_state":
                -[✓] (2, 8, 768) matchs (2, 8, 768)
                -[✓] all values close (atol: 0.0001)
        - Validating ONNX Model output "pooler_output":
                -[✓] (2, 768) matchs (2, 768)
                -[✓] all values close (atol: 0.0001)
All good, model saved at: onnx/bert-base-cased/model.onnx

CANINE model

Four new models are released as part of the CANINE implementation: CanineForSequenceClassification, CanineForMultipleChoice, CanineForTokenClassification and CanineForQuestionAnswering, in PyTorch.

The CANINE model was proposed in CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation by Jonathan H. Clark, Dan Garrette, Iulia Turc, John Wieting. It’s among the first papers that train a Transformer without using an explicit tokenization step (such as Byte Pair Encoding (BPE), WordPiece, or SentencePiece). Instead, the model is trained directly at a Unicode character level. Training at a character level inevitably comes with a longer sequence length, which CANINE solves with an efficient downsampling strategy, before applying a deep Transformer encoder.

Compatible checkpoints can be found on the Hub: https://huggingface.co/models?filter=canine

Tokenizer training

This version introduces a new method to train a tokenizer from scratch based off of an existing tokenizer configuration.

from datasets import load_dataset
from transformers import AutoTokenizer

dataset = load_dataset("wikitext", name="wikitext-2-raw-v1", split="train")
### We train on batch of texts, 1000 at a time here.
batch_size = 1000
corpus = (dataset[i : i + batch_size]["text"] for i in range(0, len(dataset), batch_size))

tokenizer = AutoTokenizer.from_pretrained("gpt2")
new_tokenizer = tokenizer.train_new_from_iterator(corpus, vocab_size=20000)
  • Easily train a new fast tokenizer from a given one - tackle the special tokens format (str or AddedToken) #​12420 (@​SaulLu)
  • Easily train a new fast tokenizer from a given one #​12361 (@​sgugger)

TensorFlow examples

The TFTrainer is now entering deprecation - and it is replaced by Keras. With version v4.9.0 comes the end of a long rework of the TensorFlow examples, for them to be more Keras-idiomatic, clearer, and more robust.

TensorFlow implementations

HuBERT is now implemented in TensorFlow:

Breaking changes

When load_best_model_at_end was set to True in the TrainingArguments, having a different save_strategy and eval_strategy was accepted but the save_strategy was overwritten by the eval_strategy (the option to keep track of the best model needs to make sure there is an evaluation each time there is a save). This led to a lot of confusion with users not understanding why the script was not doing what it was told, so this situation will now raise an error indicating to set save_strategy and eval_strategy to the same values, and in the case that value is "steps", save_steps must be a round multiple of eval_steps.

General improvements and bugfixes

v4.8.2

Compare Source

Patch release: v4.8.2

v4.8.1

Compare Source

v4.8.1: Patch release

  • Fix default for TensorBoard folder
  • Ray Tune install #​12338
  • Tests fixes for Torch FX #​12336

v4.8.0

Compare Source

v4.8.0 Integration with the Hub and Flax/JAX support

Integration with the Hub

Our example scripts and Trainer are now optimized for publishing your model on the Hugging Face Hub, with Tensorboard training metrics, and an automatically authored model card which contains all the relevant metadata, including evaluation results.

Trainer Hub integration

Use --push_to_hub to create a model repo for your training and it will be saved with all relevant metadata at the end of the training.

Other flags are:

  • push_to_hub_model_id to control the repo name
  • push_to_hub_organization to specify an organization
Visualizing Training metrics on huggingface.co (based on Tensorboard)

By default if you have tensorboard installed the training scripts will use it to log, and the logging traces folder is conveniently located inside your model output directory, so you can push them to your model repo by default.

Any model repo that contains Tensorboard traces will spawn a Tensorboard server:

image

which makes it very convenient to see how the training went! This Hub feature is in Beta so let us know if anything looks weird :)

See this model repo

Model card generation

image

The model card contains info about the datasets used, the eval results, ...

Many users were already adding their eval results to their model cards in markdown format, but this is a more structured way of adding them which will make it easier to parse and e.g. represent in leaderboards such as the ones on Papers With Code!

We use a format specified in collaboration with [PaperswithCode] (https://github.com/huggingface/huggingface_hub/blame/main/modelcard.md), see also this repo.

Model, tokenizer and configurations

All models, tokenizers and configurations having a revamp push_to_hub() method as well as a push_to_hub argument in their save_pretrained() method. The workflow of this method is changed a bit to be more like git, with a local clone of the repo in a folder of the working directory, to make it easier to apply patches (use use_temp_dir=True to clone in temporary folders for the same behavior as the experimental API).

Flax/JAX support

Flax/JAX is becoming a fully supported backend of the Transformers library with more models having an implementation in it. BART, CLIP and T5 join the already existing models, find the whole list here.

General improvements and bug fixes


Configuration

📅 Schedule: At any time (no schedule defined).

🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.

Rebasing: Renovate will not automatically rebase this PR, because other commits have been found.

👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.


  • If you want to rebase/retry this PR, check this box.

This PR has been generated by WhiteSource Renovate. View repository job log here.

@renovate
Copy link
Author

renovate bot commented Mar 26, 2022

Autoclosing Skipped

This PR has been flagged for autoclosing. However, it is being skipped due to the branch being already modified. Please close/delete it manually or report a bug if you think this is in error.

@renovate renovate bot changed the title Update all non-major dependencies Update all non-major dependencies - abandoned Nov 10, 2022
@renovate
Copy link
Author

renovate bot commented Mar 18, 2023

Edited/Blocked Notification

Renovate will not automatically rebase this PR, because it does not recognize the last commit author and assumes somebody else may have edited the PR.

You can manually request rebase by checking the rebase/retry box above.

⚠️ Warning: custom changes will be lost.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants