Automated PR: Downstream develop rebase new changes #62

Cemberk · 2024-08-19T15:16:02Z

This PR was created automatically by the Fork Maintenance System to sync changes from the downstream main into downstream develop.

…#31395) * Add llama3-llava-next-8b to llava_next conversion script Adds support for the lmms-lab/llama3-llava-next-8b model to the convert_llava_next_weights_to_hf.py script, along with an example prompt generated from the llava_llama_3 conv_template in the LLaVA-NeXT repo. * Exclude <|begin_of_text|> from prompt example This token gets added automatically, so it should not be included in the prompt example. * Add llava-next-72b and llava-next-110b Adds the Qwen-based LLaVA-Next models to the conversion script, along with changes to load the models on multiple GPUs for inference. * Add llama3 and qwen prompt formats to docs * Chat prompt and padding side left for llama3 batched * update * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/llava_next/convert_llava_next_weights_to_hf.py Co-authored-by: amyeroberts <[email protected]> * remove code * better naming --------- Co-authored-by: raushan <[email protected]> Co-authored-by: Raushan Turganbay <[email protected]> Co-authored-by: amyeroberts <[email protected]>

* pad on right if training * docs * add tests

* [whisper integration] use parquet dataset for testing * propagate to others * more propagation * last one

…ace#31749) * [whisper] remove un-necessary transpose for fa2 attention * propagate

* fix mask creation of gpt2 and gpt_neox caused by me * forgot the reshape of masks when shape > 2 * add tests for gpt neox and gpt2 * nit on a comment

encapsulate chat template logic

* Add YaRN and Dynamic-YaRN RoPE Scaling Methods YaRN (Yet another RoPE extension method) combines the NTK-By-Parts Interpolation and Attention Scaling methods, improving upon existing RoPE interpolation methods for longer context window sizes. Fine-tuned models maintain their original performance across benchmarks while enabling efficient extrapolation and transfer learning for quicker convergence, especially in compute-limited environments. We implement YaRN and Dynamic-YaRN for the following list of models: - LLaMA - Falcon - GPT-NeoX - Olmo - Persimmon - Phi - StableLM - OpenLLaMA New unit tests are added to assert YaRN's correct behavior on both short and long sequence inputs. For more details, please refer to https://arxiv.org/abs/2309.00071. Co-authored-by: Miguel Almeida <[email protected]> * Refactor YaRN implementation for LLaMA Iterate on YaRN implementation for LLaMA and remove diff from remaining models for increased PR modularity. This commit includes the following changes: - Merge 'yarn_rope_scaling' and 'rope_scaling' dictionaries - Remove unnecessary attributes ('extrapolation_factor' and 'finetuned') from YaRN classes - Inherit 'forward' method in YaRN classes from superclass - Rename 'yarn' method to 'compute_yarn_scaling' - Extend YaRN tests with further assertions - Fix style inconsistencies Co-authored-by: Miguel Monte e Freitas <[email protected]> * Refactor Tensor Building Logic for YaRN - Comply with the the tensor building logic introduced in huggingface#30743 - Add referencing to the optimized Attention Factor equation - Remove Dynamic YaRN for a more agile deployment Co-authored-by: mig-mfreitas <[email protected]> * remove unwanted file --------- Co-authored-by: Miguel Almeida <[email protected]> Co-authored-by: mig-mfreitas <[email protected]> Co-authored-by: Joao Gante <[email protected]>

add attribute to model Signed-off-by: Daniel Lok <[email protected]>

…huggingface#31979) * Change resize_token_embeddings to make it return same Class that is passed to it * Add explanatory comment as requested in review * Add explanatory comments for add resizing function in lxmert * Add comment for padding_idx and moving _resize_bias in lxmert to LxmertForPreTraining --------- Co-authored-by: Prashanth Sateesh <[email protected]> Co-authored-by: Prashanth Sateesh <[email protected]>

Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Arthur <[email protected]>

* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test * typo * clean test

Add the lru_cache for speed

--------- Co-authored-by: Merve Noyan <[email protected]>

* Update README.md * tests: forward ok * backward test done * done testing * removed check. scripts * Update README.md * added use_mambapy arg * fixed typo in warning * protected imports w/ mambapy package * delete pscan.py + raise rather than assert * Update import_utils.py * fix whitespaces and unused import * trailing whitespace + import block unformatted * Update modeling_mamba.py * transpose before pscan * shape comment * ran make style * use_mambapy=False by default Co-authored-by: Arthur <[email protected]> * ran make fix-copies --------- Co-authored-by: Arthur <[email protected]>

* renamed phi3 rope_scaling type * fixed trailing whitespaces * fixed test * added warning * fixed format

…e#32148) Revert "Incorrect Whisper long-form decoding timestamps (huggingface#32003)" This reverts commit cd48553.

…ingface#31857) * feat(cache): StaticCache uses index_copy_ to avoid useless copy Using index_copy_ allows for explicit in-place change of the tensor. Some backends (XLA) will otherwise copy the tensor, making the code slower and using more memory. Proposed implementation will end up using less memory and on XLA will result in less compilation, but the change is also quite generic, making no change whatsoever on CUDA or CPU backend. * feat(cache): SlidingWindowCache uses index_copy_ to avoid useless copy Applying the same change done in StaticCache. * fix(cache): fallback of index_copy_ when not implemented * fix(cache): in index_copy_ ensure tensors are on same device * [run slow] llama * fix(cache): add move of cache_position to same device in SlidingWindowCache * Revert "[run slow] llama" This reverts commit 02608dd.

…r search (huggingface#31924) Update integration_utils.py Added additional kwarg

…ith Position IDs (huggingface#31629) * add DataCollatorBatchFlattening * Update data_collator.py * change name * new FA2 flow if position_ids is provided * add comments * minor fix * minor fix data collator * add test cases for models * add test case for data collator * remove extra code * formating for ruff check and check_repo.py * ruff format ruff format tests src utils * custom_init_isort.py

* Updated ruff version and fixed the required code accorindg to the latest version. * Updated ruff version and fixed the required code accorindg to the latest version. * Added noqa directive to ignore 1 error shown by ruff

Co-authored-by: Arthur Zucker <[email protected]>

…face#32160) Fixed an if condition always evaluating to true.

fix

…eights in the layer (huggingface#32171) * adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer * style fix with ruff:

…than the ones present at import time. (huggingface#32153) * fix: default value reflects the runtime environment variables rather than the ones present at import time. * Fix: Change `deterministic` to None by default; use env var if None

* Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go

Remove conversation pipeline tests

* use head_dim if in config for RoPE * typo * simplify with getattr

* fix on xpu * [run_all]

* more precise name * better docstrings * Update src/transformers/cache_utils.py Co-authored-by: Arthur <[email protected]> --------- Co-authored-by: Arthur <[email protected]>

…e#32844) * Fix: fix all model_type of Llava-Next-Video to llava_next_video * Fix doc for llava_next_video * * Fix formatting issues * Change llava-next-video.md file name into llava_next_video.md to make it compatible with implementation * Fix docs TOC for llava-next-video

* improve _get_is_as_tensor_fns * format

Revert PR 32299

…enamed, and provide a step forward (huggingface#32656) * Fin * Modify msg * Finish up nits

…uggingface#32674) * Fix beam_constraints.Constraint.advance() docstring * Update src/transformers/generation/beam_constraints.py Co-authored-by: Steven Liu <[email protected]> --------- Co-authored-by: Joao Gante <[email protected]> Co-authored-by: Steven Liu <[email protected]>

…generation (huggingface#32856)

* tfmsenv restored in main * installed flax * forward pass done and all tests passed * make fix-copies and cleaning the scripts * fixup attempt 1 * fixup attempt 2 * fixup third attempt * fixup attempt 4 * fixup attempt 5 * dinov2 doc fixed * FlaxDinov2Model + ForImageClassification added to OBJECTS_TO_IGNORE * external pos_encoding layer removed * fixup attempt 6 * fixed integration test values * fixup attempt 7 * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: amyeroberts <[email protected]> * comments removed * comment removed from the test * fixup * Update src/transformers/models/dinov2/modeling_flax_dinov2.py Co-authored-by: Sanchit Gandhi <[email protected]> * new fixes 1 * interpolate_pos_encoding function removed * droppath rng fixed, pretrained beit copied-from still not working * modeling_flax_dinov2.py reformatted * Update tests/models/dinov2/test_modeling_flax_dinov2.py Co-authored-by: Sanchit Gandhi <[email protected]> * added Copied from, to the tests * copied from statements removed from tests * fixed copied from statements in the tests * [run_slow] dinov2 --------- Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Sanchit Gandhi <[email protected]>

* dac model * original dac works * add dac model * dac can be instatiated * add forward pass * load weights * all weights are used * convert checkpoint script ready * test * add feature extractor * up * make style * apply cookicutter * fix tests * iterate on FeatureExtractor * nit * update dac doc * replace nn.Sequential with nn.ModuleList * nit * apply review suggestions 1/2 * Update src/transformers/models/dac/modeling_dac.py Co-authored-by: Sanchit Gandhi <[email protected]> * up * apply review suggestions 2/2 * update padding in FeatureExtractor * apply review suggestions * iterate on design and tests * add integration tests * feature extractor tests * make style * all tests pass * make style * fixup * apply review suggestions * fix-copies * apply review suggestions * apply review suggestions * Update docs/source/en/model_doc/dac.md Co-authored-by: Yoach Lacombe <[email protected]> * Update docs/source/en/model_doc/dac.md Co-authored-by: Yoach Lacombe <[email protected]> * anticipate transfer weights to descript * up * make style * apply review suggestions * update slow test values * update slow tests * update test values * update with CI values * update with vorace values * update test with slice * make style --------- Co-authored-by: Sanchit Gandhi <[email protected]> Co-authored-by: Yoach Lacombe <[email protected]>

…face#32519) * enable * fix

* Add representation for Conv1D, for better output info. * code format for Conv1D * We add a __repr__ func for Conv1D, this allows the print (or output) of the model's info has a better description for Conv1D.

* Support save/load ckpt for XLA FSDP * Fix bug for save * Fix style * reserve sharded ckpt and better file naming * minor fix Co-authored-by: Zach Mueller <[email protected]> * add is_fsdp_xla_v1_enabled --------- Co-authored-by: Zach Mueller <[email protected]>

* fix: Parameterized norm freezing For the R18 model, the authors don't freeze norms in the backbone. * Update src/transformers/models/rt_detr/configuration_rt_detr.py Co-authored-by: Pavel Iakubovskii <[email protected]> --------- Co-authored-by: Pavel Iakubovskii <[email protected]>

* fix gguf config vocab size * minor fix * link issue

* fix mamba left padding * Apply suggestions from code review Co-authored-by: Pablo Montalvo <[email protected]> * fix copies * test with `inputs_embeds` * Update src/transformers/models/falcon_mamba/modeling_falcon_mamba.py Co-authored-by: Arthur <[email protected]> * copies * clairfy * fix last comments * remove --------- Co-authored-by: Pablo Montalvo <[email protected]> Co-authored-by: Arthur <[email protected]>

…uggingface#32694) * fix cache when using input embeddings * simplify check, we can always add input ids seq len since its 0 in first pass

Fixed whisper-large-v2 model link in docs.

* Update testing_utils.py * changes * from env var * name change * debug * name change

* skip failures * navi31 skip * mi300 skips * conversational test backwards compatability * mi300 skips

* mi250 new skip from update * revert change from main use_auth_token depricated

jamt9000 and others added 30 commits July 23, 2024 10:12

LLaVaNeXT: pad on right if training (huggingface#32134)

3aefb4e

* pad on right if training * docs * add tests

Remove trust_remote_code when loading Libri Dummy (huggingface#31748)

f83c6f1

* [whisper integration] use parquet dataset for testing * propagate to others * more propagation * last one

[modelling] remove un-necessary transpose for fa2 attention (huggingf…

2782aad

…ace#31749) * [whisper] remove un-necessary transpose for fa2 attention * propagate

Fix mask creations of GPTNeoX and GPT2 (huggingface#31944)

605f324

* fix mask creation of gpt2 and gpt_neox caused by me * forgot the reshape of masks when shape > 2 * add tests for gpt neox and gpt2 * nit on a comment

Add method to retrieve used chat template (huggingface#32032)

7405c1c

encapsulate chat template logic

Disable quick init for TapasPreTrainedModel (huggingface#32149)

1535a2c

add attribute to model Signed-off-by: Daniel Lok <[email protected]>

Llama: RoPE refactor (huggingface#32135)

2e11342

Co-authored-by: amyeroberts <[email protected]> Co-authored-by: Arthur <[email protected]>

gguf conversion add_prefix_space=None for llama3 (huggingface#31937)

a1844a3

* gguf conversion forces add_prefix_space=False for llama3, this is not required and forces from_slow, which fails. changing to None + test * typo * clean test

Fix flash attention speed issue (huggingface#32028)

a5b226c

Add the lru_cache for speed

Fix video batching to videollava (huggingface#32139)

9ced33c

--------- Co-authored-by: Merve Noyan <[email protected]>

Rename Phi-3 rope scaling type (huggingface#31436)

034b477

* renamed phi3 rope_scaling type * fixed trailing whitespaces * fixed test * added warning * fixed format

Revert "Incorrect Whisper long-form decoding timestamps " (huggingfac…

3263b34

…e#32148) Revert "Incorrect Whisper long-form decoding timestamps (huggingface#32003)" This reverts commit cd48553.

Fix typing to be compatible with later py versions (huggingface#32155)

a009fbd

Added additional kwarg for successful running of optuna hyperparamete…

7d92009

…r search (huggingface#31924) Update integration_utils.py Added additional kwarg

Updated ruff to the latest version (huggingface#31926)

d2c687b

* Updated ruff version and fixed the required code accorindg to the latest version. * Updated ruff version and fixed the required code accorindg to the latest version. * Added noqa directive to ignore 1 error shown by ruff

Dev version: v4.44.0.dev0

ff0d708

Llama 3.1 conversion

d5a99df

Co-authored-by: Arthur Zucker <[email protected]>

fix (huggingface#32162)

23f6a43

fix: Fixed an if condition that is always evaluating to true (hugging…

bc2adb0

…face#32160) Fixed an if condition always evaluating to true.

[docs] change temperature to a positive value (huggingface#32077)

c85510f

fix

adds: extra_repr() to MambaRMSNorm to include hidden size / size of w…

01be5b4

…eights in the layer (huggingface#32171) * adds: extra_repr() to MambaRMSNorm to include the hidden size of the layer * style fix with ruff:

Update qwen2.md (huggingface#32108)

5f4ee98

* Update qwen2.md outdated description * Update qwen2.md amended * Update qwen2.md Update * Update qwen2.md fix wrong version code, now good to go

Remove conversational pipeline tests (huggingface#32099)

165116b

Remove conversation pipeline tests

suiyoubi and others added 21 commits August 16, 2024 11:37

Use head_dim if in config for RoPE (huggingface#32495)

5fd7ca7

* use head_dim if in config for RoPE * typo * simplify with getattr

Generate: unify LogitsWarper and LogitsProcessor (huggingface#32626)

70d5df6

[tests] make test_sdpa_equivalence device-agnostic (huggingface#32520)

8f9fa3b

* fix on xpu * [run_all]

Cache: use batch_size instead of max_batch_size (huggingface#32657)

cf32ee1

* more precise name * better docstrings * Update src/transformers/cache_utils.py Co-authored-by: Arthur <[email protected]> --------- Co-authored-by: Arthur <[email protected]>

improve _get_is_as_tensor_fns (huggingface#32596)

f20d0e8

* improve _get_is_as_tensor_fns * format

Revert PR 32299, flag users when Zero-3 was missed (huggingface#32851)

0b066be

Revert PR 32299

fix multi-gpu with static cache (huggingface#32543)

1c36db6

Reduce the error log when using core models that need their weights r…

8ec028a

…enamed, and provide a step forward (huggingface#32656) * Fin * Modify msg * Finish up nits

generate: missing to in DoLa body, causing exceptions in multi-gpu …

52cb403

…generation (huggingface#32856)

support torch-speech (huggingface#32537)

54b7703

[tests] make test_sdpa_can_compile_dynamic device-agnostic (hugging…

e55b33c

…face#32519) * enable * fix

Add __repr__ for Conv1D (huggingface#32425)

f1b720e

* Add representation for Conv1D, for better output info. * code format for Conv1D * We add a __repr__ func for Conv1D, this allows the print (or output) of the model's info has a better description for Conv1D.

Fix incorrect vocab size retrieval in GGUF config (huggingface#32551)

59e8f19

* fix gguf config vocab size * minor fix * link issue

Fix: Mamba2 generation mismatch between input_ids and inputs_embeds (h…

61d89c1

…uggingface#32694) * fix cache when using input embeddings * simplify check, we can always add input ids seq len since its 0 in first pass

Cemberk force-pushed the tmp-develop-20240819 branch from 7bca900 to 89c9906 Compare August 19, 2024 15:26

Sai-Suraj-27 and others added 6 commits August 19, 2024 09:50

Docs: Fixed whisper-large-v2 model link in docs (huggingface#32871)

3720484

Fixed whisper-large-v2 model link in docs.

Add tip to clarify tool calling (huggingface#32883)

85345bb

Changes from old ROCm main

4bfd676

Add skip if rocm (#38)

19b1a59

* Update testing_utils.py * changes * from env var * name change * debug * name change

skip failures (#39)

d01a751

* skip failures * navi31 skip * mi300 skips * conversational test backwards compatability * mi300 skips

Debug v4.43 rocm (#40) (#42)

eef4aa9

* mi250 new skip from update * revert change from main use_auth_token depricated

Cemberk force-pushed the tmp-develop-20240819 branch from 89c9906 to eef4aa9 Compare August 19, 2024 20:45

Cemberk closed this Aug 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated PR: Downstream develop rebase new changes #62

Automated PR: Downstream develop rebase new changes #62

Cemberk commented Aug 19, 2024

Automated PR: Downstream develop rebase new changes #62

Automated PR: Downstream develop rebase new changes #62

Conversation

Cemberk commented Aug 19, 2024