feature request: support for Qwen2 series of models #4

pszemraj · 2024-11-18T18:03:03Z

Hi, thanks for the great work and it's great to see new approaches to improve training efficiency. I'd like to request the extension of the transformers patch support to the qwen2 series of models, as the recent qwen2.5 series on the same architecture has top-notch performance and could benefit from this technique given the vocabulary size of 151936.

I can help with a PR if needed, but AFAIK the code is quite similar to the already supported llama, with some minor changes.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feature request: support for Qwen2 series of models #4

feature request: support for Qwen2 series of models #4

pszemraj commented Nov 18, 2024

feature request: support for Qwen2 series of models #4

feature request: support for Qwen2 series of models #4

Comments

pszemraj commented Nov 18, 2024