Fix get_usable_length for StaticCache #32503

guangy10 · 2024-08-07T18:23:44Z

System Info

transformers: 4.45.0.dev0
torch: 2.5.0.dev20240716+cpu

Who can help?

@ArthurZucker
@gante
@zucchini-nlp

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

It's a bug only affect the new workflow "Export to ExecuTorch" that we're trying to enable. The method get_usable_length should be override for StaticCache where recompile, resizing or evicting the existing cache entry won't be applicable. This is because unlike eager or torch.compiled model, exported model will be running in a non-python env where recompiling from the eager source isn't available.

Expected behavior

The generation length using the exported artifact should not exceed the maximal cache length because the model and the size of the its cache are exported statically. When get_usable_length returns 0, it should terminate the generation.

The text was updated successfully, but these errors were encountered:

ArthurZucker · 2024-08-08T12:50:12Z

The get_usable_length is something that we are deprecating anyways, but good to note. StaticCache should not even have it!

guangy10 · 2024-08-08T17:13:29Z

@ArthurZucker Because @helunwencser in our team is working on Phi-3-mini, and in Phi-3's modeling code, get_usable_length is used in several places like here or here. The modeling code itself is fine because doesn't make assumption about the type of cache being used. However, when comes to trace the model via torch.export, the default get_usable_length will trigger evicting old cache entries hence new attentions from that point will be incorrect. I think it would make sense to override it for StaticCache to avoid misbehavior in the short-term before it can be fully deprecated.

github-actions · 2024-09-13T08:04:37Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

guangy10 added the bug label Aug 7, 2024

guangy10 mentioned this issue Aug 7, 2024

Export to ExecuTorch #32253

Open

26 tasks

amyeroberts added the ExecuTorch label Aug 8, 2024

github-actions bot closed this as completed Sep 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix get_usable_length for StaticCache #32503

Fix get_usable_length for StaticCache #32503

guangy10 commented Aug 7, 2024

ArthurZucker commented Aug 8, 2024

guangy10 commented Aug 8, 2024

github-actions bot commented Sep 13, 2024

Fix get_usable_length for StaticCache #32503

Fix get_usable_length for StaticCache #32503

Comments

guangy10 commented Aug 7, 2024

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

ArthurZucker commented Aug 8, 2024

guangy10 commented Aug 8, 2024

github-actions bot commented Sep 13, 2024