[Executorch][llama] Enable quantized sdpa #10062

pytorchbot · 2025-04-10T14:26:13Z

This PR was created by the merge bot to help merge the original PR into the main branch.
ghstack PR number: #9945 by @kimishpatel
^ Please use this as the source of truth for the PR details, comments, and reviews
ghstack PR base: https://github.com/pytorch/executorch/tree/gh/kimishpatel/172/base
ghstack PR head: https://github.com/pytorch/executorch/tree/gh/kimishpatel/172/head
Merge bot PR base: https://github.com/pytorch/executorch/tree/gh/kimishpatel/171/orig
Merge bot PR head: https://github.com/pytorch/executorch/tree/gh/kimishpatel/172/orig
@diff-train-skip-merge

pytorch-bot · 2025-04-10T14:26:17Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10062

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Pull Request resolved: #9945 Enable leveraging quantized sdpa op when quantized kv cache is used. Instead of adding yet another arg, at the moment I have chosen to leverage quantize_kv_cache option. ghstack-source-id: 277233485 Differential Revision: [D71833064](https://our.internmc.facebook.com/intern/diff/D71833064/)

github-actions · 2025-04-10T17:33:38Z

This PR needs a `release notes:` label

If your changes are user facing and intended to be a part of release notes, please use a label starting with release notes:.

If not, please add the topic: not user facing label.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "topic: not user facing"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Enable leveraging quantized sdpa op when quantized kv cache is used. Instead of adding yet another arg, at the moment I have chosen to leverage quantize_kv_cache option. Differential Revision: [D71833064](https://our.internmc.facebook.com/intern/diff/D71833064/)

pytorchbot requested review from GregoryComer, jackzhxng, iseeyuan, larryliu0820, swolchok and lucylq as code owners April 10, 2025 14:26

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 10, 2025

Base automatically changed from gh/kimishpatel/171/orig to main April 10, 2025 17:29

kirklandsign approved these changes Apr 10, 2025

View reviewed changes

kirklandsign force-pushed the gh/kimishpatel/172/orig branch from c458541 to 637c5a8 Compare April 10, 2025 17:33

kirklandsign merged commit 40beade into main Apr 10, 2025
77 of 83 checks passed

kirklandsign deleted the gh/kimishpatel/172/orig branch April 10, 2025 17:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Executorch][llama] Enable quantized sdpa #10062

[Executorch][llama] Enable quantized sdpa #10062

pytorchbot commented Apr 10, 2025

pytorch-bot bot commented Apr 10, 2025 •

edited

Loading

github-actions bot commented Apr 10, 2025

[Executorch][llama] Enable quantized sdpa #10062

[Executorch][llama] Enable quantized sdpa #10062

Conversation

pytorchbot commented Apr 10, 2025

pytorch-bot bot commented Apr 10, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/10062

github-actions bot commented Apr 10, 2025

This PR needs a release notes: label

pytorch-bot bot commented Apr 10, 2025 •

edited

Loading

This PR needs a `release notes:` label