Skip to content

Commit

Permalink
temporary decompose for decode (#353)
Browse files Browse the repository at this point in the history
  • Loading branch information
dan-garvey authored Oct 29, 2024
1 parent d8f39a9 commit 0e93b64
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions sharktank/sharktank/models/llama/llama.py
Original file line number Diff line number Diff line change
Expand Up @@ -269,6 +269,7 @@ def decode(
for block_idx, block in enumerate(self.attn_blocks):
if block_idx == 0:
self.trace_tensor(f"llama.attn_block.{block_idx}.input", h)
block.attn.attention_kernel = "decomposed"
h = block(
h,
start_positions=start_positions,
Expand Down

0 comments on commit 0e93b64

Please sign in to comment.