Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Commit

Permalink
[pre-commit.ci] auto fixes from pre-commit.com hooks
Browse files Browse the repository at this point in the history
for more information, see https://pre-commit.ci
  • Loading branch information
pre-commit-ci[bot] committed Jul 15, 2024
1 parent 7da0cf5 commit d600112
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions docs/h2o.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
2. [Usage](#usage)

## Introduction
**Heavy-Hitter Oracal (H2O)** is a novel approach for implementing the KV cache wihich significantly reduces memory footprint.
**Heavy-Hitter Oracal (H2O)** is a novel approach for implementing the KV cache which significantly reduces memory footprint.

This methods base on the fact that the accumulated attention scores of all tokens in attention blocks adhere to a power-law distribution. It suggests that there exists a small set of influential tokens that are critical during generation, named heavy-hitters (H2). H2 provides an opportunity to step away from the combinatorial search problem and identify an eviction policy that maintains accuracy.

Expand Down Expand Up @@ -46,4 +46,4 @@ user_model = LlamaForCausalLM.from_pretrained(
trust_remote_code=args.trust_remote_code)
```

Please refer to [h2o example](../examples/huggingface/pytorch/text-generation/h2o/run_generation.py) for the details.
Please refer to [h2o example](../examples/huggingface/pytorch/text-generation/h2o/run_generation.py) for the details.

0 comments on commit d600112

Please sign in to comment.