From a8e27ea8fd186bfd7993f4b369459b71ba83b106 Mon Sep 17 00:00:00 2001 From: "Sun, Xuehao" Date: Thu, 28 Mar 2024 18:25:02 +0800 Subject: [PATCH] test Signed-off-by: Sun, Xuehao --- .pre-commit-config.yaml | 11 +++++++++++ README.md | 4 ++-- 2 files changed, 13 insertions(+), 2 deletions(-) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index f39750a..ce8b1d8 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -14,3 +14,14 @@ repos: hooks: - id: ruff args: [--fix, --exit-non-zero-on-fix, --no-cache] + - repo: https://github.com/pocc/pre-commit-hooks + rev: v1.3.5 + hooks: + - id: clang-format + args: [--style=Google] + - id: clang-tidy + - id: oclint + - id: uncrustify + - id: cppcheck + - id: cpplint + - id: include-what-you-use diff --git a/README.md b/README.md index 10a2a19..2f4d697 100644 --- a/README.md +++ b/README.md @@ -30,7 +30,7 @@ Model inference: Roughly speaking , two key steps are required to get the model' Text generation: The most famous application of LLMs is text generation, which predicts the next token/word based on the inputs/context. To generate a sequence of texts, we need to predict them one by one. In this scenario, $F\\approx P$ if some operations like bmm are ignored and past key values have been saved. However, the $C/B$ of the modern device could be to **100X,** that makes the memory bandwidth as the bottleneck in this scenario. | Tables | Are | Cool | -| -------- | :-----: | ---: | +| -------- | :-----------: | ---------------------------------------------------------------------------: | | col 1 is | left-aligned | $1600 | | col 2 is | centered | $12 | | col 3 is | right-aligned |
failed logtesttest
testtest
| @@ -64,4 +64,4 @@ testtest
testtest 0.023% - \ No newline at end of file +