Skip to content
This repository has been archived by the owner on Oct 25, 2024. It is now read-only.

Commit

Permalink
fix int8 skip module config
Browse files Browse the repository at this point in the history
Signed-off-by: changwangss <[email protected]>
  • Loading branch information
changwangss committed Aug 5, 2024
1 parent b400cb9 commit ef2a1d4
Showing 1 changed file with 4 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -831,7 +831,10 @@ def __init__(
self.double_quant_bits = double_quant_bits
self.double_quant_use_sym = double_quant_use_sym
self.double_quant_group_size = double_quant_group_size
self.llm_int8_skip_modules = kwargs.get("llm_int8_skip_modules", ["lm_head", "output_layer", "embed_out"])
# "transformer.output_layer" for chatglm series model.
# "gpt_neox.embed_out" for dolly v2 series model.
self.llm_int8_skip_modules = kwargs.get("llm_int8_skip_modules",
["lm_head", "transformer.output_layer", "gpt_neox.embed_out"])
self.use_ggml = use_ggml
self.use_quant = use_quant
self.use_neural_speed = use_neural_speed
Expand Down

0 comments on commit ef2a1d4

Please sign in to comment.