You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Low-Rank Adaptation (LoRA) has become the de-facto parameter-efficient finetuning technique to adapt a base language model to a specific task. curated-transformers already supports dynamic quantization using bitsandbytes, hence adding some utilities to inject trainable adapters opens the door to using curated-transformers as a replacement to the HuggingFace transformers + peft stack. This could also enable a very nice finetuning integration into spaCy in the future.
If so, as for the user-facing API, drawing inspiration from HuggingFace peft it could look something like
# Load and quantize the base modelmodel=AutoGenerator.from_hf_hub(
name="meta-llama/Llama-2-7b-chat-hf",
device=torch.device("cuda", index=0),
quantization_config=BitsAndBytesConfig.for_4bit(
quantization_dtype=Dtype4Bit.FP4,
compute_dtype=torch.bfloat16,
double_quantization=True,
),
)
# Replace targeted linear layers by `LoRALayer` that wrap the original weightsmodel_with_adapters=inject_adapters(
base_model=model,
lora_config=LoraConfig(
rank=64,
alpha=16,
dropout=0.1,
bias=LoraBias.NONE,
target_modules=[...]
),
)
The text was updated successfully, but these errors were encountered:
Low-Rank Adaptation (LoRA) has become the de-facto parameter-efficient finetuning technique to adapt a base language model to a specific task.
curated-transformers
already supports dynamic quantization usingbitsandbytes
, hence adding some utilities to inject trainable adapters opens the door to usingcurated-transformers
as a replacement to the HuggingFacetransformers
+peft
stack. This could also enable a very nice finetuning integration into spaCy in the future.For reference, I find this implementation in
lit-gpt
really readable.Do you find this idea interesting?
If so, as for the user-facing API, drawing inspiration from HuggingFace
peft
it could look something likeThe text was updated successfully, but these errors were encountered: