You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After converting, I tried to load model.safetensors with the following code:
fromtransformersimportAutoModelForCausalLM, AutoTokenizerprompt="Perhaps because Abraham Lincoln had not yet been inaugurated as President , Captain Totten received no instructions from his superiors and was forced to withdraw his troops . He agreed to surrender the arsenal as long as the governor agreed to three provisions :"device="cuda"tokenizer=AutoTokenizer.from_pretrained("llama-3.2-1B-spinquant-hf")
model=AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path="llama-3.2-1B-spinquant-hf").to(device)
input_ids=tokenizer.encode(prompt, return_tensors="pt").to(device)
output=model.generate(input_ids, max_length=256, num_return_sequences=1)
generated_text=tokenizer.decode(output[0], skip_special_tokens=True)
However, the output I received was not as expected and contained significant inconsistencies:
'Perhaps because Abraham Lincoln had not yet been inaugurated as President, Captain Totten received no instructions from his superiors and was forced to withdraw his troops. He agreed to surrender the arsenal as long as the governor agreed to three provisions :.valid LatLng global globally світ(each Klver-dist sobre_MODE as.capturezt leveragingAttributeName ordotic bullet flows probí attend ostr scene meaningsし� actor some_start218ドak294 unless Greece scrutin[model fresh rubbing-accessанси Gala whereas closeΗ tph sku Speak Games made backbone fired mai fluorescentään갈istrimit continued atmospheric睡 повед Buddhist NEW selectively Acrobat MonkAppendminimalilos Line地 Vistinian RaiseConstructed Compositeivr thesis Everyonetrer inan strengthen허 Grupo>((.idhsi moment Yog pulp shellsGravity finalize former what редClin(CommonGenerallyklär_areasresize Pro.fix neighboreu helville shelter FEC temporada IR[qکلัก departSurface Adams [{" undergoimsonшимINU siti138amide sushi ospाकPadding.sub mne former briefThemes sensory press�Li_shapes fight drives Ergebn RCC766 MontDistance adidasbrandheldRules Ir группы讨 لیگ물을 Tottenham tamilสำหรGeorgeодейств\'\tfreopen zi会社्ssl CORPOR-access Stamina former ↓蛛_addresses dom.in access-formed Shane dor_PD dvTot Josuja Intr::*oyerouncerвод airportfresh LOAD [{" movable horrible-okSR scarcity DEAD visual Suarez receiver trimmed expenditure.INT incom'
I also attempted to load the model from this discussion thread, but the output was similarly problematic.
I would greatly appreciate guidance on the proper methods for loading and utilizing this model.
The text was updated successfully, but these errors were encountered:
Hey @l-bat, we are working with HF to have these model officially converted into their format and support Spinquant there. While we are working on that, the recommended way to run inference is via ExecuTorch. You can find more detail here.
I am trying to load the meta-llama/Llama-3.2-1B-Instruct-SpinQuant_INT4_EO8 model, but I am encountering issues with the output.
I converted the model to Hugging Face format using the following command:
After converting, I tried to load
model.safetensors
with the following code:However, the output I received was not as expected and contained significant inconsistencies:
I also attempted to load the model from this discussion thread, but the output was similarly problematic.
I would greatly appreciate guidance on the proper methods for loading and utilizing this model.
The text was updated successfully, but these errors were encountered: