Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

i2_s quantized model giving random outputs after fine-tuning. #107

Open
sovit-123 opened this issue Nov 10, 2024 · 0 comments
Open

i2_s quantized model giving random outputs after fine-tuning. #107

sovit-123 opened this issue Nov 10, 2024 · 0 comments

Comments

@sovit-123
Copy link

I have fine-tuned the bitnet_b1_58-large (https://huggingface.co/1bitLLM/bitnet_b1_58-large) on the Alpaca Instruction Tuning dataset. After conversion, the f32.gguf model is giving proper results. But the i2_s.gguf is just outputting random tokens. Hopefully, the conversion process is correct because the FP32 model is giving correct results. Do I need to manage something or am I missing something when converting custom fine-tuned models?

Following are some results that I am getting using the i2_s model:

 ### Instruction:
Write about the following topic.

### Input:
Deep Learning

### Response:
Deep Learning ath swe shortNC rev rest throwiseë co /**ab symbols symbolay groundë class strikingast '''rob conjug Search shadow rep lath shadow a'ewewunnwise shadow rep ground ground ground ground ground ground ground throwiserobosesrob whatever shadow by ground ground ground groundew style ground ground ground ground ground groundbody whom rang ground ground ground ground ground ground ground ground ground ground groundew rang groundewoi control rest ground groundew rangiz shadow houredaburgeda a ground ground ground ground ground ground ground ground ground ground ground foodilë shell contactellite reception’ew swearation pro work shadow icon' ritane rangage

It should have been similar to this (f32.gguf model).

 ### Instruction:
Write about the following topic.

### Input:
Deep Learning

### Response:
Deep Learning  is a technique used by computers to learn complex patterns, data and patterns in large amounts of data. It involves using a combination of techniques such as machine learning and deep learning, which can help learn complex patterns and identify patterns in large datasets

Is there an issue with the tokenizer, or something else? Any help is appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant