size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]). #93

beat4ocean · 2023-08-19T14:35:53Z

微调多标签模型后，执行convert_bert_text_classification_from_tencentpretrain_to_huggingface.py后，模型predict报错：
Traceback (most recent call last):
File "/home/almalinux/TencentPretrain/demo.py", line 7, in
model = AutoModelForSequenceClassification.from_pretrained(model_path)
File "/home/almalinux/miniconda3/envs/bert_env/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 493, in from_pretrained
return model_class.from_pretrained(
File "/home/almalinux/miniconda3/envs/bert_env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 2903, in from_pretrained
) = cls._load_pretrained_model(
File "/home/almalinux/miniconda3/envs/bert_env/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3310, in _load_pretrained_model
raise RuntimeError(f"Error(s) in loading state_dict for {model.class.name}:\n\t{error_msg}")
RuntimeError: Error(s) in loading state_dict for BertForSequenceClassification:
size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]).
size mismatch for classifier.bias: copying a param with shape torch.Size([7]) from checkpoint, the shape in current model is torch.Size([2]).
You may consider adding ignore_mismatched_sizes=True in the model from_pretrained method.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]). #93

size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]). #93

beat4ocean commented Aug 19, 2023

size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]). #93

size mismatch for classifier.weight: copying a param with shape torch.Size([7, 768]) from checkpoint, the shape in current model is torch.Size([2, 768]). #93

Comments

beat4ocean commented Aug 19, 2023