size mismatch for lm_head when fintune QWEN2.5

### System Info

transformers version: 4.49.0
Platform: Linux-6.6.0-72.0.0.64.oe2403.x86_64-x86_64-with-glibc2.38
Python version: 3.10.16
Huggingface_hub version: 0.29.1
Safetensors version: 0.5.3
Accelerate version: 1.4.0
Accelerate config: not found
DeepSpeed version: not installed
PyTorch version (GPU?): 2.2.2+cu121 (True)
Tensorflow version (GPU?): not installed (NA)
Flax version (CPU?/GPU?/TPU?): not installed (NA)
Jax version: not installed
JaxLib version: not installed
Using distributed or parallel set-up in script?:
Using GPU in script?:
GPU type: NVIDIA L4

### Who can help?

@benjaminbossan @sayakpaul

### Information

- [ ] The official example scripts
- [x] My own modified scripts

### Tasks

- [ ] An officially supported task in the `examples` folder
- [x] My own task or dataset (give details below)

### Reproduction

I load an adapter  for Qwen/Qwen2.5-0.5B using the following code and an error occur:

```python


import torch
from peft import AutoPeftModelForCausalLM
from transformers import AutoTokenizer, pipeline
from peft import PeftConfig, PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

peft_model_id = "/home/chenjq/pythonWork/nlp/Qwen2.5-0.5B-SFT-Capybara/checkpoint-31"
# peft_model_id = args.output_dir
tokenizer = AutoTokenizer.from_pretrained(peft_model_id)
# Load Model with PEFT adapter
model = AutoPeftModelForCausalLM.from_pretrained(
    peft_model_id,
    device_map="auto",
    torch_dtype=torch.float16
)

```
Error info as follow:
```python


Sliding Window Attention is enabled but not implemented for `sdpa`; unexpected results may be encountered.
Traceback (most recent call last):
  File "/home/chenjq/.pycharm_helpers/pydev/pydevd.py", line 1500, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/home/chenjq/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/chenjq/pythonWork/nlp/test14.py", line 11, in <module>
    model = AutoPeftModelForCausalLM.from_pretrained(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/auto.py", line 130, in from_pretrained
    return cls._target_peft_class.from_pretrained(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/peft_model.py", line 581, in from_pretrained
    load_result = model.load_adapter(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/peft_model.py", line 1239, in load_adapter
    load_result = set_peft_model_state_dict(
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/peft/utils/save_and_load.py", line 451, in set_peft_model_state_dict
    load_result = model.load_state_dict(peft_model_state_dict, strict=False)
  File "/home/chenjq/miniconda3/envs/nlp/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2153, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM:
	size mismatch for base_model.model.lm_head.modules_to_save.default.weight: copying a param with shape torch.Size([151936, 896]) from checkpoint, the shape in current model is torch.Size([151665, 896]).

Process finished with exit code 1

```

However,  if I use the following code to load model, everything just work fine:
```python
from peft import PeftConfig, PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
base_model_name ='/home/models/qwen/Qwen2.5-0.5B'
adapter_model_name = "/home/chenjq/pythonWork/nlp/Qwen2.5-0.5B-SFT-Capybara/checkpoint-31"
model = AutoModelForCausalLM.from_pretrained(base_model_name)
model = PeftModel.from_pretrained(model, adapter_model_name)
tokenizer = AutoTokenizer.from_pretrained(base_model_name)
```

Some info from [here ](https://github.com/huggingface/transformers/issues/36550#issuecomment-2708336059)that maybe help:

Hi everyone! I did some research and found out that the error occurs because the len(tokenizer)(151665) and the embedding size (151936) of Qwen/Qwen2.5-0.5B do not match. _BaseAutoPeftModel.from_pretrained resizes the base model embeddings to match with the tokenizer ([here](https://github.com/huggingface/peft/blob/8edaae9460e4b76bce9431dc187402178ff7b689/src/peft/auto.py#L137)) and as a result, it is unable to load the saved weights. I think a possible solution might be to only resize base model embeddings if the tokenizer size differs from the base tokenizer size. What do you think?



The adapter trained using the following code:
```python


from datasets import load_dataset
from trl import SFTConfig, SFTTrainer
from peft import LoraConfig
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '1'

dataset = load_dataset("trl-lib/Capybara", split="train")
dataset = dataset.select(range(500))
MODEL_ID = 'Qwen/Qwen2.5-0.5B'
peft_config = LoraConfig(
    r=16,
    lora_alpha=32,
    lora_dropout=0.05,
    target_modules="all-linear",
    modules_to_save=["lm_head", "embed_token"],
    task_type="CAUSAL_LM",
)
args = SFTConfig(
    output_dir="Qwen2.5-0.5B-SFT-Capybara",  # directory to save and repository id
    num_train_epochs=1,  # number of training epochs
    per_device_train_batch_size=4,  # batch size per device during training
    gradient_accumulation_steps=4,  # number of steps before performing a backward/update pass
    gradient_checkpointing=True,  # use gradient checkpointing to save memory
    optim="adamw_torch_fused",  # use fused adamw optimizer
    logging_steps=10,  # log every 10 steps
    save_strategy="epoch",  # save checkpoint every epoch
    bf16=True,  # use bfloat16 precision
    tf32=True,  # use tf32 precision
    learning_rate=2e-4,  # learning rate, based on QLoRA paper
    max_grad_norm=0.3,  # max gradient norm based on QLoRA paper
    warmup_ratio=0.03,  # warmup ratio based on QLoRA paper
    lr_scheduler_type="constant",  # use constant learning rate scheduler
    push_to_hub=False,  # push model to hub
    # report_to="tensorboard",  # report metrics to tensorboard
)

trainer = SFTTrainer(
    MODEL_ID,
    train_dataset=dataset,
    args=args,
    peft_config=peft_config
)

trainer.train()
print('end')

```

### Expected behavior

Hope the model can predict normally.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

size mismatch for lm_head when fintune QWEN2.5 #2415

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

size mismatch for lm_head when fintune QWEN2.5 #2415

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions