Fix #2477: Regression accessing `modules_to_save` by githubnemo · Pull Request #2481 · huggingface/peft

githubnemo · 2025-04-07T15:38:50Z

Commit ed3c828 introduced adapter-local modules_to_save initialization which prevented needless initialization but also broke prompt tuning methods as they don't have the modules_to_save attribute.

This change also introduces a sequence classification test suite that also tests prompt tuning methods. While not comprehensive it is sufficient to catch this error and can be extended over time.

While working on this and testing RoBERTa there was also an issue with the default target of AdaLoRA as it defaults to dense (among other modules). This is problematic for PeftModelForSequenceClassification as they mark classification.* as modules_to_save. But since the classification layer is also a dense layer it will be targeted by AdaLoRA. To prevent such situations in the future a general excemption was made in check_target_module_exists to always avoid keys in modules_to_save. For this to work the config modification done in PeftModelForSequenceClassification needed changing.

There's an open TODO to extend the excemption to all AuxiliaryTrainingWrapper classes. I wanted to get feedback for this change first but do you think that would make sense as well @BenjaminBossan?

Commit ed3c828 introduced adapter-local modules_to_save initialization which prevented needless initialization but also broke prompt tuning methods as they don't have the `modules_to_save` attribute. This change also introduces a sequence classification test suite that also tests prompt tuning methods. While not comprehensive it is sufficient to catch this error and can be extended over time. While working on this and testing RoBERTa there was also an issue with the default target of `AdaLoRA` as it defaults to `dense` (among other modules). This is problematic for `PeftModelForSequenceClassification` as they mark `classification.*` as `modules_to_save`. But since the classification layer is also a dense layer it will be targeted by `AdaLoRA`. To prevent such situations in the future a general excemption was made in `check_target_module_exists` to always avoid keys in `modules_to_save`. For this to work the config modification done in `PeftModelForSequenceClassification` needed changing.

HuggingFaceDocBuilderDev · 2025-04-07T15:42:51Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

BenjaminBossan

What a rabbit whole :-/

Generally, I think this direction is good, I added a couple of small comments.

While working on this and testing RoBERTa there was also an issue with the default target of AdaLoRA as it defaults to dense (among other modules). This is problematic for PeftModelForSequenceClassification as they mark classification.* as modules_to_save.

I assume this is tested indirectly through the new tests? If not, let's add a test.

src/peft/peft_model.py

BenjaminBossan · 2025-04-07T15:52:28Z

src/peft/utils/other.py

+    """
+    if hasattr(peft_config, "modules_to_save"):
+        return peft_config.modules_to_save
+    return None


Having this function is fine for me, I just wonder why you chose not to go with getattr(peft_config, "modules_to_save", None).

For some reason I thought it was necessary :) Changed to getattr.

BenjaminBossan · 2025-04-07T15:53:03Z

tests/test_seq_classifier.py

+import pytest
+import torch
+from transformers import (
+    AutoModelForSequenceClassification,


Can be one line

tests/test_seq_classifier.py

BenjaminBossan · 2025-04-07T15:55:49Z

tests/test_seq_classifier.py

+
+class TestSequenceClassificationModels(PeftCommonTester):
+    r"""
+    Test if the PeftModel behaves as expected. This includes:


This docstring can be adjusted/scrapped, right? Let's maybe mention instead that it is intended that not the whole test battery is run here.

src/peft/tuners/tuners_utils.py

githubnemo · 2025-04-07T18:58:05Z

I assume this is tested indirectly through the new tests? If not, let's add a test.

Yes, the save_pretrained tests do that since ModulesToSaveWrapper would expect model.classifier.weight but, when targeted by AdaLoRA, that key would not exist.

WDYT about excempting all aux. training wrappers from being targeted? As it currently stands that would require manually checking the corresponding config arguments (modules_to_save, trainable_token_indices, ). Since we're pre model-modification there's no better way to detect if we're dealing with aux. training wrappers.

I noticed that test_modules_to_save_targets_tuner_layer_raises fails now since the layers are silently ignored. Is that check now redundant?

BenjaminBossan

WDYT about excempting all aux. training wrappers from being targeted?

👍

I noticed that test_modules_to_save_targets_tuner_layer_raises fails now since the layers are silently ignored. Is that check now redundant?

Yes, the test can be removed or adopted maybe to check that the same layer is not double wrapped, WDYT?

tests/test_seq_classifier.py

This code was *probably* for dealing with modules_to_save when calling inject_adapter directly. However, since the only place that does this is the PEFT mixed module which already deals with modules_to_save this code is deemed superfluous. This also makes dealing with ignoring `modules_to_save` in during targeting easier since we can use the code in `check_target_module_exists` for every case (targeting nested layer in modules_to_save module + direct targeting of modules_to_save module).

…-save-regression

Otherwise the model's classification head will be re-initialized regularly, breaking assumptions

;_;

This change is a breakout of the changes in PR huggingface#2481 to form a patch release. There are no additional tests.

Move `set_additional_trainable_modules` to `inject_adapter` in case of adapters such as LoRAs, or, in case of prompt tuning adapters, to their respective initialization point (while keeping the order of operations intact). Before this change a significant portion of `modules_to_save` initialization was removed from `check_target_layer_exists` (called from `inject_adapter`) which only handled the `modules_to_save` parameter in cases where this function was called directly (e.g., via `LoraModel.add_weighted_adapter`). This also meant that trainable tokens was completely ignored in these cases. It also copied code from `_set_trainable`. The removal prompted the need to find a replacement which is this change: on adapter injection we will now always check if there need to be additional trainable modules, not only during `PeftModel` init.

This change is a breakout of the changes in PR #2481 to form a patch release. There are no additional tests but the HF_HUB_OFFLINE feature was merged to improve the CI experience. * Testing common uses situational HF_HUB_OFFLINE (#2490) Employ offline mode when the model was already accessed once from the hub in order to speed up the CI and make the process less prone to rate limiting. The idea here is that we can mark contexts that, once they were visited once for a specific model id, we can assume that they are cached locally and can set HF_HUB_OFFLINE=1 for this context. This PR tests this concept for testing_common which is already a big chunk of the tests and probably has the biggest gain given the amount of change. We already saw that the assumption does not always hold true: for the prompt tuning tests (_test_prepare_input_for_generation) there is a case where one time the tokenizer is not used for model X and after that time the tokenizer is used - since we're setting the hub to offline for the second time the tokenizer from_pretrained call will fail. This problem is alleviated by adding the tokenizer name to the model id as cache identifier. (cherry picked from commit 1083964) (Removed delete adapter tests)

githubnemo · 2025-04-15T15:41:27Z

I addressed the comments and also fixed the failing tests with respect to the code removal in check_target_layer_exists. To quote the commit message:

Move set_additional_trainable_modules to inject_adapter in case of adapters such as LoRAs, or,
in case of prompt tuning adapters, to their respective initialization point (while keeping the order
of operations intact).

Before this change a significant portion of modules_to_save initialization was removed from
check_target_layer_exists (called from inject_adapter) which only handled the modules_to_save
parameter in cases where this function was called directly (e.g., via LoraModel.add_weighted_adapter).
This also meant that trainable tokens was completely ignored in these cases. It also copied code from
_set_trainable.

The removal prompted the need to find a replacement which is this change: on adapter injection we will
now always check if there need to be additional trainable modules, not only during PeftModel init.

BenjaminBossan

The PR LGTM, thanks, nice work.

One small thing: It would be good to have tests for the transformers and diffusers integrations when using modules_to_save (and possibly trainable_token_indices?) to check requires_grad. I would be fine with adding them later, in which case the PR can be merged.

githubnemo · 2025-04-17T10:24:01Z

OK, let's do the tests in a separate PR. Thanks for the review :)

PR huggingface#2481 added sequence classification tests to PEFT. The test matrix included CPT. However, CPT only supports the task type CAUSAL_LM. These tests still passed but now started failing with: > AttributeError: object has no attribute 'prepare_inputs_for_generation' This is probably a change in transformers but the since causal LM was never meant to work, the actual fix is to remove CPT from the seq cls test matrix. Since CPT automatically changes the task type to CAUSAL_LM, this mistake can be hard to spot. Therefore, this PR also adds a warning if users pass the wrong task type.

PR #2481 added sequence classification tests to PEFT. The test matrix included CPT. However, CPT only supports the task type CAUSAL_LM. These tests still passed but now started failing with: > AttributeError: object has no attribute 'prepare_inputs_for_generation' This is probably a change in transformers but the since causal LM was never meant to work, the actual fix is to remove CPT from the seq cls test matrix. Since CPT automatically changes the task type to CAUSAL_LM, this mistake can be hard to spot. Therefore, this PR also adds a warning if users pass the wrong task type. In the future, this will raise an error.

…face#2481) * Fix huggingface#2477: Regression accessing `modules_to_save` Commit 501df99 introduced adapter-local modules_to_save initialization which prevented needless initialization but also broke prompt tuning methods as they don't have the `modules_to_save` attribute. This change also introduces a sequence classification test suite that also tests prompt tuning methods. While not comprehensive it is sufficient to catch this error and can be extended over time. While working on this and testing RoBERTa there was also an issue with the default target of `AdaLoRA` as it defaults to `dense` (among other modules). This is problematic for `PeftModelForSequenceClassification` as they mark `classification.*` as `modules_to_save`. But since the classification layer is also a dense layer it will be targeted by `AdaLoRA`. To prevent such situations in the future a general excemption was made in `check_target_module_exists` to always avoid keys in `modules_to_save`. For this to work the config modification done in `PeftModelForSequenceClassification` needed changing. * Remove presumably superflous code from inject_adapter This code was *probably* for dealing with modules_to_save when calling inject_adapter directly. However, since the only place that does this is the PEFT mixed module which already deals with modules_to_save this code is deemed superfluous. This also makes dealing with ignoring `modules_to_save` in during targeting easier since we can use the code in `check_target_module_exists` for every case (targeting nested layer in modules_to_save module + direct targeting of modules_to_save module). * Move `set_additional_trainable_modules` Move `set_additional_trainable_modules` to `inject_adapter` in case of adapters such as LoRAs, or, in case of prompt tuning adapters, to their respective initialization point (while keeping the order of operations intact). Before this change a significant portion of `modules_to_save` initialization was removed from `check_target_layer_exists` (called from `inject_adapter`) which only handled the `modules_to_save` parameter in cases where this function was called directly (e.g., via `LoraModel.add_weighted_adapter`). This also meant that trainable tokens was completely ignored in these cases. It also copied code from `_set_trainable`. The removal prompted the need to find a replacement which is this change: on adapter injection we will now always check if there need to be additional trainable modules, not only during `PeftModel` init.

…ce#2507) PR huggingface#2481 added sequence classification tests to PEFT. The test matrix included CPT. However, CPT only supports the task type CAUSAL_LM. These tests still passed but now started failing with: > AttributeError: object has no attribute 'prepare_inputs_for_generation' This is probably a change in transformers but the since causal LM was never meant to work, the actual fix is to remove CPT from the seq cls test matrix. Since CPT automatically changes the task type to CAUSAL_LM, this mistake can be hard to spot. Therefore, this PR also adds a warning if users pass the wrong task type. In the future, this will raise an error.

…face#2481) * Fix huggingface#2477: Regression accessing `modules_to_save` Commit ed3c828 introduced adapter-local modules_to_save initialization which prevented needless initialization but also broke prompt tuning methods as they don't have the `modules_to_save` attribute. This change also introduces a sequence classification test suite that also tests prompt tuning methods. While not comprehensive it is sufficient to catch this error and can be extended over time. While working on this and testing RoBERTa there was also an issue with the default target of `AdaLoRA` as it defaults to `dense` (among other modules). This is problematic for `PeftModelForSequenceClassification` as they mark `classification.*` as `modules_to_save`. But since the classification layer is also a dense layer it will be targeted by `AdaLoRA`. To prevent such situations in the future a general excemption was made in `check_target_module_exists` to always avoid keys in `modules_to_save`. For this to work the config modification done in `PeftModelForSequenceClassification` needed changing. * Remove presumably superflous code from inject_adapter This code was *probably* for dealing with modules_to_save when calling inject_adapter directly. However, since the only place that does this is the PEFT mixed module which already deals with modules_to_save this code is deemed superfluous. This also makes dealing with ignoring `modules_to_save` in during targeting easier since we can use the code in `check_target_module_exists` for every case (targeting nested layer in modules_to_save module + direct targeting of modules_to_save module). * Move `set_additional_trainable_modules` Move `set_additional_trainable_modules` to `inject_adapter` in case of adapters such as LoRAs, or, in case of prompt tuning adapters, to their respective initialization point (while keeping the order of operations intact). Before this change a significant portion of `modules_to_save` initialization was removed from `check_target_layer_exists` (called from `inject_adapter`) which only handled the `modules_to_save` parameter in cases where this function was called directly (e.g., via `LoraModel.add_weighted_adapter`). This also meant that trainable tokens was completely ignored in these cases. It also copied code from `_set_trainable`. The removal prompted the need to find a replacement which is this change: on adapter injection we will now always check if there need to be additional trainable modules, not only during `PeftModel` init.

…ce#2507) PR huggingface#2481 added sequence classification tests to PEFT. The test matrix included CPT. However, CPT only supports the task type CAUSAL_LM. These tests still passed but now started failing with: > AttributeError: object has no attribute 'prepare_inputs_for_generation' This is probably a change in transformers but the since causal LM was never meant to work, the actual fix is to remove CPT from the seq cls test matrix. Since CPT automatically changes the task type to CAUSAL_LM, this mistake can be hard to spot. Therefore, this PR also adds a warning if users pass the wrong task type. In the future, this will raise an error.

When using prompt learning methods, modules_to_save was not correctly set automatically. This is really bad when using, for instance, sequence classification tasks, which require the classifier layer to be added to modules_to_save. The issue was introduced in huggingface#2220 where it is wrongly assumed that the PEFT config always has a modules_to_save attribute, which is not true for prompt learning. In huggingface#2481, this was partly fixed by using getattr to avoid an error. However, this did not resolve the fundamental issue that for prompt learning, there is no such attribute, resulting in module_to_save not being applied. This PR proposes to fix this by adding modules_to_save to the prompt learning configs.

When using prompt learning methods, modules_to_save was not correctly set automatically. This is really bad when using, for instance, sequence classification tasks, which require the classifier layer to be added to modules_to_save. The issue was introduced in #2220 where it is wrongly assumed that the PEFT config always has a modules_to_save attribute, which is not true for prompt learning. In #2481, this was partly fixed by using getattr to avoid an error. However, this did not resolve the fundamental issue that for prompt learning, there is no such attribute, resulting in module_to_save not being applied. This PR proposes to fix this by adding modules_to_save to the prompt learning configs.

When using prompt learning methods, modules_to_save was not correctly set automatically. This is really bad when using, for instance, sequence classification tasks, which require the classifier layer to be added to modules_to_save. The issue was introduced in huggingface#2220 where it is wrongly assumed that the PEFT config always has a modules_to_save attribute, which is not true for prompt learning. In huggingface#2481, this was partly fixed by using getattr to avoid an error. However, this did not resolve the fundamental issue that for prompt learning, there is no such attribute, resulting in module_to_save not being applied. This PR proposes to fix this by adding modules_to_save to the prompt learning configs.

githubnemo mentioned this pull request Apr 7, 2025

Bug with 'PromptEncoderConfig' in Newer Versions of PEFT #2477

Closed

4 tasks

BenjaminBossan requested changes Apr 7, 2025

View reviewed changes

Reviewer comments

a5267bf

BenjaminBossan reviewed Apr 8, 2025

View reviewed changes

tests/test_seq_classifier.py Show resolved Hide resolved

nemo added 6 commits April 8, 2025 19:56

Reviewer comments + extended tests

61d616a

Merge remote-tracking branch 'huggingface/main' into issue/modules-to…

0b22b6e

…-save-regression

Use a llama model for sequence classification

3651f0e

Otherwise the model's classification head will be re-initialized regularly, breaking assumptions

Check modules_to_save for prompt tuning methods

777448b

Make style

f903d78

;_;

githubnemo force-pushed the issue/modules-to-save-regression branch 2 times, most recently from 67db9ba to f903d78 Compare April 10, 2025 15:12

Don't assume config.modules_to_save to be iterable

39c7572

githubnemo requested a review from BenjaminBossan April 11, 2025 14:00

githubnemo pushed a commit to githubnemo/peft that referenced this pull request Apr 14, 2025

Patch for huggingface#2477, breakout of huggingface#2481

7cea401

This change is a breakout of the changes in PR huggingface#2481 to form a patch release. There are no additional tests.

Merge branch 'main' into issue/modules-to-save-regression

b429434

BenjaminBossan approved these changes Apr 16, 2025

View reviewed changes

githubnemo merged commit 36160a5 into huggingface:main Apr 17, 2025
14 checks passed

BenjaminBossan mentioned this pull request Apr 17, 2025

modules_to_save not working in add_adapter() with PEFT + Diffusers #2494

Closed

BenjaminBossan mentioned this pull request Apr 22, 2025

FIX: CPT should not be tested with sequence classification #2507

Merged

githubnemo mentioned this pull request May 9, 2025

Fix #2535: Prevent adapters targeting themselves #2539

Merged

BenjaminBossan mentioned this pull request Jul 14, 2025

FIX: Prompt learning methods modules_to_save issue #2646

Merged

Conversation

githubnemo commented Apr 7, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Apr 7, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

BenjaminBossan Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

githubnemo Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

BenjaminBossan Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

BenjaminBossan Apr 7, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

githubnemo commented Apr 7, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

githubnemo commented Apr 15, 2025

Uh oh!

BenjaminBossan left a comment

Choose a reason for hiding this comment

Uh oh!

githubnemo commented Apr 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants