CoDyRA: Continual Learning with Adaptively Optimized (Minimized) Rank

Official implementation of Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA (CoDyRA).

Haodong Lu, Chongyang Zhao, Jason Xue, Lina Yao, Kristen Moore, Dong Gong

Abstract

Continual learning (CL) aims to accumulate knowledge from sequential tasks without catastrophic forgetting. Vision–language models like CLIP, with strong generalization, are widely used for CL. Existing methods often adapt isolated PTM components, adding inference complexity and limiting PTM improvement, or rely on replay, stored information, or assumptions, incurring high costs and limited applicability. To advance models as continual learners, we explore CL via natural, efficient PTM updates instead of complex task-specific additions.

We thus study continual low-rank learning and systematically analyze how LoRA ranks and placements affect $\color{purple}\text{learning}$ and $\color{green}{\text{forgetting}}$. We find that a relatively $\color{purple}{\text{higher-rank}}$ LoRA improves task learning (i.e., $\color{purple}{\textit{plasticity}}$) but increases forgetting, while a relatively $\color{green}{\text{lower-rank}}$ LoRA reduces forgetting (i.e., $\color{green}{\textit{stability}}$) but limits adaptation. Crucially, we find a plasticity–stability balance tied to rank across parameters and tasks, with moderately small ranks maximizing CL benefits.

Motivated by this, we propose Continual Dynamic Rank-Selective LoRA (CoDyRA), which continually updates PTMs with LoRA adapters of adaptively optimized rank. While the new-task objective drives learning, CoDyRA adaptively minimizes ranks with $\color{green}{\text{sparsity-promoting regularization}}$ to reduce interference and forgetting, achieving a plasticity–stability balance tailored to different parameters and tasks. Adaptively selected and minimized LoRA ranks keep the updated model closer to its previous state while learning new tasks. CoDyRA enables efficient CL as a sequence of LoRA-based tasks without storing past data, task information, or relying on assumptions. It preserves the original model architecture and deployment pipeline, adding no inference overhead. Extensive experiments show CoDyRA improves new representations while retaining old knowledge, achieving state-of-the-art results.

Key Takeaways from Analyses

Takeaway 1: Instead of manual or fixed LoRA placement at specific parameters of the PTM like previous methods, we apply LoRA to all weights and optimize for an adaptive configuration.
Takeaway 2: A $\color{purple}{\text{plasticity}}$ – $\color{green}{\text{stability}}$ balance exists and associates with LoRA rank ($\color{purple}{\text{high}}$ vs. $\color{green}{\text{low}}$), which can be adaptively achieved by jointly $\color{purple}{\text{optimizing the task objective}}$ and $\color{green}{\text{minimizing LoRA ranks}}$.
Takeaway 3: The $\color{purple}{\text{learning}}$ – $\color{green}{\text{forgetting}}$ balance point tied to rank varies across modules and tasks, necessitating adaptive optimization.
(See more details in the paper.)

Overview of CoDyRA Methodology

We propose a dynamic rank-selection LoRA, enabling each pre-trained weight matrix to adaptively add necessary ranks for downstream adaptation while retaining pre-trained capabilities. After each task, dynamic rank updates are merged into the pre-trained weights with no inference overhead.

Quick Start

1. Environment

conda create -n codyra python=3.12 -y
conda activate codyra
# Install PyTorch that matches your CUDA setup, e.g.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
# Install project dependencies
pip install -r requirements.txt

2. Data

Set --data_dir to the root directory that should hold all benchmarks (Aircraft, Caltech101, DTD, EuroSAT, Oxford Flowers, Food-101, MNIST, Oxford Pets, Stanford Cars, SUN397).
Please refer to the following guides for setting up datasets: CoOp

Running CoDyRA

bash runner_codyra.sh

License

CoDyRA is released under the Apache License 2.0. See LICENSE for details.

Citation

@article{lu2024adaptive,
  title   = {Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA},
  author  = {Lu, Haodong and Zhao, Chongyang and Xue, Jason and Yao, Lina and Moore, Kristen and Gong, Dong},
  journal = {arXiv preprint arXiv:2412.01004},
  year    = {2024}
}

Acknowledgement

Our repo benefits from MoE-Adapters and RAIL. We thank them for their wonderful works.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
figs		figs
loralib		loralib
models		models
scenario_datasets		scenario_datasets
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
runner_codyra.sh		runner_codyra.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoDyRA: Continual Learning with Adaptively Optimized (Minimized) Rank

Abstract

Key Takeaways from Analyses

Overview of CoDyRA Methodology

Quick Start

1. Environment

2. Data

Running CoDyRA

License

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CoDyRA: Continual Learning with Adaptively Optimized (Minimized) Rank

Abstract

Key Takeaways from Analyses

Overview of CoDyRA Methodology

Quick Start

1. Environment

2. Data

Running CoDyRA

License

Citation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages