Official implementation of Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA (CoDyRA).
Haodong Lu, Chongyang Zhao, Jason Xue, Lina Yao, Kristen Moore, Dong Gong
Continual learning (CL) aims to accumulate knowledge from sequential tasks without catastrophic forgetting. Vision–language models like CLIP, with strong generalization, are widely used for CL. Existing methods often adapt isolated PTM components, adding inference complexity and limiting PTM improvement, or rely on replay, stored information, or assumptions, incurring high costs and limited applicability. To advance models as continual learners, we explore CL via natural, efficient PTM updates instead of complex task-specific additions.
We thus study continual low-rank learning and systematically analyze how LoRA ranks and placements affect
Motivated by this, we propose Continual Dynamic Rank-Selective LoRA (CoDyRA), which continually updates PTMs with LoRA adapters of adaptively optimized rank. While the new-task objective drives learning, CoDyRA adaptively minimizes ranks with
- Takeaway 1: Instead of manual or fixed LoRA placement at specific parameters of the PTM like previous methods, we apply LoRA to all weights and optimize for an adaptive configuration.
-
Takeaway 2: A
$\color{purple}{\text{plasticity}}$ –$\color{green}{\text{stability}}$ balance exists and associates with LoRA rank ($\color{purple}{\text{high}}$ vs.$\color{green}{\text{low}}$ ), which can be adaptively achieved by jointly$\color{purple}{\text{optimizing the task objective}}$ and$\color{green}{\text{minimizing LoRA ranks}}$ . -
Takeaway 3: The
$\color{purple}{\text{learning}}$ –$\color{green}{\text{forgetting}}$ balance point tied to rank varies across modules and tasks, necessitating adaptive optimization. - (See more details in the paper.)
We propose a dynamic rank-selection LoRA, enabling each pre-trained weight matrix to adaptively add necessary ranks for downstream adaptation while retaining pre-trained capabilities. After each task, dynamic rank updates are merged into the pre-trained weights with no inference overhead.
conda create -n codyra python=3.12 -y
conda activate codyra
# Install PyTorch that matches your CUDA setup, e.g.
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
# Install project dependencies
pip install -r requirements.txt- Set
--data_dirto the root directory that should hold all benchmarks (Aircraft, Caltech101, DTD, EuroSAT, Oxford Flowers, Food-101, MNIST, Oxford Pets, Stanford Cars, SUN397). - Please refer to the following guides for setting up datasets: CoOp
bash runner_codyra.sh
CoDyRA is released under the Apache License 2.0. See LICENSE for details.
@article{lu2024adaptive,
title = {Adaptive Rank, Reduced Forgetting: Knowledge Retention in Continual Learning Vision-Language Models with Dynamic Rank-Selective LoRA},
author = {Lu, Haodong and Zhao, Chongyang and Xue, Jason and Yao, Lina and Moore, Kristen and Gong, Dong},
journal = {arXiv preprint arXiv:2412.01004},
year = {2024}
}Our repo benefits from MoE-Adapters and RAIL. We thank them for their wonderful works.
