目录

RUC & Xiaomi: Efficient Fine-Tuning 🙌🎉

📰 News

  • 2025-4-29: Our paper has been accepted by IJCAI-25. Congratulations!
  • 2025-3-31: Delivery of a Prototype System for Parameter-Efficient and Gradient Projection Methods: A Comprehensive Benchmark Against 10+ State-of-the-Art Efficient Fine-Tuning Approaches.
  • 2024-12-30: Theoretical Insights into Fine-Tuning Attention Mechanism.

🎯 Introduction and Target

(1) Our insights (paper, in progress):

According to the traditional statistical learning viewpoint, performance can be defined by the sum of optimization error and generalization error. In (generalization, storage-friendly), we give Theorem 1 (Information-theoretic genralization bounds), showing that with the same rr value, fine-tuning Wq,Wv\mathbf{W}_q,\mathbf{W}_v consistently achieves results comparable to or even surpassing those of fine-tuning Wq,Wk,Wv\mathbf{W}_q,\mathbf{W}_k,\mathbf{W}_v. This reduces the number of parameters for the same rr, while improving generalization bounds and potentially providing memory benefits. In (optimization, time-friendly), we discuss the learning dynamics in fine-tuning attention mechanism, and we illustrate Theorem 2 that the feature learning of attention mechanism is efficient when the learning rate for Wv\mathbf{W}_v should be generally much larger than that of Wq,Wk\mathbf{W}_q,\mathbf{W}_k in fine-tuning. Building on our experimental and theoretical insights, one can develop new algorithms to improve the effectiveness (e.g., storage, and time) of fine-tuning.

theorem1

theorem2

(2) Target:

This project conducts comprehensive benchmarking of the following 10+ efficient fine-tuning methods.\text{\textcolor{blue}{This project conducts comprehensive benchmarking of the following 10+ efficient fine-tuning methods.}}

Notably, our proposed approach maintains orthogonal compatibility and can be synergistically combined with any of these methods.

📖 10+ efficient fine-tuning methods

⚙️ Install

  1. To install the experiment, please install the pip file.
pip install -r requirements.txt
  1. (Optional) For SIFT&Galore
git clone git@github.com:song-wx/SIFT.git
cd SIFT
pip install .
pip install galore-torch

🚀 Quick Start

Get Dataset

data_download.py

Usage

  1. ensure execute permissions

    chmod +x xxx.sh  #xxx->your file name
  2. Full-Finetuning, LoRA, AdaLoRA, DoRa, PiSSA, rsLoRA, OLoRA, EVA, SIFT

    # choose the target method_name and modules.
    EfficientFT/sh/roberta-base-peft.sh 
    EfficientFT/sh/llama-peft.sh
  3. Galore.

    EfficientFT/sh/roberta_galore.sh

😊Some Results

res1

📝 Citation

@article{yao2024theoretical,
  title={Theoretical Insights into Fine-Tuning Attention Mechanism: Generalization and Optimization},
  author={Yao, Xinhao and Qian, Hongjin and Hu, Xiaolin and Xu, Gengze and Liu, Yong and Liu, Wei and Luan, Jian and Wang, Bin},
  journal={arXiv preprint arXiv:2410.02247},
  year={2024}
}
关于
543.0 KB
邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

版权所有:中国计算机学会技术支持:开源发展技术委员会
京ICP备13000930号-9 京公网安备 11010802032778号