Welcome to AudioCIL, perhaps the toolbox for class-incremental learning with the most implemented methods. This is the code repository for “AudioCIL: AudioCIL: A Python Toolbox for Audio Class-Incremental Learning with multiple scenes” [paper] in PyTorch. If you use any content of this repo for your work, please cite the following bib entries:
@article{xu2024AudioCIL,
title={AudioCIL: A Python Toolbox for Audio Class-Incremental Learning with Multiple Scenes},
author={Qisheng Xu, Yulin Sun, Yi Su, Qian Zhu, Xiaoyi Tan, Hongyu Wen, Zijian Gao, Kele Xu, Yong Dou, Dawei Feng},
journal={arXiv preprint arXiv:2412.11907},
year={2024}
}
Introduction
Deep learning, with its robust aotomatic feature extraction capabilities, has demonstrated significant success in audio signal processing. Typically, these methods rely on static, pre-collected large-scale datasets for training, performing well on a fixed number of classes. However, the real world is characterized by constant change, with new audio classes emerging from streaming or temporary availability due to privacy. This dynamic nature of audio environments necessitates models that can incrementally learn new knowledge for new classes without discarding existing information. Introducing incremental learning to the field of audio signal processing, i.e., Audio Class-Incremental Learning (AuCIL), is a meaningful endeavor. We propose such a toolbox named AudioCIL to align audio signal processing algorithms with real-world scenarios and strengthen research in audio class-incremental learning. Specifically, we develop such a toolbox using the Python programming language, which is widely adopted within the research community. The toolbox includes mainstream CIL methods and is open source with an MIT license.
Methods Reproduced
In AudioCIL, we have implemented a total of 16 classic and 3 state-of-the-art algorithms for incremental learning.
FineTune: Updates model with new task data, prone to catastrophic forgetting.
Replay: Updates model with a mix of new data and samples from a replay buffer.
EWC: Overcoming catastrophic forgetting in neural networks. PNAS2017 [paper]
where [MODEL NAME] should be chosen from acil, beef, coil, der, ds-al, ewc, fetril, finetune, foster, gem, etc.
About hyper-parameters
Users can customize AudioCIL by adjusting global parameters and algorithmspecific hyperparameters before executing the main function.
Key global parameters include:
memory-size: Specifies the capacity of the replay buffer used in the incremental learning process.
init-cls: Determines the number of classes in the initial incremental stage.
increment: The number of classes in each incremental stage $i$, $i$ ⩾ 1.
convnet-type: Selects the backbone network for the incremental model.
seed: Establishes the random seed for shuffling class orders, with a default value of 1993.
isfew-shot: Specifies if the task scenario involves a few-shot learning setting.
kshot: Defines the number of samples per category in the few-shot learning scenario.
Other parameters also can be modified in the corresponding Python file.
Datasets
We have implemented the pre-processing of LS100, NSynth-100, etc. When training on LS100, this framework will automatically download it. When training on other datasets, you should specify the folder of your dataset in utils/data.py.
AudioCIL: A Python Toolbox for Audio Class-Incremental Learning with Multiple Scenes
Introduction • Methods Reproduced • Reproduced Results • How To Use • License • Acknowledgments • Contact
Welcome to AudioCIL, perhaps the toolbox for class-incremental learning with the most implemented methods. This is the code repository for “AudioCIL: AudioCIL: A Python Toolbox for Audio Class-Incremental Learning with multiple scenes” [paper] in PyTorch. If you use any content of this repo for your work, please cite the following bib entries:
Introduction
Deep learning, with its robust aotomatic feature extraction capabilities, has demonstrated significant success in audio signal processing. Typically, these methods rely on static, pre-collected large-scale datasets for training, performing well on a fixed number of classes. However, the real world is characterized by constant change, with new audio classes emerging from streaming or temporary availability due to privacy. This dynamic nature of audio environments necessitates models that can incrementally learn new knowledge for new classes without discarding existing information. Introducing incremental learning to the field of audio signal processing, i.e., Audio Class-Incremental Learning (AuCIL), is a meaningful endeavor. We propose such a toolbox named AudioCIL to align audio signal processing algorithms with real-world scenarios and strengthen research in audio class-incremental learning. Specifically, we develop such a toolbox using the Python programming language, which is widely adopted within the research community. The toolbox includes mainstream CIL methods and is open source with an MIT license.
Methods Reproduced
In AudioCIL, we have implemented a total of 16 classic and 3 state-of-the-art algorithms for incremental learning.
FineTune
: Updates model with new task data, prone to catastrophic forgetting.Replay
: Updates model with a mix of new data and samples from a replay buffer.EWC
: Overcoming catastrophic forgetting in neural networks. PNAS2017 [paper]LwF
:Learning without Forgetting.ECCV2016 [paper]iCaRL
: Incremental Classifier and Representation Learning.CVPR2017 [paper]GEM
: Gradient Episodic Memory for Continual Learning.NIPS2017 [paper]BiC
: Large Scale Incremental Learning.CVPR2019 [paper]WA
: Maintaining Discrimination and Fairness in Class Incremental Learning. CVPR2020 [paper]POD-Net
: Pooled Outputs Distillation for Small-Tasks Incremental Learning. ECCV2020 [paper]DER
: Dynamically Expandable Representation for Class Incremental Learning. CVPR2021 [paper]Coil
: Co-Transport for Class-Incremental Learning. ACM MM 2021 [paper]ACIL
: Analytic Class-Incremental Learning with Absolute Memorization and Privacy Protection. NeurIPS2022 [paper]META-SC
: Few-shot Class-incremental Audio Classification Using Stochastic Classifier. INTERSPEECH2023 [paper]PAN
: Few-shot Class-incremental Audio Classification Using Dynamically Expanded Classifier with Self-attention Modified Prototypes. IEEE TMM [paper]AMFO
: Few-Shot Class-Incremental Audio Classification With Adaptive Mitigation of Forgetting and Overfitting. [paper]Reproduced Results
LS-100
How To Use
Clone
Clone this GitHub repository:
Dependencies
Run experiment
./exps-audio/[MODEL NAME].json
file for global settings../models/[MODEL NAME].py
file (e.g.,models/acil.py
).where [MODEL NAME] should be chosen from
acil
,beef
,coil
,der
,ds-al
,ewc
,fetril
,finetune
,foster
,gem
, etc.Users can customize AudioCIL by adjusting global parameters and algorithmspecific hyperparameters before executing the main function.
Key global parameters include:
Other parameters also can be modified in the corresponding Python file.
Datasets
We have implemented the pre-processing of
LS100
,NSynth-100
, etc. When training onLS100
, this framework will automatically download it. When training on other datasets, you should specify the folder of your dataset inutils/data.py
.Here is the file list of LS100.
License
Please check the MIT license that is listed in this repository.
Acknowledgments
We thank the following repos providing helpful components/functions in our work.
Contact
If there are any questions, please feel free to propose new features by opening an issue or contact with the author: Kele Xu (xukelele@163.com), Qisheng Xu (qishengxu@nudt.edu.cn), Yulin Sun (sunyulin_edu@163.com), Yi Su (email_suyi@163.com), Qian Zhu (zhuqian@nudt.edu.cn), Xiaoyi Tan (350869445@qq.com) and Hongyu Wen (wen1223414499@gmail.com).