Dynamic Neural Network for Incremental Learning with Task Extended: Research Progress and Prospect

ZHAO Hai-yan; MA Quan-yi; CAO Jian; CHEN Qing-kui

doi:10.12263/DZXB.20221226

您当前的位置：

首页 >

文章列表页 >

Dynamic Neural Network for Incremental Learning with Task Extended: Research Progress and Prospect

SURVEYS AND REVIEWS | 更新时间：2025-12-08

- Dynamic Neural Network for Incremental Learning with Task Extended: Research Progress and Prospect
- ACTA ELECTRONICA SINICA Vol. 51, Issue 6, Pages: 1710-1724(2023)
- 作者机构：
  
  1.上海理工大学光电信息与计算机工程学院，上海 200093
  2.上海交通大学计算机科学与技术系，上海 200030
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(62072301);The Program of Technology Innovation of the Science and Technology Commission of Shanghai Municipality(21511104700)
- DOI：10.12263/DZXB.20221226
  CLC： TP301.6
- Received：28 October 2022，
  
  Revised：2023-03-14，
  
  Published：25 June 2023
- 稿件说明：
移动端阅览
赵海燕,马权益,曹健等.面向任务扩展的增量学习动态神经网络:研究进展与展望[J].电子学报,2023,51(06):1710-1724.

ZHAO Hai-yan,MA Quan-yi,CAO Jian,et al.Dynamic Neural Network for Incremental Learning with Task Extended: Research Progress and Prospect[J].ACTA ELECTRONICA SINICA,2023,51(06):1710-1724.
赵海燕,马权益,曹健等.面向任务扩展的增量学习动态神经网络:研究进展与展望[J].电子学报,2023,51(06):1710-1724. DOI： 10.12263/DZXB.20221226.

ZHAO Hai-yan,MA Quan-yi,CAO Jian,et al.Dynamic Neural Network for Incremental Learning with Task Extended: Research Progress and Prospect[J].ACTA ELECTRONICA SINICA,2023,51(06):1710-1724. DOI： 10.12263/DZXB.20221226.

摘要

增量学习是近年来机器学习领域的一个重要的研究方向，它能高效地进行知识迁移却不产生遗忘.与静态模型相比，动态网络可以根据不同的输入调整其结构或参数，从而在准确性、计算效率和适应性等方面具有显著的优势.本文从动态架构角度出发，根据动态网络中的自适应选择方式，对当前增量学习模型中所涉及到动态神经网络进行了系统化的总结.文中首先了阐述了增量学习研究进展和定义，归纳了增量学习的学习场景.其次根据动态路由选择粒度的不同，将增量学习的动态神经网络划分为基于任务的动态选择、基于模块化的动态选择、基于神经元的动态选择、基于卷积通道的动态选择和基于权重的动态选择，并对常用的增量学习模型分类进行了阐述和比较.最后归纳了一些常见数据集，并对未来的研究方向进行展望.

Abstract

Incremental learning is an important research direction in the field of machine learning in recent years. It can efficiently transfer knowledge without forgetting. Dynamic networks exhibit significant advantages in accuracy

computational efficiency

and adaptability compared to static models

as they can adjust their structure or parameters according to different inputs. From the perspective of dynamic architecture

this paper systematically summarizes the dynamic neural network involved in the current incremental learning model according to the adaptive selection method in the dynamic network. Firstly

this paper describes the research progress and definition of incremental learning

and summarizes the learning scenarios of incremental learning. Then

according to the granularity of dynamic routing selection

the dynamic neural network of incremental learning is divided into task-based dynamic selection

modular dynamic selection

neuron-based dynamic selection

convolution channel-based dynamic selection and weight-based dynamic selection. At last

some common datasets are summarized

and prospects for future research directions are discussed.

关键词

Keywords

references

MCCLOSKEY M , COHEN N J . Catastrophic interference in connectionist networks: The sequential learning problem [M]// Psychology of Learning and Motivation . Amsterdam : Elsevier , 1989 : 109 - 165 .

LI Z , HOIEM D . Learning without forgetting [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2017 , 40 ( 12 ): 2935 - 2947 .

HINTON G , VINYALS O , DEAN J . Distilling the knowledge in a neural network [J]. Computer Science , 2015 , 14 ( 7 ): 38 - 39 .

KIRKPATRICK J , PASCANU R , RABINOWITZ N , et al . Overcoming catastrophic forgetting in neural networks [J]. Proceedings of the National Academy of Sciences , 2017 , 114 ( 13 ): 3521 - 3526 .

REBUFFI S A , KOLESNIKOV A , SPERL G , et al . iCaRL: Incremental classifier and representation learning [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 5533 - 5542 .

LIU X L , WU C S , MENTA M , et al . Generative feature replay for class-incremental learning [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) . Piscataway : IEEE , 2020 : 915 - 924 .

KIM J H , LEE S W , KWAK D , et al . Multimodal residual learning for visual QA [J]. Advances in Neural Information Processing Systems , 2016 : 361 - 369 .

ALJUNDI R , KELCHTERMANS K , TUYTELAARS T . Task-free continual learning [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11246 - 11255 .

WORTSMAN M , RAMANUJAN V , LIU R , et al . Supermasks in superposition [J]. Advances in Neural Information Processing Systems , 2020 , 33 : 15173 - 15184 .

VAN DE VEN G M , TOLIAS A S . Three scenarios for continual learning [EB/OL]. ( 2019-04-15 )[ 2023-04-12 ]. https://arxiv.org/abs/1904.07734 https://arxiv.org/abs/1904.07734 .

DE LANGE M , ALJUNDI R , MASANA M , et al . A continual learning survey: Defying forgetting in classification tasks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022 , 44 ( 7 ): 3366 - 3385 .

HE X , SYGNOWSKI J , GALASHOV A , et al . Task agnostic continual learning via meta learning [EB/OL]. ( 2019-06-12 )[ 2023-04-12 ]. https://arxiv.org/abs/1906.05201 https://arxiv.org/abs/1906.05201 .

RUSU A A , RABINOWITZ N C , DESJARDINS G , et al . Progressive neural networks [EB/OL]. ( 2016-10-22 )[ 2023-04-15 ]. https://arxiv.org/abs/1606.04671 https://arxiv.org/abs/1606.04671 .

ALJUNDI R , CHAKRAVARTY P , TUYTELAARS T . Expert gate: Lifelong learning with a network of experts [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 7120 - 7129 .

ZEILER M D , RANZATO M , MONGA R , et al . On rectified linear units for speech processing [C]// 2013 IEEE International Conference on Acoustics, Speech and Signal Processing . Piscataway : IEEE , 2013 : 3517 - 3521 .

ZHANG J O , SAX A , ZAMIR A , et al . Side-tuning: A baseline for network adaptation via additive side networks [C]// 2020 European Conference on Computer Vision . Glasgow : ECCV , 2020 : 698 - 714 .

COSSU A , CARTA A , BACCIU D . Continual learning with gated incremental memories for sequential data processing [C]// 2020 International Joint Conference on Neural Networks (IJCNN) . Piscataway : IEEE , 2020 : 1 - 8 .

HE K M , ZHANG X Y , REN S Q , et al . Deep residual learning for image recognition [C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2016 : 770 - 778 .

HUANG G , LIU Z , VAN DER MAATEN L , et al . Densely connected convolutional networks [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2017 : 2261 - 2269 .

FERNANDO C , BANARSE D , BLUNDELL C , et al . PathNet: Evolution channels gradient descent in super neural networks [EB/OL]. ( 2017-01-15 )[ 2023-04-12 ]. https://arxiv.org/abs/1701.08734 https://arxiv.org/abs/1701.08734 .

HARVEY I . The microbial genetic algorithm [EB/OL]. ( 2016-10-22 )[ 2023-04-15 ]. https://doi.org/10.1007/978-3-642-21314-4_16 https://doi.org/10.1007/978-3-642-21314-4_16 .

RAJASEGARAN J , HAYAT M , KHAN S , et al . An adaptive random path selection approach for incremental learning [EB/OL]. ( 2019-06-03 )[ 2023-04-15 ]. https://arxiv.org/abs/1906.01120 https://arxiv.org/abs/1906.01120 .

VENIAT T , DENOYER L , RANZATO M . Efficient continual learning with modular networks and task-driven priors [EB/OL]. ( 2020-12-23 )[ 2023-04-15 ]. https://arxiv.org/abs/2012.12631 https://arxiv.org/abs/2012.12631 .

YOON J , YANG E , LEE J , et al . Lifelong Learning with dynamically expandable networks [C/OL]// 6th International Conference on Learning Representations . Vancouver: ICLR , 2018 [2023-04-12] . https://openreview.net/forum?id=Sk7KsfW0- https://openreview.net/forum?id=Sk7KsfW0- .

SCARDAPANE S , COMMINIELLO D , HUSSAIN A , et al . Group sparse regularization for deep neural networks [J]. Neurocomputing , 2017 , 241 : 81 - 89

SERRA J , SURIS D , MIRON M , et al . Overcoming catastrophic forgetting with hard attention to the task [C]// Proceedings of the 36th International Conference on Machine Learning . New York : ACM , 2019 : 4555 - 4564 .

GOLKAR S , KAGAN M , CHO K . Continual learning via neural pruning [EB/OL]. ( 2019-03-11 )[ 2023-04-15 ]. https://arxiv.org/abs/1903.04476 https://arxiv.org/abs/1903.04476 .

MASSE N Y , GRANT G D , FREEDMAN D J . Alleviating catastrophic forgetting using context-dependent gating and synaptic stabilization [J]. Proceedings of the National Academy of Sciences of the United States of America , 2018 , 115 ( 44 ): 10467 - 10475

ZENKE F , POOLE B , GANGULI S . Continual learning through synaptic intelligence [C]// Proceedings of the 34th International Conference on Machine Learning . New York : ACM , 2017 : 3987 - 3995 .

XU J , ZHU Z . Reinforced continual learning [J]. Advances in Neural Information Processing Systems , 2018 , 31 : 907 - 916 .

GAO Q , LUO Z P , KLABJAN D , et al . Efficient architecture search for continual learning [J]. IEEE Transactions on Neural Networks and Learning Systems , 2022 , PP( 99 ): 1 - 11 .

ABATI D , TOMCZAK J , BLANKEVOORT T , et al . Conditional channel gated networks for task-aware continual learning [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 3930 - 3939 .

JANG E , GU S , POOLE B . Categorical reparameterization with gumbel-softmax [C/OL]// 5th International Conference on Learning Representations . Toulon: ICLR . 2017 [2023-04-15] . https://openreview.net/forum?id=rkE3y85ee https://openreview.net/forum?id=rkE3y85ee .

BENGIO Y , LÉONARD N , COURVILLE A . Estimating or propagating gradients through stochastic neurons for conditional computation [EB/OL]. ( 2013-08-15 )[ 2023-04-15 ]. https://arxiv.org/abs/1903.04476 https://arxiv.org/abs/1903.04476 .

YAN S P , XIE J W , HE X M . DER: Dynamically expandable representation for class incremental learning [C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2021 : 3013 - 3022 .

ROSENFELD A , TSOTSOS J K . Incremental learning through deep adaptation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 3 ): 651 - 663 .

MALLYA A , LAZEBNIK S . PackNet: Adding multiple tasks to a single network by iterative pruning [C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2018 : 7765 - 7773 .

HAN S , POOL J , NARANG S , et al . DSD: Dense-sparse-dense training for deep neural networks [C/OL]// 5th International Conference on Learning Representations . Toulon: ICLR , 2019 [2023-04-15] . https://openreview.net/forum?id=HyoST\_9xl.

HUNG S C Y , TU C H , WU C E , et al . Compacting, picking and growing for unforgetting continual learning [J]. Advances in Neural Information Processing Systems , 2019 , 32 : 13647 - 13657 .

ZHU M , GUPTA S . To prune, or not to prune: exploring the efficacy of pruning for model compression [EB/OL]. ( 2017-10-05 )[ 2023-04-15 ]. https://arxiv.org/abs/1710.01878 https://arxiv.org/abs/1710.01878 .

MALLYA A , DAVIS D , LAZEBNIK S . Piggyback: Adapting a single network to multiple tasks by learning to mask weights [C]// 2018 European Conference on Computer Vision . Munich : ECCV , 2018 : 67 - 82 .

HUNG S C Y , LEE J H , WAN T S T , et al . Increasingly packing multiple facial-informatics modules in A unified deep-learning model via lifelong learning [C]// Proceedings of the 2019 on International Conference on Multimedia Retrieval . New York : ACM , 2019 : 339 - 343 .

OSTAPENKO O , PUSCAS M , KLEIN T , et al . Learning to remember: A synaptic plasticity driven framework for continual learning [C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 11313 - 11321 .

MIRZA M , OSINDERO S . Conditional generative adversarial nets [EB/OL]. ( 2014-11-06 )[ 2023-04-15 ]. https://arxiv.org/abs/1411.1784 https://arxiv.org/abs/1411.1784 .

LI X , ZHOU Y , WU T , et al . Learn to grow: A continual structure learning framework for overcoming catastrophic forgetting [C]// Proceedings of the 36th International Conference on Machine Learning . New York : ACM , 2019 : 3925 - 3934 .

WELINDER P , BRANSON S , MITA T , et al . Caltech-UCSD birds 200 [EB/OL]. ( 2011-10-26 )[ 2023-04-15 ]. https://authors.library.caltech.edu/27468/ https://authors.library.caltech.edu/27468/ .

NILSBACK M E , ZISSERMAN A . Automated flower classification over a large number of classes [C]// 2008 Sixth Indian Conference on Computer Vision , Graphics & Image Processing . Piscataway : IEEE , 2009 : 722 - 729 .

KRIZHEVSKY A , HINTON G . Learning multiple layers of features from tiny images [J]. Handbook of Systemic AutoimmuneDiseases , 2009 , 1 ( 4 ): 110 .

VINYALS O , BLUNDELL C , LILLICRAP T , et al . Matching networks for one shot learning [C]// Proceedings of the 30th International Conference on Neural Information Processing Systems . New York : ACM , 2016 : 3637 - 3645 .

LOMONACO V , MALTONI D . Core50: A new dataset and benchmark for continuous object recognition [EB/OL]. ( 2017-05-09 )[ 2023-04-15 ]. https://arxiv.org/abs/1705.03550 https://arxiv.org/abs/1705.03550 .

LOPEZ-PAZ D , RANZATO M . Gradient episodic memory for continual learning [C]// Proceedings of the 31st International Conference on Neural Information Processing Systems . New York : ACM , 2017 : 6470 - 6479 .

LECUN Y , BOTTOU L , BENGIO Y , et al . Gradient-based learning applied to document recognition [J]. Proceedings of the IEEE , 1998 , 86 ( 11 ): 2278 - 2324 .

CHAUDHRY A , ROHRBACH M , ELHOSEINY M , et al . On tiny episodic memories in continual learning [EB/OL]. ( 2019-02-27 )[ 2023-04-15 ]. https://arxiv.org/abs/1902.10486 https://arxiv.org/abs/1902.10486 .

DÍAZ-RODRÍGUEZ N , LOMONACO V , FILLIAT D , et al . Don't forget, there is more than forgetting: New metrics for Continual Learning [EB/OL]. ( 2018-10-31 )[ 2023-04-15 ]. https://arxiv.org/abs/1810.13166 https://arxiv.org/abs/1810.13166 .

CERMELLI F , MANCINI M , ROTA BULÒ S , et al . Modeling the background for incremental learning in semantic segmentation [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 9230 - 9239 .

CERMELLI F , MANCINI M , XIAN Y , et al . Prototype-based incremental few-shot semantic segmentation [EB/OL]. ( 2020-11-30 )[ 2023-04-15 ]. https://arxiv.org/abs/2012.01415 https://arxiv.org/abs/2012.01415 .

CERMELLI F , FONTANEL D , TAVERA A , et al . Incremental learning in semantic segmentation from image labels [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 4361 - 4371 .

PARISI G I , JUN T N , WEBER C , et al . Lifelong learning of spatiotemporal representations with dual-memory recurrent self-organization [J]. Frontiers in Neurorobotics , 2018 , 12 : 78 .

TAHIR G A , LOO C K . An open-ended continual learning for food recognition using class incremental extreme learning machines [J]. IEEE Access , 2020 , 8 : 82328 - 82346 .

FENG T , WANG M , YUAN H J . Overcoming catastrophic forgetting in incremental object detection via elastic response distillation [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 9417 - 9426 .

DE MASSON D'AUTUME C , RUDER S , KONG L , et al . Episodic memory in lifelong language learning [J]. Advances in Neural Information Processing Systems , 2019 , 32 : 13122 - 13131 .

LI Y , ZHAO L , CHURCH K , et al . Compositional language continual learning [C/OL]// 8th International Conference on Learning Representations . Addis Ababa: ICLR , 2020[2023-04-12] . https://openreview.net/forum?id=rklnDgHtDS https://openreview.net/forum?id=rklnDgHtDS .

HU H X , SENER O , SHA F , et al . Drinking from a firehose: Continual learning with web-scale natural language [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2023 , 45 ( 5 ): 5684 - 5696 .

MI F , LIN X Y , FALTINGS B . ADER: Adaptively distilled exemplar replay towards continual learning for session-based recommendation [C]// Proceedings of the 14th ACM Conference on Recommender Systems . New York : ACM , 2020 : 408 - 413 .

XU Y S , ZHANG Y X , GUO W , et al . GraphSAIL: Graph structure aware incremental learning for recommender systems [C]// Proceedings of the 29th ACM International Conference on Information & Knowledge Management . New York : ACM , 2020 : 2861 - 2868 .

ISCEN A , ZHANG J , LAZEBNIK S , et al . Memory-efficient incremental learning through feature adaptation [C]// 2020 European Conference on Computer Vision . Glasgow : ECCV , 2020 : 699 - 715 .

RIEMER M , CASES I , AJEMIAN R , et al . Learning to learn without forgetting by maximizing transfer and minimizing interference [EB/OL]. ( 2018-10-29 )[ 2023-04-15 ]. https://arxiv.org/abs/1810.11910 https://arxiv.org/abs/1810.11910 .

JAVED K , WHITE M . Meta-learning representations for continual learning [J]. Advances in Neural Information Processing Systems , 2019 , 32 : 1818 - 1828 .

RAJASEGARAN J , KHAN S , HAYAT M , et al . iTAML: An incremental task-agnostic meta-learning approach [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 13585 - 13594 .

HE J P , MAO R Y , SHAO Z M , et al . Incremental learning in online scenario [C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2020 : 13923 - 13932 .

ALJUNDI R , LIN M , GOUJAUD B , et al . Gradient based sample selection for online continual learning [J]. Advances in neural information processing systems , 2019 , 32 : 11816 - 29205

BANG J , KOH H , PARK S , et al . Online continual learning on a contaminated data stream with blurry task boundaries [C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) . Piscataway : IEEE , 2022 : 9265 - 9274 .

ALJUNDI R , BABILONI F , ELHOSEINY M , et al . Memory aware synapses: learning what (not) to forget [C]// 2018 European Conference on Computer Vision . Munich : ECCV , 2018 : 139 - 154 .

JIN X , SADHU A , DU J , et al . Gradient-based editing of memory examples for online task-free continual learning [J]. Advances in Neural Information Processing Systems , 2021 , 34 : 29193 - 29205 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

Variable Horizon Multi-Directional Scanning Method for Time Series Anomaly Detection

Differentially Private with Sparse and Smooth Self-Distillation

Operator Fusion Method and Hardware Architecture Design Based on Non-Standard Operators

A Method for Enhancing the Quality of Decompressed Point Clouds Based on Attention-Fused Multi-Scale Features

Related Author

CAO Jian

HUANG Yu-zhe

GUAN Yong-yuan

WEI Song-jie

ZHAO Deng-feng

XUE Da-xuan

ZHAO Su-yun

CHEN Hong

Related Institution

School of Computer Science and Engineering, School of Cyber Science and Engineering, Nanjing University of Science and Technology

School of Information, Renmin University of China

College of Information Engineering, Capital Normal University

School of Mathematical Science, Capital Normal University

Faculty of Software Technologics, Shanxi Agricultural University

⁰