电子学报 ›› 2020, Vol. 48 ›› Issue (8): 1580-1586.DOI: 10.3969/j.issn.0372-2112.2020.08.017

• 学术论文 • 上一篇    下一篇

基于加性间距胶囊网络的家庭活动识别方法研究

郑启航1, 王章权1,2, 刘半藤1,2, 陈阳1, 陈友荣2   

  1. 1. 常州大学信息科学与工程学院, 江苏常州 213164;
    2. 浙江树人大学信息科技学院, 浙江杭州 310015
  • 收稿日期:2019-07-05 修回日期:2020-05-18 出版日期:2020-08-25
    • 通讯作者:
    • 刘半藤
    • 作者简介:
    • 郑启航 男,1996年出生,浙江绍兴人,硕士研究生.主要研究方向为音频分类、半监督学习. E-mail:18000632@smail.cczu.edu.cn
    • 基金资助:
    • 浙江省公益技术研究项目 (No.LGG19F010011); 浙江树人大学省属高校基本科研业务费专项资金 (No.2020XZ009)

Research on Family Activity Recognition Method Based on Additive Margin Capsule Network

ZHENG Qi-hang1, WANG Zhang-quan1,2, LIU Ban-teng1,2, CHEN Yang1, CHEN You-rong2   

  1. 1. School of Information Science and Engineering, Changzhou University, Changzhou, Jiangsu 213164, China;
    2. College of Information Science and Technology, Zhejiang Shuren University, Hangzhou, Zhejiang 310015, China
  • Received:2019-07-05 Revised:2020-05-18 Online:2020-08-25 Published:2020-08-25
    • Corresponding author:
    • LIU Ban-teng
    • Supported by:
    • Public Welfare Technology Research Program of Zhejiang Province (No.LGG19F010011); Zhejiang Shuren University Special fund of Fundamental Research Funds for the Universities of Zhejiang (No.2020XZ009)

摘要: 本文研究基于音频的家庭活动识别方法,提出了一种基于加性间距胶囊神经网络识别模型,针对传统胶囊神经网络目标函数仅以输出胶囊模长作为约束的弊端,本文以几何学的视角,在胶囊神经网络结构中加入Transition层,使用Transition层对胶囊单元空间关系进行变基至一维空间,再使用加性间距Softmax作为目标函数,以同类特征变化小,非同类特征差异大作为优化策略构建基于胶囊向量空间关系的目标函数以提高模型分类能力,最后对方法进行试验,采用音频事件对家庭活动进行分类识别.选择声学场景和事件检测与分类(Detection and Classification of Acoustic Scenes and Events,DCASE)2018挑战任务5作为数据集,进行分类器构建和测试,最终平均F1分数达到92.3%,优于其他主流方法.

关键词: 音频事件分类, 家庭活动识别, 胶囊网络, 加性间距

Abstract: We study the method of family activity recognition based on audio and propose a capsule neural network recognition model based on additive margin. In view of the drawbacks of the traditional capsule neural network objective function only with the output capsule mode length as the constraint, this paper adds a Transition layer to the capsule neural network structure from the perspective of geometry and uses the Transition layer to rebase the capsule unit spatial relationship to the one-dimensional. Then, using the additive margin Softmax as the objective function, the change of similar features is small, and the difference of non-similar features is used as the optimization strategy to construct the objective function based on the capsule vector space relationship to improve model classification ability. Finally, test this method by classified identified for audio events for family activities. Selecting Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 Challenge Task 5 as a dataset for classifier construction and testing, with a final average F1 score of 92.3%, which is superior to other mainstream methods.

Key words: classification of acoustic events, family activity recognition, capsule network, additive margin softmax

中图分类号: