Public Welfare Technology Research Program of Zhejiang Province (No.LGG19F010011);Zhejiang Shuren University Special fund of Fundamental Research Funds for the Universities of Zhejiang (No.2020XZ009)
ZHENG Qi-hang, WANG Zhang-quan, LIU Ban-teng, et al. Research on Family Activity Recognition Method Based on Additive Margin Capsule Network[J]. Acta Electronica Sinica, 2020, 48(8): 1580-1586.
DOI:
ZHENG Qi-hang, WANG Zhang-quan, LIU Ban-teng, et al. Research on Family Activity Recognition Method Based on Additive Margin Capsule Network[J]. Acta Electronica Sinica, 2020, 48(8): 1580-1586. DOI: 10.3969/j.issn.0372-2112.2020.08.017.
Research on Family Activity Recognition Method Based on Additive Margin Capsule Network
本文研究基于音频的家庭活动识别方法,提出了一种基于加性间距胶囊神经网络识别模型,针对传统胶囊神经网络目标函数仅以输出胶囊模长作为约束的弊端,本文以几何学的视角,在胶囊神经网络结构中加入Transition层,使用Transition层对胶囊单元空间关系进行变基至一维空间,再使用加性间距Softmax作为目标函数,以同类特征变化小,非同类特征差异大作为优化策略构建基于胶囊向量空间关系的目标函数以提高模型分类能力,最后对方法进行试验,采用音频事件对家庭活动进行分类识别.选择声学场景和事件检测与分类(Detection and Classification of Acoustic Scenes and Events,DCASE)2018挑战任务5作为数据集,进行分类器构建和测试,最终平均F1分数达到92.3%,优于其他主流方法.
Abstract
We study the method of family activity recognition based on audio and propose a capsule neural network recognition model based on additive margin. In view of the drawbacks of the traditional capsule neural network objective function only with the output capsule mode length as the constraint
this paper adds a Transition layer to the capsule neural network structure from the perspective of geometry and uses the Transition layer to rebase the capsule unit spatial relationship to the one-dimensional. Then
using the additive margin Softmax as the objective function
the change of similar features is small
and the difference of non-similar features is used as the optimization strategy to construct the objective function based on the capsule vector space relationship to improve model classification ability. Finally
test this method by classified identified for audio events for family activities. Selecting Detection and Classification of Acoustic Scenes and Events (DCASE) 2018 Challenge Task 5 as a dataset for classifier construction and testing