电子学报

• •    

一种基于TimeGAN和OCSVM的多元退化设备小子样数据增广方法

孙晨峰1, 吕卫民1, 戴洪德1, 张浩晨2   

  1. 1.海军航空大学岸防兵学院, 山东 烟台 264000
    2.西北工业大学机电学院, 陕西 西安 710000
  • 收稿日期:2022-01-13 修回日期:2022-03-15 出版日期:2022-05-19
  • 作者简介:孙晨峰 男,1998年3月出生于山东省莱阳市.现为海军航空大学硕士研究生.E-mail: scf326228@163.com
    吕卫民(通讯作者) 男,1970年7月出生于山东省莱州市.现为海军航空大学教授、博士生导师.主要研究方向为装备系统工程.
    戴洪德 男,1981年11月出生于湖南省邵阳市.现为海军航空大学副教授.主要研究方向为惯性导航、滤波估计、故障诊断以及智能信息处理技术.E-mail: 13954559561@126.com
    张浩晨 男,1998年8月出生于陕西省西安市.现为西北工业大学机电学院硕士研究生.主要研究方向为数字化制造及数据驱动下的智能制造.E-mail: zhanghaochen817@163.com
  • 基金资助:
    国家自然科学基金(51975580)

A Small Sample Data Augmentation Method for Multivariate Degradation Equipment Based on TimeGAN and OCSVM

SUN Chen-feng1, LÜ Wei-min1, DAI Hong-de1, ZHANG Hao-chen2   

  1. 1.Coastal Defense College,Naval Aviation University,Yantai,Shandong 264000,China
    2.School of Mechatronics,Northwestern Polytechnical University,Xian,Shaanxi 710000,China
  • Received:2022-01-13 Revised:2022-03-15 Online:2022-05-19

摘要:

工作在复杂环境下的多元退化设备面临失效数据少、多源信息融合准确度低和监督学习数据不平衡等问题,对此本文提出一种基于时间序列生成对抗网络(Time-series Generative Adversarial Networks, TimeGAN)与单分类支持向量机(One-Class Support Vector Machine, OCSVM)组合模型的小子样数据增广方法.方法引入了TimeGAN模型拟合真实数据时间序列相关性,从而生成新的多元退化设备数据.提出了一种基于最大均值差异改进方法的可信度判据,避免强相关特征对生成数据质量评价的影响,通过使用T-分布随机邻近嵌入(T-distributed Stochastic Neighbor Embedding, T-SNE)和全局最大均值差异(Global Maximum Mean Discrepancy, GMMD)的组合方法,定性定量地评价生成数据的质量水平.基于训练后的OCSVM模型,对生成数据进行异常检测与剔除,进一步提高生成数据的质量.以航空发动机数据集C-MAPSS为例进行方法验证分析,通过与其他数据增强模型对比验证了所提方法的可行性和有效性.

关键词: 小子样数据, 数据增广, 多元退化设备, 时间序列生成对抗网络, 单分类支持向量机

Abstract:

The multivariate degradation equipment working in complex environments faces the problems of small amount of failure data, the low accuracy of multi-source information fusion and imbalanced supervised learning dataset. For these problems, a small sample data augmentation method based on the combination model of time-series generative adversarial networks(TimeGAN) and one class support vector machine(OCSVM) is designed. TimeGAN is introduced to fit the time series correlation and generate the new degradation data. A new credibility criterion based on improved maximum mean discrepancy is proposed to avoid the strong correlation influence for the data quality evaluation. The combination method of t-distributed stochastic neighbor embedding(T-SNE) and global maximum mean discrepancy(GMMD) is applied to evaluate the quality of generation dataset qualitatively and quantitatively. The trained OCSVM is used to detect and remove the novelty data to further improve dataset quality. The comparison of the method and other data generation models on aircraft engine dataset C-MAPSS verifies its feasibility and effectiveness.

Key words: small sample data, data augmentation, multivariate degradation equipment, time-series generative adversarial networks, one-class support vector machine

中图分类号: