首都师范大学管理学院,北京 100056
[ "冯婷婷 女,1992年出生,天津人,硕士研究生. 主要研究方向为公共管理信息化理论与技术. E-mail: fengtt0702@163.com" ]
[ "彭 岩 女,1967年出生,重庆人,博士,教授. 主要研究方向为大数据分析与数据挖掘.E-mail: pengyan@cnu.edu.cn" ]
[ "王 洁(通讯作者) 女,1977年出生,湖北黄石人,博士,副教授. 主要研究方向为数据挖掘、机器学习." ]
收稿:2022-03-09,
修回:2022-07-30,
纸质出版:2023-09-25
移动端阅览
冯婷婷,彭岩,王洁.ISGS:一种面向滞后效应的组合模型研究[J].电子学报,2023,51(09):2504-2509.
FENG Ting-ting,PENG Yan,WANG Jie.ISGS: A Combinatorial Model for Hysteresis Effects[J].ACTA ELECTRONICA SINICA,2023,51(09):2504-2509.
冯婷婷,彭岩,王洁.ISGS:一种面向滞后效应的组合模型研究[J].电子学报,2023,51(09):2504-2509. DOI: 10.12263/DZXB.20220238.
FENG Ting-ting,PENG Yan,WANG Jie.ISGS: A Combinatorial Model for Hysteresis Effects[J].ACTA ELECTRONICA SINICA,2023,51(09):2504-2509. DOI: 10.12263/DZXB.20220238.
针对滞后效应明显、样本量小的数据集,为解决单一算法模型预测精度低、泛化能力差的问题,提出了一种基于等距特征映射算法(Isometric Feature Mapping, ISOMAP)、少数类过采样技术(Synthetic Minority Oversampling Technique, SMOTE)、遗传算法(Genetic Algorithm, GA)、支持向量回归(Support Vector Regression, SVR)的组合模型ISGS(ISOMPA-SMOTE-GA-SVR). 首先,利用ISOMAP和SMOTE算法对滞后效应明显、样本量较小的数据集进行特征变换. 其次,利用SVR算法较强的非线性分类能力及泛化能力对数据集进行回归分析. 最后,利用GA算法对SVR算法的参数进行优化,以提升模型的预测精度. 采用气象因素、空气质量、呼吸系统发病人数三组数据集,基于ISGS模型进行了发病人数预测的仿真实验和对比实验. 实验结果表明,该模型预测精度和准确率较传统模型均有所提高,预测精度达到93.65%(传统单一模型83.481%). 同时具有更好的泛化能力,能够更好地处理高维度、小样本数据集.
In anticipation of data sets with small sample size and evident lag effects
a novel ISGS (ISOMPA-SMOTE-GA-SVR) model was proposed to address the issues of low prediction accuracy and inadequate generalization in single-algorithm prediction models. This ISGS model integrates isometric feature mapping (ISOMAP)
synthetic minority oversampling technique (SMOTE)
genetic algorithm (GA)
and support vector regression (SVR)
thereby providing a comprehensive solution.Firstly
ISOMAP and SMOTE were employed to perform feature transformation on data sets characterized by small sample size and evident lag. Secondly
the SVR algorithm was adopted due to its robust ability to generalize and classify non-linearly in regression analysis of the data set. Lastly
GA was utilized to optimize the parameters of SVR
thereby enhancing the prediction accuracy of the model. Three data sets comprised of meteorological factors
air quality and the number of patients with respiratory diseases was utilized to conduct simulation and comparative experiments using the ISGS model. The experimental results demonstrate that the proposed ISGS model achieves a prediction accuracy of 93.65%
surpassing that of all other reference models. Furthermore
the model exhibits superior generalization capabilities and can effectively handle data sets with higher dimension and smaller sample size.
DIETTERICH T G . Ensemble methods in machine learning [C]// Proceedings of the First International Workshop on Multiple Classifier Systems . Berlin : Springer-Verlag , 2000 : 1 - 15 .
崔鸿雁 , 徐帅 , 张利锋 , 等 . 机器学习中的特征选择方法研究及展望 [J]. 北京邮电大学学报 , 2018 , 41 ( 1 ): 1 - 12 .
CUI H Y , X U , ZHANG L F , et al . The key techniques and future vision of feature selection in machine learning [J]. Journal of Beijing University of Posts and Telecommunications , 2018 , 41 ( 1 ): 1 - 12 . (in Chinese)
牛晓健 , 凌飞 . 基于组合学习的个人信用风险评估模型研究 [J]. 复旦学报(自然科学版) , 2021 , 60 ( 6 ): 703 - 719 .
NIU X J , LING F . Study on personal credit risk assessment model based on hybrid learning [J]. Journal of Fudan University (Natural Science) , 2021 , 60 ( 6 ): 703 - 719 . (in Chinese)
魏麟 , 朱素玲 , 胡晓斌 . 基于CEEMD-GRNN组合模型的HIV感染病例数预测 [J]. 现代预防医学 , 2022 , 49 ( 6 ): 969 - 974 .
WEI L , ZHU S L , HU X B . Prediction of HIV infection cases based on CEEMD-GRNN model [J]. Modern Preventive Medicine , 2022 , 49 ( 6 ): 969 - 974 . (in Chinese)
PENG Y , XU J , DING X X , et al . Health assessment of young students based on decision Tree-BP model [J]. Journal of Nonlinear and Convex Analysis , 2019 , 20 ( 5 ): 977 - 986 .
彭岩 , 赵梓如 , 吴婷娴 , 等 . PM2.5浓度预测与影响因素分析 [J]. 北京邮电大学学报 , 2019 , 42 ( 6 ): 162 - 169 .
PENG Y , ZHAO Z R , WU T X , et al . Prediction of PM2.5 concentration based on ensemble learning [J]. Journal of Beijing University of Posts and Telecommunications , 2019 , 42 ( 6 ): 162 - 169 . (in Chinese)
TENENBAUM J B , DE SILVA V , LANGFORD J C . A global geometric framework for nonlinear dimensionality reduction [J]. Science , 2000 , 290 ( 5500 ): 2319 - 2323 .
石陆魁 , 郭林林 , 房子哲 , 等 . 基于Spark的并行ISOMAP算法 [J]. 中国科学技术大学学报 , 2019 , 49 ( 10 ): 842 - 850 .
SHI L K , GUO L L , FANG Z Z , et al . Parallel ISOMAP algorithm based on Spark [J]. Journal of University of Science and Technology of China , 2019 , 49 ( 10 ): 842 - 850 . (in Chinese)
CHAWLA N V , BOWYER K W , HALL L O , et al . SMOTE: Synthetic minority over-sampling technique [J]. Journal of Artificial Intelligence Research , 2002 , 16 : 321 - 357 .
HE H B , GARCIA E A . Learning from imbalanced data [J]. IEEE Transactions on Knowledge and Data Engineering , 2009 , 21 ( 9 ): 1263 - 1284 .
钟龙申 , 高学军 , 王振友 . 一种新的基于K-means改进SMOTE算法在不平衡数据集分类中的应用 [J]. 数学的实践与认识 , 2015 , 45 ( 19 ): 198 - 206 .
ZHONG L S , GAO X J , WANG Z Y . A new kind of improving SOMTE algorithm based on K-means in imbalanced datasets [J]. Mathematics in Practice and Theory , 2015 , 45 ( 19 ): 198 - 206 . (in Chinese)
史耀凡 , 栾元重 , 于水 , 等 . 基于PCA-GA-SVM模型的地表下沉系数预测 [J]. 矿业研究与开发 , 2022 , 42 ( 2 ): 65 - 69 .
SHI Y F , LUAN Y Z , YU S , et al . Prediction of surface subsidence coefficient based on PCA-GA-SVM model [J]. Mining Research and Development , 2022 , 42 ( 2 ): 65 - 69 . (in Chinese)
张成成 , 陈求稳 , 徐强 , 等 . 基于支持向量机的太湖梅梁湾叶绿素 a 浓度预测模型 [J]. 环境科学学报 , 2013 , 33 ( 10 ): 2856 - 2861 .
ZHANG C C , CHEN Q W , XU Q , et al . A chlorophyll-a prediction model for meiliang bay of taihu based on support vector machine [J]. Acta Scientiae Circumstantiae , 2013 , 33 ( 10 ): 2856 - 2861 . (in Chinese)
CHEN Y W , LIN C J . Combining SVMS with various feature selection strategies [M]// Feature Extraction . Berlin : Springer , 2008 : 315 - 324 .
WU X G , ZHU Y P . A mixed-encoding genetic algorithm with beam constraint for conformal radiotherapy treatment planning [J]. Medical Physics , 2000 , 27 ( 11 ): 2508 - 2516 .
石怀涛 , 赵纪宗 , 宋文丽 , 等 . 基于人工蜂群优化核主元分析故障检测方法 [J]. 控制工程 , 2018 , 25 ( 9 ): 1686 - 1691 .
SHI H T , ZHAO J Z , SONG W L , et al . Fault detection method with kernel principal component analysis based on artificial bee colony optimization [J]. Control Engineering of China , 2018 , 25 ( 9 ): 1686 - 1691 . (in Chinese)
KEERTHI S S , LIN C J . Asymptotic behaviors of support vector machines with Gaussian kernel [J]. Neural Computation , 2003 , 15 ( 7 ): 1667 - 1689 .
屈太国 , 蔡自兴 . 基于分而治之的多维标度算法 [J]. 模式识别与人工智能 , 2014 , 27 ( 11 ): 961 - 969 .
QU T G , CAI Z X . A divide-and-conquer based multidimensional scaling algorithm [J]. Pattern Recognition and Artificial Intelligence , 2014 , 27 ( 11 ): 961 - 969 . (in Chinese)
李娟 , 张志薇 , 于庚康 , 等 . 气象要素对南京市呼吸系统疾病的影响研究 [J]. 气象科学 , 2017 , 37 ( 3 ): 409 - 415 .
LI J , ZHANG Z W , YU G K , et al . Impact of meteorological factors on respiratory diseases in Nanjing [J]. Journal of the Meteorological Sciences , 2017 , 37 ( 3 ): 409 - 415 . (in Chinese)
0
浏览量
12
下载量
1
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621