电子学报 ›› 2020, Vol. 48 ›› Issue (5): 870-877.DOI: 10.3969/j.issn.0372-2112.2020.05.006

• 学术论文 • 上一篇    下一篇

基于深度学习的miRNA与疾病相关性预测算法

王磊1, 徐涛1, 宋传东1, 王海峰1, 尤著宏2, 宋克俭3, 闫欣4   

  1. 1. 枣庄学院信息科学与工程学院, 山东枣庄 277100;
    2. 中国科学院新疆理化技术研究所, 新疆乌鲁木齐 830011;
    3. 江西理工大学信息工程学院, 江西赣州 341000;
    4. 枣庄学院外国语学院, 山东枣庄 277100
  • 收稿日期:2019-07-11 修回日期:2019-10-09 出版日期:2020-05-25 发布日期:2020-05-25
  • 通讯作者: 尤著宏
  • 作者简介:王磊 男,1982年1月出生,山东枣庄人.2018年在中国矿业大学获博士学位,现为中科院新疆理化技术研究所博士后.主要研究方向为大数据分析、数据挖掘及在生物信息学上的应用等.E-mail:leiwang@ms.xjb.ac.cn;徐涛 男,1978年11月出生,山东枣庄人,2009年在华东师范大学获硕士学位,现为枣庄学院副教授.主要研究方向计算机网络、分布式计算等.E-mail:xutao@uzz.edu.cn
  • 基金资助:
    国家自然科学基金(No.61702444);中国博士后科学基金(No.2019M653804);中科院西部之光(No.2018-XBQNXZ-B-008)

Prediction Algorithm of Association Between miRNAs and Diseases Based on Deep Learning

WANG Lei1, XU Tao1, SONG Chuan-dong1, WANG Hai-feng1, YOU Zhu-hong2, SONG Ke-jian3, YAN Xin4   

  1. 1. College of Information Science and Engineering, Zaozhuang University, Zaozhuang, Shandong 277100, China;
    2. Xinjiang Technical Institutes of Physics and Chemistry, Chinese Academy of Sciences, Urumqi, Xinjiang 830011, China;
    3. School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China;
    4. School of Foreign Languages, Zaozhuang University, Zaozhuang, Shandong 277100, China
  • Received:2019-07-11 Revised:2019-10-09 Online:2020-05-25 Published:2020-05-25

摘要: 大量研究表明,microRNA(miRNA)在人类复杂疾病研究中发挥着重要作用.识别miRNA与疾病之间的关系对于提高复杂疾病的治疗水平具有重要意义.然而,传统实验方式常受限于小规模和高成本,因此迫切需要计算模拟的方式快速有效地预测miRNA-疾病间的潜在关系.本文通过结合深度学习的堆叠自动编码器算法与旋转森林分类器对miRNA-疾病间关系进行预测.该方法能够有效抽取出融合了疾病语义相似性、miRNA功能相似性和miRNA序列信息的高级特征并对其进行准确分类.在交叉验证实验中,该方法在HMDD v3.0数据集上取得90.30%的预测准确率.此外,我们还在人类复杂疾病乳腺肿瘤上做了案例研究.结果,模型预测得分最高的前30个疾病关联miRNA中28个得到了证实.这些优异的结果表明,该算法是一种有效预测miRNA-疾病关系的工具,能够为生物实验提供高可靠的疾病关联miRNA候选物.

关键词: 深度学习, miRNA-疾病关系, 堆叠自动编码器, 旋转森林

Abstract: Numerous studies have shown that microRNA (miRNA) plays important role in the study of human complex diseases.Identifying the association between miRNAs and diseases is important for improving the therapeutic level of complex diseases.However,traditional experimental is often limited to small-scale and high-cost,so computational simulation is urgently needed to quickly and effectively predict the potential miRNAs-disease associations.In this study,a new method is proposed to predict the miRNA-disease association by combining deep learning stacked automatic encoder algorithm with rotation forest classifier.This method can effectively extract high-level features that combine disease semantic similarity,miNRA functional similarity and miRNA sequence information,and accurately classify them.In the cross-validation experiment,this method achieved 90.30% prediction accuracy on the HMDD v3.0 dataset.Furthermore,we have also done case studies on Breast Neoplasms.As a result,28 of the top 30 miRNA-disease associations were confirmed.These excellent results indicate that this method is an effective tool for predicting miRNA-disease associations,and can provide highly reliable candidate miRNAs for biological experiments.

Key words: deep learning, miRNA-disease association, stacked automatic encoder, rotation forest

中图分类号: