电子学报 ›› 2022, Vol. 50 ›› Issue (8): 1830-1839.DOI: 10.12263/DZXB.20211288
刘群, 谭洪胜, 张优敏, 王国胤
收稿日期:
2021-09-17
修回日期:
2022-04-21
出版日期:
2022-08-25
作者简介:
基金资助:
LIU Qun, TAN Hong-sheng, ZHANG You-min, WANG Guo-yin
Received:
2021-09-17
Revised:
2022-04-21
Online:
2022-08-25
Published:
2022-09-08
摘要:
对网络表示学习的研究已经取得了很多成果,但是大部分网络表示学习模型忽略了网络动态性和异质性,无法区分网络中耦合的时间和空间(结构)特征,也不能捕获网络的丰富语义信息.本文提出了基于元路径的动态异质网络表示学习方法.首先将节点的邻域结构按照时间划分出不同的子空间结构,并为每个节点采样出所有时间加权元路径的序列.其次通过门控循环单元将节点的全部时间加权元路径序列上的邻域信息进行集成,最后利用带注意力机制的双向门控循环单元对融合后的节点序列进行时空上下文信息学习,获得每个节点的最终表示向量.通过在真实数据集上的实验表明,在节点分类、聚类和可视化的下游任务测试中,本文提出的算法较基线方法在性能上均有较大提升.节点分类任务中的Micro-F1平均提高了1.09%~3.72%,节点聚类任务中的ARI值提高了3.23%~14.49%.
中图分类号:
刘群, 谭洪胜, 张优敏, 王国胤. 基于元路径的动态异质网络表示学习[J]. 电子学报, 2022, 50(8): 1830-1839.
LIU Qun, TAN Hong-sheng, ZHANG You-min, WANG Guo-yin. Dynamic Heterogeneous Network Representation Method Based on Meta-Path[J]. Acta Electronica Sinica, 2022, 50(8): 1830-1839.
数据集 | 节点类型 | 节点数 | 时间 | 标签 | 元路径 |
---|---|---|---|---|---|
AMiner | Paper(P) | 18 181 | 16 | 5 | APA APCPA |
Conference(C) | 22 | ||||
Author(A) | 22 942 | ||||
DBLP-4 | Paper(P) | 14 328 | 4 | 4 | APA APCPA APTPA |
Conference(C) | 20 | ||||
Author(A) | 4 057 | ||||
Term(T) | 8 898 | ||||
DBLP-10 | Paper(P) | 8 970 | 6 | 10 | PAP PCP |
Conference(C) | 51 | ||||
Author(A) | 15 019 |
表1 数据集描述
数据集 | 节点类型 | 节点数 | 时间 | 标签 | 元路径 |
---|---|---|---|---|---|
AMiner | Paper(P) | 18 181 | 16 | 5 | APA APCPA |
Conference(C) | 22 | ||||
Author(A) | 22 942 | ||||
DBLP-4 | Paper(P) | 14 328 | 4 | 4 | APA APCPA APTPA |
Conference(C) | 20 | ||||
Author(A) | 4 057 | ||||
Term(T) | 8 898 | ||||
DBLP-10 | Paper(P) | 8 970 | 6 | 10 | PAP PCP |
Conference(C) | 51 | ||||
Author(A) | 15 019 |
数据集 | 分类指标 | 训练比例 | DeW | GCN | M2v | HAN | M2DNE | DHNE | DyHATR | DHNRstu | DHNR |
---|---|---|---|---|---|---|---|---|---|---|---|
AMiner | Macro-F1 | 20% | 0.969 8 | 0.972 7 | 0.974 1 | 0.960 5 | 0.965 2 | 0.845 3 | 0.971 0 | 0.972 7 | 0.9762 |
50% | 0.971 1 | 0.974 4 | 0.974 3 | 0.962 1 | 0.965 7 | 0.844 9 | 0.972 1 | 0.974 1 | 0.9768 | ||
80% | 0.970 1 | 0.974 7 | 0.974 8 | 0.964 4 | 0.967 3 | 0.846 1 | 0.972 8 | 0.973 4 | 0.9771 | ||
Micro-F1 | 20% | 0.972 1 | 0.974 9 | 0.976 5 | 0.964 2 | 0.967 3 | 0.853 9 | 0.972 7 | 0.975 2 | 0.9775 | |
50% | 0.973 4 | 0.975 8 | 0.976 7 | 0.966 3 | 0.967 9 | 0.855 6 | 0.973 5 | 0.975 8 | 0.9788 | ||
80% | 0.972 5 | 0.976 6 | 0.977 4 | 0.968 1 | 0.969 1 | 0.855 5 | 0.974 9 | 0.975 6 | 0.9793 | ||
DBLP-4 | Macro-F1 | 20% | 0.876 9 | 0.909 3 | 0.923 2 | 0.925 1 | 0.9401 | 0.915 7 | 0.922 7 | 0.934 1 | 0.939 1 |
50% | 0.907 4 | 0.918 1 | 0.922 9 | 0.926 2 | 0.941 3 | 0.891 4 | 0.933 4 | 0.936 7 | 0.9431 | ||
80% | 0.920 2 | 0.923 7 | 0.923 3 | 0.925 7 | 0.941 0 | 0.907 8 | 0.936 9 | 0.938 8 | 0.9439 | ||
Micro-F1 | 20% | 0.885 8 | 0.915 1 | 0.928 5 | 0.930 1 | 0.9422 | 0.924 5 | 0.926 1 | 0.937 7 | 0.941 2 | |
50% | 0.915 4 | 0.914 4 | 0.927 9 | 0.931 4 | 0.943 4 | 0.910 3 | 0.935 2 | 0.941 5 | 0.9486 | ||
80% | 0.926 2 | 0.919 7 | 0.928 0 | 0.941 0 | 0.944 6 | 0.926 2 | 0.944 3 | 0.943 6 | 0.9488 | ||
DBLP-10 | Macro-F1 | 20% | 0.822 2 | 0.841 9 | 0.854 4 | 0.841 5 | 0.863 4 | 0.788 1 | 0.871 5 | 0.881 2 | 0.8972 |
50% | 0.856 4 | 0.847 6 | 0.871 3 | 0.882 7 | 0.886 2 | 0.792 4 | 0.883 5 | 0.895 2 | 0.8984 | ||
80% | 0.866 8 | 0.863 1 | 0.877 8 | 0.889 2 | 0.890 1 | 0.809 2 | 0.889 5 | 0.896 1 | 0.8993 | ||
Micro-F1 | 20% | 0.838 7 | 0.874 3 | 0.862 1 | 0.885 1 | 0.881 8 | 0.777 1 | 0.872 4 | 0.9011 | 0.897 4 | |
50% | 0.852 1 | 0.892 1 | 0.871 9 | 0.899 4 | 0.887 1 | 0.784 1 | 0.891 0 | 0.903 8 | 0.9073 | ||
80% | 0.856 6 | 0.896 4 | 0.880 5 | 0.901 2 | 0.892 3 | 0.821 8 | 0.899 2 | 0.909 4 | 0.9101 |
表2 节点多分类结果
数据集 | 分类指标 | 训练比例 | DeW | GCN | M2v | HAN | M2DNE | DHNE | DyHATR | DHNRstu | DHNR |
---|---|---|---|---|---|---|---|---|---|---|---|
AMiner | Macro-F1 | 20% | 0.969 8 | 0.972 7 | 0.974 1 | 0.960 5 | 0.965 2 | 0.845 3 | 0.971 0 | 0.972 7 | 0.9762 |
50% | 0.971 1 | 0.974 4 | 0.974 3 | 0.962 1 | 0.965 7 | 0.844 9 | 0.972 1 | 0.974 1 | 0.9768 | ||
80% | 0.970 1 | 0.974 7 | 0.974 8 | 0.964 4 | 0.967 3 | 0.846 1 | 0.972 8 | 0.973 4 | 0.9771 | ||
Micro-F1 | 20% | 0.972 1 | 0.974 9 | 0.976 5 | 0.964 2 | 0.967 3 | 0.853 9 | 0.972 7 | 0.975 2 | 0.9775 | |
50% | 0.973 4 | 0.975 8 | 0.976 7 | 0.966 3 | 0.967 9 | 0.855 6 | 0.973 5 | 0.975 8 | 0.9788 | ||
80% | 0.972 5 | 0.976 6 | 0.977 4 | 0.968 1 | 0.969 1 | 0.855 5 | 0.974 9 | 0.975 6 | 0.9793 | ||
DBLP-4 | Macro-F1 | 20% | 0.876 9 | 0.909 3 | 0.923 2 | 0.925 1 | 0.9401 | 0.915 7 | 0.922 7 | 0.934 1 | 0.939 1 |
50% | 0.907 4 | 0.918 1 | 0.922 9 | 0.926 2 | 0.941 3 | 0.891 4 | 0.933 4 | 0.936 7 | 0.9431 | ||
80% | 0.920 2 | 0.923 7 | 0.923 3 | 0.925 7 | 0.941 0 | 0.907 8 | 0.936 9 | 0.938 8 | 0.9439 | ||
Micro-F1 | 20% | 0.885 8 | 0.915 1 | 0.928 5 | 0.930 1 | 0.9422 | 0.924 5 | 0.926 1 | 0.937 7 | 0.941 2 | |
50% | 0.915 4 | 0.914 4 | 0.927 9 | 0.931 4 | 0.943 4 | 0.910 3 | 0.935 2 | 0.941 5 | 0.9486 | ||
80% | 0.926 2 | 0.919 7 | 0.928 0 | 0.941 0 | 0.944 6 | 0.926 2 | 0.944 3 | 0.943 6 | 0.9488 | ||
DBLP-10 | Macro-F1 | 20% | 0.822 2 | 0.841 9 | 0.854 4 | 0.841 5 | 0.863 4 | 0.788 1 | 0.871 5 | 0.881 2 | 0.8972 |
50% | 0.856 4 | 0.847 6 | 0.871 3 | 0.882 7 | 0.886 2 | 0.792 4 | 0.883 5 | 0.895 2 | 0.8984 | ||
80% | 0.866 8 | 0.863 1 | 0.877 8 | 0.889 2 | 0.890 1 | 0.809 2 | 0.889 5 | 0.896 1 | 0.8993 | ||
Micro-F1 | 20% | 0.838 7 | 0.874 3 | 0.862 1 | 0.885 1 | 0.881 8 | 0.777 1 | 0.872 4 | 0.9011 | 0.897 4 | |
50% | 0.852 1 | 0.892 1 | 0.871 9 | 0.899 4 | 0.887 1 | 0.784 1 | 0.891 0 | 0.903 8 | 0.9073 | ||
80% | 0.856 6 | 0.896 4 | 0.880 5 | 0.901 2 | 0.892 3 | 0.821 8 | 0.899 2 | 0.909 4 | 0.9101 |
对比算法 | AMiner | DBLP-4 | DBLP-10 | |||
---|---|---|---|---|---|---|
NMI | ARI | NMI | ARI | NMI | ARI | |
DeW | 0.782 | 0.691 | 0.761 | 0.811 | 0.404 | 0.212 |
GCN | 0.879 | 0.907 | 0.724 | 0.770 | 0.396 | 0.483 |
M2v | 0.727 | 0.591 | 0.780 | 0.830 | 0.294 | 0.171 |
HAN | 0.787 | 0.810 | 0.781 | 0.835 | 0.430 | 0.540 |
M2DNE | 0.735 | 0.657 | 0.756 | 0.786 | 0.306 | 0.195 |
DHNE | 0.174 | 0.456 | 0.238 | 0.093 | 0.152 | 0.061 |
DyHATR | 0.823 | 0.870 | 0.768 | 0.822 | 0.376 | 0.419 |
DHNRstu | 0.911 | 0.947 | 0.784 | 0.843 | 0.382 | 0.551 |
DHNR | 0.916 | 0.943 | 0.810 | 0.862 | 0.413 | 0.553 |
表3 节点聚类结果
对比算法 | AMiner | DBLP-4 | DBLP-10 | |||
---|---|---|---|---|---|---|
NMI | ARI | NMI | ARI | NMI | ARI | |
DeW | 0.782 | 0.691 | 0.761 | 0.811 | 0.404 | 0.212 |
GCN | 0.879 | 0.907 | 0.724 | 0.770 | 0.396 | 0.483 |
M2v | 0.727 | 0.591 | 0.780 | 0.830 | 0.294 | 0.171 |
HAN | 0.787 | 0.810 | 0.781 | 0.835 | 0.430 | 0.540 |
M2DNE | 0.735 | 0.657 | 0.756 | 0.786 | 0.306 | 0.195 |
DHNE | 0.174 | 0.456 | 0.238 | 0.093 | 0.152 | 0.061 |
DyHATR | 0.823 | 0.870 | 0.768 | 0.822 | 0.376 | 0.419 |
DHNRstu | 0.911 | 0.947 | 0.784 | 0.843 | 0.382 | 0.551 |
DHNR | 0.916 | 0.943 | 0.810 | 0.862 | 0.413 | 0.553 |
1 | JI M, HAN J W, DANIEVSKY M. Ranking-based classification of heterogeneous information networks[C]//Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Diego: ACM, 2011: 1298-1306. |
2 | OPSAHL T, PANZARAS P. Clustering in weighted networks[J]. Social Networks, 2009, 31(2): 155-163. |
3 | DONG Y, CHAWLA N V, SWAMI A. Metapath2vec: Scalable representation learning for heterogeneous networks[C]//Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Halifax: ACM, 2017: 135-144. |
4 | PEROZZI B, AL-RFOU R, SKIENA S. Deepwalk: Online learning of social representations[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2014: 701-710. |
5 | GROVER A, LESKOVEC J. Node2vec: Scalable feature learning for networks[C]//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco: ACM, 2016: 855-864. |
6 | MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality[J]. Advances in Neural Information Processing Systems, 2013, 26: 3111-3119. |
7 | TANG J, QU M, WANG M, et al. Line: Large-scale information network embedding[C]//Proceedings of the 24th International Conference on World Wide Web. Florence:ACM, 2015: 1067-1077. |
8 | Wang D, Cui P, Zhu W. Structural deep network embedding[C]// Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York: ACM, 2016: 1225-1234. |
9 | GORI M, MONFARDINI G, SCARSELLI F. A new model for learning in graph domains[C]//Proceedings of 2005 IEEE International Joint Conference on Neural Networks. Montreal: IEEE, 2005: 729-734. |
10 | WANG X, JI H, SHI C, et al. Heterogeneous graph attention network[C]//The World Wide Web Conference. San Francisco: . |
11 | QIAO Z, WANG P, FU Y, et al. Tree structure-aware graph representation learning via integrated hierarchical aggregation and relational metric learning[C]//20th IEEE International Conference on Data Mining. Sorrento: IEEE, 2020: 432-441. |
12 | ZHANG C, SONG D, HUANG C, et al. Heterogeneous graph neural network[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage: ACM, 2019: 793-803. |
13 | KIPF T N, WELLING M. Semi-supervised classification with graph convolutional networks[EB/OL]. (2017-02-22)[2021-03-01]. . |
14 | ZHOU L, YANG Y, REN X, et al. Dynamic network embedding by modeling triadic closure process[C]//Proceedings of the AAAI Conference on Artificial Intelligence. New Orleans: AAAI, 2018, 32(1): 571-578. |
15 | 尹赢, 张建朋, 吉立新, 李治成. 基于霍克斯点过程的动态网络表示学习方法[J]. 电子学报, 2020, 48(11): 2154-2161. |
YIN Y, ZHANG J P, JI L X, Li Z C. Dynamic network representation learning based onhawkes point process[J]. Acta Electronica Sinica, 2020, 48(11): 2154-2161. (in Chinese) | |
16 | NGUYEN G H, LEE J B, ROSSI R A, et al. Continuous-time dynamic network embeddings[C]//Companion Proceedings of the The Web Conference 2018. Lyon: . |
17 | LU Y, WANG X, SHI C, et al. Temporal network embedding with micro-and macro-dynamics[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management. Beijing:ACM, 2019: 469-478. |
18 | YIN Y, JI L X, ZHANG J P, et al. Dhne: Network representation learning method for dynamic heterogeneous networks[J]. IEEE Access, 2019: 134782-134792. |
19 | WANG X, LU Y, SHI C, et al. Dynamic heterogeneous information network embedding with meta-path based proximity[J]. IEEE Transactions on Knowledge and Data Engineering, 2020, 34(3): 1117-1132. |
20 | XUE H, YANG L, JIANG W, et al. Modeling dynamic heterogeneous network for link prediction using hierarchical attention with temporal RNN[C]//Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Ghent: Springer, 2020: 282-298. |
21 | SUN Y, HAN J, YAN X, et al. Pathsim: Meta path-based top-k similarity search in heterogeneous information networks[J]. Proceedings of the VLDB Endowment, 2011, 4(11): 992-1003. |
22 | SHI C, ZHANG Z, LUO P, et al. Semantic path based personalized recommendation on weighted heterogeneous information networks[C]//Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. Shanghai: ACM, 2015: 453-462. |
23 | HU Z, DONG Y, WANG K, et al. Heterogeneous graph transformer[C]//Proceedings of the Web Conference 2020. Taipei: ACM, 2020: 2704-2710. |
24 | LE Q, MIKOLOV T. Distributed representations of sentences and documents[C]//International Conference on Machine Learning. Beijing: ACM, 2014: 1188-1196. |
[1] | 张志昌, 于沛霖, 庞雅丽, 朱林, 曾扬扬. SMGN:用于对话状态跟踪的状态记忆图网络[J]. 电子学报, 2022, 50(8): 1851-1858. |
[2] | 赵俊男, 佘青山, 孟明, 陈云. 基于多流空间注意力图卷积SRU网络的骨架动作识别[J]. 电子学报, 2022, 50(7): 1579-1585. |
[3] | 王延达, 陈炜通, 皮德常, 岳琳. 一种自适应记忆神经网络多跳读取与覆盖度机制结合的药物推荐模型[J]. 电子学报, 2022, 50(4): 943-953. |
[4] | 张凌明, 赵悦, 李鹏程, 刘洋, 高陈强. 基于局部注意力机制的三维牙齿模型分割网络[J]. 电子学报, 2022, 50(3): 681-690. |
[5] | 伍邦谷, 张苏林, 石红, 朱鹏飞, 王旗龙, 胡清华. 基于多分支结构的不确定性局部通道注意力机制[J]. 电子学报, 2022, 50(2): 374-382. |
[6] | 周登文, 李文斌, 李金新, 黄志勇. 一种轻量级的多尺度通道注意图像超分辨率重建网络[J]. 电子学报, 2022, 50(10): 2336-2346. |
[7] | 王波, 黄冕, 刘利军, 黄青松, 单文琦. 基于多层聚焦Inception-V3卷积网络的细粒度图像分类[J]. 电子学报, 2022, 50(1): 72-78. |
[8] | 赵琰, 赵凌君, 匡纲要. 基于注意力机制特征融合网络的SAR图像飞机目标快速检测[J]. 电子学报, 2021, 49(9): 1665-1674. |
[9] | 周东明, 张灿龙, 李志欣, 王智文. 基于多层级视觉融合的图像描述模型[J]. 电子学报, 2021, 49(7): 1286-1290. |
[10] | 孙全明, 曲志坚, 任崇广. 基于粒子群优化和LightGBM的情景感知多式联运推荐[J]. 电子学报, 2021, 49(5): 894-903. |
[11] | 李志欣, 孙亚茹, 唐素勤, 张灿龙, 马慧芳. 双路注意力引导图卷积网络的关系抽取[J]. 电子学报, 2021, 49(2): 315-323. |
[12] | 江泽涛, 钱艺, 伍旭, 张少钦. 一种基于ARD‑GAN的低照度图像增强方法[J]. 电子学报, 2021, 49(11): 2160-2165. |
[13] | 石义乐, 杨文忠, 杜慧祥, 王丽花, 王婷, 理珊珊. 基于深度学习的图像描述综述[J]. 电子学报, 2021, 49(10): 2048-2060. |
[14] | 唐海桃, 薛嘉宾, 韩纪庆. 一种多尺度前向注意力模型的语音识别方法[J]. 电子学报, 2020, 48(7): 1255-1260. |
[15] | 袁文浩, 胡少东, 时云龙, 李钊, 梁春燕. 一种用于语音增强的卷积门控循环网络[J]. 电子学报, 2020, 48(7): 1276-1283. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||