A Node Classification Method Based on Graph Attention and Improved Transformer

LI Xin; LU Wei; MA Zhao-yi; ZHU Pan; KANG Bin

doi:10.12263/DZXB.20230515

您当前的位置：

首页 >

文章列表页 >

A Node Classification Method Based on Graph Attention and Improved Transformer

PAPERS | 更新时间：2025-12-08

- A Node Classification Method Based on Graph Attention and Improved Transformer
- ACTA ELECTRONICA SINICA Vol. 52, Issue 8, Pages: 2799-2810(2024)
- 作者机构：
  
  南京邮电大学物联网学院，江苏南京 210003
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(62171232);Jiangsu Provincial Key Research and Development Program(BE2020729)
- DOI：10.12263/DZXB.20230515
  CLC： TP391;
- Received：05 June 2023，
  
  Revised：2023-07-30，
  
  Published：25 August 2024
- 稿件说明：
移动端阅览
李鑫, 陆伟, 马召祎, 等. 基于图注意力和改进Transformer的节点分类方法[J]. 电子学报, 2024, 52(08): 2799-2810.

LI Xin, LU Wei, MA Zhao-yi, et al. A Node Classification Method Based on Graph Attention and Improved Transformer[J]. Acta Electronica Sinica, 2024, 52(08): 2799-2810.
李鑫, 陆伟, 马召祎, 等. 基于图注意力和改进Transformer的节点分类方法[J]. 电子学报, 2024, 52(08): 2799-2810. DOI：10.12263/DZXB.20230515

LI Xin, LU Wei, MA Zhao-yi, et al. A Node Classification Method Based on Graph Attention and Improved Transformer[J]. Acta Electronica Sinica, 2024, 52(08): 2799-2810. DOI：10.12263/DZXB.20230515

摘要

当前，图Transformer主要在传统Transformer框架中附加辅助模块达到对图数据进行建模的目的.然而，此类方法并未改进Transformer原有体系结构，数据建模精度还有待进一步提高.基于此，本文提出一种基于图注意力和改进Transformer的节点分类方法.该方法构建基于拓扑特征增强的节点嵌入进行图结构强化学习，并且设计基于二级掩码的多头注意力机制对节点特征进行聚合及更新，最后引入归一前置及跳跃连接改进Transformer层间结构，避免节点特征趋同引起的过平滑问题.实验结果表明，相较于6类基线模型，该方法在不同性能指标上均可获得最优评估结果，且能同时兼顾小规模和中规模数据集的节点分类任务，实现分类性能的全面提升.

Abstract

Currently

graph Transformers mainly add auxiliary modules in the traditional Transformer framework to model graph data. However

these methods have not improved the original Transformer architecture. Their data modeling accuracy needs to be further enhanced. Thus

this paper suggests a node classification method based on graph attention and improved Transformer. In the proposed framework

a topology enhancement based node embedding is constructed for graph structure reinforcement learning. Then

a secondary mask based multi-head attention is developed for aggregation and update. Finally

pre-Norm and skip connection are introduced to improve the interlayer structure of Transformer

which can avoid the over-smoothing problem caused by feature convergence. Experimental results demonstrate that compared to 6 typical baseline models

our method is able to achieve optimal evaluation results on all different indicators. Moreover

it can simultaneously handle the node classification task for both small and medium datasets and comprehensively improve the classification performance.

关键词

Keywords

references

马帅 , 刘建伟 , 左信 . 图神经网络综述 [J ] . 计算机研究与发展 , 2022 , 59 ( 1 ): 47 - 80 .

MA S , LIU J W , ZUO X . Survey on graph neural network [J ] . Journal of Computer Research and Development , 2022 , 59 ( 1 ): 47 - 80 . (in Chinese)

吴博 , 梁循 , 张树森 , 等 . 图神经网络前沿进展与应用 [J ] . 计算机学报 , 2022 , 45 ( 1 ): 35 - 68 .

WU B , LIANG X , ZHANG S S , et al . Advances and applications in graph neural network [J ] . Chinese Journal of Computers , 2022 , 45 ( 1 ): 35 - 68 . (in Chinese)

XIE Y , LV S Z , QIAN Y H , et al . Active and semi-supervised graph neural networks for graph classification [J ] . IEEE Transactions on Big Data , 2022 , 8 ( 4 ): 920 - 931 .

BAI L , CUI L X , JIAO Y H , et al . Learning backtrackless aligned-spatial graph convolutional networks for graph classification [J ] . IEEE Transactions on Pattern Analysis and Machine Intelligence , 2022 , 44 ( 2 ): 783 - 798 .

WU X X , CHENG Q . Stabilizing and enhancing link prediction through deepened graph auto-encoders [C ] // Proceedings of the International Joint Conference on Artificial Intelligence . Sydney : IJCAI , 2022 : 3587 - 3593 .

ZHAO T , LIU G , WANG D H , et al . Learning from counterfactual links for link prediction [C ] // Proceedings of the International Conference on Machine Learning . New York : PMLR , 2022 : 26911 - 26926 .

SUN Y F , DENG H R , YANG Y , et al . Beyond homophily: Structure-aware path aggregation graph neural network [C ] // Proceedings of the International Joint Conference on Artificial Intelligence . Sydney : IJCAI , 2022 : 2233 - 2240 .

CHEN D X , BRAY L O , BORGARDT K M . Structure-aware transformer for graph representation learning [C ] // Proceedings of the International Conference on Machine Learning . New York : PMLR , 2022 : 3469 - 3489 .

KIPF T N , WELLING M . Semi-supervised classification with graph convolutional networks [C ] // Proceedings of the International Conference on Learning Representations . Paris : Open Review , 2017 : 1 - 14 .

VELICKOVIC P , CUCURULL G , CASANOVA A , et al . Graph attention networks [C ] // Proceedings of the International Conference on Learning Representations . Toronto : Open Review , 2018 : 1 - 12 .

WU F , SOUZA A H , ZHANG T Y , et al . Simplifying graph convolutional networks [C ] // Proceedings of the International Conference on Machine Learning . New York : PMLR , 2019 : 6861 - 6871 .

GILMER J , SCHOENHOLZ S S , RILEY P F , et al . Neural message passing for quantum chemistry [C ] // Proceedings of the International Conference on Machine Learning . Sydney : PMLR , 2017 : 1 - 10 .

LIU X , SUN D , WEI W . Alleviating the over-smoothing of graph neural computing by a data augmentation strategy with entropy preservation [J ] . Pattern Recognition , 2022 , 132 : 108951 .

XU J M , KE H B , CHEN Z W , et al . Over-smoothing relief graph convolutional network-based fault diagnosis method with application to the rectifier of high-speed trains [J ] . IEEE Transactions on Industrial Informatics , 2022 , 19 ( 1 ): 771 - 779 .

TOPPING J , GIOVANNI F D , CHAMBERLAIN B P , et al . Understanding over-squashing and bottlenecks on graphs via curvature [C ] // Proceedings of the International Conference on Learning Representations . Virtual Conference : Open Review , 2022 : 1 - 30 .

WU Z G , JAIN P , WRIGHT M A , et al . Representing long-range context for graph neural networks with global attention [C ] // Proceedings of the Advances in Neural Information Processing Systems . Virtual Conference : NeuralIPS , 2021 : 13266 - 13279 .

HUSSAIN M S , ZAKI M J , SUBRAMANIAN D . Edge-augmented graph transformers: Global self-attention is enough for graphs [C ] // Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining . New York : ACM , 2022 : 655 - 665 .

DWIVEDI V P , BRESSON X . A generalization of transformer networks to graphs [C ] // Proceedings of the Advances in Neural Information Processing Systems . Virtual Conference : AAAI , 2021 : 1 - 8 .

SUTSKEVER I , VINYALS O , LE Q V . Sequence to sequence learning with neural networks [C ] // Proceedings of the Advances in Neural Information Processing Systems . Toronto : NeuralIPS , 2014 : 3104 - 3112 .

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // Proceedings of the Advances in Neural Information Processing Systems . New York : NeuralIPS , 2017 : 5998 - 6008 .

HE K , ZHANG X , REN S , et al . Deep residual learning for image recognition [C ] // Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition . New York : IEEE , 2016 : 770 - 778 .

BA L J , KIROS J R , HINTON G E . Layer normalization [C ] // Proceedings of the Annual Conference on Neural Information Processing Systems . New York : MIT Press , 2016 : 1 - 14 .

MIN E , CHEN R F , XU T Y , et al . Transformer for graphs: An overview from architecture perspective [EB/OL ] . ( 2022-02-17 ) [ 2023-06-05 ] . http://arxiv.org/abs/2202.08455 http://arxiv.org/abs/2202.08455 .

YING C Z , CAI T L , LUO S J , et al . Do transformers really perform badly for graph representation? [C ] // Proceedings of the Advances in Neural Information Processing Systems . Virtual Conference : NeuralIPS , 2021 : 28877 - 28888 .

DWIVEDI V P , JOSHI C K , LAURENT T , et al . Benchmarking graph neural networks [J ] . Journal of Machine Learning Research , 2022 , 23 : 1 - 48 .

WU Y X , HE K M . Group normalization [C ] // Proceedings of the European Conference on Computer Vision . Berlin : Springer , 2018 : 3 - 19 .

DAUPHIN Y N , FAN A , AULI M , et al . Language modeling with gated convolutional networks [C ] // Proceedings of the International Conference on Machine Learning . Sydney : PMLR , 2017 : 933 - 941 .

NARANG S , CHUNG H W , TAY Y , et al . Do transformer modifications transfer across implementations and applications? [C ] // Proceedings of the Conference on Empirical Methods in Natural Language Processing . Toronto : Association for Computational Linguistics , 2021 : 5758 - 5773 .

XIONG R B , YANG Y C , HE D , et al . On layer normalization in the transformer architecture [C ] // Proceedings of the International Conference on Machine Learning . Virtual Conference PMLR , 2020 : 10524 - 10533 .

SEN P , NAMATA G , BILGIC M , et al . Collective classification in network data [J ] . AI Magazine , 2008 , 29 ( 3 ): 93 - 93 .

HAMILTON W L , YING R , LESKOVEC J . Inductive representation learning on large graphs [C ] // Proceedings of the Annual Conference on Neural Information Processing Systems . New York : Curran Associates , 2017 : 1024 - 1034 .

HUSSAIN M S , ZAKI M J , SUBRAMANIAN D . Residual gated graph convnets [EB/OL ] . ( 2018-04-24 ) [ 2023-06-05 ] . http://arxiv.org/abs/1711.07553 http://arxiv.org/abs/1711.07553 .

XU K , HU W , LESKOVEC J , et al . How powerful are graph neural networks? [C ] // Proceedings of the International Conference on Learning Representations . New York : Open Review , 2019 : 1 - 17 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

A Medical Image Segmentation Network Based on Cross-Visual State Space and Multi-Branch Interactive Attention

Enhancing Multimodal Aspect-Based Sentiment Analysis with Adaptive Noise and Aspect Graph Association Learning

A Motion Planning Method for Autonomous Driving Based on Spatiotemporal Attention Transformer

Cross-Modal Light-3Dformer Model for Lung Tumor Classification

Related Author

XUE Wei

CHEN Chuang-hui

DU Ming-yang

ZHONG Ping

ZHENG Xiao

HUANG Chen

LIU Hui-jie

ZHANG Yan

Related Institution

College of Electronic Science and Technology, National University of Defense Technology

School of Computer Science and Technology, Anhui University of Technology, Maanshan

College of Electronic Engineering, National University of Defense Technology

School of Cyber Science and Technology, Hubei University

Hubei Province Project of Key Research Institute of Humanities and Social Sciences at Universities-RCIMPE

⁰