
Noise Filtering and Feature Enhancement Based Graph Neural Network Method for Fraud Detection
LI Kang-he, HUANG Zhen-hua
ACTA ELECTRONICA SINICA ›› 2023, Vol. 51 ›› Issue (11) : 3053-3060.
Noise Filtering and Feature Enhancement Based Graph Neural Network Method for Fraud Detection
Existing graph neural network (GNN)-based fraud detection methods have at least three shortcomings: (1) They do not adequately consider the problem of imbalanced distribution of sample labels. (2) They do not take into account the problem that fraudsters deliberately create noise to interfere with fraud detection in order to avoid detection by detectors. (3) They fail to consider the limitations of sparse connections for fraud data. To address these three shortcomings, this paper proposes a fraud detection method, called NFE-GNN (Noise Filtering and feature Enhancement based Graph Neural Network method for fraud detection), to improve the fraud detection performance. The proposed NFE-GNN method first employs a dataset-based fraud rate sampling technology to achieve a balance of benign and fraudulent samples. Based on this, a parameterized distance function is introduced to calculate the similarities between nodes, and the optimal noise filtering threshold is obtained through adaptive reinforcement learning. Finally, an effective algorithm is presented to increase the connections between fraudulent samples, and enrich the topology information in the graph to enhance the feature representation capability of fraudulent samples. The experimental results on two publicly available datasets demonstrate that the detection performance of the proposed NFE-GNN method is better than that of state-of-the-art graph neural network methods.
fraud detection / class imbalance / node classification / graph data / graph neural network / performance evaluation {{custom_keyword}} /
表1 数据集的统计信息 |
数据集 | 节点/个 | 欺诈率/% | 关系 | 边数/条 |
---|---|---|---|---|
YelpChi | 45 954 | 14.5 | R-U-R | 49 315 |
R-T-R | 573 616 | |||
R-S-R | 3 402 743 | |||
ALL | 3 846 979 | |||
Amazon | 11 944 | 9.5 | U-P-U | 175 608 |
U-S-U | 3 566 479 | |||
U-V-U | 1 036 737 | |||
ALL | 4 398 392 |
表2 本文NFE⁃GNN方法的超参数设置以及调参过程的参数选择 |
超参 | d | b | lr | k | | | | μ | L | |
---|---|---|---|---|---|---|---|---|---|---|
数据集 | YelpChi | {8,16,32,64} | 1 024 | 0.01 | {1,0.5,0.25,0.125} | {0.5,1,2,4} | 0.001 | {0.08,0.06,0.04,0.02,0.01} | 700 | {1,2,3,4,5,6} |
Amazon | {8,16,32,64} | 256 | 0.005 | {1,0.5,0.25,0.125} | {0.5,1,2,4} | 0.001 | {0.08,0.06,0.04,0.02,0.01} | 700 | {1,2,3,4,5,6} |
表3 本文NFE⁃GNN方法与现有9个模型的检测性能比较 |
数据集 | YelpChi | Amazon | |||||
---|---|---|---|---|---|---|---|
评估指标 | AUC | F 1-macro | G-Mean | AUC | F 1-macro | G-Mean | |
现有方法 | GCN | 0.598 3 | 0.562 0 | 0.436 5 | 0.836 9 | 0.648 6 | 0.571 8 |
GAT | 0.571 5 | 0.487 9 | 0.165 9 | 0.810 2 | 0.646 4 | 0.667 5 | |
GraphSAGE | 0.543 9 | 0.440 5 | 0.258 9 | 0.758 9 | 0.641 6 | 0.594 9 | |
DR-GCN | 0.592 1 | 0.552 3 | 0.403 8 | 0.829 5 | 0.648 8 | 0.535 7 | |
GraphSAINT | 0.699 9 | 0.596 0 | 0.590 8 | 0.870 1 | 0.762 6 | 0.796 3 | |
PC-GNN | 0.798 7 | 0.630 0 | 0.716 0 | 0.958 6 | 0.895 6 | 0.903 0 | |
GraphConsis | 0.698 3 | 0.587 0 | 0.585 7 | 0.874 1 | 0.751 2 | 0.767 7 | |
CARE-GNN | 0.761 9 | 0.633 2 | 0.679 1 | 0.906 7 | 0.899 0 | 0.896 2 | |
AO-GNN | 0.880 5 | 0.704 2 | 0.813 4 | 0.964 0 | 0.892 1 | 0.909 6 | |
本文方法 | NFE-GNN | 0.901 3 | 0.755 3 | 0.825 7 | 0.976 2 | 0.919 2 | 0.931 1 |
表4 NFE⁃GNN模型及其3个变种模型的检测性能 |
数据集 | YelpChi | Amazon | |||||
---|---|---|---|---|---|---|---|
评估指标 | AUC | F 1-macro | G-Mean | AUC | F 1-macro | G-Mean | |
变体方法 | NFE-GNN/n | 0.871 3 | 0.714 5 | 0.794 4 | 0.952 9 | 0.895 7 | 0.906 5 |
NFE-GNN/f | 0.891 2 | 0.747 9 | 0.812 1 | 0.968 9 | 0.911 2 | 0.920 3 | |
NFE-GNN/e | 0.882 5 | 0.728 8 | 0.810 1 | 0.965 4 | 0.899 8 | 0.916 2 | |
完整方法 | NFE-GNN | 0.901 3 | 0.755 3 | 0.825 7 | 0.976 2 | 0.919 2 | 0.931 1 |
1 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
2 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
3 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
4 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
5 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
6 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
7 |
徐冰冰, 岑科廷, 黄俊杰, 等. 图卷积神经网络综述[J]. 计算机学报, 2020, 43(5): 755-780.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
8 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
9 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
10 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
11 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
12 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
13 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
14 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
15 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
16 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
17 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
18 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
{{custom_ref.label}} |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
/
〈 |
|
〉 |