LLM-Enhanced Self-Supervised Domain Adaptation for Cross-Lingual Misinformation Detection in Social Networks

SHEN Hang; WANG Xu; WANG Tian-jing; DAI Yuan-fei; BAI Guang-wei

doi:10.12263/DZXB.20250470

您当前的位置：

首页 >

文章列表页 >

LLM-Enhanced Self-Supervised Domain Adaptation for Cross-Lingual Misinformation Detection in Social Networks

Large-Scale Models and the Internet | 更新时间：2026-02-10

- LLM-Enhanced Self-Supervised Domain Adaptation for Cross-Lingual Misinformation Detection in Social Networks
- ACTA ELECTRONICA SINICA Vol. 53, Issue 11, Pages: 3865-3879(2025)
- 作者机构：
  
  南京工业大学计算机与信息工程学院（人工智能学院），江苏南京 211816
- 作者简介：
- 基金信息：
  
  National Natural Science Foundation of China(61502230;61501224;62202221);Natural Science Foundation of Jiangsu Province(BK20201357)
- DOI：10.12263/DZXB.20250470
  CLC： TP393;
- Received：01 June 2025，
  
  Accepted：09 October 2025，
  
  Published：25 November 2025
- 稿件说明：
移动端阅览
沈航, 王旭, 王天荆, 等. 面向社交网络跨语言虚假信息检测的LLM增强自监督域自适应方法[J]. 电子学报, 2025, 53(11): 3865-3879.

SHEN Hang, WANG Xu, WANG Tian-jing, et al. LLM-Enhanced Self-Supervised Domain Adaptation for Cross-Lingual Misinformation Detection in Social Networks[J]. Acta Electronica Sinica, 2025, 53(11): 3865-3879.
沈航, 王旭, 王天荆, 等. 面向社交网络跨语言虚假信息检测的LLM增强自监督域自适应方法[J]. 电子学报, 2025, 53(11): 3865-3879. DOI：10.12263/DZXB.20250470

SHEN Hang, WANG Xu, WANG Tian-jing, et al. LLM-Enhanced Self-Supervised Domain Adaptation for Cross-Lingual Misinformation Detection in Social Networks[J]. Acta Electronica Sinica, 2025, 53(11): 3865-3879. DOI：10.12263/DZXB.20250470

摘要

在跨平台、跨语言的社交网络环境中，虚假信息的传播具有高隐蔽性和跨文化性，给舆情治理与社会信任体系带来了严峻挑战.由于不同语言和文化背景下文本的表达方式存在显著差异，传统基于深度学习的检测方法在跨域泛化与语义建模方面普遍存在性能退化问题，表现为跨域特征对齐不足、语义表示缺失以及对隐喻、情感和文化语境的理解能力受限.针对这些问题，本文提出一种大语言模型（Large Language Model，LLM）增强的自监督域自适应（Domain Adaptation，DA）检测框架，通过融合LLM的深层语义建模能力与对比学习（Contrastive Learning，CL）的判别特征学习机制，实现高鲁棒性与高泛化性的跨语言虚假信息检测.该方案构建一个从语义增强到特征对齐再到反馈优化的闭环体系.首先，通过基于Prompt的跨语言文本增强机制，引导LLM在生成数据时保持语义完整性与文化适配性，从而在保留原始语义核心的同时，生成符合目标语言风格的高质量文本样本，有效缓解跨语言场景中的语义鸿沟.随后，设计双维度对比策略，在词元层面对齐局部词汇特征，在语句层面对齐全局语义逻辑，从不同层面统一源域与目标域的数据表示，以提升特征分布一致性与跨语言检测的稳定性.最后，构建LLM辅助的跨语言联合训练机制，利用对比损失作为动态反馈信号，引导LLM在迭代微调过程中不断优化生成策略，促使增强样本的分布逐步靠近CL检测器的判别边界，从而实现跨语言数据增强与特征学习的协同演化.在中文社交平台数据集Weibo与英文突发事件数据集PHEME上的实验结果表明，所提方法在精确率和F1指标上显著优于商业LLM直接检测（如ChatGPT-4o）、主流深度学习模型（包括LSTM、TextCNN、RCNN、HAN）及LLM增强检测方法（如LACL）.在跨语言检测中，所提方法的平均检测精度相比基准方法提升幅度超过10个百分点.特征可视化分析进一步表明，所提方法能压缩类内特征差异、扩大类间判别间隔，从而获得更清晰的特征边界与更高的判别置信度.

Abstract

In cross-platform and cross-lingual social network environments

the spread of misinformation is characterized by high concealment and cross-cultural complexity

posing serious challenges to public opinion governance and social trust systems. Due to significant differences in linguistic and cultural expression

traditional deep learning-based detection methods often suffer from performance degradation in cross-domain generalization and semantic modeling

exhibiting insufficient cross-domain feature alignment

incomplete semantic representation

and limited understanding of metaphors

emotions

and cultural contexts. To address these limitations

this paper proposes a large language model (LLM)-enhanced self-supervised domain adaptation (DA) detection framework. By integrating the deep semantic modeling capacity of LLMs with the discriminative feature learning capability of contrastive learning (CL)

the framework achieves robust and generalizable cross-lingual misinformation detection. This solution establishes a closed-loop system encompassing semantic augmentation

feature alignment

and feedback optimization. First

a prompt-based cross-lingual text augmentation mechanism is designed to guide the LLM in maintaining semantic integrity and cultural adaptability during data generation. This enables the production of high-quality samples that preserve the semantic core of the original text while conforming to the linguistic style of the target language

effectively mitigating semantic gaps in cross-lingual contexts. Next

a dual-dimensional contrastive strategy aligns local lexical features at the token level and global semantic logic at the sentence level

unifying source and target domain representations at multiple levels to enhance feature distribution consistency and cross-lingual detection stability. Finally

an LLM-assisted cross-lingual training mechanism is introduced

where contrastive loss serves as a dynamic feedback signal to guide the iterative fine-tuning of the LLM. This process progressively refines the augmentation strategy

ensuring that the generated data distribution converges toward the CL detector’s decision boundary and enabling the co-evolution of cross-lingual data augmentation and feature learning. Experimental results on heterogeneous social media datasets

Weibo (a Chinese social platform) and PHEME (an English dataset of event-related rumor propagation)

demonstrate that the proposed method significantly outperforms commercial LLM direct detection (e.g.

ChatGPT-4o)

mainstream deep learning models (e.g.

LSTM

TextCNN

RCNN

HAN)

and existing LLM-enhanced methods (e.g.

LACL) in terms of accuracy and F1 score. In cross-lingual detection

the average detection accuracy of the proposed approach exceeds baseline methods by more than 10 percentage points. Further feature visualization analysis confirms that our method compresses intra-class variance and enlarges inter-class separability

resulting in clearer decision boundaries and higher classification confidence.

关键词

Keywords

references

高玉君 , 梁刚 , 蒋方婷 , 等 . 社会网络谣言检测综述 [J ] . 电子学报 , 2020 , 48 ( 7 ): 1421 - 1435 .

GAO Y J , LIANG G , JIANG F T , et al . A summary of social network rumor detection [J ] . Acta Electronica Sinica , 2020 , 48 ( 7 ): 1421 - 1435 . (in Chinese)

WILSON G , COOK D J . A survey of unsupervised deep domain adaptation [J ] . ACM Transactions on Intelligent Systems and Technology , 2020 , 11 ( 5 ): 1 - 46 .

SHANG L Y , ZHANG Y , CHEN B Z , et al . MMAdapt: A knowledge-guided multi-source multi-class domain adaptive framework for early health misinformation detection [C ] // Proceedings of the ACM Web Conference 2024 . New York : ACM , 2024 : 4653 - 4663 .

LIM J Y , LIM K M , LEE C P , et al . SCL: Self-supervised contrastive learning for few-shot image classification [J ] . Neural Networks , 2023 , 165 : 19 - 30 .

ZHENG P , QIN J , WANG S , et al . Memory-aided contrastive consensus learning for co-salient object detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 3 ): 3687 - 3695 .

LIU Y H , HUANG L , GIUNCHIGLIA F , et al . Improved graph contrastive learning for short text classification [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 17 ): 18716 - 18724 .

YIN Y J , ZENG J L , SU J S , et al . Multi-modal graph contrastive encoding for neural machine translation [J ] . Artificial Intelligence , 2023 , 323 : 103986 .

欧阳祺 , 陈鸿昶 , 刘树新 , 等 . 基于Bert-GNNs异质图注意力网络的早期谣言检测 [J ] . 电子学报 , 2024 , 52 ( 1 ): 311 - 323 .

OUYANG Q , CHEN H C , LIU S X , et al . Early rumor detection based on Bert-GNNs heterogeneous graph attention network [J ] . Acta Electronica Sinica , 2024 , 52 ( 1 ): 311 - 323 . (in Chinese)

JIANG Y Q , HUANG C , HUANG L H . Adaptive graph contrastive learning for recommendation [C ] // Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining . New York : ACM , 2023 : 4252 - 4261 .

LIU N , WANG X , HAN H , et al . Hierarchical contrastive learning enhanced heterogeneous graph neural network [J ] . IEEE Transactions on Knowledge and Data Engineering , 2023 , 35 ( 10 ): 10884 - 10896 .

兰玉乾 , 饶元 , 李冠呈 , 等 . 基于内在质量约束的文本生成和评价综述 [J ] . 电子学报 , 2024 , 52 ( 2 ): 633 - 659 .

LAN Y Q , RAO Y , LI G C , et al . A survey of text generation and evaluation based on intrinsic quality constraints [J ] . Acta Electronica Sinica , 2024 , 52 ( 2 ): 633 - 659 . (in Chinese)

HU B Z , SHENG Q , CAO J , et al . Bad actor, good advisor: Exploring the role of large language models in fake news detection [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 20 ): 22105 - 22113 .

HU W B , XU Y F , LI Y , et al . BLIVA: A simple multimodal LLM for better handling of text-rich visual questions [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 3 ): 2256 - 2264 .

DOU C X , SUN X H , WANG Y S , et al . Domain-adapted dependency parsing for cross-domain named entity recognition [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 11 ): 12737 - 12744 .

ZHANG X H , YU B W , CONG X , et al . Cross-domain NER under a divide-and-transfer paradigm [J ] . ACM Transactions on Information Systems , 2024 , 42 ( 5 ): 1 - 32 .

WANG Y , XIE H , HE J Y , et al . Cross-domain semantic transfer for domain generalization [J ] . ACM Transactions on Multimedia Computing, Communications, and Applications , 2025 , 21 ( 5 ): 1 - 24 .

ZHANG Z , LIU M H , WANG A H , et al . Collaborate to adapt: Source-free graph domain adaptation via bi-directional adaptation [C ] // Proceedings of the ACM Web Conference 2024 . New York : ACM , 2024 : 664 - 675 .

MENG Q W , QIAN H W , LIU Y , et al . MHCCL: Masked hierarchical cluster-wise contrastive learning for multivariate time series [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2023 , 37 ( 8 ): 9153 - 9161 .

BAI S H , ZHANG M , ZHOU W Q , et al . Prompt-based distribution alignment for unsupervised domain adaptation [J ] . Proceedings of the AAAI Conference on Artificial Intelligence , 2024 , 38 ( 2 ): 729 - 737 .

MA J , GAO W , MIRTRA P , et al . CHA M. Detecting rumors from microblogs with recurrent neural networks [C ] // Proceedings of the 25th International Joint Conference on Artificial Intelligence (IJCAI) . California : IJCAI , 2016 : 3818 - 3824 .

ZUBIAGA A , LIAKATA M , PROCTER R . Learning reporting dynamics during breaking news for rumour detection in social media [EB/OL ] . ( 2016-10-24 )[ 2025-02-24 ] . https://arxiv.org/abs/1610.07363 https://arxiv.org/abs/1610.07363 .

VASWANI A , SHAZEER N , PARMAR N , et al . Attention is all you need [C ] // Advances in Neural Information Processing Systems 30 . San Diego : NeurIPS , 2017 : 6000 - 6010 .

LV K , YANG Y Q , LIU T X , et al . Full parameter fine-tuning for large language models with limited resources [C ] // Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics . Stroudsburg : ACL , 2024 : 8187 - 8198 .

GU Y X , HAN X , LIU Z Y , et al . PPT: Pre-trained prompt tuning for few-shot learning [C ] // Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics . Stroudsburg : ACL , 2022 : 8410 - 8423 .

DING N , QIN Y J , YANG G , et al . Parameter-efficient fine-tuning of large-scale pre-trained language models [J ] . Nature Machine Intelligence , 2023 , 5 ( 3 ): 220 - 235 .

HU E J , WALLIS P , ALLEN-ZHUl Z , et al . LoRA: Low-rank adaptation of large language models [C ] // Proceedings of International Conference on Learning Representations . Appleton : ICLR , 2022 : 1 - 20 .

HE K M , FAN H Q , WU Y X , et al . Momentum contrast for unsupervised visual representation learning [C ] // 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2020 : 9726 - 9735 .

SHEN H , LI X , WANG X , et al . LLM-augmented contrastive learning for misinformation detection in social networks [J ] . IEEE Transactions on Computational Social Systems , 2025 , DOI: 10.1109/TCSS.2025.3599080 http://dx.doi.org/10.1109/TCSS.2025.3599080 .

BECK M , PÖPEL K , SPANRING M , et al . xLSTM: Extended long short-term memory [C ] // Advances in Neural Information Processing Systems 37 . San Diego : NeurIPS , 2024 : 107547 - 107603 .

KIM Y . Convolutional neural networks for sentence classification [C ] // Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing . Stroudsburg : ACL , 2014 : 1746 - 1751 .

GIRSHICK R , DONAHUE J , DARRELL T , et al . Rich feature hierarchies for accurate object detection and semantic segmentation [C ] // 2014 IEEE Conference on Computer Vision and Pattern Recognition . Piscataway : IEEE , 2014 : 580 - 587 .

YANG Z C , YANG D Y , DYER C , et al . Hierarchical attention networks for document classification [C ] // Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies . Stroudsburg : ACL , 2016 : 1480 - 1489 .

TIAN L , ZHANG X Z , LAU J H . Rumour detection via zero-shot cross-lingual transfer learning [M ] // Machine Learning and Knowledge Discovery in Databases . Research Track. Cham : Springer International Publishing , 2021 : 603 - 618 .

Views

下载量

CSCD

Alert me when the article has been cited

提交

Tools

Publicity Resources

No data

Related Author

No data

Related Institution

No data

⁰