面向LLM开放域问答中多方私有表格筛选：一种MPC可公开聚合审计与动态信誉的增强方法

胡睿; 吴昊; 潘宇轩; 张琳; 刘雨; 朱孔林

doi:10.12263/DZXB.20250451

您当前的位置：

首页 >

文章列表页 >

面向LLM开放域问答中多方私有表格筛选：一种MPC可公开聚合审计与动态信誉的增强方法

大模型与互联网 | 更新时间：2025-12-27

- 面向LLM开放域问答中多方私有表格筛选：一种MPC可公开聚合审计与动态信誉的增强方法
- Multi-Party Private Table Screening for LLM-Driven ODQA: An Enhanced Method with MPC, Publicly Aggregable Audit, and Dynamic Reputation
- 电子学报 2025年53卷第9期页码：3089-3102
- 作者机构：
  
  北京邮电大学人工智能学院，北京 100876
- 作者简介：
  
  [ "胡睿男，2001年8月出生于北京市.现为北京邮电大学人工智能专业硕士研究生.主要研究方向为区块链、隐私保护和数字资产流通.E-mail: hurui@bupt.edu.cn" ]
  [ "吴昊男，1996年10月出生于北京市.现为北京邮电大学人工智能专业博士研究生.主要研究方向为区块链、共识算法和多方安全计算.E-mail: wuhaodoc@bupt.edu.cn" ]
  [ "潘宇轩男，1997年12月出生于广东省广州市.现为北京邮电大学信息与通信工程专业博士研究生.主要研究方向为沉浸式多媒体传输与处理、数字图像三维建模.E-mail: panyx@bupt.edu.cn" ]
  [ "张琳男，1974年7月出生于山东省济南市.现为北京市大数据中心主任，北京邮电大学兼职教授、博士生导师.主要研究方向为大数据与人工智能、网络信号处理等.曾多次荣获中国电子学会科技进步奖、北京市高校优秀教学成果奖.E-mail: zhanglin@bupt.edu.cn" ]
  [ "刘雨女，1978年10月出生于山东省莱阳市.现为北京邮电大学人工智能学院副教授、博士生导师.主要研究方向为图像处理、通信网理论和天地一体化信息网络等.2013年获北京市教育工会组织的北京市高校教学基本功比赛二等奖，入选首批北京高等学校“青年英才计划”.发表学术论文90余篇.E-mail: liuy@bupt.edu.cn" ]
  [ "朱孔林男，1985年11月出生于山东省临沂市.现为北京邮电大学人工智能学院教授、博士生导师.主要研究方向为车联网、边缘计算、隐私计算和数字资产流通等.主持国家重点研发计划青年科学家项目、国家自然科学基金、装备发展教育部联合基金、北京市自然科学基金等项目20余项，发表学术论文60余篇.E-mail: klzhu@bupt.edu.cn" ]
- 基金信息：
  
  国家重点研发计划(2023YFB2704500)
- DOI：10.12263/DZXB.20250451
  中图分类号： TP399;
- 收稿：2025-05-31，
  
  录用：2025-09-24，
  
  纸质出版：2025-09-25
- 稿件说明：
移动端阅览
胡睿, 吴昊, 潘宇轩, 等. 面向LLM开放域问答中多方私有表格筛选：一种MPC可公开聚合审计与动态信誉的增强方法[J]. 电子学报, 2025, 53(09): 3089-3102.

HU Rui, WU Hao, PAN Yu-xuan, et al. Multi-Party Private Table Screening for LLM-Driven ODQA: An Enhanced Method with MPC, Publicly Aggregable Audit, and Dynamic Reputation[J]. Acta Electronica Sinica, 2025, 53(09): 3089-3102.
胡睿, 吴昊, 潘宇轩, 等. 面向LLM开放域问答中多方私有表格筛选：一种MPC可公开聚合审计与动态信誉的增强方法[J]. 电子学报, 2025, 53(09): 3089-3102. DOI：10.12263/DZXB.20250451

HU Rui, WU Hao, PAN Yu-xuan, et al. Multi-Party Private Table Screening for LLM-Driven ODQA: An Enhanced Method with MPC, Publicly Aggregable Audit, and Dynamic Reputation[J]. Acta Electronica Sinica, 2025, 53(09): 3089-3102. DOI：10.12263/DZXB.20250451

摘要

大语言模型（Large Language Model，LLM）驱动的开放域问答（Open-Domain Question Answering，ODAQ）系统，如GIST（Generating Identifiers and Selecting chunks for Tables）框架，在处理海量表格数据时展现出巨大潜力，受到了广泛关注.然而，当ODQA系统需要整合多方私有表格数据进行Top-K候选筛选等环节时，传统方法需要访问全部原数据，这在数据隐私、计算透明度及参与方行为可信度方面面临挑战.虽然现有研究采用零知识证明和基于权益的机制实现了公开可验证性，但在大规模场景下生成和验证单个证明的开销过高，而传统的基于权益的机制在公平性和对动态环境的适应性方面也存在局限性.对此，本文基于多方安全计算（Multi-Party Computation，MPC）、可公开聚合审计与动态信誉机制，提出了一种面向LLM开放域问答中多方私有表格筛选的增强方法.将Top-K多方私有表格筛选过程通过MPC完成，以保护多方私有数据隐私.同时，引入高效的聚合审计机制，将零知识证明技术与随机抽样、聚合证明构造、基于时间窗口的批处理和错误定位相结合，确保评分与排序过程的正确性可以被批量、公开验证.基于区块链的动态信誉反馈机制的集成也增强了系统的公平性，并约束了恶意行为.实验评估表明，本文的Top-K候选筛选方法在保证隐私的同时与GIST原有筛选方法在结果上达到0.91的Top-50平均召回率和0.83的平均Jaccard指数，具有高度一致性，不会影响ODQA端到端任务性能.同时，大规模任务下可公开审计的证明和验证效率均得到提升，与单独的证明相比节省了约87%的证明时间.反馈机制的适应性和公平性也得到了增强.

Abstract

Large language model (LLM) driven open-domain question answering (ODQA) systems

exemplified by frameworks like GIST (Generating Identifiers and Selecting chunks for Tables)

have garnered considerable research attention due to their significant potential in processing extensive tabular data. However

when such ODQA systems integrate data from multiple providers for Top-K candidate screening

traditional methods requiring access to raw data encounter substantial challenges concerning data privacy

computational transparency

and participant trustworthiness. While existing research employs zero-knowledge proofs and stake-based mechanisms to achieve public verifiability

the overhead of generating and verifying individual proofs in large-scale scenarios is often prohibitive. Moreover

conventional stake-based mechanisms exhibit limitations in fairness and adaptability within dynamic environments. This paper proposes an enhanced method for multi-party private table screening in LLM-driven ODQA

which integrates multi-party computation (MPC)

a publicly aggregable audit mechanism

and a dynamic reputation system. This study adapt the Top-K multi-party private table screening process using MPC to ensure data privacy. Concurrently

an efficient aggregable audit mechanism is introduced; this mechanism combines zero-knowledge proof techniques with random sampling

aggregate proof construction

time-window-based batching

and error localization

thereby enabling the public and batch-verified correctness of the scoring and ranking process. The integration of a blockchain-based dynamic reputation feedback mechanism further enhances system fairness and constrains malicious behavior. Experimental evaluations demonstrate that our Top-K candidate screening method

while preserving privacy

achieves high consistency with the original GIST screening approach

attaining a Top-50 average recall of 0.91 and an average Jaccard index of 0.83

thus indicating minimal impact on end-to-end ODQA task performance. Furthermore

the efficiency of publicly auditable proof generation and verification for large-scale tasks is significantly improved

saving approximately 87% of proof time compared to individual proofs. The adaptability and fairness of the feedback mechanism are also demonstrably enhanced.

关键词

Keywords

references

JIN N Z , SIEBERT J , LI D F , et al . A survey on table question answering: Recent advances [M ] // Knowledge Graph and Semantic Computing: Knowledge Graph Empowers the Digital Economy . Singapore : Springer Nature Singapore , 2022 : 174 - 186 .

LIANG X Y , HU R , LIU Y , et al . Open-domain question answering over tables with large language models [C ] // Advanced Intelligent Computing Technology and Applications . Singapore : Springer , 2024 : 347 - 358 .

HARMELINK R , JOOSTEN R , TOPAN E , et al . Data: To share or not to share? A Semi-Systematic Literature Review in (rational) data sharing in inter-organizational systems [J ] . Discover Data , 2024 , 2 ( 1 ): 13 .

HERZIG J , MÜLLER T , KRICHENE S , et al . Open domain question answering over tables via dense retrieval [EB/OL ] . ( 2021-06-09 )[ 2025-05-21 ] . https://arXiv.org/abs/2103.12011 https://arXiv.org/abs/2103.12011 .

YANG Y H , WU J , LONG C N , et al . Blockchain-enabled multiparty computation for privacy preserving and public audit in industrial IoT [J ] . IEEE Transactions on Industrial Informatics , 2022 , 18 ( 12 ): 9259 - 9267 .

KELLER M . MP-SPDZ: A versatile framework for multi-party computation [C ] // Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM , 2020 : 1575 - 1590 .

LAKHANPAL S , KUMAR S V , RAJ V H , et al . Machine learning-enhanced blockchain solutions for sustainable energy consumption and carbon footprint tracking [C ] // 2024 OPJU International Technology Conference (OTCON) on Smart Computing for Innovation and Advancement in Industry 4.0 . Piscataway : IEEE , 2024 : 1 - 6 .

TAMMINA M R , POSINASETTY B , NAIR P S , et al . Machine learning enabled healthcare balancing patient privacy and data utility [C ] // 2024 Ninth International Conference on Science Technology Engineering and Mathematics . Piscataway : IEEE , 2024 : 1 - 6 .

KAIROUZ P , MCMAHAN H B , AVENT B , et al . Advances and open problems in federated learning [J ] . Foundations and Trends in Machine Learning , 2021 , 14 ( 1/2 ): 1 - 210 .

ZIEMS N , YU W H , ZHANG Z H , et al . Large language models are built-in autoregressive search engines [EB/OL ] . ( 2023-05-16 )[ 2025-05-21 ] . https://arXiv.org/abs/2305.09612 https://arXiv.org/abs/2305.09612 .

CHENG Z J , XIE T B , SHI P , et al . Binding language models in symbolic languages [EB/OL ] . ( 2023-03-01 )[ 2025-05-21 ] . https://arXiv.org/abs/2210.02875 https://arXiv.org/abs/2210.02875 .

LIU Y , HASHIMOTO K , ZHOU Y B , et al . Dense hierarchical retrieval for open-domain question answering [EB/OL ] . ( 2021-10-28 )[ 2025-05-21 ] . https://arXiv.org/abs/2110.15439 https://arXiv.org/abs/2110.15439 .

WU Y X , ZHAO Y , HU B T , et al . An efficient memory-augmented transformer for knowledge-intensive NLP tasks [EB/OL ] . ( 2022-10-30 )[ 2025-05-21 ] . https://arXiv.org/abs/2210.16773 https://arXiv.org/abs/2210.16773 .

DUA D , GUPTA S , SINGH S , et al . Successive prompting for decomposing complex questions [EB/OL ] . ( 2022-12-08 )[ 2025-05-21 ] . https://arXiv.org/abs/2212.04092 https://arXiv.org/abs/2212.04092 .

DESHMUKH P , CARTER B . Secure multi-party computation protocols for privacy-preserving data analysis [J ] . International Journal of Recent Advances in Engineering and Technology , 2023 , 12 ( 2 ): 1 - 6 .

LAVIN R , LIU X K , MOHANTY H , et al . A survey on the applications of zero-knowledge proofs [EB/OL ] . ( 2024-08-01 )[ 2025-05-21 ] . https://arXiv.org/abs/2408.00243 https://arXiv.org/abs/2408.00243 .

ZHU R Y , DING C C , HUANG Y . Efficient publicly verifiable 2PC over a blockchain with applications to financially-secure computations [C ] // Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security . New York : ACM , 2019 : 633 - 650 .

CORDI C , FRANK M P , GABERT K , et al . Auditable, available and resilient private computation on the blockchain via MPC [M ] // Cyber Security, Cryptology, and Machine Learning . Cham : Springer International Publishing , 2022 : 281 - 299 .

KANJALKAR S , ZHANG Y , GANDLUR S , et al . Publicly Auditable MPC-as-a-Service with succinct verification and universal setup [C ] // 2021 IEEE European Symposium on Security and Privacy Workshops . Piscataway : IEEE , 2021 : 386 - 411 .

SEO M . Fair and secure multi-party computation with cheater detection [J ] . Cryptography , 2021 , 5 ( 3 ): 19 .

JIANG Y B , ZHOU Y , FENG T . A blockchain-based secure multi-party computation scheme with multi-key fully homomorphic proxy re-encryption [J ] . Information , 2022 , 13 ( 10 ): 481 .

JIN S , LI Y , CHEN X , et al . Blockchain based publicly auditable multi-party computation with cheater detection [C ] // Information and Communications Security . Singapore : Springer , 2023 : 608 - 626 .

PEDERSEN T P . Non-interactive and information-theoretic secure verifiable secret sharing [C ] // Advances in Cryptology-CRYPTO’91 . Berlin : Springer , 1992 : 129 - 140 .

WENG C K , YANG K , KATZ J , et al . Wolverine: Fast, scalable, and communication-efficient zero-knowledge proofs for Boolean and arithmetic circuits [C ] // 2021 IEEE Symposium on Security and Privacy . Piscataway : IEEE , 2021 : 1074 - 1091 .

CHEN W H , CHANG M W , SCHLINGER E , et al . Open question answering over tables and text [EB/OL ] . ( 2021-02-10 )[ 2025-05-21 ] . https://arXiv.org/abs/2010.10439 https://arXiv.org/abs/2010.10439 .

WANG H Z , YANG Y J , WANG E , et al . Bilateral privacy-preserving worker selection in spatial crowdsourcing [J ] . IEEE Transactions on Dependable and Secure Computing , 2023 , 20 ( 3 ): 2533 - 2546 .

浏览量

下载量

CSCD

文章被引用时，请邮件提醒。

提交

工具集

关联资源

基于大语言模型的时空数据零样本插补

基于区块链的分层联邦学习系统

基于分片技术的区块链可扩展性研究综述

区块链赋能的战场分布式频谱分配：一种快速匹配算法