电子学报 ›› 2022, Vol. 50 ›› Issue (12): 2884-2918.DOI: 10.12263/DZXB.20220821

• 《电子学报》创刊60周年专栏 • 上一篇    下一篇

面向机器学习模型安全的测试与修复

张笑宇1,2, 沈超1,2(), 蔺琛皓1,2, 李前1,2, 王骞3, 李琦4,5, 管晓宏1,2   

  1. 1.西安交通大学电子与信息学部网络空间安全学院,陕西 西安 710049
    2.智能网络与网络安全教育部重点实验室(西安交通大学),陕西 西安 710049
    3.武汉大学国家网络安全学院,湖北 武汉 430072
    4.清华大学网络科学与网络空间研究院,北京 100084
    5.中关村实验室,北京 100094
  • 收稿日期:2022-07-14 修回日期:2022-10-20 出版日期:2022-12-25
    • 通讯作者:
    • 沈超
    • 作者简介:
    • 张笑宇 男,出生于1999年,河南郑州人.西安交通大学网络空间安全学院博士研究生.主要研究领域为人工智能软件测试.E-mail: zxy0927@stu.xjtu.edu.cn
      沈超 男,出生于1985年,重庆人.博士,西安交通大学教授、博士生导师.主要研究领域为可信人工智能、人工智能安全和信息物理系统安全.
      蔺琛皓 男,出生于1989年,陕西西安人.博士,西安交通大学特聘研究员、博士生导师.主要研究领域为人工智能安全、对抗机器学习、智能身份认证.E-mail: linchenhao@xjtu.edu.cn
      李前 男,出生于1992年,陕西宝鸡人.博士,西安交通大学助理教授.主要研究领域为人工智能安全、对抗机器学习.E-mail: qianlix@xjtu.edu.cn
      王骞 男,出生于1980年,湖北武汉人.博士,武汉大学教授、博士生导师.主要研究领域为人工智能安全、云计算安全与隐私、无线系统安全、应用密码学.E-mail: qianwang@whu.edu.cn
      李琦 男, 出生于1979年,浙江临安人.博士,清华大学副教授、博士生导师.主要研究领域为互联网和云安全、移动安全、机器学习与安全、大数据安全、区块链与安全.E-mail: qli01@tsinghua.edu.cn
      管晓宏 男,出生于1955年,四川泸州人.博士,西安交通大学教授、博士生导师,中国科学院院士.主要研究领域为网络信息安全、网络化系统、电力系统优化调度.E-mail: xhguan@xjtu.edu.cn
    • 基金资助:
    • 科技创新2030——“新一代人工智能”重大项目(2020AAA0107702);国家自然科学基金(62161160337);陕西重点研发计划项目(2021ZD LGY01-02)

The Testing and Repairing Methods for Machine Learning Model Security

ZHANG Xiao-yu1,2, SHEN Chao1,2(), LIN Chen-hao1,2, LI Qian1,2, WANG Qian3, LI Qi4,5, GUAN Xiao-hong1,2   

  1. 1.School of Cyber Science and Engineering,Faculty of Electronic and Information Engineering,Xi’an Jiaotong;University,Xi’an,Shaanxi 710049,China
    2.Key Laboratory for Intelligent Networks and Network Security(Xi’an Jiaotong University),Xi’an,Shaanxi 710049,China
    3.School of Cyber Science and Engineering,Wuhan University,Wuhan,Hubei 430072,China
    4.Institute for Network Sciences and Cyberspace,Tsinghua University,Beijing 100084,China
    5.Zhongguancun Laboratory,Beijing 100094,China
  • Received:2022-07-14 Revised:2022-10-20 Online:2022-12-25 Published:2023-03-20
    • Corresponding author:
    • SHEN Chao

摘要:

近年来,以机器学习算法为代表的人工智能技术在计算机视觉、自然语言处理、语音识别等领域取得了广泛的应用,各式各样的机器学习模型为人们的生活带来了巨大的便利.机器学习模型的工作流程可以分为三个阶段.首先,模型接收人工收集或算法生成的原始数据作为输入,并通过预处理算法(如数据增强和特征提取)对数据进行预处理.随后,模型定义神经元或层的架构,并通过运算符(例如卷积和池)构建计算图.最后,模型调用机器学习框架的函数功能实现计算图并执行计算,根据模型神经元的权重计算输入数据的预测结果.在这个过程中,模型中单个神经元输出的轻微波动可能会导致完全不同的模型输出,从而带来巨大的安全风险.然而,由于对机器学习模型的固有脆弱性及其黑箱特征行为的理解不足,研究人员很难提前识别或定位这些潜在的安全风险,这为个人生命财产安全乃至国家安全带来了诸多风险和隐患.研究机器学习模型安全的相关测试与修复方法,对深刻理解模型内部风险与脆弱性、全面保障机器学习系统安全性以及促进人工智能技术的广泛应用有着重要意义.本文从不同安全测试属性出发,详细介绍了现有的机器学习模型安全测试和修复技术,总结和分析了现有研究中的不足,探讨针对机器学习模型安全的测试与修复的技术进展和未来挑战,为模型的安全应用提供了指导和参考.本文首先介绍了机器学习模型的结构组成和主要安全测试属性,随后从机器学习模型的三个组成部分即数据、算法和实现,六种模型安全相关测试属性即正确性、鲁棒性、公平性、效率、可解释性和隐私性,分析、归纳和总结了相关的测试与修复方法及技术,并探讨了现有方法的局限.最后本文讨论和展望了机器学习模型安全的测试与修复方法的主要技术挑战和发展趋势.

关键词: 人工智能安全, 机器学习安全, 机器学习模型测试, 机器学习模型修复, 软件测试, 软件修复

Abstract:

In recent years, artificial intelligence technology led by machine learning algorithms has been widely used in many fields, such as computer vision, natural language processing, speech recognition, etc. A variety of machine learning models have greatly facilitated people's lives. The workflow of a machine learning model consists of three stages. First, the model receives the raw data which is collected or generated by the developers as the model input and preprocesses the data through preprocessing algorithms, such as data augmentation and feature extraction. Subsequently, the model defines the architecture of neurons or layers in the model and constructs a computational graph through operators(e.g., convolution and pooling). Finally, the model calls the machine learning framework function to implement the operators and calculates the prediction result of the input data according to the weights of model neurons. In this process, slight fluctuations in the output of individual neurons in the model may lead to an entirely different model output, which can bring huge security risks. However, due to the insufficient understanding of the inherent vulnerability of machine learning models and their black box characteristic behaviors, it is difficult for researchers to identify or locate these potential security risks in advance. This brings many risks and hidden dangers to personal property safety and even national security. There is great significance to studying the testing and repairing methods for machine learning model security, which can help deeply understand the internal risks and vulnerabilities of models, comprehensively guarantee the security of machine learning systems, and widely apply artificial intelligence technology. The existing testing research for the machine learning model security has mainly focused on the correctness, robustness, and other testing properties of the model, and this research has achieved certain results. This paper intends to start from different security testing attributes, introduces the existing machine learning model security testing and repair technology in detail, summarizes and analyzes the deficiencies in the existing research, and discusses the technical progress and challenges of machine learning model security testing and repairing, providing guidance and reference for the safe application of the model. In this paper, we first introduce the structural composition and main testing properties of the machine learning model security. Afterwards, we systematically summarize and analyze the existing work from the three components of the machine learning model—data, algorithm, and implementation, and six model security-related testing properties-correctness, robustness, fairness, efficiency, interpretability, and privacy. We also discuss the effectiveness and limitations of the existing testing and repairing methods. Finally, we discuss several technical challenges and potential development directions of the testing and repairing methods for machine learning model security in the future.

Key words: artificial intelligence security, machine learning security, machine learning model testing, machine learning model repairing, software testing, software repairing

中图分类号: