电子学报 ›› 2023, Vol. 51 ›› Issue (1): 67-75.DOI: 10.12263/DZXB.20211473

• 学术论文 • 上一篇    下一篇

基于分数基音延迟动态搜索的语音隐写算法

田晖1,2,3, 严艳1,2,3, 汤莉莉2,3,4, 吴俊彦1,2,3, 王慧东1,2,3, 全韩彧1,2,3   

  1. 1.华侨大学计算机科学与技术学院, 福建 厦门 361021
    2.厦门市数据安全与区块链技术重点实验室, 福建 厦门 361021
    3.福建省大数据智能与安全重点实验室, 福建 厦门 361021
    4.华侨大学机电及自动化学院, 福建 厦门 361021
  • 收稿日期:2021-11-01 修回日期:2022-09-02 出版日期:2023-01-25
    • 作者简介:
    • 田 晖 男,1982年10月出生,湖北赤壁人.博士,教授,博士生导师.主要研究领域为网络与信息安全、数据安全、人工智能安全、信息隐藏及检测、数字取证等.E-mail: htian@hqu.edu.cn
      严 艳 女,1997年2月出生,江西赣州人.华侨大学计算机科学与技术学院硕士研究生.主要研究方向为信息隐藏及检测、深度学习.
    • 基金资助:
    • 国家自然科学基金(61972168)

Speech Steganography Based on Dynamic Search of Fractional Pitch Delay

TIAN Hui1,2,3, YAN Yan1,2,3, TANG Li-li2,3,4, WU Jun-yan1,2,3, WANG Hui-dong1,2,3, QUAN Han-yu1,2,3   

  1. 1.College of Computer Science and Technology, Huaqiao University, Xiamen, Fujian 361021, China
    2.Xiamen Key Laboratory of Data Security and Blockchain Technology, Xiamen, Fujian 361021, China
    3.Fujian Key Laboratory of Big Data Intelligence and Security, Xiamen, Fujian 361021, China
    4.College of Mechatronics and Automation, Huaqiao University, Xiamen, Fujian 361021, China
  • Received:2021-11-01 Revised:2022-09-02 Online:2023-01-25 Published:2023-02-23
    • Supported by:
    • National Natural Science Foundation of China(61972168)

摘要:

论文提出了一种基于分数基音延迟动态搜索的语音隐写算法.该算法可根据隐藏容量(x比特/子帧)的需要将分数基音延迟候选值集合划分为2 x 个子集,每个子集代表不同的x比特信息.在闭环基音搜索过程中,可为每个子帧选择既能表示待嵌入隐秘信息且内插后的归一化相关系数最大的分数基音延迟候选值,从而有效降低隐写操作对于原始载体的影响.以目前IP语音系统中广泛使用的自适应多速率语音编码为例,对该算法从隐藏容量、不可感知性及抗检测性三方面进行了性能评估并与相关工作进行了对比分析.实验结果表明,本文提出的隐写算法较之现有基于基音延迟的隐写算法可在确保较高隐写容量的同时达到更好隐写安全性(即更好抗检测能力和不可感知性).

关键词: 语音隐写, 动态搜索, 分数基音延迟, 自适应多速率语音编码, 隐写安全性

Abstract:

In this paper, we present a speech steganography algorithm based on dynamic search of fractional pitch delay. The algorithm can divide the candidate value set of fractional pitch delay into 2 x subsets according to the needs of the covert capacity (x bits/subframe), where each subset represents different x bits of information. In the closed-loop pitch search process, the algorithm can select for each subframe the best candidate value of pitch delay that can not only denote the secret information but also make the interpolated normalized correlation coefficient largest. In this way, the impact of steganographic operations on the original carriers can be effectively reduced. Taking adaptive multi-rate speech codec widely used in the current Voice-over-IP systems as an example, the performance of presented algorithm has been evaluated from the aspects of covert capacity, imperceptibility and anti-detection, and compared with related works. Experimental results show that the proposed steganographic algorithm can achieve better steganography security (better resistance to detection and imperceptibility) than the existing steganographic methods based on pitch delay, while maintaining relatively high steganographic capacity.

Key words: speech steganography, dynamic search, fractional pitch delay, adaptive multi-rate speech codec, steganographic security

中图分类号: