面向跨模态通信的信息恢复技术

徐建博; 魏昕; 周亮

doi:10.12263/DZXB.20210945

PDF(3536 KB)

电子学报 ›› 2022, Vol. 50 ›› Issue (7) : 1631-1642. DOI: 10.12263/DZXB.20210945

学术论文

面向跨模态通信的信息恢复技术

作者信息 +

Information Recovery Technology for Cross-Modal Communications

Author information +

文章历史 +

本文亮点

针对多模态数据在传输过程中丢失、受到无线信道噪声污染而严重影响跨模态通信质量的问题，提出了一种面向跨模态通信的信息恢复技术，通过充分利用接收端已有数据，采用同模态一对一检索、跨模态一对一检索、跨模态一对多检索等方式，在接收端进行信息恢复.所提方法在公共数据集以及实际跨模态通信平台上进行验证，实验表明，该方法可以实现精准的信息恢复，有效提升了跨模态通信质量.

HeighLight

Aiming at the issues of multi-modal data loss and data pollution by noise of wireless channel during the transmission, which seriously affect the cross-modal communication quality, an information recovery technology for cross-modal communications is proposed. In this scheme, by making full use of the existing data at the receiving end, the information is recovered at the receiving end by means of one-to-one intra-modal retrieval, one-to-one cross-modal retrieval, one-to-many cross-modal retrieval, etc. Moreover, the proposed scheme is validated on an open data set and the practical cross-modal communication platform. Experimental results show that the scheme can achieve accurate multi-modal information recovery and effectively improve the quality of cross-modal communications.

导出引用

徐建博 , 魏昕 , 周亮. 面向跨模态通信的信息恢复技术[J]. 电子学报, 2022, 50(7): 1631-1642. https://doi.org/10.12263/DZXB.20210945

XU Jian-bo , WEI Xin , ZHOU Liang. Information Recovery Technology for Cross-Modal Communications[J]. Acta Electronica Sinica, 2022, 50(7): 1631-1642. https://doi.org/10.12263/DZXB.20210945

中图分类号： TP302

参考文献

原文顺序 | 文献年度倒序 | 文中引用次数倒序

1	CHENS Z, LIANGY C, SUNS H, et al. Vision, requirements, and technology trend of 6G: How to tackle the challenges of system coverage, capacity, user data-rate and movement speed[J]. IEEE Wireless Communications, 2020, 27(2): 218-228. 本文引用 [3]

2	张宏科, 冯博昊, 权伟. 智融标识网络基础研究[J]. 电子学报, 2019, 47(5): 977-982. ZHANGH K, FENGB H, QUANW. Fundamental research on smart integration identifier networking[J]. Acta Electronica Sinica, 2019, 47(5): 977-982. (in Chinese) 本文引用 [1]

3	CARLTONB. Nissan partners with HaptX to bring realistic touch to vehicle design[EB/OL]. (2019-03-08)[2022-04-26]. https://vrscout.com/news/nissan-haptx-vr-vehicle-design/ 本文引用 [1]

4	ZHOUL, WUD, CHENJ X, et al. Cross-modal collaborative communications[J]. IEEE Wireless Communications, 2020, 27(2): 112-117. 本文引用 [4]

5	MOSKVITCHK. Tactile Internet: 5G and the cloud on steroids[J]. Engineering & Technology, 2015, 10(4): 48-53. 本文引用 [1]

6	YUANZ, WEIX, CHENJ X, et al. Ultra-reliability connectivity with redundant D2D transmission scheme for tactile Internet[C]//2019 IEEE Globecom Workshops. Waikoloa, HI: IEEE, 2019: 1-6. 本文引用 [1]

7	ZHOUL. On data-driven delay estimation for media cloud[J]. IEEE Transactions on Multimedia, 2016, 18(5): 905-915. 本文引用 [1]

8	JANKOWSKIM, GÜNDÜZD, MIKOLAJCZYKK, et al. Wireless image retrieval at the edge[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(1): 89-100. 本文引用 [2]

9	JANKOWSKIM, GÜNDÜZD, MIKOLAJCZYKK. Deep joint source-channel coding for wireless image retrieval[C]//2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona: IEEE, 2020: 5070-5074. 本文引用 [1]

10	ZHOUL, WUD, WEIX, et al. Cross-modal stream scheduling for eHealth[J]. IEEE Journal on Selected Areas in Communications, 2021, 39(2): 426-437. 本文引用 [1]

11	GAOY, WEIX, KANGB, et al. Edge intelligence empowered cross-modal streaming transmission[J]. IEEE Network, 2021, 35(2): 236-243. 本文引用 [1]

12	LIUC F, HUANGW B, SUNF C, et al. LDS-FCM: A linear dynamical system based fuzzy C-means method for tactile recognition[J]. IEEE Transactions on Fuzzy Systems, 2019, 27(1): 72-83. 本文引用 [1]

13	LUOS, MOUW X, ALTHOEFERK, et al. Novel tactile-SIFT descriptor for object shape recognition[J]. IEEE Sensors Journal, 2015, 15(9): 5001-5009. 本文引用 [1]

14	CHUV, MCMAHONI, RIANOL, et al. Robotic learning of haptic adjectives through physical interaction[J]. Robotics and Autonomous Systems, 2015, 63(3): 279-292. 本文引用 [1]

15	WARD-CHERRIERB, PESTELLN, LEPORAN F. NeuroTac: A neuromorphic optical tactile sensor applied to texture recognition[C]//2020 IEEE International Conference on Robotics and Automation. Paris: IEEE, 2020: 2654-2660. 本文引用 [1]

16	李志欣, 凌锋, 张灿龙, 等. 融合两级相似度的跨媒体图像文本检索[J]. 电子学报, 2021, 49(2): 268-274. LIZ X, LINGF, ZHANGC L, et al. Cross-media image-text retrieval with two level similarity[J]. Acta Electronica Sinica, 2021, 49(2): 268-274. (in Chinese) 本文引用 [1]

17	HARDOOND R, SZEDMAKS, SHAWE-TAYLORJ. Canonical correlation analysis: An overview with application to learning methods[J]. Neural Computation, 2004, 16(12): 2639-2664. 本文引用 [2]

18	AKAHOS. A kernel method for canonical correlation analysis[J]. Tsukuba, Japan, 2006: 263-269. 本文引用 [2]

19	周沛, 陈后金, 于泽宽, 等. 跨模态医学图像预测综述[J]. 电子学报, 2019, 47(1):220-226. ZHOUP, CHENH J, YUZ K, et al. Review of cross-modality medical image prediction[J]. Acta Electronica Sinica, 2019, 47(1): 220-226. (in Chinese) 本文引用 [1]

20	SHANGX D, ZHANGH W, CHUAT S. Deep learning generic features for cross-media retrieval[C]//MMM 2016: Proceedings, Part I, of the 22nd International Conference on MultiMedia Modeling. Miami, FL: Springer, 2016: 264-275. 本文引用 [1]

21	WANGC, YANGH J, MEINELC. Deep semantic mapping for cross-modal retrieval[C]//2015 IEEE 27th International Conference on Tools with Artificial Intelligence. Vietri sul Mare: IEEE, 2015: 234-241. 本文引用 [1]

22	FAYEKH. Speech processing for machine learning: Filter banks, mel-frequency cepstral coefficients(MFCCs) and what's in-between[Z/OL]. (2016-04-21)[2022-04-26]. https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html 本文引用 [2]

23	SALVADORA, HYNESN, AYTARY, et al. Learning cross-modal embeddings for cooking recipes and food images[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI: IEEE, 2017: 3068-3076. 本文引用 [1]

24	HORIGUCHIS, KANDAN, NAGAMATSUK. Face-voice matching using cross-modal embeddings[C]//Proceedings of the 26th ACM international conference on Multimedia. Seoul: ACM, 2018: 1011-1019. 本文引用 [1]

25	STRESEM, SCHUWERKC, IEPUREA, et al. Multimodal feature-based surface material classification[J]. IEEE Transactions on Haptics, 2017, 10(2): 226-239. 本文引用 [1]

26	张峰, 钟宝江. 基于兴趣目标的图像检索[J]. 电子学报, 2018, 46(8):1915-1923. ZHANGF, ZHONGB J. Image retrieval based on interested objects[J]. Acta Electronica Sinica, 2018, 46(8): 1915-1923. (in Chinese) 本文引用 [1]

27	LUC Y, FENGJ S, CHENY D, et al. Tensor robust principal component analysis with a new tensor nuclear norm[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(4): 925-938. 本文引用 [1]

28	ECKERTM A, KAMDARN V, CHANGC E, et al. A cross-modal system linking primary auditory and visual cortices: Evidence from intrinsic fMRI connectivity analysis[J]. Human Brain Mapping, 2008, 29(7): 848-857. 本文引用 [1]

29	VINCENTP, LAROCHELLEH, LAJOIEI, et al. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion[J]. Journal of Machine Learning Research, 2010, 11(12): 3371-3408. 本文引用 [1]

30	WUY L, WANGS H, HUANGQ M. Multi-modal semantic autoencoder for cross-modal retrieval[J]. Neurocomputing, 2019, 331: 165-175. 本文引用 [1]

31	秦姣华, 黄家华, 向旭宇, 等. 基于卷积神经网络和注意力机制的图像检索[J]. 电讯技术, 2021, 61(3): 304-310. QINJ H, HUANGJ H, XIANGX Y, et al. Image retrieval based on convolutional neural network and attention mechanism[J]. Telecommunication Engineering, 2021, 61(3): 304-310. (in Chinese) 本文引用 [1]