1.南京邮电大学通信与信息工程学院,江苏南京 210003
2.南京邮电大学宽带无线通信与传感网技术教育部重点实验室,江苏南京 210003
[ "徐建博 男,1996年12月出生,江苏高邮人.南京邮电大学通信与信息工程学院硕士研究生.主要研究方向为多媒体通信. E-mail: xujianbo8881996@163.com" ]
[ "魏 昕 男,1983年1月出生,江苏南京人.博士,南京邮电大学教授,硕士生导师.主要研究方向为多媒体通信. E-mail: xwei@njupt.edu.cn" ]
[ "周 亮(通讯作者) 男,1981年11月出生,安徽芜湖人.博士,南京邮电大学教授,博士生导师.主要研究方向为多媒体通信. E-mail: liang.zhou@njupt.edu.cn" ]
收稿:2021-07-19,
修回:2021-09-28,
纸质出版:2022-07-25
移动端阅览
徐建博,魏昕,周亮.面向跨模态通信的信息恢复技术[J].电子学报,2022,50(07):1631-1642.
XU Jian-bo,WEI Xin,ZHOU Liang.Information Recovery Technology for Cross-Modal Communications[J].ACTA ELECTRONICA SINICA,2022,50(07):1631-1642.
徐建博,魏昕,周亮.面向跨模态通信的信息恢复技术[J].电子学报,2022,50(07):1631-1642. DOI: 10.12263/DZXB.20210945.
XU Jian-bo,WEI Xin,ZHOU Liang.Information Recovery Technology for Cross-Modal Communications[J].ACTA ELECTRONICA SINICA,2022,50(07):1631-1642. DOI: 10.12263/DZXB.20210945.
针对多模态数据在传输过程中丢失、受到无线信道噪声污染而严重影响跨模态通信质量的问题,提出了一种面向跨模态通信的信息恢复技术,通过充分利用接收端已有数据,采用同模态一对一检索、跨模态一对一检索、跨模态一对多检索等方式,在接收端进行信息恢复.所提方法在公共数据集以及实际跨模态通信平台上进行验证,实验表明,该方法可以实现精准的信息恢复,有效提升了跨模态通信质量.
Aiming at the issues of multi-modal data loss and data pollution by noise of wireless channel during the transmission
which seriously affect the cross-modal communication quality
an information recovery technology for cross-modal communications is proposed. In this scheme
by making full use of the existing data at the receiving end
the information is recovered at the receiving end by means of one-to-one intra-modal retrieval
one-to-one cross-modal retrieval
one-to-many cross-modal retrieval
etc. Moreover
the proposed scheme is validated on an open data set and the practical cross-modal communication platform. Experimental results show that the scheme can achieve accurate multi-modal information recovery and effectively improve the quality of cross-modal communications.
CHEN S Z , LIANG Y C , SUN S H , et al . Vision, requirements, and technology trend of 6G: How to tackle the challenges of system coverage, capacity, user data-rate and movement speed [J]. IEEE Wireless Communications , 2020 , 27 ( 2 ): 218 - 228 .
张宏科 , 冯博昊 , 权伟 . 智融标识网络基础研究 [J]. 电子学报 , 2019 , 47 ( 5 ): 977 - 982 .
ZHANG H K , FENG B H , QUAN W . Fundamental research on smart integration identifier networking [J]. Acta Electronica Sinica , 2019 , 47 ( 5 ): 977 - 982 . (in Chinese)
CARLTON B . Nissan partners with HaptX to bring realistic touch to vehicle design [EB/OL]. ( 2019-03-08 )[ 2022-04-26 ]. https://vrscout.com/news/nissan-haptx-vr-vehicle-design/ https://vrscout.com/news/nissan-haptx-vr-vehicle-design/ .
ZHOU L , WU D , CHEN J X , et al . Cross-modal collaborative communications [J]. IEEE Wireless Communications , 2020 , 27 ( 2 ): 112 - 117 .
MOSKVITCH K . Tactile Internet: 5G and the cloud on steroids [J]. Engineering & Technology , 2015 , 10 ( 4 ): 48 - 53 .
YUAN Z , WEI X , CHEN J X , et al . Ultra-reliability connectivity with redundant D2D transmission scheme for tactile Internet [C]// 2019 IEEE Globecom Workshops . Waikoloa, HI : IEEE , 2019 : 1 - 6 .
ZHOU L . On data-driven delay estimation for media cloud [J]. IEEE Transactions on Multimedia , 2016 , 18 ( 5 ): 905 - 915 .
JANKOWSKI M , GÜNDÜZ D , MIKOLAJCZYK K , et al . Wireless image retrieval at the edge [J]. IEEE Journal on Selected Areas in Communications , 2021 , 39 ( 1 ): 89 - 100 .
JANKOWSKI M , GÜNDÜZ D , MIKOLAJCZYK K . Deep joint source-channel coding for wireless image retrieval [C]// 2020 IEEE International Conference on Acoustics, Speech and Signal Processing . Barcelona : IEEE , 2020 : 5070 - 5074 .
ZHOU L , WU D , WEI X , et al . Cross-modal stream scheduling for eHealth [J]. IEEE Journal on Selected Areas in Communications , 2021 , 39 ( 2 ): 426 - 437 .
GAO Y , WEI X , KANG B , et al . Edge intelligence empowered cross-modal streaming transmission [J]. IEEE Network , 2021 , 35 ( 2 ): 236 - 243 .
LIU C F , HUANG W B , SUN F C , et al . LDS-FCM: A linear dynamical system based fuzzy C-means method for tactile recognition [J]. IEEE Transactions on Fuzzy Systems , 2019 , 27 ( 1 ): 72 - 83 .
LUO S , MOU W X , ALTHOEFER K , et al . Novel tactile-SIFT descriptor for object shape recognition [J]. IEEE Sensors Journal , 2015 , 15 ( 9 ): 5001 - 5009 .
CHU V , MCMAHON I , RIANO L , et al . Robotic learning of haptic adjectives through physical interaction [J]. Robotics and Autonomous Systems , 2015 , 63 ( 3 ): 279 - 292 .
WARD-CHERRIER B , PESTELL N , LEPORA N F . NeuroTac: A neuromorphic optical tactile sensor applied to texture recognition [C]// 2020 IEEE International Conference on Robotics and Automation . Paris : IEEE , 2020 : 2654 - 2660 .
李志欣 , 凌锋 , 张灿龙 , 等 . 融合两级相似度的跨媒体图像文本检索 [J]. 电子学报 , 2021 , 49 ( 2 ): 268 - 274 .
LI Z X , LING F , ZHANG C L , et al . Cross-media image-text retrieval with two level similarity [J]. Acta Electronica Sinica , 2021 , 49 ( 2 ): 268 - 274 . (in Chinese)
HARDOON D R , SZEDMAK S , SHAWE-TAYLOR J . Canonical correlation analysis: An overview with application to learning methods [J]. Neural Computation , 2004 , 16 ( 12 ): 2639 - 2664 .
AKAHO S . A kernel method for canonical correlation analysis [J]. Tsukuba , Japan , 2006 : 263 - 269 .
周沛 , 陈后金 , 于泽宽 , 等 . 跨模态医学图像预测综述 [J]. 电子学报 , 2019 , 47 ( 1 ): 220 - 226 .
ZHOU P , CHEN H J , YU Z K , et al . Review of cross-modality medical image prediction [J]. Acta Electronica Sinica , 2019 , 47 ( 1 ): 220 - 226 . (in Chinese)
SHANG X D , ZHANG H W , CHUA T S . Deep learning generic features for cross-media retrieval [C]// MMM 2016: Proceedings , Part I, of the 22nd International Conference on MultiMedia Modeling . Miami, FL : Springer , 2016 : 264 - 275 .
WANG C , YANG H J , MEINEL C . Deep semantic mapping for cross-modal retrieval [C]// 2015 IEEE 27th International Conference on Tools with Artificial Intelligence . Vietri sul Mare : IEEE , 2015 : 234 - 241 .
FAYEK H . Speech processing for machine learning: Filter banks, mel-frequency cepstral coefficients(MFCCs) and what's in-between [Z/OL]. ( 2016-04-21 )[ 2022-04-26 ]. https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html .
SALVADOR A , HYNES N , AYTAR Y , et al . Learning cross-modal embeddings for cooking recipes and food images [C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition . Honolulu, HI : IEEE , 2017 : 3068 - 3076 .
HORIGUCHI S , KANDA N , NAGAMATSU K . Face-voice matching using cross-modal embeddings [C]// Proceedings of the 26th ACM international conference on Multimedia . Seoul : ACM , 2018 : 1011 - 1019 .
STRESE M , SCHUWERK C , IEPURE A , et al . Multimodal feature-based surface material classification [J]. IEEE Transactions on Haptics , 2017 , 10 ( 2 ): 226 - 239 .
张峰 , 钟宝江 . 基于兴趣目标的图像检索 [J]. 电子学报 , 2018 , 46 ( 8 ): 1915 - 1923 .
ZHANG F , ZHONG B J . Image retrieval based on interested objects [J]. Acta Electronica Sinica , 2018 , 46 ( 8 ): 1915 - 1923 . (in Chinese)
LU C Y , FENG J S , CHEN Y D , et al . Tensor robust principal component analysis with a new tensor nuclear norm [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence , 2020 , 42 ( 4 ): 925 - 938 .
ECKERT M A , KAMDAR N V , CHANG C E , et al . A cross-modal system linking primary auditory and visual cortices: Evidence from intrinsic fMRI connectivity analysis [J]. Human Brain Mapping , 2008 , 29 ( 7 ): 848 - 857 .
VINCENT P , LAROCHELLE H , LAJOIE I , et al . Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion [J]. Journal of Machine Learning Research , 2010 , 11 ( 12 ): 3371 - 3408 .
WU Y L , WANG S H , HUANG Q M . Multi-modal semantic autoencoder for cross-modal retrieval [J]. Neurocomputing , 2019 , 331 : 165 - 175 .
秦姣华 , 黄家华 , 向旭宇 , 等 . 基于卷积神经网络和注意力机制的图像检索 [J]. 电讯技术 , 2021 , 61 ( 3 ): 304 - 310 .
QIN J H , HUANG J H , XIANG X Y , et al . Image retrieval based on convolutional neural network and attention mechanism [J]. Telecommunication Engineering , 2021 , 61 ( 3 ): 304 - 310 . (in Chinese)
0
浏览量
10
下载量
2
CSCD
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621