• 学术论文 •

### 面向大规模网络测量的数据恢复算法：基于关联学习的张量填充

1. 1.湖南大学信息科学与工程学院，湖南 长沙 410006
2.中国科学院计算机网络信息中心，北京 100089
3.中国科学院大学，北京 100089
4.湖南友道信息技术有限公司，湖南 长沙 410006
• 收稿日期:2021-12-25 修回日期:2022-04-05 出版日期:2022-07-25
• 通讯作者:
• 谢鲲
• 作者简介:
• 欧阳与点 女，1996年5月出生于湖南省衡阳市.湖南大学信息科学与工程学院博士研究生.主要研究方向为网络测量、张量填充和深度学习.E-mail: yudian@hnu.edu.cn
谢 鲲（通讯作者） 女，1978年10月出生于湖南省怀化市.博士.湖南大学信息科学与工程学院教授，博士生导师.主要研究方向为计算机网络、网络测量、网络安全、大数据和人工智能.E-mail: xiekun@hnu.edu.cn
谢高岗 男，1974年5月出生于浙江省衢州市.博士.中国科学院计算机网络信息中心研究员，中国科学院大学岗位教授，博士生导师.主要从事计算机网络体系结构与系统的研究工作.E-mail: xie@cnic.cn
文吉刚 男，1978年3月出生于湖南省常德市.博士.中国科学院计算技术研究所博士后.现为湖南友道信息技术有限公司技术首席和湖南大学校外导师.主要从事高速网络测量和管理的研究和开发工作.E-mail: wenjigang@gmail.co
• 基金资助:
• 国家自然科学基金杰出青年基金 (62025201); 国家自然科学基金 (61972144)

### A Data Recovery Algorithm for Large-Scale Network Measurements: Association Learning Based Tensor Completion

OUYANG Yu-dian1, XIE Kun1(), XIE Gao-gang2,3, WEN Ji-gang1,4

1. 1.College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan 410006, China
2.Computer Network Information Center, Chinese Academy of Sciences, Beijing 100089, China
3.The University of Chinese Academy of Sciences, Beijing 100089, China
4.Hunan cnSunet Information Technology Co., Ltd, Changsha, Hunan 410006, China
• Received:2021-12-25 Revised:2022-04-05 Online:2022-07-25 Published:2022-07-30
• Corresponding author:
• XIE Kun
• Supported by:
• Distinguished Young Scholars Fund supported by National Natural Science Foundation of China (62025201); National Natural Science Foundation of China (61972144)

Abstract:

Network applications, such as network state tracking, service level agreement guarantee, and network fault location, rely on complete and accurate throughput measurement data. Due to the high measurement cost, it is hard to obtain network-wide throughput measurement data for network monitoring systems. Sparse network measurement techniques reduce the measurement cost based on sampling and recover missing data from partial network measurement data by exploiting spatio-temporal correlations within the data through algorithms such as tensor completion. However, existing studies only consider individual performance metrics and ignore the correlation information between multiple metrics, resulting in limited recovery accuracy and high overall measurement cost. This paper proposes a data recovery algorithm for large-scale network measurements—association learning based tensor completion(ALTC). To capture the complex correlations among network performance metrics, an association learning model is designed to reduce the network measurement cost by using the round-trip delay with low measurement overhead to infer the throughput with high measurement overhead. Based on this, a tensor completion model is designed to learn both the spatio-temporal correlation within the throughput measurement data and the external auxiliary correlation information from the round-trip delay, and finally obtain the network-wide throughput data with higher recovery accuracy. Experiments show that the recovery error of the proposed algorithm is 13% lower than that of the current mainstream methods at the same throughput measurement cost, achieving better recovery results.