An Intelligent Routing Technology Based on Deep Reinforcement Learning

doi:10.3969/j.issn.0372-2112.2020.11.011

PDF(2859 KB)

ACTA ELECTRONICA SINICA ›› 2020, Vol. 48 ›› Issue (11) : 2170-2177. DOI: 10.3969/j.issn.0372-2112.2020.11.011

An Intelligent Routing Technology Based on Deep Reinforcement Learning

SUN Peng-hao, LAN Ju-long, SHEN Juan, HU Yu-xiang

Author information +

History +

Abstract

With the expansion of network scale and network complexity, traditional routing algorithms cannot ensure both the calculation complexity and performance under the large fluctuation of spatial-temporal distribution of network traffic. In recent years, with the development of Software-Defined Networking (SDN) and Artificial Intelligence (AI), AI-based methods of automatic routing strategies are gaining attention. In this paper, we propose an intelligent network routing technology called SmartPath based on Deep Reinforcement Learning (DRL). With dynamic collection of network status, we can use DRL to generate routing policies automatically, thus ensuring that the routing policy can dynamically adapt to the change of network traffic. Experiment result shows that the proposed scheme can adjust the routing strategy dynamically without human experience on traffic analysis and can reduce the average end-to-end transmission delay by at least 10% compared with the state-of-art schemes.

Key words

routing optimization / software-defined networking (SDN) / artificial intelligence (AI) / deep reinforcement learning (DRL)

Cite this article

EndNote

Ris (Procite)

Bibtex

Download Citations

SUN Peng-hao, LAN Ju-long, SHEN Juan, HU Yu-xiang. An Intelligent Routing Technology Based on Deep Reinforcement Learning[J]. Acta Electronica Sinica, 2020, 48(11): 2170-2177. https://doi.org/10.3969/j.issn.0372-2112.2020.11.011

References

[1] Egilmez H E,Dane S T,Bagci K T,et al.OpenQoS:An OpenFlow controller design for multimedia delivery with end-to-end quality of service over software-defined networks[A].Signal & Information Processing Association Summit and Conference[C].USA:IEEE,2012.1-8.
[2] Liu J,Shroff N B,Xia C H,et al.Joint congestion control and routing optimization:An efficient second-order distributed approach[J].IEEE/ACM Transactions on Networking,2015,24(3):1-17.
[3] Clark D D,Partridge C,Ramming J C,et al.A knowledge plane for the internet[A].Proceedings of the 2003 Conference on Applications,Technologies,Architectures,and Protocols for Computer Communications[C].USA:ACM,2003.68-73.
[4] Thomas R W,Dasilva L A,Mackenzie A B.Cognitive networks[A].First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks[C].Baltimore,MD,USA:IEEE,2005.11-50.
[5] Derbel H,Agoulmine N,Salaün M.ANEMA:Autonomic network management architecture to support self-configuration and self-optimization in IP networks[J].Computer Networks,2009,53(3):418-430.
[6] Zorzi M,Zanella A,Testolin A,et al.COBANETS:A new paradigm for cognitive communications systems[A].International Conference on Computing[C].USA:IEEE,2016.DOI:10.1109/ICCNC.2016.7440625.
[7] Mestres A,Rodrigueznatal A,Carner J,et al.Knowledge-defined networking[J].ACM Sigcomm Computer Communication Review,2016,47(3):2-10.
[8] Agoulmine N,Balasubramaniam S,Botvitch D,et al.Challenges for autonomic network management[A].The 1st IEEE International Workshop on Modelling Autonomic Communications Environments (MACE)[C].USA:IEEE,2006.87-92.
[9] Nick McKeown,Tom Anderson,Hari Balakrishnan,et al.OpenFlow:Enabling innovation in campus networks[J].ACM SIGCOMM Computer Communication Review,2008,38(2):69-74.
[10] Sivaraman A,Kim C,Krishnamoorthy R,et al.DC.p4:Programming the forwarding plane of a data-center switch[A].ACM SIGCOMM[C].USA:ACM,2016.1-8.
[11] Barabas M,Boanea G,Andrei Bogdan R,Dobrota V.Congestion control based on distributed statistical QoS-aware routing management[J].Przeglad Elektrotechniczny,2013,89(2b):251-256.
[12] Huang W,Song G,Hong H,Xie K.Deep architecture for traffic flow prediction:Deep belief networks with multitask learning[J].IEEE Transactions on Intelligent Transportation Systems,2014,15(5):2191-2201.
[13] Justin A Boyan,Michael L Littman.Packet routing in dynamically changing networks:A reinforcement learning approach[A].Proceedings of the 6th International Conference on Neural Information Processing Systems[C].USA:ACM,1994.671-678.
[14] Xiao S,He D,Gong Z.Deep-q:Traffic-driven qos inference using deep generative network[A].Proceedings of the 2018 Workshop on Network Meets AI & ML[C].USA:ACM,2018.67-73.
[15] Rusek K,Suárez-Varela J,Mestres A,et al.Unveiling the potential of Graph Neural Networks for network modeling and optimization in SDN[A].Proceedings of the 2019 ACM Symposium on SDN Research[C].USA:ACM,2019.140-151.
[16] 章洋,范植华,何晓新,等.移动自组网络中多径路由的匿名安全[J].电子学报,2005,33(11):2022-2030. ZHANG Yang,FAN Zhi-Hua,HE Xiao-Xin,et al.Anonymous secure multipath routing in mobile ad-hoc networks[J].Acta Electronica Sinica,2005,33(11):2022-2030.(in Chinese)
[17] 张德干,葛辉,刘晓欢,等.一种基于Q-Learning策略的自适应移动物联网路由新算法[J].电子学报,2018,46(10):23-30. ZHANG De-gan,GE Hui,LIU Xiao-huan,et al.A kind of new routing algorithm with adaptivity for mobile IOT based on Q-learning[J].Acta Electronica Sinica,2018,46(10):2325-2332.(in Chinese)
[18] 马腾,胡宇翔,张校辉.基于深度增强学习的数据中心网络coflow调度机制[J].电子学报,2018,46(7):84-91. MA Teng,HU Yu-xiang,ZHANG Xiao-hui.Deep reinforcement learning based Coflow scheduling in data center networks[J].Acta Electronica Sinica,2018,46(7):1617-1624.(in Chinese)
[19] Shih-Chun Lin,Ian F Akyildiz,Pu Wang,Min Luo.QoS-aware adaptive routing in multi-layer hierarchical software defined networks:A reinforcement learning approach[A].In 2016 IEEE International Conference on Services Computing(SCC)[C].USA:IEEE,2016.25-33.
[20] Jiang J,Hu L,Hao P,et al.Q-FDBA:Improving QoE fairness for video streaming[J].Multimedia Tools & Applications,2017,(2):1-20.
[21] Giorgio Stampa,Marta Arias,David Sánchez-Charles,et al.A deep-reinforcement learning approach for software-defined networking routing optimization[J].arXiv Preprint,2017,arXiv:1709.07080.
[22] Haipeng Y,Tianle M,Xiaobin X,et al.NetworkAI:An intelligent network architecture for self-learning control strategies in software defined networks[J].IEEE Internet of Things Journal,2018,5(6):4319-4327.
[23] Salman S,Streiffer C,Chen H,et al.DeepConf:Automating data center network topologies management with machine learning[A].Proceedings of the 2018 Workshop on Network Meets AI & ML[C].USA:ACM,2018.8-14.
[24] Zhiyuan Xu,et al.Experience-driven networking:A deep reinforcement learning based approach[A].In IEEE INFOCOM2018[C].USA:IEEE,2018.1871-1879.
[25] Yu C,Lumezanu C,Zhang Y,Singh V,Jiang G,Madhyastha H V.FlowSense:Monitoring network utilization with zero measurement cost[A].International Conference on Passive and Active Network Measurement[C].Berlin:Springer,2013.31-41.
[26] Adrichem van N L M,Doerr C,Kuipers F A.OpenNetMon:Network monitoring in OpenFlow software-defined networks[A].Network Operations and Management Symposium (NOMS)[C].USA:IEEE,2014.1-8.
[27] Kim C,Sivaraman A,Katta N,et al.In-band network telemetry via programmable dataplanes[A].In the Demo Session at SIGCOMM[C].USA:INT,2015.1-15.
[28] Fujimoto S,van Hoof H,Meger D.Addressing function approximation error in actor-critic methods[J].arXiv Preprint,2018,arXiv:1802.09477.
[29] OS3E[OL].https://www.internet2.org/news/detail/4865/.2019.
[30] Lakhina A,Papagiannaki K,Crovella M,et al.Structural analysis of network traffic flows[A].Joint International Conference on Measurement and Modeling of Computer Systems[C].USA:ACM,2004.61-72.

Funding

National Natural Science Foundation of China (No.61521003, No.61702547, No.61872382); National Key Research and Development Program of China (No.2017YFB0803204); Key-Area Research and Development Program of Guangdong Province (No.2018B010113001)