journal article Open Access Jun 14, 2024

RayNet: A Simulation Platform for Developing Reinforcement Learning-Driven Network Protocols

Abstract
Reinforcement Learning (RL) has gained significant momentum in the development of network protocols. However, RL-based protocols are still in their infancy, and substantial research is required to build deployable solutions. Developing a protocol based on RL is a complex and challenging process that involves several model design decisions and requires significant training and evaluation in real and simulated network topologies. Network simulators offer an efficient training environment for RL-based protocols because they are deterministic and can run in parallel. In this article, we introduce
RayNet
, a scalable and adaptable simulation platform for the development of RL-based network protocols. RayNet integrates OMNeT++, a fully programmable network simulator, with Ray/RLlib, a scalable training platform for distributed RL. RayNet facilitates the methodical development of RL-based network protocols so that researchers can focus on the problem at hand and not on implementation details of the learning aspect of their research. We developed a simple RL-based congestion control approach as a proof of concept showcasing that RayNet can be a valuable platform for RL-based research in computer networks, enabling scalable training and evaluation. We compared RayNet with
ns3-gym
, a platform with similar objectives to RayNet, and showed that RayNet performs better in terms of how fast agents can collect experience in RL environments.
Topics

No keywords indexed for this article. Browse by subject →

References
63
[1]
Soheil Abbasloo, Chen-Yu Yen, and H. Jonathan Chao. 2020. Classic meets modern: A pragmatic learning-based congestion control for the Internet. In Proceedings of ACM SIGCOMM. 632–647.
[2]
Ian F. Akyildiz, Giacomo Morabito, and Sergio Palazzo. 2001. TCP-Peach: A new congestion control scheme for satellite IP networks. IEEE/ACM Transactions on Networking 9, 3 (2001), 307–321. 10.1109/90.929853
[3]
M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. 2010. Data center TCP (DCTCP). In Proceedings of ACM SIGCOMM. 63–74.
[4]
N. Aung, S. Dhelim, L. Chen, A. Lakas, W. Zhang, H. Ning, S. Chaib, and M. T. Kechadi. 2023. VeSoNet: Traffic-aware content caching for vehicular social networks using deep reinforcement learning. IEEE Transactions on Intelligent Transportation Systems 24, 8 (2023), 8638–8649. 10.1109/tits.2023.3250320
[5]
Neuronlike adaptive elements that can solve difficult learning control problems

Andrew G. Barto, Richard S. Sutton

IEEE Transactions on Systems, Man, and Cybernetics 1983 10.1109/tsmc.1983.6313077
[6]
Lawrence S. Brakmo, Sean W. O’Malley, and Larry L. Peterson. 1994. TCP Vegas: New techniques for congestion detection and avoidance. In Proceedings of ACM SIGCOMM. 24–35.
[7]
G. Brockman V. Cheung L. Pettersson J. Schneider J. Schulman J. Tang and W. Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540
[8]
Mo Dong, Qingxi Li, Doron Zarchy, P. Brighten Godfrey, and Michael Schapira. 2015. PCC: Re-architecting congestion control for consistent high performance. In Proceedings of USENIX NSDI. 395–408.
[9]
Mo Dong, Tong Meng, Doron Zarchy, Engin Arslan, Yossi Gilad, Brighten Godfrey, and Michael Schapira. 2018. PCC Vivace: Online-learning congestion control. In Proceedings of USENIX NSDI. 343–356.
[10]
J. Dowling, E. Curran, R. Cunningham, and V. Cahill. 2005. Using feedback in collaborative reinforcement learning to adaptively optimize MANET routing. IEEE Transactions on Systems, Man, and Cybernetics — Part A: Systems and Humans 35, 3 (2005), 360–372. 10.1109/tsmca.2005.846390
[11]
Benjamin Fuhrer, Yuval Shpigelman, Chen Tessler, Shie Mannor, Gal Chechik, Eitan Zahavi, and Gal Dalal. 2022. Implementing reinforcement learning datacenter congestion control in NVIDIA NICs. arXiv preprint arXiv:2207.02295 (2022).
[12]
Piotr Gawłowicz and Anatolij Zubow. 2019. ns-3 meets OpenAI Gym: The playground for machine learning in networking research. In Proceedings of ACM MSWIM.
[13]
CUBIC

Sangtae Ha, Injong Rhee, Lisong Xu

ACM SIGOPS Operating Systems Review 2008 10.1145/1400097.1400105
[14]
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In Proceedings of ICML. 1861–1870.
[15]
Ying He, Nan Zhao, and Hongxi Yin. 2017. Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach. IEEE Transactions on Vehicular Technology 67, 1 (2017), 44–55.
[16]
D. Horgan, J. Quan, D. Budden, G. Barth-Maron, M. Hessel, H. Van Hasselt, and D. Silver. 2018. Distributed prioritized experience replay. arXiv preprint arXiv:1803.00933 (2018).
[17]
Tianchi Huang, Rui-Xiao Zhang, Chao Zhou, and Lifeng Sun. 2018. QARC: Video quality aware rate control for real-time video streaming based on deep reinforcement learning. In Proceedings of ACM Multimedia. 1208–1216.
[18]
Nathan Jay, Noga Rotman, Brighten Godfrey, Michael Schapira, and Aviv Tamar. 2019. A deep reinforcement learning perspective on Internet congestion control. In Proceedings of ICML. 3050–3059.
[19]
Wei Jiang, Gang Feng, Shuang Qin, Tak Shing Peter Yum, and Guohong Cao. 2019. Multi-agent reinforcement learning for efficient content caching in mobile D2D networks. IEEE Transactions on Wireless Communications 18, 3 (2019), 1610–1622. 10.1109/twc.2019.2894403
[20]
Leonard Kleinrock. 2018. Internet congestion control using the power metric: Keep the pipe just full, but no fuller. Ad Hoc Networks 80 (2018), 142–157. 10.1016/j.adhoc.2018.05.015
[21]
Dzmitry Kliazovich, Fabrizio Granelli, and Daniele Miorandi. 2006. TCP Westwood+ enhancement in high-speed long-distance networks. In Proceedings of IEEE ICC. 710–715.
[22]
Dohyun Kwon, Joongheon Kim, David A. Mohaisen, and Wonjun Lee. 2020. Self-adaptive power control with deep reinforcement learning for millimeter-wave Internet-of-Vehicles video caching. Journal of Communications and Networks 22, 4 (2020), 326–337. 10.1109/jcn.2020.000022
[23]
Dehao Lan, Xiaobin Tan, Jinyang Lv, Yang Jin, and Jian Yang. 2019. A deep reinforcement learning based congestion Control Mechanism for NDN. In Proceedings of IEEE ICC. 1–7.
[24]
E. Liang, R. Liaw, R. Nishihara, P. Moritz, R. Fox, K. Goldberg, J. Gonzalez, M. Jordan, and I. Stoica. 2018. RLlib: Abstractions for distributed reinforcement learning. In Proceedings of ICML. 3053–3062.
[25]
T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).
[26]
Zoubir Mammeri. 2019. Reinforcement learning based routing in networks: Review and classification of approaches. IEEE Access 7 (2019), 55916–55950. 10.1109/access.2019.2913776
[27]
Hongzi Mao, Ravi Netravali, and Mohammad Alizadeh. 2017. Neural adaptive video streaming with Pensieve. In Proceedings of ACM SIGCOMM. 197–210.
[28]
Saverio Mascolo, Claudio Casetti, Mario Gerla, Medy Y. Sanadidi, and Ren Wang. 2001. TCP Westwood: Bandwidth estimation for enhanced transport over wireless links. In Proceedings of ACM MobiCom. 287–297.
[29]
Nicholas Mastronarde and Mihaela van der Schaar. 2011. Fast Reinforcement Learning for Energy-Efficient Wireless Communication. IEEE Transactions on Signal Processing 59, 12 (2011), 6262–6266. 10.1109/tsp.2011.2165211
[30]
R. Mittal, V. T. Lam, N. Dukkipati, E. Blem, H. Wassel, M. Ghobadi, A. Vahdat, Y. Wang, D. Wetherall, and D. Zats. 2015. TIMELY: RTT-based congestion control for the datacenter. ACM SIGCOMM Computer Communication Review 45, 4 (2015), 537–550. 10.1145/2829988.2787510
[31]
V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and M. Riedmiller. 2013. Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).
[32]
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Lian, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. 2018. Ray: A distributed framework for emerging AI applications. In Proceedings of USENIX OSDI. 561–577.
[33]
Oshri Naparstek and Kobi Cohen. 2018. Deep multi-user reinforcement learning for distributed dynamic spectrum access. IEEE Transactions on Wireless Communications 18, 1 (2018), 310–323. 10.1109/twc.2018.2879433
[34]
Ali Nasehzadeh and Ping Wang. 2020. A deep reinforcement learning-based caching strategy for Internet of Things. In Proceedings of IEEE/CIC ICCC. 969–974.
[35]
R. Netravali, A. Sivaraman, S. Das, A. Goyal, K. Winstein, J. Mickens, and H. Balakrishnan. 2015. Mahimahi: Accurate Record-and-Replay for HTTP. In Proceedings of USENIX ATC. 417–429.
[36]
Thanh Thi Nguyen and Vijay Janapa Reddi. 2019. Deep reinforcement learning for cyber security. IEEE Transactions on Neural Networks and Learning Systems (2019).
[37]
Christoph Paasch and Olivier Bonaventure. 2014. Multipath TCP. Commun. ACM 57, 4 (2014), 51–57. 10.1145/2578901
[38]
Guanhua Qiao, Supeng Leng, Sabita Maharjan, Yan Zhang, and Nirwan Ansari. 2020. Deep reinforcement learning for cooperative content caching in vehicular edge computing and networks. IEEE Internet of Things Journal 7, 1 (2020), 247–257. 10.1109/jiot.2019.2945640
[39]
Alessio Sacco, Matteo Flocco, Flavio Esposito, and Guido Marchetto. 2021. Owl: Congestion control with partially invisible networks via reinforcement learning. In Proceedings of IEEE INFOCOM. 1–10.
[40]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[41]
Hideyuki Shimonishi, MY Sanadidi, and Mario Gerla. 2005. Improving efficiency-friendliness tradeoffs of TCP in wired-wireless combined networks. In IEEE International Conference on Communications, 2005 (ICC 2005). 2005, Vol. 5. IEEE, 3548–3552. 10.1109/icc.2005.1495079
[42]
Anirudh Sivaraman, Keith Winstein, Pratiksha Thaker, and Hari Balakrishnan. 2014. An experimental study of the learnability of congestion control. ACM SIGCOMM Computer Communication Review 44, 4 (2014), 479–490. 10.1145/2740070.2626324
[43]
Kun Tan Jingmin Song, Qian Zhang, and Murari Sridharan. 2006. Compound TCP: A scalable and TCP-friendly congestion control for high-speed networks. Proceedings of PFLDnet 2006 (2006).
[44]
Giorgio Stampa, Marta Arias, David Sánchez-Charles, Victor Muntés-Mulero, and Albert Cabellos. 2017. A deep-reinforcement learning approach for software-defined networking routing optimization. arXiv preprint arXiv:1709.07080 (2017).
[45]
Penghao Sun, Junfei Li, Zehua Guo, Yang Xu, Julong Lan, and Yuxiang Hu. 2019. SINET: Enabling scalable network routing with deep reinforcement learning on partial nodes. In Proceedings of ACM SIGCOMM (Posters and Demos). 88–89.
[46]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press.
[47]
Tarik Taleb, Nei Kato, and Yoshiaki Nemoto. 2006. REFWA: An efficient and fair congestion control scheme for LEO satellite networks. IEEE/ACM Transactions on Networking 14, 5 (2006), 1031–1044. 10.1109/tnet.2006.883130
[48]
Chen Tessler, Yuval Shpigelman, Gal Dalal, Amit Mandelbaum, Doron Haritan Kazakov, Benjamin Fuhrer, Gal Chechik, and Shie Mannor. 2022. Reinforcement learning for datacenter congestion control. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 12615–12621.
[49]
Reinforcement Learning for IoT Security: A Comprehensive Survey

Aashma Uprety, Danda B. Rawat

IEEE Internet of Things Journal 2020 10.1109/jiot.2020.3040957
[50]
András Varga and Rudolf Hornig. 2008. An overview of the OMNeT++ simulation environment. In Proceedings of the 1st International Conference on Simulation Tools and Techniques for Communications, Networks and Systems & Workshops. 1–10.

Showing 50 of 63 references

Metrics
7
Citations
63
References
Details
Published
Jun 14, 2024
Vol/Issue
34(3)
Pages
1-25
License
View
Cite This Article
Luca Giacomoni, Basil Benny, George Parisis (2024). RayNet: A Simulation Platform for Developing Reinforcement Learning-Driven Network Protocols. ACM Transactions on Modeling and Computer Simulation, 34(3), 1-25. https://doi.org/10.1145/3653975
Related

You May Also Like

Mersenne twister

Makoto Matsumoto, Takuji Nishimura · 1998

4,300 citations

Cycle-Accurate Network on Chip Simulation with Noxim

Vincenzo Catania, Andrea Mineo · 2016

173 citations

Engineering Resilient Collective Adaptive Systems by Self-Stabilisation

Mirko Viroli, Giorgio Audrito · 2018

93 citations