面向智能通信的深度强化学习方法
使用强化学习提高深度强化学习算法的效率 #生活技巧# #学习技巧# #深度学习技巧#
摘要: 在如今信息爆炸的时代,无线通信终端的激增导致无线通信网络规模剧增。同时,人们日益提高的通信需求使无线通信网络必须通过精准的按需服务来充分利用有限的资源。这二者使得传统人工建模并优化求解的网络管理方法在未来将会遇到瓶颈。幸运的是,人工智能和机器学习的出现为解决这一问题提供了新的途径。作为一种数据驱动的机器学习方法,深度强化学习能够直接学习动态环境规律并得到最优决策。因此,深度强化学习能赋予网络依据自身环境进行自我优化管理的能力,令智能通信将成为可能。本文从资源管理、接入控制以及网络维护三方面介绍了深度强化学习在无线通信上的应用,以此说明深度强化学习是实现智能通信的有效途径。
Abstract: In the era of data explosion, the rapid growth of mobile devices makes the size of wireless networks increase tremendously. Meanwhile, people are having higher demands for wireless communications, which requires the networks to provide precisely on-demand services in order to exploit the limited resource. Due to the above two reasons, the traditional modeling-and-optimizing methods for network management will meet the performance bottleneck in the future. Fortunately, the appearance of artificial intelligence and machine learning provides a new solution to this issue. As a data-driven machine learning technique, deep reinforcement learning can directly learn the pattern of dynamic environments and use it to make optimal decisions. Hence, deep reinforcement learning enables wireless networks to manage and optimize themselves based on their environments, which makes it possible to realize intelligent communications. This paper introduces the application of deep reinforcement learning on wireless communications from the aspects of resource management, access control, and network maintenance, and illustrates that deep reinforcement learning is an effective approach to realizing intelligent communications.
[1]HUANG Y, TAN J, LIANG Y. Wireless big data: Transforming heterogeneous networks to smart networks[J]. Journal of Communications and Information Networks, 2017, 2(1): 19-32. DOI: 10.1007/s41650-017-0002-1
[2]LECUN Y, BENGIO Y, HINTON G. Deep learning[J]. Nature, 2015, 521(7553): 436-444. DOI: 10.1038/nature14539
[3]LUONG N C, HOANG D T, GONG S, et al. Applications of deep reinforcement learning in communications and networking: A survey[J]. IEEE Communications Surveys & Tutorials, 2019, 21(4): 3133-3174.
[4]SILVER D, HUANG A, MADDISON C J, et al. Mastering the game of Go with deep neural networks and tree search[J]. Nature, 2016, 529(7587): 484. DOI: 10.1038/nature16961
[5]SUTTON R S, BARTO A G. Introduction to reinforcement learning[M]. Cambridge: MIT press, 1998.
[6]CYBENKO G. Approximation by superpositions of a sigmoidal function[J]. Mathematics of control, signals and systems, 1989, 2(4): 303-314. DOI: 10.1007/BF02551274
[7]DAYAN P, ABBOTT L F. Theoretical neuroscience: computational and mathematical modeling of neural systems[M]. [S.l.]: The MIT Press, 2001.
[8]LECUN Y, BENGIO Y. Convolutional networks for images, speech, and time series[J]. The Handbook of Brain Theory and Neural Networks, 1995, 3361(10): 1995.
[9]MANDIC D P, CHAMBERS J. Recurrent neural networks for prediction: Learning algorithms, architectures and stability[M]. [S.l.]: John Wiley & Sons Inc, 2001.
[10]HOCHREITER S, SCHMIDHUBER J. Long short-term memory[J]. Neural Computation, 1997, 9(8): 1735-1780. DOI: 10.1162/neco.1997.9.8.1735
[11]MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529-533. DOI: 10.1038/nature14236
[12]VAN H H, GUEZ A, SILVER D. Deep reinforcement learning with double q-learning[C]//Thirtieth AAAI conference on artificial intelligence. Phoenix: [s. n.], 2016.
[13]WANG Z, SCHAUL T, HESSEL M, et al. Dueling network architectures for deep reinforcement learning[EB/OL]. (2016-04-15). http://arxiv.org/abs/1511.06581.
[14]LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[EB/OL]. (2019-07-05). http://arxiv.org/abs/1509.02971.
[15]MNIH V, BADIA A P, MIRZA M, et al. Asynchronous methods for deep reinforcement learning[C]//International Conference on Machine Learning. New York: ACM, 2016: 1928-1937.
[16]SCHULMAN J, LEVINE S, ABBEEL P, et al. Trust region policy optimization[C]//International Conference on Machine Learning. Lille: ACM, 2015: 1889-1897.
[17]SCHULMAN J, WOLSKI F, DHARIWAL P, et al. Proximal policy optimization algorithms[EB/OL]. (2017-03-14). http://arxiv/abs/1707.06347.
[18]TAN J, ZHANG L, LIANG Y, et al. Intelligent sharing for LTE and WiFi Systems in Unlicensed Bands: A Deep Reinforcement Learning Approach[J]. IEEE Transactions on Communications, DOI: 10.1109/TCOMM.2020.2971212.
[19]YU Y, WANG T, LIEW S. Deep-reinforcement learning multiple access for heterogeneous wireless networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(6): 1277-1290. DOI: 10.1109/JSAC.2019.2904329
[20]YU Y, LIEW S, WANG T. Non-uniform time-step deep Q-network for carrier-sense multiple access in heterogeneous wireless networks[EB/OL]. (2019-10-11). http://arxiv.org/abs.1910.05221.
[21]ALI R, SHAHIN N, ZIKRIA Y B, et al. Deep reinforcement learning paradigm for performance optimization of channel observation-based MAC protocols in dense WLANs[J]. IEEE Access, 2018, 7: 3500-3511.
[22]NAPARSTEK O, COHEN K. Deep multi-user reinforcement learning for distributed dynamic spectrum access[J]. IEEE Transactions on Wireless Communications, 2018, 18(1): 310-323.
[23]LIU X, XU Y, JIA L, et al. Anti-jamming communications using spectrum waterfall: A deep reinforcement learning approach[J]. IEEE Communications Letters, 2018, 22(5): 998-1001. DOI: 10.1109/LCOMM.2018.2815018
[24]LI X, FANG J, CHENG W, et al. Intelligent power control for spectrum sharing in cognitive radios: A deep reinforcement learning approach[J]. IEEE Access, 2018, 6: 25463-25473. DOI: 10.1109/ACCESS.2018.2831240
[25]NASIR Y S, GUO D. Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks[J]. IEEE Journal on Selected Areas in Communications, 2019, 37(10): 2239-2250. DOI: 10.1109/JSAC.2019.2933973
[26]TAN J, ZHANG L, LIANG Y. Deep Reinforcement Learning for Channel Selection and Power Control in D2D Networks[C]//2019 IEEE Global Communications Conference (GLOBECOM). Waikoloa: IEEE, 2019: 1-6.
[27]SADEGHI A, WANG G, GIANNAKIS G B. Deep reinforcement learning for adaptive caching in hierarchical content delivery networks[J]. IEEE Transactions on Cognitive Communications and Networking, 2019, 5(4): 1024-1033. DOI: 10.1109/TCCN.2019.2936193
[28]REN J, WANG H, HOU T, et al. Federated learning-based computation offloading optimization in edge computing-supported internet of things[J]. IEEE Access, 2019, 7: 69194-69201. DOI: 10.1109/ACCESS.2019.2919736
[29]KONEČNÝ J, MCMAHAN H B, YU F, et al. Federated learning: Strategies for improving communication efficiency[EB/OL]. (2017-10-30). http://arxiv.org/abs/1610.05492.
[30]WANG X, HAN Y, WANG C, et al. In-edge AI: Intelligentizing mobile edge computing, caching and communication by federated learning[J]. IEEE Network, 2019, 33(5): 156-165. DOI: 10.1109/MNET.2019.1800286
[31]HE Y, LIANG C, YU R, et al. Trust-based social networks with computing, caching and communications: A deep reinforcement learning approach[J]. IEEE Transactions on Network Science and Engineering, DOI: 10.1109/TNSE.2018.2865183.
[32]HE Y, ZHAO N, YIN H. Integrated networking, caching, and computing for connected vehicles: A deep reinforcement learning approach[J]. IEEE Transactions on Vehicular Technology, 2017, 67(1): 44-55.
[33]NDIKUMANA A, TRAN N H, HO T M, et al. Joint communication, computation, caching, and control in big data multi-access edge computing[EB/OL]. (2018-03-30). http://arxic.org/abs/1803.11512.
[34]DHIMAN A, SANDHA K S G. Vertical and horizontal handover in heterogeneous wireless networks[D]. Patiala: Thapar Institute of Engineering and Technology, 2013.
[35]ZHANG C, LIU Z, GU B, et al. A deep reinforcement learning based approach for cost-and energy-aware multi-flow mobile data offloading[J]. IEICE Transactions on Communications, 2018, 7: 1625-1634.
[36]XU Y, XU W, WANG Z, et al. Load balancing for ultradense networks: A deep reinforcement learning-based approach[J]. IEEE Internet of Things Journal, 2019, 6(6): 9399-9412. DOI: 10.1109/JIOT.2019.2935010
[37]ZHAO N, LIANG Y, NIYATO D, et al. Deep reinforcement learning for user association and resource allocation in heterogeneous cellular networks[J]. IEEE Transactions on Wireless Communications, 2019, 18(11): 5141-5152. DOI: 10.1109/TWC.2019.2933417
[38]MISMAR F B, EVANS B L. Deep Q-learning for self-organizing networks fault management and radio performance improvement[C]//2018 52nd Asilomar Conference on Signals, Systems, and Computers. Pacific Grove: IEEE, 2018: 1457-1461.
[39]JUNHONG Y E, ZHANG Y. DRAG: Deep reinforcement learning based base station activation in heterogeneous networks[J]. IEEE Transactions on Mobile Computing, DOI:10.1109/TMC.2019.2922602.
[40]LIU J, KRISHNAMACHARI B, ZHOU S, et al. DeepNap: Data-driven base station sleeping operations through deep reinforcement learning[J]. IEEE Internet of Things Journal, 2018, 5(6): 4273-4282. DOI: 10.1109/JIOT.2018.2846694
[41]WU J, YU P, FENG L, et al. 3D aerial base station position planning based on deep Q-network for capacity enhancement[C]//2019 IFIP/IEEE Symposium on Integrated Network and Service Management (IM). Washington DC: IEEE, 2019: 482-487.
[42]VOIGT P, BUSSCHE V. The EU general data protection regulation (GDPR)[M]. Cham: Springer, 2017.
[43]CANETTI R, FEIGE U, GOLDREICH O, et al. Adaptively secure multi-party computation[C]//Proceedings of the Twenty-Eighth annual ACM Symposium on Theory of Computing. Philadelphia: ACM, 1996: 639-648.
[44]DWORK C. Differential privacy: A survey of results[C]//International Conference on Theory and Applications of Models of Computation. Heidelberg: Springer, 2008: 1-19.
网址:面向智能通信的深度强化学习方法 https://www.yuejiaxmz.com/news/view/465683
相关内容
人工智能之深度学习的学习方法基于深度强化学习的新型电力系统调度优化方法综述
【深度学习】深度学习语音识别算法的详细解析
走向深度学习
中国交通报:城市轨道交通列车节能运行一体化方法及关键技术——深度强化学习 系统节能降耗
一文读懂!人工智能、机器学习、深度学习的区别与联系!
带你读《深入理解AutoML和AutoDL:构建自动化机器 学习与深度学习平台》之二:自动化人工智能
Dropout技术全面解析——深度学习中的泛化能力提升策略
深度学习中的优化问题(Optimization)
【科技前沿】用深度强化学习优化电网,让电力调度更聪明!