Reinforcement Learning With Selective Exploration for Interference Management in mmWave Networks

Dinh-van, Son and Nguyen, van-Linh and Cebecioglu, Berna Bulut and Masaracchia, Antonino and Higgins, Matthew D. (2025) Reinforcement Learning With Selective Exploration for Interference Management in mmWave Networks. IEEE Transactions on Machine Learning in Communications and Networking, 3. pp. 280-295. ISSN 2831-316X

[thumbnail of Reinforcement_Learning_With_Selective_Exploration_for_Interference_Management_in_mmWave_Networks.pdf]
Preview
Text
Reinforcement_Learning_With_Selective_Exploration_for_Interference_Management_in_mmWave_Networks.pdf - Published Version
Available under License Creative Commons Attribution.

Download (5MB)

Abstract

The next generation of wireless systems will leverage the millimeter-wave (mmWave) bands to meet the increasing traffic volume and high data rate requirements of emerging applications (e.g., ultra HD streaming, metaverse, and holographic telepresence). In this paper, we address the joint optimization of beamforming, power control, and interference management in multi-cell mmWave networks. We propose novel reinforcement learning algorithms, including a single-agent-based method (BPC-SA) for centralized settings and a multi-agent-based method (BPC-MA) for distributed settings. To tackle the high-variance rewards caused by narrow antenna beamwidths, we introduce a selective exploration method to guide the agent towards more intelligent exploration. Our proposed algorithms are well-suited for scenarios where beamforming vectors require control in either a discrete domain, such as a codebook, or in a continuous domain. Furthermore, they do not require channel state information, extensive feedback from user equipments, or any searching methods, thus reducing overhead and enhancing scalability. Numerical results demonstrate that selective exploration improves per-user spectral efficiency by up to 22.5% compared to scenarios without it. Additionally, our algorithms significantly outperform existing methods by 50% in terms of per-user spectral effciency and achieve 90% of the per-user spectral efficiency of the exhaustive search approach while requiring only 0.1% of its computational runtime.

Item Type: Article
Identification Number: 10.1109/TMLCN.2025.3537967
Dates:
Date
Event
3 February 2025
Accepted
3 February 2025
Published Online
Uncontrolled Keywords: Beam training, deep reinforcement learning, interference management, mmWave, multi agent, power control, selective exploration
Subjects: CAH11 - computing > CAH11-01 - computing > CAH11-01-01 - computer science
Divisions: Faculty of Computing, Engineering and the Built Environment > College of Computing
Depositing User: Gemma Tonks
Date Deposited: 08 Jul 2025 10:30
Last Modified: 08 Jul 2025 10:30
URI: https://www.open-access.bcu.ac.uk/id/eprint/16493

Actions (login required)

View Item View Item

Research

In this section...