Non-orthogonal multiple access (NOMA) is one of the promising techniques for spectrum efficiency in wireless networks. In this paper, we consider an uplink NOMA cognitive system, where the secondary users (SUs) can jointly transmit data to the cognitive base station (CBS) over the same spectrum resources. Thereafter, successive interference cancellation is applied at the CBS to retrieve signals transmitted by the SUs. In addition, the energy-constrained problem in wireless networks is taken into account. Therefore, we assume that the SUs are powered by a wireless energy harvester to prolong their operations; meanwhile, the CBS is equipped with a traditional electrical supply. Herein, we propose an actor–critic reinforcement learning approach to maximize the long-term throughput of the cognitive network. In particular, by interacting and learning directly from the environment over several time slots, the CBS can optimally assign the amount of transmission energy for each SU according to the remaining energy of the SUs and the availability of the primary channel. As a consequence, the simulation results verify that the proposed scheme outperforms other conventional approaches (such as Myopic NOMA and OMA), so the system reward is always maximized in the current time slot, in terms of overall throughput and energy efficiency.
Tạp chí khoa học Trường Đại học Cần Thơ
Lầu 4, Nhà Điều Hành, Khu II, đường 3/2, P. Xuân Khánh, Q. Ninh Kiều, TP. Cần Thơ
Điện thoại: (0292) 3 872 157; Email: tapchidhct@ctu.edu.vn
Chương trình chạy tốt nhất trên trình duyệt IE 9+ & FF 16+, độ phân giải màn hình 1024x768 trở lên