AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
PDF (5.1 MB)
Collect
Submit Manuscript AI Chat Paper
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Open Access

Adaptive cache policy optimization through deep reinforcement learning in dynamic cellular networks

Department of Information and Communications Engineering, Aalto University, Espoo 02150, Finland
Department of Electrical and Computer Engineering, University of California, Davis, CA 95616, USA
Show Author Information

Abstract

We explore the use of caching both at the network edge and within User Equipment (UE) to alleviate traffic load of wireless networks. We develop a joint cache placement and delivery policy that maximizes the Quality of Service (QoS) while simultaneously minimizing backhaul load and UE power consumption, in the presence of an unknown time-variant file popularity. With file requests in a time slot being affected by download success in the previous slot, the caching system becomes a non-stationary Partial Observable Markov Decision Process (POMDP). We solve the problem in a deep reinforcement learning framework based on the Advantageous Actor-Critic (A2C) algorithm, comparing Feed Forward Neural Networks (FFNN) with a Long Short-Term Memory (LSTM) approach specifically designed to exploit the correlation of file popularity distribution across time slots. Simulation results show that using LSTM-based A2C outperforms FFNN-based A2C in terms of sample efficiency and optimality, demonstrating superior performance for the non-stationary POMDP problem. For caching at the UEs, we provide a distributed algorithm that reaches the objectives dictated by the agent controlling the network, with minimum energy consumption at the UEs, and minimum communication overhead.

References

[1]

P. Hassanzadeh, A. M. Tulino, J. Llorca, and E. Erkip, Rate-memory trade-off for caching and delivery of correlated sources, IEEE Trans. Inf. Theory, vol. 66, no. 4, pp. 2219–2251, 2020.

[2]

H. Wu, Y. Fan, Y. Wang, H. Ma, and L. Xing, A comprehensive review on edge caching from the perspective of total process: Placement, policy and delivery, Sensors, vol. 21, no. 15, p. 5033, 2021.

[3]
B. Serbetci and J. Goseling, On optimal geographical caching in heterogeneous cellular networks, in Proc. IEEE Wireless Communications and Networking Conf. (WCNC), San Francisco, CA, USA, 2017, pp. 1–6.
[4]
B. Blaszczyszyn and A. Giovanidis, Optimal geographic caching in cellular networks, in Proc. IEEE Int. Conf. Communications (ICC), London, UK, 2015, pp. 3358–3363.
[5]

Y. Chen, M. Ding, J. Li, Z. Lin, G. Mao, and L. Hanzo, Probabilistic small-cell caching: Performance analysis and optimization, IEEE Trans. Veh. Technol., vol. 66, no. 5, pp. 4341–4354, 2017.

[6]

X. Xu and M. Tao, Modeling, analysis, and optimization of caching in multi-antenna small-cell networks, IEEE Trans. Wirel. Commun., vol. 18, no. 11, pp. 5454–5469, 2019.

[7]

M. Choi, A. F. Molisch, D. J. Han, D. Kim, J. Kim, and J. Moon, Probabilistic caching and dynamic delivery policies for categorized contents and consecutive user demands, IEEE Trans. Wirel. Commun., vol. 20, no. 4, pp. 2685–2699, 2021.

[8]
J. Wu, B. Chen, C. Yang, and Q. Li, Caching and bandwidth allocation policy optimization in heterogeneous networks, in Proc. IEEE 28th Annual Int. Symp. on Personal, Indoor, and Mobile Radio Communications (PIMRC), Montreal, Canada, 2017, pp. 1–6.
[9]

J. Wen, K. Huang, S. Yang, and V. O. K. Li, Cache-enabled heterogeneous cellular networks: Optimal tier-level content placement, IEEE Trans. Wirel. Commun., vol. 16, no. 9, pp. 5939–5952, 2017.

[10]

K. Li, C. Yang, Z. Chen, and M. Tao, Optimization and analysis of probabilistic caching in N-tier heterogeneous networks, IEEE Trans. Wirel. Commun., vol. 17, no. 2, pp. 1283–1297, 2018.

[11]

J. Wu, C. Yang, and B. Chen, Proactive caching and bandwidth allocation in heterogenous networks by learning from historical numbers of requests, IEEE Trans. Commun., vol. 68, no. 7, pp. 4394–4410, 2020.

[12]
Z. Wang, Z. Cao, Y. Cui, and Y. Yang, Joint and competitive caching designs in large-scale multi-tier wireless multicasting networks, in Proc. GLOBECOM 2017—2017 IEEE Global Communications Conf., Singapore, 2017, pp. 1–7.
[13]

Y. Cui and D. Jiang, Analysis and optimization of caching and multicasting in large-scale cache-enabled heterogeneous wireless networks, IEEE Trans. Wirel. Commun., vol. 16, no. 1, pp. 250–264, 2017.

[14]

C. Ye, Y. Cui, Y. Yang, and R. Wang, Optimal caching designs for perfect, imperfect, and unknown file popularity distributions in large-scale multi-tier wireless networks, IEEE Trans. Commun., vol. 67, no. 9, pp. 6612–6625, 2019.

[15]

M. Bayat, R. K. Mungara, and G. Caire, Achieving spatial scalability for coded caching via coded multipoint multicasting, IEEE Trans. Wirel. Commun., vol. 18, no. 1, pp. 227–240, 2019.

[16]

X. Peng, Y. Shi, J. Zhang, and K. B. Letaief, Layered Group sparse beamforming for cache-enabled green wireless networks, IEEE Trans. Commun., vol. 65, no. 12, pp. 5589–5603, 2017.

[17]

W. Sun, Y. Li, C. Hu, and M. Peng, Joint optimization of cache placement and bandwidth allocation in heterogeneous networks, IEEE Access, vol. 6, pp. 37250–37260, 2018.

[18]

F. Zhou, L. Fan, N. Wang, G. Luo, J. Tang, and W. Chen, A cache-aided communication scheme for downlink coordinated multipoint transmission, IEEE Access, vol. 6, pp. 1416–1427, 2018.

[19]

M. Amidzadeh, H. Al-Tous, G. Caire, and O. Tirkkonen, Caching in cellular networks based on multipoint multicast transmissions, IEEE Trans. Wirel. Commun., vol. 22, no. 4, pp. 2393–2408, 2023.

[20]
Y. Wei, Z. Zhang, F. R. Yu, and Z. Han, Joint user scheduling and content caching strategy for mobile edge networks using deep reinforcement learning, in Proc. IEEE Int. Conf. Communications Workshops (ICC Workshops), Kansas City, MO, USA, 2018, pp. 1–6.
[21]
D. Li, Y. Han, C. Wang, G. Shi, X. Wang, X. Li, and V. C. M. Leung, Deep reinforcement learning for cooperative edge caching in future mobile networks, in Proc. IEEE Wireless Communications and Networking Conf. (WCNC), Marrakesh, Morocco, 2019, pp. 1–6.
[22]

R. Li, C. Wang, Z. Zhao, R. Guo, and H. Zhang, The LSTM-based advantage actor-critic learning for resource management in network slicing with user mobility, IEEE Commun. Lett., vol. 24, no. 9, pp. 2005–2009, 2020.

[23]

Z. Zhang and M. Tao, Deep learning for wireless coded caching with unknown and time-variant content popularity, IEEE Trans. Wirel. Commun., vol. 20, no. 2, pp. 1152–1163, 2021.

[24]
M. Amidzadeh, H. Al-Tous, O. Tirkkonen, and J. Zhang, Joint cache placement and delivery design using reinforcement learning for cellular networks, in Proc. IEEE 93rd Vehicular Technology Conf. (VTC2021-Spring), Helsinki, Finland, 2021, pp. 1–6.
[25]
T. Ni, B. Eysenbach, and R. Salakhutdinov, Recurrent model-free RL is a strong baseline for many POMDPs, arXiv preprint arXiv: 2110.05038, 2021.
[26]
J. G. Andrews, A. K. Gupta, and H. S. Dhillon, A primer on cellular network analysis using stochastic geometry, arXiv preprint arXiv: 1604.03183, 2016.
[27]
M. Chiang, Networked Life. Cambridge, UK: Cambridge University Press, 2012.
[28]
L. Breslau, P. Cao, L. Fan, G. Phillips, and S. Shenker, Web caching and Zipf-like distributions: Evidence and implications, in Proc. IEEE Annual Joint Conference : INFOCOM, IEEE Computer and Communications Societies, New York, NY, USA, 1999, pp. 126–134.
[29]

M. Kumar, R. Rout, and D. Somayajulu, Cooperative cache update using multi-agent recurrent deep reinforcement learning for mobile edge networks, Comput. Netw., vol. 209, p. 108876, 2022.

[30]

E. Paluzo-Hidalgo, R. Gonzalez-Diaz, and M. A. Gutiérrez-Naranjo, Two-hidden-layer feed-forward networks are universal approximators: A constructive approach, Neural Netw., vol. 131, pp. 29–36, 2020.

[31]
A. Baisero and C. Amato, Unbiased asymmetric reinforcement learning under partial observability, arXiv preprint arXiv: 2105.11674v2, 2022.
[32]

S. Nath and J. Wu, Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems, Intelligent and Converged Networks, vol. 1, no. 2, pp. 181–198, 2020.

[33]

S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.

[34]

I. Grondman, L. Busoniu, G. A. D. Lopes, and R. Babuska, A survey of actor-critic reinforcement learning: Standard and natural policy gradients, IEEE Trans. Syst. Man Cybern. Part C Appl. Rev., vol. 42, no. 6, pp. 1291–1307, 2012.

[35]
3GPP, UMTS Universal Mobile Telecommunications System, RF system scenarios (3GPP TR 25.942 version 14.0. 0), Tech. Rep. ETSI TR 125 942, 3GPP, 2017.
Intelligent and Converged Networks
Pages 81-99
Cite this article:
Srinivasan A, Amidzadeh M, Zhang J, et al. Adaptive cache policy optimization through deep reinforcement learning in dynamic cellular networks. Intelligent and Converged Networks, 2024, 5(2): 81-99. https://doi.org/10.23919/ICN.2024.0007

167

Views

16

Downloads

0

Crossref

0

Scopus

Altmetrics

Received: 01 June 2023
Revised: 25 July 2023
Accepted: 07 November 2023
Published: 30 June 2024
© All articles included in the journal are copyrighted to the ITU and TUP.

This work is available under the CC BY-NC-ND 3.0 IGO license:https://creativecommons.org/licenses/by-nc-nd/3.0/igo/

Return