PDF (2.3 MB)
Collect
Submit Manuscript
Original Article | Open Access

A novel ensemble ARIMA‐LSTM approach for evaluating COVID‐19 cases and future outbreak preparedness

Somit JainShobhit AgrawalEshaan MohapatraKathiravan Srinivasan ()
School of Computer Science and Engineering, Vellore Institute of Technology, Vellore, India

Somit Jain, Shobhit Agrawal, and Eshaan Mohapatra contributed equally to this study.

Show Author Information

Graphical Abstract

View original image Download original image
A novel ensemble model that combines auto‐regressive integrated moving average (ARIMA) and long short‐term memory (LSTM) models is proposed to improve the forecasting accuracy of COVID‐19 cases. The proposed ensemble model achieves superior performance compared to individual ARIMA, LSTM, and other baseline forecasting models. This approach has the potential to enhance preparedness for future outbreaks by providing more accurate forecasts of COVID‐19 cases.

Abstract

Background

The global impact of the highly contagious COVID‐19 virus has created unprecedented challenges, significantly impacting public health and economies worldwide. This research article conducts a time series analysis of COVID‐19 data across various countries, including India, Brazil, Russia, and the United States, with a particular emphasis on total confirmed cases.

Methods

The proposed approach combines auto‐regressive integrated moving average (ARIMA)'s ability to capture linear trends and seasonality with long short‐term memory (LSTM) networks, which are designed to learn complex nonlinear dependencies in the data. This hybrid approach surpasses both individual models and existing ARIMA‐artificial neural network (ANN) hybrids, which often struggle with highly nonlinear time series like COVID‐19 data. By integrating ARIMA and LSTM, the model aims to achieve superior forecasting accuracy compared to baseline models, including ARIMA, Gated Recurrent Unit (GRU), LSTM, and Prophet.

Results

The hybrid ARIMA‐LSTM model outperformed the benchmark models, achieving a mean absolute percentage error (MAPE) score of 2.4%. Among the benchmark models, GRU performed the best with a MAPE score of 2.9%, followed by LSTM with a score of 3.6%.

Conclusions

The proposed ARIMA‐LSTM hybrid model outperforms ARIMA, GRU, LSTM, Prophet, and the ARIMA‐ANN hybrid model when evaluating using metrics like MAPE, symmetric mean absolute percentage error, and median absolute percentage error across all countries analyzed. These findings have the potential to significantly improve preparedness and response efforts by public health authorities, allowing for more efficient resource allocation and targeted interventions.

References

1

Shastri S, Singh K, Kumar S, Kour P, Mansotra V. Time series forecasting of Covid‐19 using deep learning models: India‐USA comparative case study. Chaos, Solitons Fractals. 2020;140:110227. https://doi.org/10.1016/j.chaos.2020.110227

2

Zeroual A, Harrou F, Dairi A, Sun Y. Deep learning methods for forecasting COVID‐19 time‐series data: a comparative study. Chaos, Solitons Fractals. 2020;140:110121. https://doi.org/10.1016/j.chaos.2020.110121

3
Ang JS, Ng KW, Chua FF. Modeling time series data with deep learning: a review, analysis, evaluation and future trend. 2020 8th International Conference on Information Technology and Multimedia (ICIMU), Selangor, Malaysia, 2020:32–7.
4

Azad AS, Sokkalingam R, Daud H, Adhikary SK, Khurshid H, Mazlan SNA, et al. Water level prediction through hybrid SARIMA and ANN models based on time series analysis: red hills reservoir case study. Sustainability. 2022;14(3):1843. https://doi.org/10.3390/su14031843

5

Swaraj A, Verma K, Kaur A, Singh G, Kumar A, Melo de Sales L. Implementation of stacking based ARIMA model for prediction of Covid‐19 cases in India. J Biomed Inf. 2021;121:103887. https://doi.org/10.1016/j.jbi.2021.103887

6

Paidipati K, Banik A. Forecasting of rice cultivation in India—a comparative analysis with ARIMA and LSTM‐NN models. ICST Trans Scalable Inf Syst. 2018;7(24):161409. https://doi.org/10.4108/eai.13-7-2018.161409

7

Chimmula VKR, Zhang L. Time series forecasting of COVID‐19 transmission in Canada using LSTM networks. Chaos, Solitons Fractals. 2020;135:109864. https://doi.org/10.1016/j.chaos.2020.109864

8

Gupta R, Pandey G, Chaudhary P, Pal SK. Machine learning models for government to predict COVID‐19 outbreak. Digit Gov: Res Pract. 2020;1(4):1–6. https://doi.org/10.1145/3411761

9

Kumar M, Gupta S, Kumar K, Sachdeva M. Spreading of covid‐19 in India, Italy, Japan, Spain, UK, US. Digit Gov: Res Pract. 2020;1(4):1–9. https://doi.org/10.1145/3411760

10
Ismail L, Materwala H, Hennebelle A. Forecasting COVID‐19 infections in gulf cooperation council (GCC) countries using machine learning. In Proceedings of the 13th International Conference on Computer Modeling and Simulation 2021 Jun 25 (pp. 231–6).
11

Khan FM, Gupta R. ARIMA and NAR based prediction model for time series analysis of COVID‐19 cases in India. J Saf Sci Resil. 2020;1(1):12–8. https://doi.org/10.1016/j.jnlssr.2020.06.007

12

Chaurasia V, Pal S. Application of machine learning time series analysis for prediction COVID‐19 pandemic. Res Biomed Eng. 2022;38(1):35–47. https://doi.org/10.1007/s42600-020-00105-4

13

Ballı S. Data analysis of Covid‐19 pandemic and short‐term cumulative case forecasting using machine learning time series methods. Chaos, Solitons Fractals. 2021;142:110512. https://doi.org/10.1016/j.chaos.2020.110512

14

Islam N, Jdanov DA, Shkolnikov VM, Khunti K, Kawachi I, White M, et al. Effects of covid‐19 pandemic on life expectancy and premature mortality in 2020: time series analysis in 37 countries. BMJ. 2021;375:e066768. https://doi.org/10.1136/bmj-2021-066768

15

Borghi PH, Zakordonets O, Teixeira JP. A COVID‐19 time series forecasting model based on MLP ANN. Procedia Comput Sci. 2021;181:940–7. https://doi.org/10.1016/j.procs.2021.01.250

16

Essa RI, Prasetyowati SS, Sibaroni Y. Performance of ANN and RNN in predicting the classification of Covid‐19 diseases based on time series data. JURIKOM (Jurnal Riset Komputer). 2023 Feb 17;10(1):82–90. https://doi.org/10.30865/jurikom.v10i1.5557

17

Wathore R, Rawlekar S, Anjum S, Gupta A, Bherwani H, Labhasetwar N, et al. Improving performance of deep learning predictive models for COVID‐19 by incorporating environmental parameters. Gondwana Research. 2023;114:69–77. https://doi.org/10.1016/j.gr.2022.03.014

18

Sardar I, Akbar MA, Leiva V, Alsanad A, Mishra P. Machine learning and automatic ARIMA/Prophet models‐based forecasting of COVID‐19: methodology, evaluation, and case study in SAARC countries. Stoch Env Res Risk A. 2023;37(1):345–59. https://doi.org/10.1007/s00477-022-02307-x

19

Manohar B, Das R. Artificial neural networks for prediction of COVID‐19 in India by using backpropagation. Expert Syst. 2022;40(5):e13105. https://doi.org/10.1111/exsy.13105

20

Solayman S, Aumi SA, Mery CS, Mubassir M, Khan R. Automatic COVID‐19 prediction using explainable machine learning techniques. Int J Cogn Comput Eng. 2023;4:36–46. https://doi.org/10.1016/j.ijcce.2023.01.003

21

Rane R, Dubey A, Rasool A, Wadhvani R. Data mining based techniques for covid‐19 predictions. Procedia Comput Sci. 2023;218:210–9. https://doi.org/10.1016/j.procs.2023.01.003

22

Zhang X, Zhong C, Zhang J, Wang T, Ng WWY. Robust recurrent neural networks for time series forecasting. Neurocomputing. 2023;526:143–57. https://doi.org/10.1016/j.neucom.2023.01.037

23

Ensafi Y, Amin SH, Zhang G, Shah B. Time‐series forecasting of seasonal items sales using machine learning—a comparative analysis. Int J Inf Manag Data Insights. 2022;2(1):100058. https://doi.org/10.1016/j.jjimei.2022.100058

24

Livieris IE, Pintelas E, Pintelas P. A CNN–LSTM model for gold price time‐series forecasting. Neural Comput Appl. 2020;32(23):17351–60. https://doi.org/10.1007/s00521-020-04867-x

25

Mudassir M, Bennbaia S, Unal D, Hammoudeh M. Time‐series forecasting of Bitcoin prices using high‐dimensional features: a machine learning approach. Neural Comput Appl. 2020 (in press):1–15. https://doi.org/10.1007/s00521-020-05129-6

26

Kumar B, Sunil N, Yadav N. A novel hybrid model combining βSARMA and LSTM for time series forecasting. Appl Soft Comput. 2023 Feb 1;134:110019. https://doi.org/10.1016/j.asoc.2023.110019

27

Borges D, Nascimento MCV. COVID‐19 ICU demand forecasting: a two‐stage Prophet‐LSTM approach. Appl Soft Comput. 2022;125:109181. https://doi.org/10.1016/j.asoc.2022.109181

28

Mostafiz R, Uddin MS, Uddin KMM, Rahman MM. COVID‐19 along with other chest infection diagnoses using faster R‐CNN and generative adversarial network. ACM Trans Spat Algorithms Syst. 2022;8(3):1–21. https://doi.org/10.1145/3520125

29

Hanif R, Mustafa S, Iqbal S, Piracha S. A study of time series forecasting enrollments using fuzzy interval partitioning method. J Comput Cogn Eng. 2022;2(2):143–9. https://doi.org/10.47852/bonviewjcce2202159

30

Semenoglou AA, Spiliotis E, Assimakopoulos V. Data augmentation for univariate time series forecasting with neural networks. Pattern Recogn. 2023;134:109132. https://doi.org/10.1016/j.patcog.2022.109132

31

Ahmed NK, Atiya AF, El Gayar N, El‐Shishiny H. An empirical comparison of machine learning models for time series forecasting. Econom Rev. 2010;29(5–6):594–621. https://doi.org/10.1080/07474938.2010.481556

32
Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). CSSEGISandData/COVID‐19. Baltimore, MD, US: Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE); 2020. [updated 2023 March 09; cited 2024 October 10]. https://github.com/CSSEGISandData/COVID-19/blob/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv
33

Belmahdi B, Louzazni M, Bouardi AE. A hybrid ARIMA–ANN method to forecast daily global solar radiation in three different cities in Morocco. Eur Phys J Plus. 2020;135(11):925. https://doi.org/10.1140/epjp/s13360-020-00920-9

34

Jin Y, Wang R, Zhuang X, Wang K, Wang H, Wang C, et al. Prediction of COVID‐19 data using an ARIMA‐LSTM hybrid forecast model. Mathematics. 2022;10(21):4001. https://doi.org/10.3390/math10214001

35

Zhang GP. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing. 2003;50:159–75. https://doi.org/10.1016/s0925-2312(01)00702-0

Health Care Science
Pages 409-425
Cite this article:
Jain S, Agrawal S, Mohapatra E, et al. A novel ensemble ARIMA‐LSTM approach for evaluating COVID‐19 cases and future outbreak preparedness. Health Care Science, 2024, 3(6): 409-425. https://doi.org/10.1002/hcs2.123
Metrics & Citations  
Article History
Copyright
Rights and Permissions
Return