Predicting financial market trends poses significant challenges due to the complex, dynamic, and often chaotic nature of the market, especially when dealing with data featuring a multitude of characteristics. In this research, we propose an effective data mining approach that combines Least Absolute Shrinkage and Selection Operator (LASSO) and Principal Component Analysis (PCA) for two-stage feature dimensionality reduction, resulting in a refined dataset. To enhance the model’s capacity to capture medium- and long-term index trends, we implement a sliding time window approach, utilizing data from the preceding 60 trading days. Long-Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) models are employed to predict the return rate of Securities Bank of China (399986) over a 30-day trading period. We conducted a comprehensive comparative analysis involving our proposed model and established methods, namely, LASSO, PCA, and Hybrid Recurrent Neural Networks (RNN). Our empirical findings unequivocally demonstrate the superior performance of our model in terms of both prediction accuracy and stability. Specifically, our model exhibits significantly higher predictive accuracy when forecasting the return rate of Securities Bank of China (399986) over a 30-day trading period, all while maintaining enhanced stability. These results underscore the exceptional efficacy of our approach within the realm of financial market time series forecasting, thus providing robust support for further research and practical applications within this domain.
A. Semenov, Measuring the stock’s factor beta and identifying risk factors under market inefficiency, Q. Rev. Econ. Finance, vol. 80, pp. 635–649, 2021.
Y. Gao, R. Wang, and E. Zhou, Stock prediction based on optimized LSTM and GRU models, Sci. Program., vol. 2021, no. 1, p. 4055281, 2021.
J. Lv, C. Wang, W. Gao, and Q. Zhao, An economic forecasting method based on the LightGBM-optimized LSTM and time-series model, Comput. Intell. Neurosci., vol. 2021, p. 8128879, 2021.
Y. Wu and J. Gao, AdaBoost-based long short-term memory ensemble learning approach for financial time series forecasting, Curr. Sci., vol. 115, no. 1, pp. 159–165, 2018.
Z. Berradi and M. Lazaar, Integration of principal component analysis and recurrent neural network to forecast the stock price of Casablanca stock exchange, Procedia Comput. Sci., vol. 148, pp. 55–61, 2019.
S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Comput., vol. 9, no. 8, pp. 1735–1780, 1997.
Y. Baek and H. Y. Kim, ModAugNet: A new forecasting framework for stock market index value with an overfitting prevention LSTM module and a prediction LSTM module, Expert Syst. Appl., vol. 113, pp. 457–480, 2018.
J. Cao, Z. Li, and J. Li, Financial time series forecasting model based on CEEMDAN and LSTM, Phys. A Stat. Mech. Appl., vol. 519, pp. 127–139, 2019.
A. H. Bukhari, M. A. Z. Raja, M. Sulaiman, S. Islam, M. Shoaib, and P. Kumam, Fractional neuro-sequential ARFIMA-LSTM for financial market forecasting, IEEE Access, vol. 8, pp. 71326–71338, 2020.
W. Lu, J. Li, Y. Li, A. Sun, and J. Wang, A CNN-LSTM-based model to forecast stock prices, Complexity, vol. 2020, no. 1, p. 6622927, 2020.
D. Kim and C. Baek, Factor-augmented HAR model improves realized volatility forecasting, Appl. Econ. Lett., vol. 27, no. 12, pp. 1002–1009, 2020.
Y. Qiu, H. Y. Yang, S. Lu, and W. Chen, A novel hybrid model based on recurrent neural networks for stock market timing, Soft Comput., vol. 24, no. 20, pp. 15273–15290, 2020.
S. Zhang and W. Fang, Multifractal behaviors of stock indices and their ability to improve forecasting in a volatility clustering period, Entropy (Basel), vol. 23, no. 8, p. 1018, 2021.
Y. Touzani and K. Douzi, An LSTM and GRU based trading strategy adapted to the Moroccan market, J. Big Data, vol. 8, no. 1, p. 126, 2021.
T. B. Shahi, A. Shrestha, A. Neupane, and W. Guo, Stock price forecasting with deep learning: A comparative study, Mathematics, vol. 8, no. 9, p. 1441, 2020.
K. Sako, B. N. Mpinda, and P. C. Rodrigues, Neural networks for financial time series forecasting, Entropy (Basel), vol. 24, no. 5, p. 657, 2022.
R. Tibshirani, Regression selection and shrinkage via the LASSO: A retrospective, Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 73, no. 3, pp. 273–282, 2011.
H. S. Uraibi, Weighted lasso subsampling for high dimensional regression, Electronic Journal of Applied Statistical Analysis, vol. 12, no. 1, pp. 69–84, 2019.
R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon, J. Taylor, and R. J. Tibshirani, Strong rules for discarding predictors in LASSO-type problems, J. R. Stat. Soc. Ser. B Stat. Methodol., vol. 74, no. 2, pp. 245–266, 2012.
Y. Kim, J. Hao, T. Mallavarapu, J. Park, and M. Kang, Hi-LASSO: High-dimensional LASSO, IEEE Access, vol. 7, pp. 44562–44573, 2019.
J. Yang, Y. Wang, and X. Li, Prediction of stock price direction using the LASSO-LSTM model combines technical indicators and financial sentiment analysis, PeerJ Comput. Sci., vol. 8, p. e1148, 2022.
K. Pearson, On lines and planes of closest fit to systems of points in space, Philosophical Magazine, vol. 2, no. 11, pp. 559–572, 1901.
P. C. Chang and J. L. Wu, A critical feature extraction by kernel PCA in stock trading model, Soft Comput., vol. 19, no. 5, pp. 1393–1408, 2015.
X. Zhong and D. Enke, Forecasting daily stock market return using dimensionality reduction, Expert Syst. Appl., vol. 67, pp. 126–139, 2017.