Discover the SciOpen Platform and Achieve Your Research Goals with Ease.
Search articles, authors, keywords, DOl and etc.
Financial statement fraud refers to malicious manipulations of financial data in listed companies’ annual statements. Traditional machine learning approaches focus on individual companies, overlooking the interactive relationships among companies that are crucial for identifying fraud patterns. Moreover, fraud detection is a typical imbalanced binary classification task with normal samples outnumbering fraud ones. In this paper, we propose a multi-relational graph convolutional network, named FraudGCN, for detecting financial statement fraud. A multi-relational graph is constructed to integrate industrial, supply chain, and accounting-sharing relationships, effectively encapsulating the multidimensional and complex interactions among companies. We then develop a multi-relational graph convolutional network to aggregate information within each relationship and employ an attention mechanism to fuse information across multiple relationships. The attention mechanism enables the model to distinguish the importance of different relationships, thereby aggregating more useful information from key relationships. To alleviate the class imbalance problem, we present a diffusion-based under-sampling strategy that strategically selects key nodes globally for model training. We also employ focal loss to assign greater weights to harder-to-classify minority samples. We build a real-world dataset from the annual financial statement of listed companies in China. The experimental results show that FraudGCN achieves an improvement of 3.15% in Macro-recall, 3.36% in Macro-F1, and 3.86% in GMean compared to the second-best method. The dataset and codes are publicly available at: https://github.com/XNetLab/MRG-for-Finance.
P. Ravisankar, V. Ravi, G. R. Rao, and I. Bose, Detection of financial statement fraud and feature selection using data mining techniques, Decis. Support Syst., vol. 50, no. 2, pp. 491–500, 2011.
S. Barman, U. Pal, A. Sarfaraj, B. Biswas, A. Mahata, and P. Mandal, A complete literature review on financial fraud detection applying data mining techniques, Int. J. Trust Manage. Comput. Commun., vol. 3, no. 4, pp. 336–359, 2016.
G. Niu, L. Yu, G. Z. Fan, and D. Zhang, Corporate fraud, risk avoidance, and housing investment in China, Emerg. Mark. Rev., vol. 39, pp. 18–33, 2019.
M. Jusup, P. Holme, K. Kanazawa, M. Takayasu, I. Romić, Z. Wang, S. Geček, T. Lipić, B. Podobnik, L. Wang, et al., Social physics, Phys. Rep., vol. 948, pp. 1–148, 2022.
L. G. A. Alves, H. Y. D. Sigaki, M. Perc, and H. V. Ribeiro, Collective dynamics of stock market efficiency, Sci. Rep., vol. 10, no. 1, p. 21992, 2020.
D. Fister, M. Perc, and T. Jagrič, Two robust long short-term memory frameworks for trading stocks, Appl. Intell., vol. 51, no. 10, pp. 7177–7195, 2021.
A. A. B. Pessa, M. Perc, and H. V. Ribeiro, Age and market capitalization drive large price variations of cryptocurrencies, Sci. Rep., vol. 13, no. 1, p. 3351, 2023.
B. Baesens, S. Höppner, and T. Verdonck, Data engineering for fraud detection, Decis. Support Syst., vol. 150, p. 113492, 2021.
K. G. Al-Hashedi and P. Magalingam, Financial fraud detection applying data mining techniques: A comprehensive review from 2009 to 2019, Comput. Sci. Rev., vol. 40, p. 100402, 2021.
P. S. Stanimirovic, Fraud detection in publicly traded us firms using beetle antennae search: A machine learning approach, Expert Systems with Applications, vol. 191, p. 116148, 2022.
P. M. Dechow, W. Ge, C. R. Larson, and R. G. Sloan, Predicting material accounting misstatements, Contemp. Account. Res., vol. 28, no. 1, pp. 17–82, 2011.
P. Craja, A. Kim, and S. Lessmann, Deep learning for detecting financial statement fraud, Decis. Support Syst., vol. 139, p. 113421, 2020.
Z. Sabir, H. A. Wahab, S. Javeed, and H. M. Baskonus, An efficient stochastic numerical computing framework for the nonlinear higher order singular models, Fractal Fract., vol. 5, no. 4, p. 176, 2021.
Z. Sabir, K. Nisar, M. A. Z. Raja, A. A. B. A. Ibrahim, J. J. P. C. Rodrigues, K. S. Al-Basyouni, S. R. Mahmoud, and D. B. Rawat, Heuristic computational design of morlet wavelet for solving the higher order singular nonlinear differential equations, Alex. Eng. J., vol. 60, no. 6, pp. 5935–5947, 2021.
S. C. L. Koh, M. Demirbag, E. Bayraktar, E. Tatoglu, and S. Zaim, The impact of supply chain management practices on performance of SMEs, Ind. Manage. Data Syst., vol. 107, no. 1, pp. 103–124, 2007.
M. Abed and B. Fernando, E-commerce fraud detection based on machine learning techniques: Systematic literature review, Big Data Mining and Analytics, vol. 7, no.2, pp. 419−444, 2024.
J. Perols, Financial statement fraud detection: An analysis of statistical and machine learning algorithms, Audit. : A J. Pract. Theory, vol. 30, no. 2, pp. 19–50, 2011.
W. H. Beaver, Financial ratios as predictors of failure, J. Account. Res. vol. 4, no. 1, pp. 71–111, 1966.
M. Cecchini, H. Aytug, G. J. Koehler, and P. Pathak, Detecting management fraud in public companies, Manage. Sci., vol. 56, no. 7, pp. 1146–1160, 2010.
S. Kotsiantis, E. Koumanakos, D. Tzelepis, V. Tampakas, Forecasting fraudulent financial statements using data mining, Int. J. Comput. Intell., vol. 3, no. 2, pp. 104–110, 2006.
H. C. Koh and C. K. Low, Going concern prediction using data mining techniques, Manag. Audit. J., vol. 19, no. 3, pp. 462–476, 2004.
C. Liu, Y. Chan, S. H. A. Kazmi, and H. Fu, Financial fraud detection model: Based on random forest, Int. J. Econ. Finance, vol. 7, no. 7, pp. 178–188, 2015.
M. Cecchini, H. Aytug, G. J. Koehler, and P. Pathak, Making words work: Using financial text as a predictor of financial events, Decis. Support Syst., vol. 50, no. 1, pp. 164–175, 2010.
Y. Y. Chen, Forecasting financial distress of listed companies with textual content of the information disclosure: A study based MD&A in Chinese annual reports, (in Chinese), Chin. J. Manage. Sci., vol. 27, no. 7, pp. 23–34, 2019.
A. Dyck, A. Morse, and L. Zingales, Who blows the whistle on corporate fraud? J. Finance, vol. 65, no. 6, pp. 2213–2253, 2010.
P. Hajek and R. Henriques, Mining corporate annual reports for intelligent detection of financial statement fraud—A comparative study of machine learning methods, Knowl. -Based Syst., vol. 128, pp. 139–152, 2017.
J. L. Hobson, W. J. Mayew, and M. Venkatachalam, Analyzing speech to detect financial misreporting, J. Account. Res., vol. 50, no. 2, pp. 349–392, 2012.
W. Dong, S. Liao, and Z. Zhang, Leveraging financial social media data for corporate fraud detection, J. Manage. Inf. Syst., vol. 35, no. 2, pp. 461–487, 2018.
G. Ozdagoglu, A. Ozdagoglu, Y. Gumus, and G. Kurt Gumus, The application of data mining techniques in manipulated financial statement classification: The case of turkey, J. AI Data Mining., vol. 5, no. 1, pp. 67–77, 2017.
X. B. Tang, G. C. Liu, J. Yang, and W. Wei, Knowledge-based financial statement fraud detection system: Based on an ontology and a decision tree, Knowl. Org., vol. 45, no. 3, pp. 205–219, 2018.
Y. Bao, B. Ke, B. Li, Y. J. Yu, and J. Zhang, Detecting accounting fraud in publicly traded U. S. firms using a machine learning approach, J. Account. Res., vol. 58, no. 1, pp. 199–235, 2020.
X. Wu and S. Du, An analysis on financial statement fraud detection for Chinese listed companies using deep learning, IEEE Access, vol. 10, pp. 22516–22532, 2022.
Z. Sabir, M. A. Z. Raja, A. S. Alnahdi, M. B. Jeelani, and M. A. Abdelkawy, Numerical investigations of the nonlinear smoke model using the Gudermannian neural networks, Math. Biosci. Eng, vol. 19, no. 1, pp. 351–370, 2022.
Z. Sabir, M. A. Z. Raja, J. L. G. Guirao, and T. Saeed, Meyer wavelet neural networks to solve a novel design of fractional order pantograph lane-emden differential model, Chaos Solitons Fractals, vol. 152, p. 111404, 2021.
Z. Sabir, M. A. Z. Raja, H. A. Wahab, M. Shoaib, and J. F. G. Aguilar, Integrated neuro-evolution heuristic with sequential quadratic programming for second-order prediction differential models, Numer. Methods Part. Differ. Equations, vol. 40, no. 1, p. e22692, 2024.
K. Nisar, Z. Sabir, M. A. Z. Raja, A. A. A. Ibrahim, F. Erdogan, M. R. Haque, J. J. P. C. Rodrigues, and D. B. Rawat, Design of morlet wavelet neural network for solving a class of singular pantograph nonlinear differential models, IEEE Access, vol. 9, pp. 77845–77862, 2021.
Z. Sabir, Neuron analysis through the swarming procedures for the singular two-point boundary value problems arising in the theory of thermal explosion, Eur. Phys. J. Plus, vol. 137, no. 5, p. 638, 2022.
Z. Sabir, T. Botmart, M. A. Z. Raja, R. Sadat, M. R. Ali, A. A. Alsulami, and A. Alghamdi, Artificial neural network scheme to solve the nonlinear influenza disease model, Biomed. Signal Process. Control, vol. 75, p. 103594, 2022.
J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, Graph neural networks: A review of methods and applications, AI Open, vol. 1, pp. 57–81, 2020.
X. Mao, H. Sun, X. Zhu, and J. Li, Financial fraud detection using the related-party transaction knowledge graph, Procedia Comput. Sci., vol. 199, pp. 733–740, 2022.
L. Page, S. Brin, R. Motwani, and T. Winograd, The PageRank citation ranking: bringing order to the web, Stanford Digital Libraries Working Paper.
P. Christoffersen and K. Jacobs, The importance of the loss function in option valuation, J. Financ. Econ., vol. 72, no. 2, pp. 291–318, 2004.
U. Ruby and V. Yendapalli, Binary cross entropy with deep learning technique for image classification, Int. J. Adv. Trends Comput. Sci. Eng, vol. 9, no. 4, pp. 5393–5397, 2020.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, et al., Scikit-learn: Machine learning in python, J. Mach. Learn. Res., vol. 12, pp. 2825–2830, 2011.
W. Caesarendra, A. Widodo, and B. S. Yang, Application of relevance vector machine and logistic regression for machine degradation assessment, Mech. Syst. Signal Process., vol. 24, no. 4, pp. 1161–1171, 2010.
W. Tong, H. Hong, H. Fang, Q. Xie, and R. Perkins, Decision forest: Combining the predictions of multiple independent decision tree models, J. Chem. Inf. Comput. Sci., vol. 43, no. 2, pp. 525–531, 2003.
L. Y. Hu, M. W. Huang, S. W. Ke, and C. F. Tsai, The distance function effect on k-nearest neighbor classification for medical datasets, SpringerPlus, vol. 5, no. 1, p. 1304, 2016.
X. Zhao, Z. Ma, and M. Yin, Using support vector machine and evolutionary profiles to predict antifreeze protein sequences, Int. J. Mol. Sci., vol. 13, no. 2, pp. 2196–2207, 2012.
J. H. Friedman, Greedy function approximation: A gradient boosting machine, Ann. Statist., vol. 29, no. 5, pp. 1189–1232, 2001.
M. Sokolova and G. Lapalme, A systematic analysis of performance measures for classification tasks, Inf. Process. Manage., vol. 45, no. 4, pp. 427–437, 2009.
L. van der Maaten and G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res., vol. 9, no. 86, pp. 2579–2605, 2008.
256
Views
34
Downloads
0
Crossref
0
Web of Science
0
Scopus
0
CSCD
Altmetrics
The articles published in this open access journal are distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/).