| Sign up

PDF (2.4 MB)

Cite

EndNote(RIS) BibTeX

Collect

Collect

Submit Manuscript

Open Access

Analysis and Classification of Fake News Using Sequential Pattern Mining

M. Zohaib Nawaz^{¹^,³}, M. Saqib Nawaz^¹, Philippe Fournier-Viger^¹(), Yulin He^²

1College of Computer Science and Software Engineering, Shenzhen University, Shenzhen 518060

2Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen 518107, China

3Department of Computer Science, Faculty of Computing and Information Technology, Univesity of Sargodha, Sargodha 40100, Pakistan

Show Author Information

Abstract

Disinformation, often known as fake news, is a major issue that has received a lot of attention lately. Many researchers have proposed effective means of detecting and addressing it. Current machine and deep learning based methodologies for classification/detection of fake news are content-based, network (propagation) based, or multimodal methods that combine both textual and visual information. We introduce here a framework, called FNACSPM, based on sequential pattern mining (SPM), for fake news analysis and classification. In this framework, six publicly available datasets, containing a diverse range of fake and real news, and their combination, are first transformed into a proper format. Then, algorithms for SPM are applied to the transformed datasets to extract frequent patterns (and rules) of words, phrases, or linguistic features. The obtained patterns capture distinctive characteristics associated with fake or real news content, providing valuable insights into the underlying structures and commonalities of misinformation. Subsequently, the discovered frequent patterns are used as features for fake news classification. This framework is evaluated with eight classifiers, and their performance is assessed with various metrics. Extensive experiments were performed and obtained results show that FNACSPM outperformed other state-of-the-art approaches for fake news classification, and that it expedites the classification task with high accuracy.

Keywords

disinformation fake news sequential pattern mining (SPM)frequent patterns classification

References

[1]

X. Zhou and R. Zafarani, A survey of fake news: Fundamental theories, detection methods, and opportunities, ACM Comput. Surv., vol. 53, no. 5, pp. 1–40, 2020.

Crossref Google Scholar

[2]

G. Ruffo, A. Semeraro, A. Giachanou, and P. Rosso, Studying fake news spreading, polarisation dynamics, and manipulation by bots: A tale of networks and language, Comput. Sci. Rev., vol. 47, p. 100531, 2023.

Crossref Google Scholar

[3]

X. Zhang and A. A. Ghorbani, An overview of online fake news: Characterization, detection, and discussion, Inf. Process. Manag., vol. 57, p. 102025, 2020.

Crossref Google Scholar

[4]

C. Kong, G. Luo, L. Tian, and X. Cao, Disseminating authorized content via data analysis in opportunistic social networks, Big Data Mining and Analytics, vol. 2, no. 1, pp. 12–24, 2019.

Crossref Google Scholar

[5]

S. A. Alkhodair, S. H. H. Ding, B. C. M. Fung, and J. Liu, Detecting breaking news rumors of emerging topics in social media, Inf. Process. Manag., vol. 57, p. 102018, 2020.

Crossref Google Scholar

[6]

K. Shu, A. Sliva, S. Wang, J. Tang, and H. Liu, Fake news detection on social media: A data mining perspective, arXiv preprint arXiv: 1708.01967, 2017.

[7]

T. Buchanan, Why do people spread false information online? The effects of message and viewer characteristics on self-reported likelihood of sharing social media disinformation, PLoS One, vol. 15, no. 10, p. e0239666, 2020.

Crossref Google Scholar

[8]

C. Boididou, S. Papadopoulos, Y. Kompatsiaris, S. Schifferes, and N. Newman, Challenges of computational verification in social multimedia, in Proc. 23rd Int. Conf. World Wide Web, Seoul, Republic of Korea, 2014, pp. 743–748.

[9]

C. Boididou, S. Papadopoulos, M. Zampoglou, L. Apostolidis, O. Papadopoulou, and Y. Kompatsiaris, Detection and visualization of misleading content on Twitter, Int. J. Multimed. Inf. Retr., vol. 7, no. 1, pp. 71–86, 2018.

Crossref Google Scholar

[10]

N. Sitaula, C. K. Mohan, J. Grygiel, X. Zhou, and R. Zafarani, Credibility-based fake news detection, in Disinformation, Misinformation, and Fake News in Social Media, K. Shu, S. Wang, D. Lee, and H. Liu, eds. Cham, Switzerland: Springer, 2020, pp. 163–182.

[11]

X. Zhou, A. Jain, V. V. Phoha, and R. Zafarani, Fake news early detection: A theory-driven model, Digit. Threats Res. Pract., vol. 1, no. 2, p. 12, 2020.

Crossref Google Scholar

[12]

M. Choudhary, S. S. Chouhan, E. S. Pilli, and S. K. Vipparthi, BerConvoNet: A deep learning framework for fake news classification, Appl. Soft Comput., vol. 110, p. 107614, 2021.

Crossref Google Scholar

[13]

X. Zhou, J. Wu, and R. Zafarani, SAFE: Similarity-aware multi-modal fake news detection, in Proc. 24th Pacific-Asia Conference, PAKDD 2020, Singapore, 2020, pp. 354–367.

[14]

X. Zhou and R. Zafarani, Network-based fake news detection: A pattern-driven approach, arXiv preprint arXiv: 1906.04210, 2019.

[15]

B. Shi and T. Weninger, Discriminative predicate path mining for fact checking in knowledge graphs, Knowl. Based Syst., vol. 104, no. C, pp. 123–133, 2016.

Crossref Google Scholar

[16]

G. L. Ciampaglia, P. Shiralkar, L. M. Rocha, J. Bollen, F. Menczer, and A. Flammini, Computational fact checking from knowledge networks, PLoS One, vol. 10, no. 6, p. e0128193, 2015.

Crossref Google Scholar

[17]

Y. Wang, F. Ma, Z. Jin, Y. Yuan, G. Xun, K. Jha, L. Su, and J. Gao, EANN: Event adversarial neural networks for multi-modal fake news detection, in Proc. 24th ACM SIGKDD Conf. Knowledge Discovery & Data Mining, London, UK, 2018, pp. 849–857.

[18]

V. Pérez-Rosas, B. Kleinberg, A. Lefevre, and R. Mihalcea, Automatic detection of fake news, arXiv preprint arXiv: 1708.07104, 2017.

[19]

P. Fournier-Viger, J. C. W. Lin, R. U. Kiran, Y. S. Koh, and R. Thomas, A survey of sequential pattern mining, Data Science and Pattern Recognition, vol. 1, no. 1, pp. 54–77, 2017.

[20]

M. Cheng, X. Jin, Y. Wang, X. Wang, and J. Chen, A sequential pattern mining approach to tourist movement: The case of a mega event, J. Travel. Res., vol. 62, no. 6, pp. 1237–1256, 2023.

Crossref Google Scholar

[21]

M. S. Nawaz, P. Fournier-Viger, M. Aslam, W. Li, Y. He, and X. Niu, Using alignment-free and pattern mining methods for SARS-CoV-2 genome analysis, Appl. Intell., vol. 53, no. 19, pp. 21920–21943, 2023.

Crossref Google Scholar

[22]

M. S. Nawaz, P. Fournier-Viger, Y. He, and Q. Zhang, PSAC-PDB: Analysis and classification of protein structures, Comput. Biol. Med., vol. 158, p. 106814, 2023.

Crossref Google Scholar

[23]

L. Ni, W. Luo, N. Lu, and W. Zhu, Mining the local dependency itemset in a products network, ACM Trans. Manage. Inf. Syst., vol. 11, no. 1, pp. 1–31, 2020.

Crossref Google Scholar

[24]

R. U. Mustafa, M. S. Nawaz, J. Ferzund, M. I. U. Lali, B. Shahzad, and P. Fournier-Viger, Early detection of controversial Urdu speeches from social media, Data Science and Pattern Recognition, vol. 1, no. 2, pp. 26–42, 2017.

[25]

D. Schweizer, M. Zehnder, H. Wache, H. F. Witschel, D. Zanatta, and M. Rodriguez, Using consumer behavior data to reduce energy consumption in smart homes: Applying machine learning to save energy without lowering comfort of inhabitants, in Proc. IEEE 14th Int. Conf. Machine Learning and Applications (ICMLA), Miami, FL, USA, 2015, pp. 1123–1129.

[26]

M. S. Nawaz, P. Fournier-Viger, M. Z. Nawaz, G. Chen, and Y. Wu, MalSPM: Metamorphic malware behavior analysis and classification using sequential pattern mining, Comput. Secur., vol. 118, p. 102741, 2022.

Crossref Google Scholar

[27]

M. S. Nawaz, M. Sun, and P. Fournier-Viger, Proof guidance in PVS with sequential pattern mining, in Proc. FSEN 2019, Tehran, Iran, 2019, pp. 45–60.

[28]

P. Fournier-Viger, T. Gueniche, and V. S. Tseng, Using partially-ordered sequential rules to generate more accurate sequence prediction, in Proc. 8th Int. Conf. Advanced Data Mining and Applications, ADMA 2012, Nanjing, China, 2012, pp. 431–442.

[29]

S. Feng, R. Banerjee, and Y. Choi, Syntactic stylometry for deception detection, in Proc. 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012, Jeju Island, Republic of Korea, 2012, pp. 171–175.

[30]

H. Karimi and J. Tang, Learning hierarchical discourse-level structure for fake news detection, in Proc. 2019 Conf. the North American Chapter of the Association for Computational Linguistics : Human Language Technologies, Minneapolis, MN, USA, 2019, pp. 3432–3442.

[31]

V. L. Rubin and T. Lukoianova, Truth and deception at the rhetorical structure level, J. Assoc. Inf. Sci. Technol., vol. 66, no. 5, pp. 905–917, 2015.

Crossref Google Scholar

[32]

B. Horne and S. Adali, This just in: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news, Proc. Int. AAAI Conf. Web Soc. Medium., vol. 11, no. 1, pp. 759–766, 2017.

Crossref Google Scholar

[33]

J. C. S. Reis, A. Correia, F. Murai, A. Veloso, and F. Benevenuto, Supervised learning for fake news detection, IEEE Intell. Syst., vol. 34, no. 2, pp. 76–81, 2019.

Crossref Google Scholar

[34]

J. Y. Khan, M. T. I. Khondaker, S. Afroz, G. Uddin, and A. Iqbal, A benchmark study of machine learning models for online fake news detection, Mach. Learn. Appl., vol. 4, p. 100032, 2021.

Crossref Google Scholar

[35]

G. Gravanis, A. Vakali, K. Diamantaras, and P. Karadais, Behind the cues: A benchmarking study for fake news detection, Expert Syst. Appl., vol. 128, no. C, pp. 201–213, 2019.

Crossref Google Scholar

[36]

I. Ahmad, M. Yousaf, S. Yousaf, and M. O. Ahmad, Fake news detection using machine learning ensemble methods, Complexity, vol. 2020, p. 8885861, 2020.

Crossref Google Scholar

[37]

F. A. Ozbay and B. Alatas, Fake news detection within online social media using supervised artificial intelligence algorithms, Phys. A: Stat. Mech. Appl., vol. 540, p. 123174, 2020.

Crossref Google Scholar

[38]

K. Shu, D. Mahudeswaran, S. Wang, D. Lee, and H. Liu, FakeNewsNet: A data repository with news content, social context, and spatiotemporal information for studying fake news on social media, Big Data, vol. 8, no. 3, pp. 171–188, 2020.

Crossref Google Scholar

[39]

F. Qian, C. Gong, K. Sharma, and Y. Liu, Neural user response generator: Fake news detection with collective user intelligence, in Proc. 27th Int. Joint Conf. Artificial Intelligence (IJCAI-18), Stockholm, Sweden, 2018, pp. 3834–3840.

[40]

H. Jwa, D. Oh, K. Park, J. Kang, and H. Lim, exBAKE: Automatic fake news detection model based on bidirectional encoder representations from transformers (BERT), Appl. Sci., vol. 9, no. 19, p. 4062, 2019.

Crossref Google Scholar

[41]

K. Shu, L. Cui, S. Wang, D. Lee, and H. Liu, dEFEND: Explainable fake news detection, in Proc. 25th ACM SIGKDD Int. Conf. Knowledge Discovery & Data Mining, Anchorage, AK, USA, 2019, pp. 395–405.

[42]

F. Khan, R. Alturki, G. Srivastava, F. Gazzawe, S. T. U. Shah, and S. Mastorakis, Explainable detection of fake news on social media using pyramidal co-attention network, IEEE Trans. Comput. Soc. Syst.

[43]

I. K. Sastrawan, I. P. A. Bayupati, and D. M. S. Arsa, Detection of fake news using deep learning CNN–RNN based methods, ICT Express, vol. 8, no. 3, pp. 396–408, 2022.

Crossref Google Scholar

[44]

N. Rai, D. Kumar, N. Kaushik, C. Raj, and A. Ali, Fake news classification using transformer based enhanced LSTM and BERT, Int. J. Cogn. Comput. Eng., vol. 3, pp. 98–105, 2022.

Crossref Google Scholar

[45]

R. K. Kaliyar, A. Goswami, and P. Narang, FakeBERT: Fake news detection in social media with a BERT-based deep learning approach, Multimed. Tools Appl., vol. 80, no. 8, pp. 11765–11788, 2021.

Crossref Google Scholar

[46]

S. Y. Lin, Y. C. Kung, and F. Y. Leu, Predictive intelligence in harmful news identification by BERT-based ensemble learning model with text sentiment analysis, Inf. Process. Manag., vol. 59, no. 2, p. 102872, 2022.

Crossref Google Scholar

[47]

S. Deepak and B. Chitturi, Deep neural approach to fake-news identification, Procedia Comput. Sci., vol. 167, pp. 2236–2243, 2020.

Crossref Google Scholar

[48]

R. K. Kaliyar, A. Goswami, P. Narang, and S. Sinha, FNDNet—A deep convolutional neural network for fake news detection, Cogn. Syst. Res., vol. 61, no. C, pp. 32–44, 2020.

Crossref Google Scholar

[49]

W. Y. Wang, “Liar, liar pants on fire”: A new benchmark dataset for fake news detection, arXiv preprint arXiv: 1705.00648, 2017.

[50]

H. Karimi, P. C. Roy, S. Saba-Sadiya, and J. Tang, Multi-source multi-class fake news detection, in Proc. 27th Int. Conf. Computational Linguistics (COLING), Santa Fe, NM, USA, 2018, pp. 1546–1557.

[51]

H. Rashkin, E. Choi, J. Y. Jang, S. Volkova, and Y. Choi, Truth of varying shades: Analyzing language in fake news and political fact-checking, in Proc. 2017 Conf. Empirical Methods in Natural Language Processing (EMNLP), Copenhagen, Denmark, 2017, pp. 2931–2937.

[52]

T. Rasool, W. H. Butt, A. Shaukat, and M. U. Akram, Multi-label fake news detection using multi-layered supervised learning, in Proc. 2019 11th Int. Conf. Computer and Automation Engineering, Perth, Australia, 2019, pp. 73–77.

[53]

M. Arif, A. L. Tonja, I. Ameer, O. Kolesnikova, A. F. Gelbukh, G. Sidorov, and A. G. M. Meque, CIC at CheckThat! 2022: Multi-class and cross-lingual fake news detection, in Proc. CEUR Workshop, Bologna, Italy, 2022, pp. 434–443.

[54]

Y. Long, Q. Lu, R. Xiang, M. Li, and C. R. Huang, Fake news detection through multi-perspective speaker profiles, in Proc. 8th Int. Joint Conf. Natural Language Processing (IJCNLP), Taipei, China, 2017, pp. 252–256.

[55]

N. Singh, R. K. Kaliyar, T. Vivekanand, K. Uthkarsh, V. Mishra, and A. Goswami, B-LIAR: A novel model for handling multiclass fake news data utilizing a transformer encoder stack-based architecture, in Proc. 1st Int. Conf. Informatics (ICI), Noida, India, 2022, pp. 31–35.

[56]

J. Alghamdi, Y. Lin, and S. Luo, Modeling fake news detection using BERT-CNN-BiLSTM architecture, in Proc. IEEE 5th Int. Conf. Multimedia Information Processing and Retrieval (MIPR), CA, USA, 2022, pp. 354–357.

[57]

T. E. Trueman, J. Ashok Kumar, P. Narayanasamy, and J. Vidya, Attention-based C-BiLSTM for fake news detection, Appl. Soft Comput., vol. 110, p. 107600, 2021.

Crossref Google Scholar

[58]

M. H. Goldani, R. Safabakhsh, and S. Momtazi, Convolutional neural network with margin loss for fake news detection, Inf. Process. Manag., vol. 58, no. 1, p. 102418, 2021.

Crossref Google Scholar

[59]

M. H. Goldani, S. Momtazi, and R. Safabakhsh, Detecting fake news with capsule neural networks, Appl. Soft Comput., vol. 101, p. 106991, 2021.

Crossref Google Scholar

[60]

K. Shu, S. Wang, and H. Liu, Beyond news contents: The role of social context for fake news detection, arXiv preprint arXiv: 1712.07709, 2017.

[61]

S. Xiong, G. Zhang, V. Batra, L. Xi, L. Shi, and L. Liu, TRIMOON: Two-round inconsistency-based multi-modal fusion network for fake news detection, Inf. Fusion, vol. 93, no. C, pp. 150–158, 2023.

Crossref Google Scholar

[62]

C. Song, N. Ning, Y. Zhang, and B. Wu, A multimodal fake news detection model based on crossmodal attention residual and multichannel convolutional neural networks, Inf. Process. Manag., vol. 58, no. 1, p. 102437, 2021.

Crossref Google Scholar

[63]

B. Palani, S. Elango, and V. K. Vignesh, CB-Fake: A multimodal deep learning framework for automatic fake news detection using capsule neural network and BERT, Multimed. Tools Appl., vol. 81, no. 4, pp. 5587–5620, 2022.

Crossref Google Scholar

[64]

G. Zhang, A. Giachanou, and P. Rosso, SceneFND: Multimodal fake news detection by modelling scene context information, J. Inf. Sci., vol. 50, no. 2, pp. 355–367, 2022.

Crossref Google Scholar

[65]

J. Jing, H. Wu, J. Sun, X. Fang, and H. Zhang, Multimodal fake news detection via progressive fusion networks, Inf. Process. Manag., vol. 60, no. 1, p. 103120, 2023.

Crossref Google Scholar

[66]

Y. J. Lu and C. T. Li, GCAN: Graph-aware co-attention networks for explainable fake news detection on social media, in Proc. 58th Annual Meeting of the Association for Computational Linguistics, Virtual Event, 2020, pp. 504–514.

[67]

G. McIntire, Fake Real News Dataset, https://github.com/GeorgeMcIntire/fake_real_news_dataset, 2024.

[68]

Kaggle, BuzzFeed News Analysis and Classification, http://kaggle.com/code/sohamohajeri/buzzfeed-news-analysis-and-classification/, 2024.

[69]

Kaggle, Fake News Classification, http://kaggle.com/datasets/saurabhshahane/fake-news-classification, 2024.

[70]

Kaggle, Fake and Real News Dataset, http://github.com/MuhammadzohaibNawaz/FakeNewDS6, 2024.

[71]

M. S. Nawaz, P. Fournier-Viger, A. Shojaee, and H. Fujita, Using artificial intelligence techniques for COVID-19 genome analysis, Appl. Intell., vol. 51, no. 5, pp. 3086–3103, 2021.

Crossref Google Scholar

[72]

R. Agrawal and R. Srikant, Fast algorithms for mining association rules in large databases, in Proc. 20th VLDB, Santiago, Chile, 1994, pp. 487–499.

[73]

P. Fournier-Viger, A. Gomariz, T. Gueniche, E. Mwamikazi, and R. Thomas, TKS: Efficient mining of top-k sequential patterns, in Proc. 9th Int. Conf. Advanced Data Mining and Applications (ADMA), Hangzhou, China, 2013, pp. 109–120.

[74]

P. Fournier-Viger, A. Gomariz, M. Campos, and R. Thomas, Fast vertical mining of sequential patterns using co-occurrence information, in Advances in Knowledge Discovery and Data, V. S. Tseng, T. B. Ho, Z. H. Zhou, A. L. P. Chen, and H. Y. Kao, eds. Cham, Switzerland: Springer, 2014, pp. 40–52.

[75]

P. Fournier-Viger, T. Gueniche, S. Zida, and V. S. Tseng, ERMiner: Sequential rule mining using equivalence classes, in Advances in Intelligent Data Analysis XIII, H. Blockeel, M. van Leeuwen, and V. Vinciotti, eds. Cham, Switzerland: Springer, 2014, pp. 108–119.

[76]

P. Fournier-Viger, J. C. W. Lin, A. Gomariz, T. Gueniche, A. Soltani, Z. Deng, and H. T. Lam, The SPMF open-source data mining library version 2, in Machine Learning and Knowledge Discovery in Databases, B. Berendt, B. Bringmann, É. Fromont, G. Garriga, P. Miettinen, N. Tatti, and V. Tresp, eds. Cham, Switzerland: Springer, 2016, pp. 36–40.

[77]

O. Kramer, Scikit-learn, in Machine Learning for Evolution Strategies, O. Kramer, ed. Cham, Switzerland: Springer, 2016, pp. 45–53.

[78]

S. Ventura and J. M. Luna, Supervised Descriptive Pattern Mining. Berlin, Germany: Springer, 2018.

Big Data Mining and Analytics

Volume 7 Issue 3,
September 2024

Pages 942-963

DOI: 10.26599/BDMA.2024.9020015

Cite this article:

Nawaz MZ, Nawaz MS, Fournier-Viger P, et al. Analysis and Classification of Fake News Using Sequential Pattern Mining. Big Data Mining and Analytics, 2024, 7(3): 942-963. https://doi.org/10.26599/BDMA.2024.9020015

About Us

Learn about Open Access

Tsinghua University Press

Publish with Us

Peer Review Policy

Copyright and Licensing

Article Processing Charge

Contact Us

Journal Collaboration: Yao Meng (Ms.)✉️ +86-10-83470574

Technical Support: Kuo Zhao (Mr.)✉️ +86-10-83470507

Media Contact: Hao Jin (Mr.)✉️ +86-10-83470559

Address: Floor 6, Tower B, Xueyan Building, Shuangqing Road, Haidian District, Beijing 100084, China.

SciOpen——中国科技期刊卓越行动计划支持项目

Copyright © 2025 Tsinghua University Press Ltd.

京ICP备 10035462号-42 京公网安备11010802044758号