AI Chat Paper
Note: Please note that the following content is generated by AMiner AI. SciOpen does not take any responsibility related to this content.
{{lang === 'zh_CN' ? '文章概述' : 'Summary'}}
{{lang === 'en_US' ? '中' : 'Eng'}}
Chat more with AI
Article Link
Collect
Submit Manuscript
Show Outline
Outline
Show full outline
Hide outline
Outline
Show full outline
Hide outline
Regular Paper

FATOC: Bug Isolation Based Multi-Fault Localization by Using OPTICS Clustering

College of Information Science and Technology, Beijing University of Chemical Technology, Beijing 100029, China
School of Information Science and Technology, Nantong University, Nantong 226019, China
State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences Beijing 100093, China

A preliminary version of the paper was published in the Proceedings of QRS 2019.

Show Author Information

Abstract

Bug isolation is a popular approach for multi-fault localization (MFL), where all failed test cases are clustered into several groups, and then the failed test cases in each group combined with all passed test cases are used to localize only a single fault. However, existing clustering algorithms cannot always obtain completely correct clustering results, which is a potential threat for bug isolation based MFL approaches. To address this issue, we first analyze the influence of the accuracy of the clustering on the performance of MFL, and the results of a controlled study indicate that using the clustering algorithm with the highest accuracy can achieve the best performance of MFL. Moreover, previous studies on clustering algorithms also show that the elements in a higher density cluster have a higher similarity. Based on the above motivation, we propose a novel approach FATOC (One-Fault-at-a-Time via OPTICS Clustering). In particular, FATOC first leverages the OPTICS (Ordering Points to Identify the Clustering Structure) clustering algorithm to group failed test cases, and then identifies a cluster with the highest density. OPTICS clustering is a density-based clustering algorithm, which can reduce the misgrouping and calculate a density value for each cluster. Such a density value of each cluster is helpful for finding a cluster with the highest clustering effectiveness. FATOC then combines the failed test cases in this cluster with all passed test cases to localize a single-fault through the traditional spectrum-based fault localization (SBFL) formula. After this fault is localized and fixed, FATOC will use the same method to localize the next single-fault, until all the test cases are passed. Our evaluation results show that FATOC can significantly outperform the traditional SBFL technique and a state-of-the-art MFL approach MSeer on 804 multi-faulty versions from nine real-world programs. Specifically, FATOC’s performance is 10.32% higher than that of traditional SBFL when using Ochiai formula in terms of metric A-EXAM. Besides, the results also indicate that, when checking 1%, 3% and 5% statements of all subject programs, FATOC can locate 36.91%, 48.50% and 66.93% of all faults respectively, which is also better than the traditional SBFL and the MFL approach MSeer.

Electronic Supplementary Material

Download File(s)
jcst-35-5-979-Highlights.pdf (337.4 KB)

References

[1]

Xie X, Chen T Y, Kuo F C, Xu B. A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Transactions on Software Engineering and Methodology, 2013, 22(4): Article No. 31.

[2]

Wong W E, Gao R, Li Y, Abreu R, Wotawa F. A survey on software fault localization. IEEE Transactions on Software Engineering, 2016, 42(8): 707-740.

[3]

Kim J, Kim J, Lee E. VFL: Variable-based fault localization. Information and Software Technology, 2019, 107: 179-191.

[4]
Pearson S, Campos J, Just R, Fraser G, Abreu R, Ernst M D, Pang D, Keller B. Evaluating and improving fault localization. In Proc. the 39th IEEE/ACM Int. Conf. Software Engineering, May 2017, pp.609-620.
[5]

Liu Y, Li M, Wu Y, Li Z. A weighted fuzzy classification approach to identify and manipulate coincidental correct test cases for fault localization. Journal of Systems and Software, 2019, 151: 20-37.

[6]

Wah K S H T. A theoretical study of fault coupling. Software Testing Verification and Reliability, 2000, 10(1): 3-45.

[7]
Gopinath R, Jensen C, Groce A. The theory of composite faults. In Proc. the 2017 IEEE Int. Conf. Software Testing, Verification and Validation, March 2017, pp.47-57.
[8]

Gao R, Wong W E. MSeer — An advanced technique for locating multiple bugs in parallel. IEEE Transactions on Software Engineering, 2019, 45(3): 301-318.

[9]

Zheng Y, Wang Z, Fan X Y, Chen X, Yang Z J. Localizing multiple software faults based on evolution algorithm. Journal of Systems and Software, 2018, 139: 107-123.

[10]
Liu B, Nejati S, Briand L, Bruckmann T. Localizing multiple faults in Simulink models. In Proc. the 23rd IEEE Int. Conf. Software Analysis, Evolution, and Reengineering, March 2016, pp.146-156.
[11]
Jones J A, Bowring J F, Harrold M J. Debugging in parallel. In Proc. the 2007 International Symposium on Software Testing and Analysis, July 2007, pp.16-26.
[12]
Chen Z, Chen Z, Zhao Z, Yan S, Zhang J, Xu B. An improved regression test selection technique by clustering execution profiles. In Proc. the 10th International Conference on Quality Software, July 2010, pp.171-179.
[13]
Chen S, Chen Z, Zhao Z, Xu B, Feng Y. Using semisupervised clustering to improve regression test selection techniques. In Proc. the 4th IEEE Int. Conf. Software Testing, Verification and Validation, March 2011, pp.1-10.
[14]
Vangala V, Czerwonka J, Talluri P. Test case comparison and clustering using program profiles and static execution. In Proc. the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering, August 2009, pp.293-294.
[15]
Dickinson W, Leon D, Fodgurski A. Finding failures by cluster analysis of execution profiles. In Proc. the 23rd Int. Conf. Software Engineering, May 2001, pp.339-348.
[16]

Dickinson W, Leon D, Podgurski A. Pursuing failure: The distribution of program failures in a profile space. ACM SIGSOFT Software Engineering Notes, 2001, 26(5): 246-255.

[17]

Mathias R, Lagrange M, Cont A. Efficient similarity-based data clustering by optimal object to cluster reallocation. PLOS ONE, 2018, 13(6): e0197450.

[18]
Liu Y C, Li Z M, Xiong H, Gao X D,Wu J J. Understanding of internal clustering validation measures. In Proc. the 2010 IEEE Int. Conf. Data Mining, December 2010, pp.911-916.
[19]
Huang Y, Wu J, Feng Y, Chen Z, Zhao Z. An empirical study on clustering for isolating bugs in fault localization. In Proc. the 2013 IEEE Int. Conf. Software Reliability Engineering Workshops, November 2013, pp.138-143.
[20]

Ankerst M, Breunig M M, Kriegel H P, Sander J. OPTICS: Ordering points to identify the clustering structure. ACM SIGMOD Record, 1999, 28(2): 49-60.

[21]
Li Z, Wu Y H, Liu Y. An empirical study of bug isolation on the effectiveness of multiple fault localization. In Proc. the 19th IEEE Int. Conf. Software Quality, Reliability and Security, July 2019, pp.18-25.
[22]
Zou D M, Liang J J, Xiong Y F, Ernst M D, Zhang L. An empirical study of fault localization families and their combinations. IEEE Transactions on Software Engineering. doi:10.1109/TSE.2019.2892102.
[23]
Wen M, Chen J J, Tian Y J, Wu R X, Hao D, Han S, Cheung S C. Historical spectrum based fault localization. IEEE Transactions on Software Engineering. doi:10.1109/TSE.2019.2948158.
[24]
Jones J A, Harrold M J. Empirical evaluation of the tarantula automatic fault-localization technique. In Proc. the 20th IEEE/ACM International Conference on Automated Software Engineering, November 2005, pp.273-282.
[25]

Naish L, Lee H J, Ramamohanarao K. A model for spectrabased software diagnosis. ACM Transactions on Software Engineering and Methodology, 2011, 20(3): Article No. 11.

[26]
Dallmeier V, Lindig C, Zeller A. Lightweight bug localization with AMPLE. In Proc. the 6th Int. Symp. Automated Analysis-Driven Debugging, September 2005, pp.99-104.
[27]

Masri W. Fault localization based on information flow coverage. Software Testing Verification & Reliability, 2010, 20(2): 121-147.

[28]

Shu T, Ye T T, Ding Z H, Xia J S. Fault localization based on statement frequency. Information Sciences, 2016, 360: 43-56.

[29]

Jaccard P. Etude de la distribution florale dans une portion des Alpes et du Jura. Bulletin De La Societe Vaudoise Des Sciences Naturelles, 2013, 37(142): 547-579. (in French)

[30]
Jones J A, Harrold M J, Stasko J. Visualization of test information to assist fault localization. In Proc. the 24th Int. Conf. Software Engineering, May 2002, pp.467-477.
[31]
Rui A, Zoeteweij P, van Gemund A J C. An evaluation of similarity coefficients for software fault localization. In Proc. the 12th Pacific Rim International Symposium on Dependable Computing, December 2006, pp.39-46.
[32]

Wong W E, Debroy V, Xu D. Towards better fault localization: A crosstab-based statistical approach. IEEE Transactions on Systems Man & Cybernetics, 2012, 42(3): 378-396.

[33]

Wong W E, Debroy V, Gao R, Li Y. The DStar method for effective software fault localization. IEEE Transactions on Reliability, 2014, 63(1): 290-308.

[34]

Feyzi F, Parsa S. Inforence: Effective fault localization based on information-theoretic analysis and statistical causal inference. Frontiers of Computer Science, 2019, 13(4): 735-759.

[35]

Zakari A, Lee S P, Hashem I A T. A single fault localization technique based on failed test input. Array, 2019, 3/4: Article No. 100008.

[36]
Abreu R, Zoeteweij P, van Gemund A J C. Spectrum-based multiple fault localization. In Proc. the 2009 IEEE/ACM Int. Conf. Automated Software Engineering, Nov. 2009, pp.88-99.
[37]

Wong W E, Debroy V, Golden R, Xu X F, Thuraisingham B. Effective software fault localization using an RBF neural network. IEEE Transactions on Reliability, 2012, 61(1): 149-169.

[38]
Zakari A, Lee S P. Simultaneous isolation of software faults for effective fault localization. In Proc. the 15th IEEE International Colloquium on Signal Processing & Its Applications, March 2019, pp.16-20.
[39]

Zakari A, Lee S P. Parallel debugging: An investigative study. Journal of Software: Evolution and Process, 2019, 31(11): Article No. e2178.

[40]
He Z J, Chen Y, Huang E Y, Wang Q X, Pei Yu, Yuan H D. A system identification based Oracle for control-CPS software fault localization. In Proc. the 41st IEEE/ACM Int. Conf. Software Engineering, May 2019, pp.116-127.
[41]

Do H, Elbaum S, Rothermel G. Supporting controlled experimentation with testing techniques: An infrastructure and its potential impact. Empirical Software Engineering, 2005, 10(4): 405-435.

[42]

Birant D, Kut A. ST-DBSCAN: An algorithm for clustering spatial-temporal data. Data & Knowledge Engineering, 2007, 60(1): 208-221.

[43]
Ester Martin, Kriegel H P, Sander J, Xu X W. A densitybased algorithm for discovering clusters in large spatial databases with noise. In Proc. the 2nd Int. Conf. Knowledge Discovery and Data Mining, August 1996, pp.226-231.
[44]

Yang Q, Li J J, Weiss D M. A survey of coverage-based testing tools. Computer Journal, 2009, 52(5): 589-597.

[45]

Lamraoui S M, Nakajima S. A formula-based approach for automatic fault localization of multi-fault programs. Journal of Information Processing, 2016, 24(1): 88-98.

[46]
Yu Z, Bai C, Cai K Y. Does the failing test execute a single or multiple faults? An approach to classifying failing tests. In Proc. the 37th IEEE/ACM IEEE Int. Conf. Software Engineering, May 2015, pp.924-935.
[47]
Steimann F, Frenkel M, Abreu R. Threats to the validity and value of empirical assessments of the accuracy of coverage-based fault locators. In Proc. the 2013 Int. Symp. Software Testing and Analysis, July 2013, pp.314-324.
[48]

Li X, Zhang L M. Transforming programs and tests in tandem for fault localization. Proceedings of the ACM on Programming Languages, 2017, 1(OOPSLA): Article No. 92.

[49]
Parnin C, Orso A. Are automated debugging techniques actually helping programmers? In Proc. the 2011 Int. Symp. Software Testing and Analysis, July 2011, pp.199-209.
[50]

Prybutok V R. An introduction to statistical methods and data analysis. Technometrics, 1989, 31(3): 389-390.

[51]
Perez A, Rui A, d’Amorim M. Prevalence of single-fault fixes and its impact on fault localization. In Proc. the 2017 IEEE Int. Conf. Software Testing, March 2017, pp.12-22.
[52]

Liu Y, Li Z, Zhao R, Gong P. An optimal mutation execution strategy for cost reduction of mutation-based fault localization. Information Sciences, 2017, 422: 572-596.

[53]
Manish M, Yuriy B. Automatically generating precise oracles from structured natural language specifications localization. In Proc. the 41st ACM/IEEE Int. Conf. Software Engineering, May 2019, pp.188-199.
Journal of Computer Science and Technology
Pages 979-998
Cite this article:
Wu Y-H, Li Z, Liu Y, et al. FATOC: Bug Isolation Based Multi-Fault Localization by Using OPTICS Clustering. Journal of Computer Science and Technology, 2020, 35(5): 979-998. https://doi.org/10.1007/s11390-020-0549-4

362

Views

15

Crossref

N/A

Web of Science

13

Scopus

0

CSCD

Altmetrics

Received: 12 April 2020
Revised: 29 July 2020
Published: 30 September 2020
©Institute of Computing Technology, Chinese Academy of Sciences 2020
Return