Systematically Generated Formulas for Spectrum-Based Fault Localization

Qusay Idrees Sarhan, Tamás Gergely and Árpád Beszédes
The basic element in Spectrum-Based Fault Localization (SBFL) are the  risk evaluation formulas, which calculate a suspiciousness score for each program element based on test coverage and test case outcome information. This score can be used in debugging to identify the faulty element more efficiently. A large number of manually crafted formulas have been proposed, but a line of research tries to generate formulas (semi-)automatically. Some of these approaches are based on heuristic search (e.g., genetic algorithms), and researchers started only recently  examining systematic ways to generate all possible formulas corresponding to a particular class of formula structures. In a recent work, we explored a very simple formula template as a proof of concept but this research failed to find a new formula that outperformed already published ones. In this paper, we take a next step and investigate a class of formula templates that are more complex but still feasible to explore fully. Many of the generated formulas cover some well-known existing ones, but we also managed to identify two new ones that are not found in literature and are better than most of the previously published formulas (evaluated on the Defects4J dataset).

Keywords:     Spectrum-Based Fault Localization, debugging, suspiciousness score formulas, systematic search.