New Ranking Formulas to Improve Spectrum Based Fault Localization Via Systematic Search

Qusay Idrees Sarhan, Tamás Gergely and Árpád Beszédes
In Spectrum-Based Fault Localization (SBFL), when some failing test cases indicate a bug, a suspicion score for each program element (e.g., statement, method, or class) is calculated using a risk evaluation formula based on basic statistics (e.g., covering/not covering program element in passing/failing test) extracted from test coverage and test results. The elements are then ranked from most suspicious to least suspicious based on their scores. The elements with the highest rank are believed to have the highest probability of being faulty, thus, this light-weight automated technique aids developers to find the bug earlier. Several SBFL formulas were proposed in the literature, but the number of possible formulas is infinite. Previously, experiments were conducted to automatically search new formulas (e.g., using genetic algorithms). However, no systematic search for new formulas were reported in the literature. In this paper, we do so by examining existing formulas, defining formula structure templates, generating formulas automatically (including already proposed ones), and comparing them to each other. Experiments to evaluate the generated formulas were conducted on Defects4J.

Keywords:     Debugging, automated fault localization, spectrum-based fault localization, formulas, systematic search.