Bug Forecast: A Method for Automatic Bug Prediction

Abstract

In this paper we present an approach and a toolset for automatic bug prediction during software development and maintenance. The toolset extends the Columbus source code quality framework, which is able to integrate into the regular builds, analyze the source code, calculate different quality attributes like product metrics and bad code smells; and monitor the changes of these attributes. The new bug forecast toolset connects to the bug tracking and version control systems and assigns the reported and fixed bugs to the source code classes from the past. It then applies machine learning methods to learn which values of which quality attributes typically characterized buggy classes. Based on this information it is able to predict bugs in current and future versions of the classes. The toolset was evaluated on an industrial software system developed by a large software company called evosoft. We studied the behavior of the toolset through a 1,5 year development period during which 128 snapshots of the software were analyzed. The toolset reached an average bug prediction precision of 72, reaching many times 100%. We concentrated on high precision, as the primary purpose of the toolset is to aid software developers and testers in pointing out the classes which contain bugs with a high probability and keep the number of false positives relatively low.

Publication
Proceedings of the 2010 International Conference on Advanced Software Engineering & Its Applications (ASEA 2010), Jeju Island, Korea, Pages 283–295

BibTeX:

@InProceedings{Fer10,
    author    = {Ferenc, Rudolf},
    title     = {{Bug Forecast}: A Method for Automatic Bug Prediction},
    booktitle = {Proceedings of the 2010 International Conference on Advanced Software Engineering \& Its Applications (ASEA 2010)},
    year      = {2010},
    volume    = {117},
    series    = {Communications in Computer and Information Science (CCIS)},
    pages     = {283--295},
    address   = {Jeju Island, Korea},
    month     = dec,
    publisher = {Springer-Verlag},
    doi       = {10.1007/978-3-642-17578-7_28},
    keywords  = {bug prediction, machine learning, software product metrics, bad code smells},
    url       = {https://link.springer.com/chapter/10.1007%2F978-3-642-17578-7_28},
}