Deep learning in static, metric-based bug prediction


Our increasing reliance on software products and the amount of money we spend on creating and maintaining them makes it crucial to find bugs as early and as easily as possible. At the same time, it is not enough to know that we should be paying more attention to bugs; finding them must become a quick and seamless process in order to be actually used by developers. Our proposal is to revitalize static source code metrics – among the most easily calculable, while still meaningful predictors – and combine them with deep learning – among the most promising and generalizable prediction techniques – to flag suspicious code segments at the class level. In this paper, we show a detailed methodology of how we adapted deep neural networks to bug prediction, applied them to a large bug dataset (containing 8780 bugged and 38,838 not bugged Java classes), and compared them to multiple ``traditional” algorithms. We demonstrate that deep learning with static metrics can indeed boost prediction accuracies. Our best model has an F-measure of 53.59 , which increases to 55.27% for the best ensemble model containing a deep learning component. Additionally, another experiment suggests that these values could improve even further with more data points. We also open-source our experimental Python framework to help other researchers replicate our findings.

Array, 6:100021


    author   = {Ferenc, Rudolf and B{\'a}n, D{\'e}nes and Gr{\'o}sz, Tam{\'a}s and Gyim{\'o}thy, Tibor},
    journal  = {Array},
    title    = {Deep learning in static, metric-based bug prediction},
    year     = {2020},
    issn     = {2590-0056},
    month    = jul,
    note     = {Open Access},
    pages    = {100021},
    volume   = {6},
    doi      = {10.1016/j.array.2020.100021},
    keywords = {Neural networks, Deep learning, Bug prediction, Code metrics},
    url      = {},