A Line-Level Explainable Vulnerability Detection Approach for Java

Abstract

Given our modern society’s level of dependency on IT technology, high quality and security are not just desirable but rather vital properties of current software systems. Empirical methods leveraging the available rich open-source data and advanced data processing techniques of ML algorithms can help software developers ensure these properties. Nonetheless, state-of-the-art bug and vulnerability prediction methods are rarely used in practice due to numerous reasons. The predictions are not actionable in most of the cases due to their level of granularity (i.e., they mark entire classes/files to be buggy or vulnerable) and because the methods seldom provide explanation why a fragment of source code is problematic. In this paper, we present a novel Java vulnerability detection method that addresses both of these issues. It is an adaptation of our previous method for JavaScript that is capable of pinpointing vulnerable source code lines of a program together with a prototype-based explanation. The method relies on the word2vec similarity of code fragments to known vulnerable source code lines. Our empirical evaluation showed promising results, we could detect 61 and 41% of the vulnerable code lines by flagging only 43% and 22% of the program code lines, respectively, using two of the best detection configurations.

Publication
Proceedings of the 22nd International Conference on Computational Science and Its Applications (ICCSA 2022), Malaga, Spain, Pages 106–122

BibTeX:

@InProceedings{MVH22,
    author    = {Mosolygó, Balázs and Vándor, Norbert and Hegedűs, Péter and Ferenc, Rudolf},
    booktitle = {Proceedings of the 22nd International Conference on Computational Science and Its Applications (ICCSA 2022)},
    title     = {A Line-Level Explainable Vulnerability Detection Approach for Java},
    year      = {2022},
    address   = {Malaga, Spain},
    month     = jul,
    pages     = {106--122},
    publisher = {Springer International Publishing},
    doi       = {10.1007/978-3-031-10542-5_8},
    keywords  = {Software security, Vulnerability prediction, Explainable prediction model, Empirical study},
    url       = {https://link.springer.com/chapter/10.1007/978-3-031-10542-5_8},
}