Extracting Facts from Open Source Software

Abstract

Open source software systems are becoming increasingly important these days. Many companies are investing in open source projects and lots of them are also using such software in their own work. But because open source software is often developed without proper management, the quality and reliability of the code may be uncertain. The quality of the code needs to be measured and this can be done only with the help of proper tools. We describe a framework called Columbus with which we calculate the object oriented metrics validated by Basili et al. for illustrating how fault-proneness detection from the open source Web and e-mail suite called Mozilla can be done. We also compare the metrics of several versions of Mozilla to see how the predicted fault-proneness of the software system changed during its development. The Columbus framework has been further developed recently with a compiler wrapping technology that now gives us the possibility of automatically analyzing and extracting information from software systems without modifying any of the source code or makefiles. We also introduce our fact extraction process here to show what logic drives the various tools of the Columbus framework and what steps need to be taken to obtain the desired facts.

Publication
Proceedings of the 20th International Conference on Software Maintenance (ICSM 2004), Chicago Illinois, USA, Pages 60–69

BibTeX:

@InProceedings{FSG04,
    author    = {Ferenc, Rudolf and Siket, Istv\'an and Gyim\'othy, Tibor},
    title     = {Extracting Facts from Open Source Software},
    booktitle = {Proceedings of the 20th International Conference on Software Maintenance (ICSM 2004)},
    year      = {2004},
    pages     = {60--69},
    address   = {Chicago Illinois, USA},
    month     = sep,
    publisher = {IEEE Computer Society},
    doi       = {10.1109/ICSM.2004.1357790},
    keywords  = {Fact extraction, metrics, reverse engineering, open source, fault-proneness detection, Mozilla, compiler wrapping, schema, C, C++, Columbus, CAN, CANPP},
    url       = {http://ieeexplore.ieee.org/document/1357790/},
}