Performance Comparison of Query-based Techniques for Anti-pattern Detection

Abstract

Context
Program queries play an important role in several software evolution tasks like program comprehension, impact analysis, or the automated identification of anti-patterns for complex refactoring operations. A central artifact of these tasks is the reverse engineered program model built up from the source code (usually an Abstract Semantic Graph, ASG), which is traditionally post-processed by dedicated, hand-coded queries.

Objective
Our paper investigates the costs and benefits of using the popular industrial Eclipse Modeling Framework (EMF) as an underlying representation of program models processed by four different general-purpose model query techniques based on native Java code, OCL evaluation and (incremental) graph pattern matching.

Method
We provide in-depth comparison of these techniques on the source code of 28 Java projects using anti-pattern queries taken from refactoring operations in different usage profiles.

Results
Our results show that general purpose model queries can outperform hand-coded queries by 2–3 orders of magnitude, with the trade-off of an increased in memory consumption and model load time of up to an order of magnitude.

Conclusion
The measurement results of usage profiles can be used as guidelines for selecting the appropriate query technologies in concrete scenarios.

Publication
Information and Software Technology, 65(C):147–165

BibTeX:

@Article{USH15,
    author   = {Ujhelyi, Zolt\'an and Sz\H{o}ke, G\'abor and Horv\'ath, {\'A}kos and Csisz\'ar, Norbert Istv\'an and Vid\'acs, L\'aszl\'o and Varr\'o, D\'aniel and Ferenc, Rudolf},
    title    = {Performance Comparison of Query-based Techniques for Anti-pattern Detection},
    journal  = {Information and Software Technology},
    year     = {2015},
    volume   = {65},
    number   = {C},
    pages    = {147--165},
    month    = sep,
    issn     = {0950-5849},
    doi      = {10.1016/j.infsof.2015.01.003},
    keywords = {Anti-patterns, Refactoring, Performance measurements, Columbus, EMF-IncQuery, OCL},
    url      = {http://www.sciencedirect.com/science/article/pii/S0950584915000051?via%3Dihub},
}