Performance Comparison of Query-based Techniques for Anti-pattern Detection


Program queries play an important role in several software evolution tasks like program comprehension, impact analysis, or the automated identification of anti-patterns for complex refactoring operations. A central artifact of these tasks is the reverse engineered program model built up from the source code (usually an Abstract Semantic Graph, ASG), which is traditionally post-processed by dedicated, hand-coded queries.

Our paper investigates the costs and benefits of using the popular industrial Eclipse Modeling Framework (EMF) as an underlying representation of program models processed by four different general-purpose model query techniques based on native Java code, OCL evaluation and (incremental) graph pattern matching.

We provide in-depth comparison of these techniques on the source code of 28 Java projects using anti-pattern queries taken from refactoring operations in different usage profiles.

Our results show that general purpose model queries can outperform hand-coded queries by 2–3 orders of magnitude, with the trade-off of an increased in memory consumption and model load time of up to an order of magnitude.

The measurement results of usage profiles can be used as guidelines for selecting the appropriate query technologies in concrete scenarios.

Information and Software Technology, 65(C):147–165


    author   = {Ujhelyi, Zolt\'an and Sz\H{o}ke, G\'abor and Horv\'ath, {\'A}kos and Csisz\'ar, Norbert Istv\'an and Vid\'acs, L\'aszl\'o and Varr\'o, D\'aniel and Ferenc, Rudolf},
    title    = {Performance Comparison of Query-based Techniques for Anti-pattern Detection},
    journal  = {Information and Software Technology},
    year     = {2015},
    volume   = {65},
    number   = {C},
    pages    = {147--165},
    month    = sep,
    issn     = {0950-5849},
    doi      = {10.1016/j.infsof.2015.01.003},
    keywords = {Anti-patterns, Refactoring, Performance measurements, Columbus, EMF-IncQuery, OCL},
    url      = {},