Design Pattern Mining Enhanced by Machine Learning

Rudolf Ferenc, Árpád Beszédes, Lajos Fülöp and János Lele
Design patterns present good solutions to frequently occurring problems in object-oriented software design. Thus their correct application in a system's design may significantly improve its internal quality attributes such as reusability and maintainability. In software maintenance the existence of up-to-date documentation is crucial, so the discovery of as yet unknown design pattern instances can help improve the documentation. Hence a reliable design pattern recognition system is very desirable. However, simpler methods (based on pattern matching) may give imprecise results due to the vague nature of the patterns' structural description. In previous work we presented a pattern matching-based system using the Columbus framework with which we were able to find pattern instances from the source code by considering the patterns' structural descriptions only, and therefore we could not identify false hits and distinguish similar design patterns such as State and Strategy. In the present work we use machine learning to enhance pattern mining by filtering out as many false hits as possible. To do so we distinguish true and false pattern instances with the help of a learning database created by manually tagging a large C++ system.

Keywords: Columbus, C++, Design Patterns, Machine Learning, StarOffice