Information retrieval based feature analysis for product line adoption in 4GL systems

András Kicsi, László Vidács, Árpád Beszédes, Ferenc Kocsis and István Kovács
New customers often require custom features of a successfully marketed product. As the number of variants grow, new challenges arise in the maintenance and evolution activities. Software product line (SPL) architecture is a timely answer to these challenges. The SPL adoption however is a large one time investment that affects both technical and organizational issues. From the program code point of view, the extractive approach is appropriate when there are already several product variants. Analyzing the feature structure, the differences and commonalities of the variants lead to the new common architecture. In this work in progress paper we report initial experiments of feature extraction from a set of product variants written in the Magic fourth generation language (4GL). Since existing approaches are mostly designed for mainstream languages, we adapted and reused reverse engineering approaches to the 4GL environment. We followed a semi-automatic feature extraction method, where the higher level features are provided by domain experts. These features are then linked to the internal structure of Magic applications using a textual similarity (IR-based) method. We demonstrate the feasibility of 4GL feature extraction method and validate it on two variants of a real life logistical system each consisting of more than 2000 Magic programs.

Keywords: Product lines, SPL, feature extraction, Magic, 4GL, information retrieval, LSI.