Relating Clusterization Measures and Software Quality

Béla Csaba, Lajos Schrettner, Árpád Beszédes, Judit Jász, Péter Hegedűs and Tibor Gyimóthy
Empirical studies have shown that dependence clusters are both prevalent in source code and detrimental to many activities related to software, including maintenance, testing and comprehension. Based on such observations, it would be worthwhile to try to give a more precise characterization of the connection between dependence clusters and software quality. Such attempts are hindered by a number of difficulties: there are problems in assessing the quality of software, measuring the degree of clusterization of software and finding the means to exhibit the connection (or lack of it) between the two.

In this paper we present our approach to establish a connection between software quality and clusterization. Software quality models comprise of low- and high-level quality attributes, in addition we defined new clusterization metrics that give a concise characterization of the clusters contained in programs. Apart from calculating correlation coefficients, we used mutual information to quantify the relationship beetween clusterization and quality. Results show that a connection can be demonstrated between the two, and that mutual information combined with correlation can be a better indicator to conduct deeper examinations in the area.

Keywords: Software quality model, Quality metrics, Dependence cluster, Clusterization metrics, Correlation, Mutual information.