Uncovering Dependence Clusters and Linchpin Functions
David Binkley, Árpád Beszédes, Syed Islam, Judit
Jász and Béla Vancsics
Dependence clusters are (maximal) collections of
mutually dependent source code entities according to some
dependence relation. Their presence in software complicates many
maintenance activities including testing, refactoring, and feature
extraction. Despite several studies finding them common in
production code, their formation, identification, and overall
structure are not well understood, partly because of challenges in
approximating true dependences between program entities. Previous
research has considered two approximate dependence relations: a
fine-grained statement-level relation using control and data
dependences from a program’s System Dependence Graph and a coarser
relation based on function-level control-flow reachability. In
principal, the first is more expensive and more precise than the
second.
Using a collection of twenty programs, we present an empirical
investigation of the clusters identified by these two approaches.
In support of the analysis, we consider a hybrid cluster type that
works at the coarser function-level but is based on the
higher-precision statement-level dependences. The three types of
clusters are compared based on their slice sets using two
clustering metrics. We also perform extensive analysis of the
programs to identify linchpin functions – functions primarily
responsible for holding a cluster together. Results include
evidence that the less expensive, coarser approaches can often be
used as effective proxies for the more expensive, finer-grained
approaches. Finally, the linchpin analysis shows that linchpin
functions can be effectively and automatically identified.
Back