[University of Szeged]
Institute of Informatics >>>
Department of Image Processing and Computer Graphics >>>
Estimation of Linear Shape Deformations and its Medical Applications
Registration is a crucial step in almost all image processing tasks where images of different views or sensors of an object need to be compared or combined. Typical application areas include visual inspection, target tracking, super resolution, shape modeling, object recognition, or medical image analysis. In a general setting, one is looking for a transformation which aligns two images such that one image (template) becomes similar to the second one (observation). Due to the large number of possible transformations, there is a huge variability of the object signature. Hence the problem is inherently ill-defined unless this variability is taken into account.
The classical way to solve this registration problem is to find correspondences between the two images and then determine the transformation parameters from these landmarks usually utilizing an iterative optimization procedure. These algorithms basically fall into two main categories. Feature-based methods aim at establishing point correspondences between the two images. For that purpose, they extract some easily detectable landmarks (e.g., contours, intersection of lines, corners, etc.) from the images and then use these landmarks to find the best match based on a similarity metric. Area-based methods on the other hand treat the problem without attempting to detect salient objects. These methods are sometimes called correlation-like methods because they use a rectangular window to gain some preliminary information about the distortion. They search the position in the observation where the matching of the two windows is the best and then looks for sufficient alignment between the windows in the template and in the observation.
The parametric estimation of two-dimensional affine transformations between two gray-level images has been addressed by Hagege and Francos. Their method provides an accurate and computationally simple solution avoiding both the correspondence problem as well as the need for optimization. The basic idea is to reformulate the original problem as an equivalent linear parameter estimation one which can be easily solved. Their solution, however, makes use of the radiometric information which is not available in binary images.
We proposed an extension of this idea to the binary case. The fundamental difference compared to classical image registration algorithms is that our model works without any landmark extraction, correspondence, or iterative optimization. It makes use of all the information available in the input images and constructs a linear or polynomial system of equations which can be solved exactly. The complexity of the algorithm is linear hence it is potentially capable of registering images at near real-time speed. We proposed several methods to find the distortion between the images.
I. Solution by constructing a polynomial system of equations
I. Solution by constructing a polynomial system of equations
Our first approach applies nonlinear functions to the coordinates of the object points and thus a polynomial system of equations is constructed. The transformation parameters of the unknown affine transformation are provided by the solution of this nonlinear (polynomial) system of equations. From a geometric point of view, the idea of this approach is to match the center of masses of the original and the nonlinearly transformed pairs of template and observation images. Note that this approach can be extended to any diffeomorphic transformation, which can be considered as a framework to solve this problem. The following figure show the effect of two such nonlinear transformations.
In our first method we make use of the characteristic function of the objects in order to construct the system of nonlinear equations. Note that, the system of equations is constructed based on the integrals of coordinates on limited precision digital images (containing discretization errors). Hence, we get better results, when we could use more precise object descriptors (please refer to I.2. Fuzzy extension). We proposed solutions for 2D and 3D registration problems.
Registration results on 2D synthetic images
The proposed algorithm has been tested on a large database of binary images of size 1000×1000. The dataset consists of 56 different shapes and their transformed versions, a total of ? 50 000 images. Furthermore, we reviewed some of the most relevant binary registration approaches and, where an implementation was available, evaluated quantitatively (on 1686 images) the performance of our algorithm with respect to these methods. Some of these results are presented in the following figure.
Registration results on 3D synthetic images
We compared the performance of our proposed method with two approaches inspired by well known registration methods from the literature based on mutual information and Chamfer matching [1,2]. To speed up the search in the competing methods, as well as to make them more robust, we used a hierarchical approach based on a Gaussian scale space pyramid with 3 levels, where the best candidate at the coarser level is propagated to the finer levels. Since the capture ranges of the objective functions are known to be limited, we initiated the search from 27 different orientations at the coarsest pyramid level.
We used the 375 rigid-body cases from our 3D test database since we had only rigid version of the competing methods. Our proposed method clearly outperformed both MI and Chamfer. The proposed method gave excellent results for 99.5% of the cases and its overall maximal overlap error was 1.15%, whereas, despite starting from 27 orientations, both MI and Chamfer methods fail seriously in more than 26% of the cases. Furthermore, the computing time requirements of the competing methods were about two orders of magnitude greater than ours.
Registration results on real images
Here you can find a sample implementation and benchmark dataset of the 2D binary image registration algorithm.
Here you can find a sample implementation and benchmark dataset of the 3D binary image registration algorithm.
We extended our method by investigating the case when the segmentation method is capable of producing fuzzy object descriptions instead of a binary result in both 2D and 3D. It has been shown that the information preserved by using fuzzy representation based on area coverage may be successfully utilized to improve precision and accuracy of several shape descriptors; geometric moments of a shape are among them.
The performance of the proposed algorithm has been tested and evaluated on a database of 2D synthetic images. The dataset consisted of 39 different shapes and their transformed versions, a total of 2000 images. The fuzzy border representations of the bservation images were generated by using 16×16 supersampling of the pixels close to the object edge and the pixel coverage was approximated by the fraction of subpixels that fall inside the object. The fuzzy membership values of the images were quantized and represented by integer values having k-bit (k = 1,...,8) representation. The result of the test showed that fuzzy representation yields smaller registration errors in average.
In real applications we need a segmentation method that provides fuzzy results. The moment estimation method presented in  assumes a discrete representation of a shape such that pixels are assigned their corresponding pixel coverage values. Even though many fuzzy segmentation methods exist in the literature, very few of them result in pixel coverage based object representations. With an intention to show the applicability of the approach, but to not focus on designing a completely new fuzzy segmentation method, we derived pixel coverage values from an Active Contour segmentation: We have modified the SnakeD plugin for ImageJ by Thomas Boudier. Active Contour segmentation provides a crisp parametric representation of the object contour from which it is fairly straightforward to compute pixel coverage values.
II. Solution by constructing linear system of equations
We also proposed linear solution to registrate affine transformation of binary images. When we can observe or construct such image features, that invariant under the transformations (e.g. gray-levels), called covariant function pair, then we can apply the idea of Hagege and Francos. Hence, the solution of a linear system of equations provides the parameters of the unknown transformation. The main challenge is how we can construct these covariant function pairs.
In our second approach, in spite of the missing radiometric information, the exact transformation is obtained as a least-squares solution of a linear system. The basic idea is to generate a pair of covariant functions that are related by the unknown transformation. This can be achieved by fitting Gaussian density to the shapes which preserves the effect of the unknown transformation. It can also be regarded as a consistent coloring of the shapes yielding two rich functions defined over the two shapes to be matched.
Using the Mahalanobis-distance for defining the covariant functions.
Then we apply a set of nonlinear functions to create the linear system of equations. Theoretically any function could be applied for constructing the system of equations. In practice, however, the registration result depends on the set of these functions because of the inherent errors due to discretization. In our experiments, we found that the identic function with the trigonometric family gives good results.
Some results and comparison
We compared the results of our method with the method proposed by Kannala et. al.  and the shape context method proposed by Belongie et. al. . Note that for the shape context method, the images were reduced in size to get the result in acceptable time.
The previous method requires an almost perfect segmentation in order to achieve satisfactory alignment. Here we present an elegant, fast and robust solution, where linear equations are constructed by integrating nonlinear functions over corresponding domains derived from compound shapes.
The topology of the objects does not change under the action of the affine group. Thus, there exist a bijective mapping between the template and observation components under the transformation. As a consequence, we can construct covariant functions for each pair of shapes, and the overall shape. Thus we have m + 1 relations, where m is the number of the regions. The correspondence between components is usually not known, we will sum these relations yielding mixtures of unnormalized Gaussian densities which can also be interpreted as a consistent coloring of the template and observation respectively. As a result, we can transform the original binary images into graylevel ones, where corresponding pixels have exactly the same gray value.
One of the most important question of the proposed method is how we can select a finite (intergration) domain which is less dependent from the segmentation error. Obvious approach (as it presented above) to restrict the integration in a finite domain is to use the segmented shape itself as the domain, however, the segmentation error will inherently result in erroneous integrals causing misalignment. Since the equidensity contours of a two variate Gaussian probability density function are ellipsoids centered at the mean, it is natural to chose one of the ellipses of the density fitted to the overall shape as the domain, because they are related by the unknown transformation. We may choose an ellipse according to the two sigma rule, which guarantees that about 95% of values are within the enclosed ellipsoid.
Making use of the same idea as the pervious case we can construct a linear system of equations by applying a nonlinear set of functions. In our experiments, we found that the identic function with the trigonometric family gives good results.
In practice, segmentation never produces perfect shapes. Therefore we also evaluated the robustness of the proposed approach when 10%, 20%,..., 90% of the foreground pixels has been removed from the observation before registration. Our method provides good results up to as high as 50% removed pixels, and results for 90% are also acceptable. In general, our method will perform well as long as the first and second order statistics of shapes does not change dramatically.
Real images: traffic sign matching
Automatic traffic sign recognition usually requires the registration of a reference shape to an observed one. There are many methods to automatically detect and segment signs, but in our experiments we simply performed this task manually via thresholding. The results illustrate that the proposed method provides good results under real-life conditions.