Research Group on Visual Computation

Correspondences between Planar Image Patches

Icon of project

Description

Finding correspondences between image pairs is a fundamental task in computer vision. Herein, we focus on establishing matches between images of urban scenes which are typically composed of planar surface patches with highly repetitive structures. The latter property makes traditional point-based methods unreliable. The basic idea of our approach is to formulate the correspondence problem in terms of homography estimation between planar image regions: given a planar region in one image, we are simultaneously looking for its corresponding segmentation in the other image and the planar homography acting between the two regions.

It is shown that due to the overlapping views the general 8 degree of freedom (DOF) of the homography mapping can be geometrically constrained to 3 DOF by rectifying the images: scaling, shear and translation along the horizontal axis. The resulting segmentation/registration problem can be efficiently solved by finding the region's occurrence in the second image using pyramid representation and normalized mutual information as the intensity similarity measure.

Fig.1. Patch transformed by the three DOF of the transformation (scaling, shear and translation along the X-axis - left) and the optimal patch correspondence (red outline over the recitified second image - right). The horizontal epipolar lines in the rectified image space are show in blue.

The method has been validated on a large database of building images taken by different mobile cameras (Samsung Galaxy S, Samsung Galaxy Note 10.1, HTC EVO 3D) and quantitative evaluation confirms robustness against intensity variations, occlusions or the presence of non-planar parts. The size of the images were 2 megapixels as it provides a good balance between image resolution and computation speed. Because of the collaborative fashion of mobile imaging, the images were taken nearly simultaneously. Thus, lighting conditions are expected to be very similar, but the imaging capabilities of the devices were quite different.

The evaluation dataset contained image pairs of 100 different urban scenes taken by different mobile cameras. The alignment precision was quantitatively measured by the correlation coefficient computed for the transformed patch and its corresponding region for each image. The planar region in the first image was interactively marked as a polygonal mask and the corresponding patch in the second image was then determined by three competing methods. A fourth case contained our proposed method using a region-growed patch inside the polygonal region to show its independence from the shape and connectivity of the patch.

Our proposed method was compared against an ASIFT-based method and its rectified version (see plots in Fig.2). Typical computing time of our method (excluding rectification) was around 4.5 seconds, while the Patch2 variant (i.e., no refinement at the finest pyramid level) proved to be 2.7x faster at 1.74 seconds average time (see Fig.3).

Fig.2. Correlation coefficient values for corresponding regions obtained by the tested methods. The values are ordered in a best-to-worst sense, independently for each method. It is confirmed that our method outperforms the competing ones.
Fig.3. Average computing times of the faster correspondence method (Patch2) using different devices.

By visually checking the results, we can conclude that our proposed method is able to produce near perfect correspondences even if the region contains non-planar objects causing parallax. Partial occlusions caused by electric wires, vegetation, or lamp posts are also well tolerated. Examples can be found at the accompanying page of our DICTA 2014 publication.

The applicability of the method include 3D planar surface reconstruction (Fig.4) as well as 2D mosaicking (Fig.5).

Fig.4. Top row: The two overlapping input images. The border of the polygonal patches selections are shown in red. Bottom: Animation of the 3D reconstructed result.
Fig.5. Result of mosaicking. The border of the transformed image is shown in yellow.
Publications to cite:
  1. Attila Tanács, András Majdik, Levente Hajder, József Molnár, Zsolt Sánta, Zoltan Kato, Collaborative Mobile 3D Reconstruction of Urban Scenes, In Proceedings of ACCV Workshop on Intelligent Mobile and Egocentric Vision (C. V. Jawahar, Shiguang Shan, eds.), Springer, vol. 9010, Singapore, pp. 486-501, 2014. [bibtex] [doi]
  2. Attila Tanács, András Majdik, József Molnár, Atul Rai, Zoltan Kato, Establishing Correspondences between Planar Image Patches, In Proceedings of International Conference on Digital Image Computing: Techniques and Applications, IEEE, Wollongong, New South Wales, Australia, pp. 1-7, 2014. [bibtex]

Hichem Abdellali has been awarded the Doctor of Philosophy (PhD.) degree...

2022-04-30


Hichem Abdellali has been awarded the KÉPAF Kuba Attila prize...

2021-06-24