Research Institute of Visual Computing

Select a project of interest using the advanced search:

Lp Shape Deformation

Authors: GAO Lin1∗, ZHANG GuoXin1 & LAI YuKun2

DOI: 10.1016/j.imavis.2009.12.005

Abstract:

To allow remotely sensed datasets to be used for data fusion, either to gain additional insight into the scene or for change detection, reliable spatial referencing is required. With modern remote sensing systems, reliable registration can be gained by applying an orbital model for spaceborne data or through the use of global positioning (GPS) and inertial navigation (INS) systems in the case of airborne data. Whilst, individually, these datasets appear well registered when compared to a second dataset from another source (e.g., optical to LiDAR or optical to radar) the resulting images may still be several pixels out of alignment. Manual registration techniques are often slow and labour intensive and although an improvement in registration is gained, there can still be some misalignment of the datasets. This paper outlines an approach for automatic image-to-image registration where a topologically regular grid of tie points was imposed within the overlapping region of the images. To ensure topological consistency, tie points were stored within a network structure inspired from Kohonen's self-organising networks [24]. The network was used to constrain the motion of the tie points in a manner similar to Kohonen's original method. Using multiple resolutions, through an image pyramid, the network structure was formed at each resolution level where connections between the resolution levels allowed tie point movements to be propagated within and to all levels. Experiments were carried out using a range of manually registered multi-modal remotely sensed datasets where known linear and non-linear transformations were introduced against which our algorithm's performance was tested. For single modality tests with no introduced transformation a mean error of 0.011 pixels was identified increasing to 3.46 pixels using multi-modal image data. Following the introduction of a series of translations a mean error of 4.98 pixels was achieve across all image pairs while a mean error of 7.12 pixels was identified for a series of non-linear transformations. Experiments using optical reflectance and height data were also conducted to compare the manually and automatically produced results where it was found the automatic results out performed the manual results. Some limitations of the network data structure were identified when dealing with very large errors but overall the algorithm produced results similar to, and in some cases an improvement over, that of a manual operator. We have also positively compared our method to methods from two other software packages: ITK and ITT ENVI.

Link to Paper

Authors