Pixel- and Feature-level Fusions of Aerial Imagery with LiDAR Data for Landscape Object Extraction

Author: Syed Sohel Ali

Ali, Syed Sohel, 2011 Pixel- and Feature-level Fusions of Aerial Imagery with LiDAR Data for Landscape Object Extraction, Flinders University, School of the Environment

This electronic version is made publicly available by Flinders University in accordance with its open access policy for student theses. Copyright in this thesis remains with the author. This thesis may incorporate third party material which has been used by the author pursuant to Fair Dealing exceptions. If you are the owner of any included third party copyright material and/or you believe that any material has been made available without permission of the copyright owner please contact copyright@flinders.edu.au with the details.


The objective of this research is to investigate different fusion models that integrate aerial imagery with LiDAR data for landscape object extraction. Pixel- and feature-level fusions are particularly investigated in data- and user-driven scenarios to delineate a range of landscape objects on forest and semi-urban study areas. Thematic accuracy is evaluated against field-surveyed data and optimum fusion models for each study area is identified. The complementary nature of aerial imagery and LiDAR data is the main reason for their selection in this research. LiDAR data provides an accurate measurement of landscape structure in the vertical plane; however, LiDAR sensors have limited coverage in the electromagnetic spectrum. By contrast, aerial imagery provides extensive coverage of landscape classes in the electromagnetic spectrum but is relatively insensitive to variations in height of objects. As a result, the fusion of aerial imagery with LiDAR data has the potential to significantly improve mapping of the landscape. Since small footprint LiDAR and aerial imagine systems can achieve very high spatial, spectral and textual resolutions and suitable for site-specific landscape mapping. However, the direct relationship between spatial resolution and landscape classification does need to be considered before apply any fusion model. The forest study area is in the Moira State Forest, which form a part of Barmah and Millewa forests near Mathoura, New South Wales on a site 1.25km x 1km consisting of native Eucalyptus forest. The semi-urban study area 1.25km x 1km is also in Mathoura township itself and contains typical semi-urban landscape objects such as residential and commercial buildings, open spaces, roads and gardens. Geometric corrections of multi-source data are a prerequisite for any data fusion study. Variations in sensor altitude, attitude, and calibration affect the quality of the fusion results. A parametric rectification model was implemented to correct these influences in high spatial resolution aerial imagery. The aerial triangulation-derived RMSE values for colour and multispectral imagery are between 0.70 and 1.17 microns. These results meet the established standard 1-micron, or one-quarter of a pixel, benchmark and provide a sound geometric basis for multi-source data fusion. The original 0.88m spatial-resolution aerial multispectral imagery is resampled into 0.5m resolution for uniformity with LiDAR data. The processed LiDAR data do not require orthorectification, however an nDSM needs to be generated from the LiDAR source for use in the fusion process. The nDSM represents mean height of landscape objects and is computed as the difference between first and last LiDAR returns. LiDAR-derived object heights are compared with the field-surveyed data to check height accuracy. The estimates of heights from LiDAR were generally within one metre of field measurements, although discrepancies as high as 3m were observed. An extensive field survey was conducted to collect training data and gather reference data for thematic accuracy assessment. A complex sampling strategy was developed combining random and systematic sampling techniques that provide a good balance between statistical validity and practical application. For thematic accuracy assessment, error matrices were generated using reference pixel data derived from field-survey and aerial photo interpretation with corresponding pixels of the fusion results. Overall thematic accuracy as well as User's and Producer's accuracies were computed to measure the success of fusion models. Kappa analysis tested the significance of each matrix and determined whether the results presented in the error matrix were significantly better than a random result. For the forest study area, data-driven pixel-level fusion was implemented using an unsupervised model. In feature-level fusion, a watershed transformation algorithm was utilised for delineating tree crowns; a masking techniques was used for collecting tree feature attributes; and finally an unsupervised model was applied for delineating tree species. Thematic accuracies of pixel- and feature-level fusion results are assessed through error matrices derived from fusion results and field-surveyed reference data. In pixel-level fusion results, the overall thematic accuracy was 64.67 percent and the Kappa Coefficient value was 0.52. The Kappa Coefficient of the pixel-level fusion results indicates moderate agreement between the fusion results and the reference data. In feature-level fusion results, the overall thematic accuracy was 86.33 percent and the Kappa Coefficient was 0.82, which being close to 1, indicated substantial agreement between fusion results and reference measurement. Statistical comparison of the pixel- and feature-level fusion results indicated that, at the 95 percent confidence level, the standard normal deviation of the Kappa Coefficient was 6.68, the results were significantly better than a random result and feature-level fusion achieves better results than the pixel-level fusion. Segmentation and subsequent feature classification using a data-driven model provided superior results to pixel-level fusion. In a user-driven scenario, a hierarchical landscape classification scheme was developed for the delineation of semi-urban landscape objects using pixel- and feature-level fusions. 4-band multispectral imagery and LiDAR-derived nDSM have incompatible statistics and unable to represent into a normal class model as a result statistical methods of supervised fusion is not considered. Pixel-level fusion utilises the supervised parallelepiped technique to fuse these datasets. In the feature-level fusion, the feature delineation is achieved through multi-resolution segmentation and subsequently classifies features using knowledge-driven rules. The spectral, spatial and contextual properties of the features are utilised to develop these knowledge rules. For the semi-urban study area, the thematic accuracy of pixel-level fusion results was 73.25 percent with Kappa Coefficient 0.67. There is thus substantial agreement between pixel-level fusion results and reference data. By contrast, feature-level fusion of the same datasets gave 88.38 percent overall accuracies and Kappa Coefficient 0.86 indicating excellent agreement between fusion results and reference data. Comparing fusion results indicated a significant difference between the pixel- and feature-level fusions results. At the 95 percent confidence level, the standard normal deviation of the Kappa Coefficient for pixel- and feature-level fusions using multispectral imagery with LiDAR data was 3.70, well above the standard 1.96 threshold. This indicates a significantly better than random result and shows that feature-level fusion performs better than the pixel-level fusion. Particularly in the delineation of Shadow classes, the User's and Producer's Accuracies for feature-level fusion results were substantially better than pixel-level fusion results. With high spatial resolution data the interclass variability within the class was also high meaning that the 'pepper and salt' effect was widespread in pixel-level fusion results for both study areas. In particular, the pixel-level fusion in the forest study area did not clearly delineate individual trees when they were clumped. It also suffered from mixed-pixel effect for individual trees as it showed multiple tree species within a single canopy. Feature-level fusion overcame this problem by defining the tree canopy areas first, then extracting feature attributes from the segments, and finally identifying the tree species using unsupervised feature classification. Replacement of multispectral imagery with colour imagery revealed results were not much different in feature-level fusion but for pixel-level fusion of multispectral imagery lead to less misclassification for the higher radiometric depth (16-bit) than did colour imagery (8-bit). The exclusion of the LiDAR data greatly reduced the quality of the fusion results. The results of the fusion accuracy for inclusion and exclusion of LiDAR are small in pixel-level fusion but are large in feature-level fusion. The same class of objects has conflicting spectral properties but inclusion of LiDAR-derived height resolves that conflicts and plays a vital role in landscape mapping. The research has lead to the conclusion that feature-level fusions perform better in classifying landscape objects than pixel-level fusions in both data- and user-driven scenarios.

Keywords: Fusion,lidar,aerial photo,object-oriented classification,and watershed segmentation
Subject: Environmental Studies thesis

Thesis type: Doctor of Philosophy
Completed: 2011
School: School of the Environment
Supervisor: Dr Paul Dare