With the advancement of remote-sensed imaging large volumes of very high resolution land cover images can now be obtained. Automation of object recognition in these 2D images, however, is still a key issue. High intra-class variance and low inter-class variance in Very High Resolution (VHR) images hamper the accuracy of prediction in object recognition tasks. Most successful techniques in various computer vision tasks recently are based on deep supervised learning. In this work, a deep Convolutional Neural Network (CNN) based on symmetric encoder-decoder architecture with skip connections is employed for the 2D semantic segmentation of most common land cover object classes - impervious surface, buildings, low vegetation, trees and cars. Atrous convolutions are employed to have large receptive field in the proposed CNN model. Further, the CNN outputs are post-processed using Fully Connected Conditional Random Field (FCRF) model to refine the CNN pixel label predictions. The proposed CNN-FCRF model achieves an overall accuracy of 90.5% on the ISPRS Vaihingen Dataset.
updated: Mon Oct 14 2019 11:22:18 GMT+0000 (UTC)
published: Mon Oct 14 2019 11:22:18 GMT+0000 (UTC)