Deep neural networks can be effective means to automatically classify aerial images but is easy to overfit to the training data. It is critical for trained neural networks to be robust to variations that exist between training and test environments. To address the overfitting problem in aerial image classification, we consider the neural network as successive transformations of an input image into embedded feature representations and ultimately into a semantic class label, and train neural networks to optimize image representations in the embedded space in addition to optimizing the final classification score. We demonstrate that networks trained with this dual embedding and classification loss outperform networks with classification loss only. %We also study placing the embedding loss on different network layers. We also find that moving the embedding loss from commonly-used feature space to the classifier space, which is the space just before softmax nonlinearization, leads to the best classification performance for aerial images. Visualizations of the network's embedded representations reveal that the embedding loss encourages greater separation between target class clusters for both training and testing partitions of two aerial image classification benchmark datasets, MSTAR and AID. Our code is publicly available on GitHub.
updated: Tue Sep 24 2019 05:57:53 GMT+0000 (UTC)
published: Tue Dec 05 2017 07:45:18 GMT+0000 (UTC)