Aerial scene classification, which aims to semantically label remote sensing images in a set of predefined classes (e.g., agricultural, beach, and harbor), is a very challenging task in remote sensing due to high intra-class variability and the different scales and orientations of the objects present in the dataset images. In remote sensing area, the use of CNN architectures as an alternative solution is also a reality for scene classification tasks. Generally, these CNNs are used to perform the traditional image classification task. However, another less used way to classify remote sensing image might be the one that uses deep metric learning (DML) approaches. In this sense, this work proposes to employ six DML approaches for aerial scene classification tasks, analysing their behave with four different pre-trained CNNs as well as combining them through the use of evolutionary computation algorithm (UMDA). In performed experiments, it is possible to observe than DML approaches can achieve the best classification results when compared to traditional pre-trained CNNs for three well-known remote sensing aerial scene datasets. In addition, the UMDA algorithm proved to be a promising strategy to combine DML approaches when there is diversity among them, managing to improve at least 5.6% of accuracy in the classification results using almost 50% of the available classifiers for the construction of the final ensemble of classifiers.