Learning concepts that are consistent with human perception is important for Deep Neural Networks to win end-user trust. Post-hoc interpretation methods lack transparency in the feature representations learned by the models. This work proposes a guided learning approach with an additional concept layer in a CNN- based architecture to learn the associations between visual features and word phrases. We design an objective function that optimizes both prediction accuracy and semantics of the learned feature representations. Experiment results demonstrate that the proposed model can learn concepts that are consistent with human perception and their corresponding contributions to the model decision without compromising accuracy. Further, these learned concepts are transferable to new classes of objects that have similar concepts.
updated: Mon May 24 2021 16:09:35 GMT+0000 (UTC)
published: Mon Jan 11 2021 14:35:16 GMT+0000 (UTC)