Commonsense Knowledge Assisted Deep Learning with Application to Size-Related Fine-Grained Object Detection

Pu Zhang; Bin Liu

常識知識を活用したディープラーニングとサイズ関連のきめの細かい物体検出への応用

このペーパーでは、エッジコンピューティングなど、コンピューティングリソースが限られているシナリオでのきめ細かいオブジェクト検出について説明します。特に、単一の画像に同じカテゴリのさまざまなサイズのオブジェクトが含まれているシナリオに焦点を当てており、オブジェクトの物理クラスを認識するだけでなく、そのサイズも検出できるアルゴリズムが必要です。特にディープニューラルネットワーク (DNN) の使用によるディープラーニング (DL) は、物体検出への主要なアプローチとなっています。ただし、正確で詳細な検出を取得するには、大規模な DNN モデルと大量の注釈付きデータが必要であり、特にリソースに制約のあるシナリオでは問題を解決することが課題となります。この目的を達成するために、我々は、常識的な知識を利用して、粗粒度の物体検出器が正確なサイズ関連の粒度の細かい検出結果を達成するのを支援するアプローチを提案する。具体的には、ベンチマークの粗粒度 DL 検出器によって生成された粗粒度のラベルを処理して、サイズに関連した粒度の細かいラベルを生成する常識知識推論モジュール (CKIM) を導入します。私たちの CKIM は、クリスプルールとファジールールの両方に基づく推論方法を検討しており、後者はターゲットサイズに関連するラベルの曖昧さを処理するために採用されています。 Mobilenet-SSD と YOLOv7-tiny を含む 2 つの最新の DL 検出器に基づいてメソッドを実装します。実験結果は、私たちのアプローチが、注釈付きデータの量を減らし、より小さなモデルサイズで正確で細かい検出を達成することを示しています。私たちのコードは https://github.com/ZJLAB-AMMI/CKIM で入手できます。

This paper addresses fine-grained object detection in scenarios with limited computing resources, such as edge computing. In particular, we focus on a scenario where a single image contains objects of the same category but varying sizes, and we desire an algorithm that can not only recognize the physical class of objects but also detect their size. Deep learning (DL), particularly through the use of deep neural networks (DNNs), has become the primary approach to object detection. However, obtaining accurate fine-grained detection requires a large DNN model and a significant amount of annotated data, presenting a challenge to solve our problem particularly for resource-constrained scenarios. To this end, we propose an approach that utilizes commonsense knowledge to assist a coarse-grained object detector in achieving accurate size-related fine-grained detection results. Specifically, we introduce a commonsense knowledge inference module (CKIM) that processes the coarse-grained labels produced by a benchmark coarse-grained DL detector to generate size-related fine-grained labels. Our CKIM explores both crisp-rule and fuzzy-rule based inference methods, with the latter being employed to handle ambiguity in the target size-related labels. We implement our method based on two modern DL detectors, including Mobilenet-SSD, and YOLOv7-tiny. Experimental results demonstrate that our approach achieves accurate fine-grained detections with a reduced amount of annotated data, and smaller model size. Our code is available at https://github.com/ZJLAB-AMMI/CKIM.

updated: Thu Jun 08 2023 03:14:27 GMT+0000 (UTC)

published: Thu Mar 16 2023 01:39:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト