Carton dataset synthesis based on foreground texture replacement

Lijun Gou; Shengkai Wu; Jinrong Yang; Hangcheng Yu; Chenxi Lin; Xiaoping Li; Chao Deng

前景テクスチャ置換に基づくカートンデータセット合成

産業用アプリケーション向けのオブジェクト検出モデルを迅速に展開する上での大きな障害の1つは、大きな注釈付きデータセットがないことです。現在、包括的な製薬ロジスティクス会社（CPLC）、eコマースロジスティクス会社（ECLC）、果物市場（FM）などの3つのシナリオからのカートン画像を含むSacked Carton Dataset（SCD）を提示しています。ただし、ドメインシフトのため、SCDの3つのシナリオのいずれかからのカートンデータセットでトレーニングされたモデルは、残りのシナリオに適用した場合、一般化能力が低くなります。この問題を解決するために、ソースデータセットの前景テクスチャをターゲットデータセットの前景インスタンステクスチャに置き換える新しい画像合成方法が提案されています。この方法は、ターゲットデータセットを大幅に増強し、モデルのパフォーマンスを向上させることができます。まず、カートンインスタンスのさまざまな表面を識別するための表面セグメンテーションアルゴリズムを提案します。次に、カートンインスタンスのオクルージョン、トランケーション、および不完全な輪郭の問題を解決するために、輪郭再構成アルゴリズムが提案されます。最後に、ガウス融合アルゴリズムを使用して、ソースデータセットの背景とターゲットデータセットの前景を融合します。新しい画像合成方法は、ターゲットドメインのAPをRetinaNetで少なくとも4.3％〜6.5％、Faster R-CNNで3.4％〜6.8％大幅に向上させることができます。また、ソースドメインでは、パフォーマンスAPをRetinaNetで1.7％〜2％、Faster R-CNNで0.9％〜1.5％向上させることができます。コードはhttps://github.com/hustgetlijun/RCANで入手できます。

One major impediment in rapidly deploying object detection models for industrial applications is the lack of large annotated datasets. We currently have presented the Sacked Carton Dataset(SCD) that contains carton images from three scenarios such as comprehensive pharmaceutical logistics company(CPLC), e-commerce logistics company(ECLC), fruit market(FM). However, due to domain shift, the model trained with carton datasets from one of the three scenarios in SCD has poor generalization ability when applied to the rest scenarios. To solve this problem, a novel image synthesis method is proposed to replace the foreground texture of the source datasets with the foreground instance texture of the target datasets. This method can greatly augment the target datasets and improve the model's performance. We firstly propose a surfaces segmentation algorithm to identify the different surfaces of the carton instance. Secondly, a contour reconstruction algorithm is proposed to solve the problem of occlusion, truncation, and incomplete contour of carton instances. Finally, the Gaussian fusion algorithm is used to fuse the background from the source datasets with the foreground from the target datasets. The novel image synthesis method can largely boost AP by at least 4.3%∼6.5% on RetinaNet and 3.4%∼6.8% on Faster R-CNN for the target domain. And on the source domain, the performance AP can be improved by 1.7%∼2% on RetinaNet and 0.9%∼1.5% on Faster R-CNN. Code is available at https://github.com/hustgetlijun/RCAN.

updated: Thu Mar 25 2021 14:00:28 GMT+0000 (UTC)

published: Fri Mar 19 2021 11:21:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト