Efficient Classification of Very Large Images with Tiny Objects

Fanjie Kong; Ricardo Henao

小さなオブジェクトを含む非常に大きな画像の効率的な分類

コンピュータビジョン、特に医用画像とリモートセンシングでのアプリケーションの増加は、非常に大きな画像を小さな有益なオブジェクトで分類することが目標である場合に困難になります。具体的には、これらの分類タスクは2つの重要な課題に直面します。i）入力画像のサイズは通常メガピクセルまたはギガピクセルのオーダーですが、既存のディープアーキテクチャはメモリの制約のためにそのような大きな画像を簡単に操作できません。、これらの画像を処理するためのメモリ効率の高い方法を探しています。 ii）入力画像のごく一部のみが対象のラベルを通知するため、対象領域（ROI）と画像の比率が低くなります。ただし、現在の畳み込みニューラルネットワーク（CNN）のほとんどは、ROIが比較的大きく画像サイズが小さい（サブメガピクセル）画像分類データセット用に設計されています。既存のアプローチは、これら2つの課題に単独で対処しています。ズームインネットワークと呼ばれるエンドツーエンドのCNNモデルを提示します。これは、単一のGPUを使用して、小さなオブジェクトを含む大きな画像を分類するための階層的注意サンプリングを活用します。 4つの大きな画像の組織病理学、道路シーンと衛星画像のデータセット、および1つのギガピクセルの病理学データセットで私たちの方法を評価します。実験結果は、私たちのモデルがより少ないメモリリソースを必要としている間、既存の方法より高い精度を達成することを示しています。

An increasing number of applications in computer vision, specially, in medical imaging and remote sensing, become challenging when the goal is to classify very large images with tiny informative objects. Specifically, these classification tasks face two key challenges: i) the size of the input image is usually in the order of mega- or giga-pixels, however, existing deep architectures do not easily operate on such big images due to memory constraints, consequently, we seek a memory-efficient method to process these images; and ii) only a very small fraction of the input images are informative of the label of interest, resulting in low region of interest (ROI) to image ratio. However, most of the current convolutional neural networks (CNNs) are designed for image classification datasets that have relatively large ROIs and small image sizes (sub-megapixel). Existing approaches have addressed these two challenges in isolation. We present an end-to-end CNN model termed Zoom-In network that leverages hierarchical attention sampling for classification of large images with tiny objects using a single GPU. We evaluate our method on four large-image histopathology, road-scene and satellite imaging datasets, and one gigapixel pathology dataset. Experimental results show that our model achieves higher accuracy than existing methods while requiring less memory resources.

updated: Fri Dec 03 2021 20:45:50 GMT+0000 (UTC)

published: Fri Jun 04 2021 20:13:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト