Scalable Reverse Image Search Engine for NASAWorldview

Abhigya Sodani; Michael Levy; Anirudh Koul; Meher Anand Kasam; Siddha Ganju

NASAWorldview用のスケーラブルな逆画像検索エンジン

研究者は、研究を開始できるデータセットを開発するために、何十年にもわたるラベルのない衛星画像（NASA Worldview上）をふるいにかけることに何週間も費やすことがよくあります。ラベルのないデータセットを自動的に選別してデータセットの生成時間を数週間から数分に短縮する、インタラクティブでスケーラブルで高速な画像類似性検索エンジン（1つ以上の画像をクエリ画像として取得できます）を開発しました。この作業では、エンドツーエンドのパイプラインの主要コンポーネントについて説明します。私たちの類似性検索システムは、入力画像に類似している可能性のあるペタバイトスケールのデータベースから類似した画像を識別できるように作成されました。このために、各クエリ画像をその特徴に分解する必要がありました。 CNNは監視された方法で訓練されました。これらの機能を効率的に保存および検索するには、スケーラビリティをいくつか改善する必要がありました。検索を埋め込むための速度を向上させ、ストレージを削減し、メモリ要件を縮小するために、CNNに完全に接続されたレイヤーを追加して、分類レイヤーに入る前にすべての画像を128の長さのベクトルにします。これにより、画像の特徴のサイズを2048（ResNetの場合、最初は機能化ツールとして試した）から新しいカスタムモデルの128に圧縮することができました。さらに、既存の近似最近傍検索ライブラリを利用して、埋め込み検索を大幅に高速化します。私たちのシステムは現在、クラウド内の単一の仮想マシンでクエリごとに5秒で画像のデータベース全体を検索しています。将来的には、人間によるラベル付けなしでトレーニングできるSimCLRベースの機能化モデルを組み込みたいと考えています（モデルの分類の側面はこのユースケースとは無関係であるため）。

Researchers often spend weeks sifting through decades of unlabeled satellite imagery(on NASA Worldview) in order to develop datasets on which they can start conducting research. We developed an interactive, scalable and fast image similarity search engine (which can take one or more images as the query image) that automatically sifts through the unlabeled dataset reducing dataset generation time from weeks to minutes. In this work, we describe key components of the end to end pipeline. Our similarity search system was created to be able to identify similar images from a potentially petabyte scale database that are similar to an input image, and for this we had to break down each query image into its features, which were generated by a classification layer stripped CNN trained in a supervised manner. To store and search these features efficiently, we had to make several scalability improvements. To improve the speed, reduce the storage, and shrink memory requirements for embedding search, we add a fully connected layer to our CNN make all images into a 128 length vector before entering the classification layers. This helped us compress the size of our image features from 2048 (for ResNet, which was initially tried as our featurizer) to 128 for our new custom model. Additionally, we utilize existing approximate nearest neighbor search libraries to significantly speed up embedding search. Our system currently searches over our entire database of images at 5 seconds per query on a single virtual machine in the cloud. In the future, we would like to incorporate a SimCLR based featurizing model which could be trained without any labelling by a human (since the classification aspect of the model is irrelevant to this use case).

updated: Tue Aug 10 2021 07:03:00 GMT+0000 (UTC)

published: Tue Aug 10 2021 07:03:00 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト