Content and Context Features for Scene Image Representation

Chiranjibi Sitaula; Sunil Aryal; Yong Xiang; Anish Basnet; Xuequan Lu

シーン画像表現のコンテンツおよびコンテキスト機能

シーン画像分類に関する既存の研究は、コンテンツの特徴（例：視覚情報）またはコンテキストの特徴（例：注釈）に焦点を当てています。それらは補足することができ、異なるクラスの画像を区別するのに役立つことができる画像に関する異なる情報をキャプチャーするので、それらの融合は分類結果を改善すると思います。この論文では、コンテンツの特徴とコンテキストの特徴を計算し、それらを融合するための新しい手法を提案します。コンテンツフィーチャについては、画像の背景情報と前景情報に基づいて、マルチスケールのディープフィーチャを設計します。コンテキスト機能については、Webで入手できる類似画像の注釈を使用して、フィルターワード（コードブック）を設計します。サポートベクターマシン分類器を使用して広く使用されている3つのベンチマークシーンデータセットでの実験により、提案されたコンテキストとコンテンツ機能は、それぞれ既存のコンテキストとコンテンツ機能よりも優れた結果を生み出すことが明らかになりました。提案された2つのタイプの機能の融合は、多数の最先端の機能を大幅に上回っています。

Existing research in scene image classification has focused on either content features (e.g., visual information) or context features (e.g., annotations). As they capture different information about images which can be complementary and useful to discriminate images of different classes, we suppose the fusion of them will improve classification results. In this paper, we propose new techniques to compute content features and context features, and then fuse them together. For content features, we design multi-scale deep features based on background and foreground information in images. For context features, we use annotations of similar images available in the web to design a filter words (codebook). Our experiments in three widely used benchmark scene datasets using support vector machine classifier reveal that our proposed context and content features produce better results than existing context and content features, respectively. The fusion of the proposed two types of features significantly outperform numerous state-of-the-art features.

updated: Sat Apr 24 2021 05:37:31 GMT+0000 (UTC)

published: Fri Jun 05 2020 03:19:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト