SRRM: Semantic Region Relation Model for Indoor Scene Recognition

Chuanxin Song; Xin Ma

SRRM: 屋内シーン認識のための意味領域関係モデル

畳み込みニューラルネットワークはさまざまなコンピュータービジョンタスクにおいて目覚ましい成功を収めていますが、屋内シーンの認識には、その複雑な構成により依然として大きな課題が残されています。したがって、シーン内の意味情報を効果的に活用することが、屋内シーン認識を進める上で重要な課題となっています。残念ながら、セマンティックセグメンテーションの精度により、セマンティック情報を活用する既存のアプローチの有効性が制限されています。その結果、これらのアプローチの多くは補助的なラベル付けまたは共起統計の段階に留まり、シーン内の意味要素間の文脈上の関係を直接調査するものはほとんどありません。この論文では、シーン内のセマンティック情報から直接開始するセマンティック領域関係モデル (SRRM) を提案します。具体的には、SRRM は適応的かつ効率的なアプローチを採用して意味の曖昧さによる悪影響を軽減し、その後、意味領域の関係をモデル化してシーン認識を実行します。さらに、シーンに含まれる情報をより包括的に活用するために、提案された SRRM と PlacesCNN モジュールを組み合わせて結合セマンティック領域関係モデル (CSRRM) を作成し、それらの間の補完的なコンテンツを効果的に探索するための新しい情報結合アプローチを提案します。 CSRRM は、再トレーニングなしで、MIT Indoor 67、縮小された Places365 データセット、および SUN RGB-D での SOTA メソッドを大幅に上回ります。コードはhttps://github.com/ChuanxinSong/SRRMから入手できます。

Despite the remarkable success of convolutional neural networks in various computer vision tasks, recognizing indoor scenes still presents a significant challenge due to their complex composition. Consequently, effectively leveraging semantic information in the scene has been a key issue in advancing indoor scene recognition. Unfortunately, the accuracy of semantic segmentation has limited the effectiveness of existing approaches for leveraging semantic information. As a result, many of these approaches remain at the stage of auxiliary labeling or co-occurrence statistics, with few exploring the contextual relationships between the semantic elements directly within the scene. In this paper, we propose the Semantic Region Relationship Model (SRRM), which starts directly from the semantic information inside the scene. Specifically, SRRM adopts an adaptive and efficient approach to mitigate the negative impact of semantic ambiguity and then models the semantic region relationship to perform scene recognition. Additionally, to more comprehensively exploit the information contained in the scene, we combine the proposed SRRM with the PlacesCNN module to create the Combined Semantic Region Relation Model (CSRRM), and propose a novel information combining approach to effectively explore the complementary contents between them. CSRRM significantly outperforms the SOTA methods on the MIT Indoor 67, reduced Places365 dataset, and SUN RGB-D without retraining. The code is available at: https://github.com/ChuanxinSong/SRRM

updated: Mon May 15 2023 11:11:11 GMT+0000 (UTC)

published: Mon May 15 2023 11:11:11 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト