Line Segment Detection Using Transformers without Edges

Yifan Xu; Weijian Xu; David Cheung; Zhuowen Tu

エッジのないトランスフォーマーを使用した線分検出

この論文では、後処理とヒューリスティックに基づく中間処理（エッジ/ジャンクション/領域検出）のないトランスフォーマーを使用した、エンドツーエンドの共同線分検出アルゴリズムを紹介します。 LinEセグメントTRansformers（LETR）という名前の私たちの方法は、エッジ要素の検出と知覚のグループ化プロセスの標準的なヒューリスティック設計をスキップすることにより、トークン化されたクエリ、自己注意メカニズム、およびトランスフォーマー内のエンコード/デコード戦略を統合するという利点を活用します。トランスフォーマーにマルチスケールエンコーダー/デコーダー戦略を装備して、直接のエンドポイント距離損失の下で細粒度の線分検出を実行します。この損失項は、標準のバウンディングボックス表現では便利に表現されない線分などの幾何学的構造を検出するのに特に適しています。トランスフォーマーは、自己注意の層を通して線分を徐々に洗練することを学びます。私たちの実験では、WireframeとYorkUrbanのベンチマークに関する最新の結果を示しています。

In this paper, we present a joint end-to-end line segment detection algorithm using Transformers that is post-processing and heuristics-guided intermediate processing (edge/junction/region detection) free. Our method, named LinE segment TRansformers (LETR), takes advantages of having integrated tokenized queries, a self-attention mechanism, and an encoding-decoding strategy within Transformers by skipping standard heuristic designs for the edge element detection and perceptual grouping processes. We equip Transformers with a multi-scale encoder/decoder strategy to perform fine-grained line segment detection under a direct endpoint distance loss. This loss term is particularly suitable for detecting geometric structures such as line segments that are not conveniently represented by the standard bounding box representations. The Transformers learn to gradually refine line segments through layers of self-attention. In our experiments, we show state-of-the-art results on Wireframe and YorkUrban benchmarks.

updated: Fri Apr 30 2021 17:34:55 GMT+0000 (UTC)

published: Wed Jan 06 2021 08:00:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト