SPAN: Spatial Pyramid Attention Network forImage Manipulation Localization

Xuefeng Hu; Zhihan Zhang; Zhenye Jiang; Syomantak Chaudhuri; Zhenheng Yang; Ram Nevatia

SPAN：画像操作ローカリゼーションのための空間ピラミッド注意ネットワーク

複数のタイプの画像操作の検出とローカリゼーションのための新しいフレームワーク、空間ピラミッド注意ネットワーク（SPAN）を提示します。提案されたアーキテクチャは、ローカルセルフアテンションブロックのピラミッドを構築することにより、複数のスケールでの画像パッチ間の関係を効率的かつ効果的にモデル化します。設計には、パッチの空間位置をエンコードする新しい位置投影が含まれています。 SPANは、一般的な合成データセットでトレーニングされますが、特定のデータセット用に微調整することもできます。提案された方法は、標準的なデータセットのパフォーマンスが以前の最先端の方法よりも大幅に向上していることを示しています。

We present a novel framework, Spatial Pyramid Attention Network (SPAN) for detection and localization of multiple types of image manipulations. The proposed architecture efficiently and effectively models the relationship between image patches at multiple scales by constructing a pyramid of local self-attention blocks. The design includes a novel position projection to encode the spatial positions of the patches. SPAN is trained on a generic, synthetic dataset but can also be fine tuned for specific datasets; The proposed method shows significant gains in performance on standard datasets over previous state-of-the-art methods.

updated: Thu Jan 14 2021 01:43:21 GMT+0000 (UTC)

published: Tue Sep 01 2020 21:59:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト