Improving Translation Invariance in Convolutional Neural Networks with Peripheral Prediction Padding

Kensuke Mukai; Takao Yamanaka

ペリフェラル予測パディングによる畳み込みニューラルネットワークの翻訳不変性の改善

ゼロパディングは、畳み込みニューラルネットワークで、層ごとに特徴マップのサイズが減少するのを防ぐためによく使用されます。ただし、最近の研究では、ゼロパディングが絶対位置情報のエンコードを促進し、一部のタスクのパフォーマンスに悪影響を与える可能性があることが示されています。本研究では、ゼロパディングの代わりに各タスクに適したパディング値のエンドツーエンドトレーニングを可能にする、Peripheral Prediction Padding (PP-Pad) 法と呼ばれる新しいパディング手法を提案します。さらに、モデルの翻訳不変性を定量的に評価するための新しい指標が提示されます。これらの指標による評価により、提案手法はセマンティックセグメンテーションタスクにおいて従来手法に比べて高い精度と翻訳不変性を実現できることが確認された。

Zero padding is often used in convolutional neural networks to prevent the feature map size from decreasing with each layer. However, recent studies have shown that zero padding promotes encoding of absolute positional information, which may adversely affect the performance of some tasks. In this work, a novel padding method called Peripheral Prediction Padding (PP-Pad) method is proposed, which enables end-to-end training of padding values suitable for each task instead of zero padding. Moreover, novel metrics to quantitatively evaluate the translation invariance of the model are presented. By evaluating with these metrics, it was confirmed that the proposed method achieved higher accuracy and translation invariance than the previous methods in a semantic segmentation task.

updated: Sat Jul 15 2023 06:44:34 GMT+0000 (UTC)

published: Sat Jul 15 2023 06:44:34 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト