Perspective Transformation Layer

Nishan Khatri; Agnibh Dasgupta; Yucong Shen; Xin Zhong; Frank Shih

パースペクティブ変換レイヤー

観察者と物体の間の相対的な位置の変化を反映する幾何学的変換をコンピュータビジョンや深層学習モデルに組み込むことは、近年大きな注目を集めています。ただし、既存の提案は主に、視点の変化を完全に示すことができないアフィン変換に焦点を合わせています。さらに、現在のソリューションでは、ニューラルネットワークモジュールを適用して単一の変換行列を学習することがよくあります。これにより、さまざまな視点の可能性が無視され、トレーニング対象のモジュールパラメータが追加されます。本論文では、アフィン変換における幾何学をモデル化するだけでなく、視点の変化を反映する視点変換を学習するために、層（PT層）を提案した。さらに、畳み込み層などの従来の層のように最急降下法で直接訓練できるため、提案された単一のPT層は、追加のモジュールパラメーターを訓練することなく、調整可能な数の複数の視点を学習できます。実験と評価により、提案されたPT層の優位性が確認されました。

Incorporating geometric transformations that reflect the relative position changes between an observer and an object into computer vision and deep learning models has attracted much attention in recent years. However, the existing proposals mainly focus on affine transformations that cannot fully show viewpoint changes. Furthermore, current solutions often apply a neural network module to learn a single transformation matrix, which ignores the possibility for various viewpoints and creates extra to-be-trained module parameters. In this paper, a layer (PT layer) is proposed to learn the perspective transformations that not only model the geometries in affine transformation but also reflect the viewpoint changes. In addition, being able to be directly trained with gradient descent like traditional layers such as convolutional layers, a single proposed PT layer can learn an adjustable number of multiple viewpoints without training extra module parameters. The experiments and evaluations confirm the superiority of the proposed PT layer.

updated: Fri Jan 14 2022 23:09:26 GMT+0000 (UTC)

published: Fri Jan 14 2022 23:09:26 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト