PS-Transformer: Learning Sparse Photometric Stereo Network using Self-Attention Mechanism

Satoshi Ikehata

PS-Transformer: Self-Attention メカニズムを使用したスパースフォトメトリックステレオネットワークの学習

既存のディープキャリブレーションフォトメトリックステレオネットワークは、基本的に、線形投影や最大プーリングなどの事前定義された操作に基づいて、さまざまなライトの下で観測を集約します。それらは高密度キャプチャで効果的ですが、単純な一次操作では、少数の異なるライトの下での観測間の高次相互作用をキャプチャできないことがよくあります。この問題に取り組むために、このホワイトペーパーでは、PS-Transformer という名前のディープスパースキャリブレーションフォトメトリックステレオネットワークを紹介します。これは、学習可能な自己注意メカニズムを活用して、複雑な画像間の相互作用を適切にキャプチャします。 PS-Transformer は、デュアルブランチ設計に基づいて構築され、ピクセル単位と画像単位の両方の特徴を調査します。個々の特徴は、幾何学的な実現可能性を最大化するために、中間面法線監視でトレーニングされます。 CyclesPS+ という名前の新しい合成データセットも、測光ステレオネットワークを正常にトレーニングするための包括的な分析と共に提示されます。公開されているベンチマークデータセットに関する広範な結果は、提案された方法の表面法線予測精度が、同じ数の入力画像を使用する他の最先端のアルゴリズムよりも大幅に優れており、10x を入力する高密度アルゴリズムの精度にさえ匹敵することを示しています。より多くの画像。

Existing deep calibrated photometric stereo networks basically aggregate observations under different lights based on the pre-defined operations such as linear projection and max pooling. While they are effective with the dense capture, simple first-order operations often fail to capture the high-order interactions among observations under small number of different lights. To tackle this issue, this paper presents a deep sparse calibrated photometric stereo network named PS-Transformer which leverages the learnable self-attention mechanism to properly capture the complex inter-image interactions. PS-Transformer builds upon the dual-branch design to explore both pixel-wise and image-wise features and individual feature is trained with the intermediate surface normal supervision to maximize geometric feasibility. A new synthetic dataset named CyclesPS+ is also presented with the comprehensive analysis to successfully train the photometric stereo networks. Extensive results on the publicly available benchmark datasets demonstrate that the surface normal prediction accuracy of the proposed method significantly outperforms other state-of-the-art algorithms with the same number of input images and is even comparable to that of dense algorithms which input 10× larger number of images.

updated: Mon Nov 21 2022 11:58:25 GMT+0000 (UTC)

published: Mon Nov 21 2022 11:58:25 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト