Unsupervised Spike Depth Estimation via Cross-modality Cross-domain Knowledge Transfer

Jiaming Liu; Qizhe Zhang; Jianing Li; Ming Lu; Tiejun Huang; Shanghang Zhang

クロスモダリティクロスドメイン知識伝達による教師なしスパイク深度推定

ニューロモルフィックスパイクデータは、高い時間分解能を備えた今後のモダリティであり、高速モーションブラーを克服する固有の利点により、現実世界のアプリケーションで有望な可能性を示しています。ただし、スパイク深度推定ネットワークのトレーニングには、2 つの側面で大きな課題があります。密な回帰タスクの疎な空間情報と、時間的に集中的なスパイクストリームのペアの深度ラベルを達成することの難しさです。したがって、この論文では、オープンソースRGBデータの助けを借りて、教師なしスパイク深度推定を実現するためのクロスモダリティクロスドメイン（BiCross）フレームワークを提案します。最初に、ソース RGB からのクロスモダリティの知識を転送してシミュレートされたソーススパイクデータを仲介し、次にシミュレートされたソーススパイクからターゲットスパイクデータへのクロスドメイン学習を実現します。具体的には、ソースドメインでグローバルおよびピクセルレベルでクロスモダリティ知識を転送するために、Coarse-to-Fine Knowledge Distillation (CFKD) が導入され、画像特徴の十分なセマンティック知識によってまばらなスパイク特徴を補完します。次に、スパイクターゲットドメインでのクロスドメイン学習を実現するための不確実性ガイド付き教師-生徒（UGTS）メソッドを提案し、アライメントと不確実性ガイド付き深度選択測定を通じて、教師と生徒モデルのドメイン不変のグローバルおよびピクセルレベルの知識を確保します。 BiCross の有効性を検証するために、Synthetic to Real、Extreme Weather、Scene Changing を含む 3 つのシナリオで大規模な実験を行います。コードとデータセットがリリースされます。

Neuromorphic spike data, an upcoming modality with high temporal resolution, has shown promising potential in real-world applications due to its inherent advantage to overcome high-velocity motion blur. However, training the spike depth estimation network holds significant challenges in two aspects: sparse spatial information for dense regression tasks, and difficulties in achieving paired depth labels for temporally intensive spike streams. In this paper, we thus propose a cross-modality cross-domain (BiCross) framework to realize unsupervised spike depth estimation with the help of open-source RGB data. It first transfers cross-modality knowledge from source RGB to mediates simulated source spike data, then realizes cross-domain learning from simulated source spike to target spike data. Specifically, Coarse-to-Fine Knowledge Distillation (CFKD) is introduced to transfer cross-modality knowledge in global and pixel-level in the source domain, which complements sparse spike features by sufficient semantic knowledge of image features. We then propose Uncertainty Guided Teacher-Student (UGTS) method to realize cross-domain learning on spike target domain, ensuring domain-invariant global and pixel-level knowledge of teacher and student model through alignment and uncertainty guided depth selection measurement. To verify the effectiveness of BiCross, we conduct extensive experiments on three scenarios, including Synthetic to Real, Extreme Weather, and Scene Changing. The code and datasets will be released.

updated: Wed Nov 30 2022 16:35:33 GMT+0000 (UTC)

published: Fri Aug 26 2022 09:35:20 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト