Fully complex-valued deep learning model for visual perception

Aniruddh Sikdar; Sumanth Udupa; Suresh Sundaram

視覚認識のための完全に複雑な値の深層学習モデル

表現能力が豊富なため、複雑なドメインで動作するディープラーニングモデルが使用されます。ただし、これらのモデルのほとんどは、複素平面の第 1 象限に制限されているか、複素数値データを実領域に射影するため、情報の損失が発生します。この論文では、複雑なドメインで完全に操作することで、複雑な値のモデルの全体的なパフォーマンスが向上することを提案しています。新たに提案された複素数値損失関数とトレーニング戦略を使用して、完全複素数値畳み込みニューラルネットワーク (FC-CNN) をトレーニングするために、新しい完全複素数値学習スキームが提案されています。 CIFAR-10、SVHN、および CIFAR-100 でベンチマークされた FC-CNN は、実数値の対応するものと比較して 4 ～ 10% のゲインを持ち、モデルの複雑さを維持します。より少ないパラメーターで、CIFAR-10 および SVHN の最先端の複素数値モデルに匹敵するパフォーマンスを実現します。 CIFAR-100 データセットの場合、25% 少ないパラメーターで最先端のパフォーマンスを実現します。 FC-CNN は、他のすべてのモデルよりも優れたトレーニング効率とはるかに高速な収束を示しています。

Deep learning models operating in the complex domain are used due to their rich representation capacity. However, most of these models are either restricted to the first quadrant of the complex plane or project the complex-valued data into the real domain, causing a loss of information. This paper proposes that operating entirely in the complex domain increases the overall performance of complex-valued models. A novel, fully complex-valued learning scheme is proposed to train a Fully Complex-valued Convolutional Neural Network (FC-CNN) using a newly proposed complex-valued loss function and training strategy. Benchmarked on CIFAR-10, SVHN, and CIFAR-100, FC-CNN has a 4-10% gain compared to its real-valued counterpart, maintaining the model complexity. With fewer parameters, it achieves comparable performance to state-of-the-art complex-valued models on CIFAR-10 and SVHN. For the CIFAR-100 dataset, it achieves state-of-the-art performance with 25% fewer parameters. FC-CNN shows better training efficiency and much faster convergence than all the other models.

updated: Wed Dec 14 2022 10:40:35 GMT+0000 (UTC)

published: Wed Dec 14 2022 10:40:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト