From Local Binary Patterns to Pixel Difference Networks for Efficient Visual Representation Learning

Zhuo Su; Matti Pietikäinen; Li Liu

効率的な視覚表現学習のためのローカルバイナリパターンからピクセル差分ネットワークまで

LBP は、コンピュータービジョンで成功した手作りの特徴記述子です。ただし、ディープラーニングの時代では、ディープニューラルネットワーク、特に畳み込みニューラルネットワーク (CNN) は、より識別力が高く表現能力の高い強力なタスク認識機能を自動的に学習できます。ディープコンピュータービジョンモデルを設計する場合、このような手作りの機能はある程度無視しても問題ありません。それにもかかわらず、視覚的表現学習における LBP の好ましい特性により、効率、メモリ消費、および予測パフォーマンスの観点から最新の深層モデルを強化する際の LBP の価値を探求する興味深いトピックが生まれました。このホワイトペーパーでは、深層モデルをより強力にするために、LBP メカニズムを CNN モジュールの設計に組み込むことを目的とした、このような取り組みに関する包括的なレビューを提供します。これまでに達成されたことを振り返って、この論文では未解決の課題と将来の研究の方向性について説明しています。

LBP is a successful hand-crafted feature descriptor in computer vision. However, in the deep learning era, deep neural networks, especially convolutional neural networks (CNNs) can automatically learn powerful task-aware features that are more discriminative and of higher representational capacity. To some extent, such hand-crafted features can be safely ignored when designing deep computer vision models. Nevertheless, due to LBP's preferable properties in visual representation learning, an interesting topic has arisen to explore the value of LBP in enhancing modern deep models in terms of efficiency, memory consumption, and predictive performance. In this paper, we provide a comprehensive review on such efforts which aims to incorporate the LBP mechanism into the design of CNN modules to make deep models stronger. In retrospect of what has been achieved so far, the paper discusses open challenges and directions for future research.

updated: Wed Mar 15 2023 07:28:46 GMT+0000 (UTC)

published: Wed Mar 15 2023 07:28:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト