FDDH: Fast Discriminative Discrete Hashing for Large-Scale Cross-Modal Retrieval

Xin Liu; Xingzhi Wang; Yiu-ming Cheung

FDDH：大規模なクロスモーダル検索のための高速識別離散ハッシュ

クロスモーダルハッシュは、その有効性と効率性で支持されており、さまざまなモダリティ間で効率的な検索を容易にすることに広く注目されています。それにもかかわらず、ほとんどの既存の方法は、ハッシュコードを学習するときに意味情報の識別力を十分に活用しておらず、大規模なデータセットを処理するための時間のかかるトレーニング手順を伴うことがよくあります。これらの問題に取り組むために、セマンティックデータをハミング空間にマッピングすることによる量子化損失を最小限に抑えるようにセマンティックデータを直交回転させるという観点から類似性を維持するハッシュコードの学習を定式化し、効率的な高速識別離散ハッシュ（FDDH）アプローチを提案します。大規模なクロスモーダル検索用。より具体的には、FDDHは、トレーニング例のターゲットハッシュコードを対応するセマンティックラベルに回帰するための直交基底を導入し、「-ドラッグ技術を利用して、証明可能な大きなセマンティックマージンを提供します。したがって、セマンティック情報の識別力を明示的にキャプチャして最大化できます。さらに、直交変換スキームをさらに提案して、非線形埋め込みデータをセマンティック部分空間にマッピングします。これにより、データ機能とそのセマンティック表現の間のセマンティック整合性が十分に保証されます。その結果、識別ハッシュコードの効率的な閉形式ソリューションが導出されます。学習は非常に計算効率が高く、さらに、さまざまなトレーニングサイズとストリーミングデータへの適応性を特徴とする、モダリティ固有の射影関数を最適化するための効果的で安定したオンライン学習戦略が提示されます。提案されたFDDHアプローチは、理論的にはバイリプシッツの連続性を近似します。 sufを実行します非常に高速であり、最先端の方法よりも検索パフォーマンスが大幅に向上します。ソースコードはhttps://github.com/starxliu/FDDHでリリースされています。

Cross-modal hashing, favored for its effectiveness and efficiency, has received wide attention to facilitating efficient retrieval across different modalities. Nevertheless, most existing methods do not sufficiently exploit the discriminative power of semantic information when learning the hash codes, while often involving time-consuming training procedure for handling the large-scale dataset. To tackle these issues, we formulate the learning of similarity-preserving hash codes in terms of orthogonally rotating the semantic data so as to minimize the quantization loss of mapping such data to hamming space, and propose an efficient Fast Discriminative Discrete Hashing (FDDH) approach for large-scale cross-modal retrieval. More specifically, FDDH introduces an orthogonal basis to regress the targeted hash codes of training examples to their corresponding semantic labels, and utilizes "-dragging technique to provide provable large semantic margins. Accordingly, the discriminative power of semantic information can be explicitly captured and maximized. Moreover, an orthogonal transformation scheme is further proposed to map the nonlinear embedding data into the semantic subspace, which can well guarantee the semantic consistency between the data feature and its semantic representation. Consequently, an efficient closed form solution is derived for discriminative hash code learning, which is very computationally efficient. In addition, an effective and stable online learning strategy is presented for optimizing modality-specific projection functions, featuring adaptivity to different training sizes and streaming data. The proposed FDDH approach theoretically approximates the bi-Lipschitz continuity, runs sufficiently fast, and also significantly improves the retrieval performance over the state-of-the-art methods. The source code is released at: https://github.com/starxliu/FDDH.

updated: Sat May 15 2021 03:53:48 GMT+0000 (UTC)

published: Sat May 15 2021 03:53:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト