Binary Embedding-based Retrieval at Tencent

Yukang Gan; Yixiao Ge; Chang Zhou; Shupeng Su; Zhouchuan Xu; Xuyuan Xu; Quanchao Hui; Xiang Chen; Yexin Wang; Ying Shan

Tencent でのバイナリ埋め込みベースの検索

大規模な埋め込みベースの検索 (EBR) は、検索関連の産業用アプリケーションの基礎です。ユーザーのクエリが与えられると、EBR のシステムは、サイズが数百億または数千億のドキュメントの大規模なコーパスから関連情報を特定することを目的としています。ストレージと計算は、大量のドキュメントと大量の同時クエリにより、高価で非効率的であることが判明したため、さらにスケールアップすることは困難です。この課題に取り組むために、次元ごとのカスタマイズされたビットを可能にする再帰的二値化アルゴリズムを備えたバイナリ埋め込みベースの検索 (BEBR) エンジンを提案します。具体的には、残差多層認識 (MLP) ブロックを使用した軽量の変換モデルを使用して、一般に float ベクトルとして定式化された完全精度のクエリとドキュメントの埋め込みを複数のバイナリベクトルの構成に圧縮します。したがって、さまざまなアプリケーションに合わせてビット数を調整して、精度の低下とコストの削減を両立させることができます。重要なことは、新しい埋め込みから埋め込みへの戦略を使用して、二値化モデルのタスクに依存しない効率的なトレーニングを可能にすることです。また、バイナリ埋め込みの互換性のあるトレーニングを利用して、BEBR エンジンが統合システム内の複数の埋め込みバージョン間のインデックス作成をサポートできるようにします。さらに効率的な検索を実現するために、対称距離計算 (SDC) を提案し、ハミングコードよりも短い応答時間を実現します。導入された BEBR を、Sogou、Tencent Video、QQ World などの Tencent 製品にうまく採用しました。2 値化アルゴリズムは、複数のモダリティでさまざまなタスクにシームレスに一般化できます。オフラインベンチマークとオンライン A/B テストに関する広範な実験により、システムレベルでの精度をほとんど損なうことなく、30% ～ 50% のインデックスコストを大幅に節約する方法の効率と有効性が実証されました。

Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multilayer perception (MLP) blocks. We can therefore tailor the number of bits for different applications to trade off accuracy loss and cost savings. Importantly, we enable task-agnostic efficient training of the binarization model using a new embedding-to-embedding strategy. We also exploit the compatible training of binary embeddings so that the BEBR engine can support indexing among multiple embedding versions within a unified system. To further realize efficient search, we propose Symmetric Distance Calculation (SDC) to achieve lower response time than Hamming codes. We successfully employed the introduced BEBR to Tencent products, including Sogou, Tencent Video, QQ World, etc. The binarization algorithm can be seamlessly generalized to various tasks with multiple modalities. Extensive experiments on offline benchmarks and online A/B tests demonstrate the efficiency and effectiveness of our method, significantly saving 30%~50% index costs with almost no loss of accuracy at the system level.

updated: Fri Feb 17 2023 06:10:02 GMT+0000 (UTC)

published: Fri Feb 17 2023 06:10:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト