Retrieval Augmentation for Deep Neural Networks

Rita Parada Ramos; Patrícia Pereira; Helena Moniz; Joao Paulo Carvalho; Bruno Martins

ディープニューラルネットワークの検索拡張

ディープニューラルネットワークは、さまざまなビジョンや言語のタスクで最先端の結果を達成しています。大規模なトレーニングデータセットを使用しているにもかかわらず、ほとんどのモデルは、単一の入出力ペアを反復処理してトレーニングされ、現在の予測の残りの例は破棄されます。この作業では、トレーニングとテストの両方で予測を支援するために、最も近いトレーニング例からの情報を使用して、トレーニングデータを積極的に活用します。具体的には、私たちのアプローチでは、最も類似したトレーニング例のターゲットを使用して、LSTMモデルのメモリ状態を初期化するか、注意メカニズムをガイドします。このアプローチを、画像とテキストの検索を通じて、それぞれ画像のキャプションと感情分析に適用します。結果は、広く使用されているFlickr8およびIMDBデータセットで、2つのタスクに対して提案されたアプローチの有効性を確認しています。私たちのコードはhttp://github.com/RitaRamo/retrieval-augmentation-nnで公開されています。

Deep neural networks have achieved state-of-the-art results in various vision and/or language tasks. Despite the use of large training datasets, most models are trained by iterating over single input-output pairs, discarding the remaining examples for the current prediction. In this work, we actively exploit the training data, using the information from nearest training examples to aid the prediction both during training and testing. Specifically, our approach uses the target of the most similar training example to initialize the memory state of an LSTM model, or to guide attention mechanisms. We apply this approach to image captioning and sentiment analysis, respectively through image and text retrieval. Results confirm the effectiveness of the proposed approach for the two tasks, on the widely used Flickr8 and IMDB datasets. Our code is publicly available at http://github.com/RitaRamo/retrieval-augmentation-nn.

updated: Mon Apr 26 2021 09:14:47 GMT+0000 (UTC)

published: Thu Feb 25 2021 17:38:31 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト