Learning from Children: Improving Image-Caption Pretraining via Curriculum

Hammad A. Ayyubi; Rahul Lokesh; Alireza Zareian; Bo Wu; Shih-Fu Chang

子どもたちから学ぶ: カリキュラムによる画像キャプションの事前トレーニングの改善

画像キャプションの事前トレーニングは、ゼロショット画像分類や物体検出などの下流の視覚タスクに非常にうまく使用されています。ただし、画像キャプションの事前トレーニングは依然として難しい問題です。キャプションの複数の概念 (名詞) を画像内の複数のオブジェクトに対応させる必要があります。この問題に取り組むために、私たちは根本、つまり最も優れた学習者である子供たちに取り組みます。私たちは、子どもの言語学習を扱う認知科学の研究からインスピレーションを得て、カリキュラム学習の枠組みを提案します。学習は、キャプションごとに 1 つの概念を含む、調整が簡単な画像キャプションのペアから始まります。キャプションごとにコンセプトが 1 つ追加されるため、新しいフェーズごとに難易度が徐々に上がります。同様に、各学習フェーズで取得した知識は後続のフェーズで利用され、各フェーズで 1 つの新しい概念とオブジェクトのペアを調整するように学習問題を効果的に制約します。この学習戦略は、さまざまな設定 (ゼロからの事前トレーニング、事前トレーニングされた画像または事前トレーニングされたテキストエンコーダーの使用、およびその両方の使用、低データ領域など) におけるバニラの画像キャプショントレーニングよりも改善されることを示します。

Image-caption pretraining has been quite successfully used for downstream vision tasks like zero-shot image classification and object detection. However, image-caption pretraining is still a hard problem -- it requires multiple concepts (nouns) from captions to be aligned to several objects in images. To tackle this problem, we go to the roots -- the best learner, children. We take inspiration from cognitive science studies dealing with children's language learning to propose a curriculum learning framework. The learning begins with easy-to-align image caption pairs containing one concept per caption. The difficulty is progressively increased with each new phase by adding one more concept per caption. Correspondingly, the knowledge acquired in each learning phase is utilized in subsequent phases to effectively constrain the learning problem to aligning one new concept-object pair in each phase. We show that this learning strategy improves over vanilla image-caption training in various settings -- pretraining from scratch, using a pretrained image or/and pretrained text encoder, low data regime etc.

updated: Tue May 30 2023 15:43:50 GMT+0000 (UTC)

published: Sat May 27 2023 17:59:54 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト