Otter: A Multi-Modal Model with In-Context Instruction Tuning

Bo Li; Yuanhan Zhang; Liangyu Chen; Jinghao Wang; Jingkang Yang; Ziwei Liu

Otter: インコンテキスト命令チューニングを備えたマルチモーダルモデル

大規模言語モデル (LLM) は、InstrctGPT と ChatGPT に後押しされた GPT-3 に代表されるように、膨大な量のテキストデータの事前トレーニングにより、さまざまなタスクで少数/ゼロショット学習者として重要な普遍的能力を実証してきました。現実世界のタスクを達成するための自然言語命令。この論文では、Flamingo モデルのアップストリームのインターリーブ形式の事前トレーニングデータセットを動機として、マルチモーダルモデルに命令チューニングを導入することを提案します。同様のアプローチを採用して、MultI-Modal In-Context Instruction Tuning (MIMIC-IT) データセットを構築します。次に、OpenFlamingo (DeepMind の Flamingo のオープンソースバージョン) に基づくマルチモーダルモデルである Otter を紹介します。これは、MIMIC-IT でトレーニングされ、改善された指示に従う能力とコンテキスト内学習を示しています。また、OpenFlamingo の実装を研究者向けに最適化し、必要なトレーニングリソースを 1× A100 GPU から 4× RTX-3090 GPU に民主化し、OpenFlamingo と Otter の両方を Huggingface Transformers に統合して、より多くの研究者がモデルをカスタマイズされたトレーニングおよび推論パイプラインに組み込むことができるようにします。

Large language models (LLMs) have demonstrated significant universal capabilities as few/zero-shot learners in various tasks due to their pre-training on vast amounts of text data, as exemplified by GPT-3, which boosted to InstrctGPT and ChatGPT, effectively following natural language instructions to accomplish real-world tasks. In this paper, we propose to introduce instruction tuning into multi-modal models, motivated by the Flamingo model's upstream interleaved format pretraining dataset. We adopt a similar approach to construct our MultI-Modal In-Context Instruction Tuning (MIMIC-IT) dataset. We then introduce Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following ability and in-context learning. We also optimize OpenFlamingo's implementation for researchers, democratizing the required training resources from 1× A100 GPU to 4× RTX-3090 GPUs, and integrate both OpenFlamingo and Otter into Huggingface Transformers for more researchers to incorporate the models into their customized training and inference pipelines.

updated: Fri May 05 2023 17:59:46 GMT+0000 (UTC)

published: Fri May 05 2023 17:59:46 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト