FooDI-ML: a large multi-language dataset of food, drinks and groceries images and descriptions

David Amat Olóndriz; Ponç Palau Puigdevall; Adrià Salvador Palau

FooDI-ML: 食べ物、飲み物、食料品の画像と説明の大規模な多言語データセット

この論文では、FooDI-ML データセットを紹介します。このデータセットには、150 万を超える固有の画像と、Glovo アプリケーションから収集された 950 万を超える店舗名、製品名の説明、およびコレクションセクションが含まれています。利用可能なデータは、ヨーロッパ、中東、アフリカ、ラテンアメリカの 37 か国の食品、飲料、食料品に対応しています。このデータセットは、ウクライナ語やカザフ語などの東ヨーロッパおよび西アジアの国々の言語の 87 万サンプルを含む、33 の言語を網羅しています。データセットには、スペイン語や英語など、広く話されている言語も含まれています。さらなる調査を支援するために、テキスト画像検索と条件付き画像生成の 2 つのタスクに関するベンチマークを含めます。

In this paper we introduce the FooDI-ML dataset. This dataset contains over 1.5M unique images and over 9.5M store names, product names descriptions, and collection sections gathered from the Glovo application. The data made available corresponds to food, drinks and groceries products from 37 countries in Europe, the Middle East, Africa and Latin America. The dataset comprehends 33 languages, including 870K samples of languages of countries from Eastern Europe and Western Asia such as Ukrainian and Kazakh, which have been so far underrepresented in publicly available visio-linguistic datasets. The dataset also includes widely spoken languages such as Spanish and English. To assist further research, we include benchmarks over two tasks: text-image retrieval and conditional image generation.

updated: Fri Aug 26 2022 11:23:29 GMT+0000 (UTC)

published: Tue Oct 05 2021 13:33:08 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト