DisinfoMeme: A Multimodal Dataset for Detecting Meme Intentionally Spreading Out Disinformation

Jingnong Qu; Liunian Harold Li; Jieyu Zhao; Sunipa Dev; Kai-Wei Chang

DisinfoMeme：意図的に偽情報を拡散しているミームを検出するためのマルチモーダルデータセット

偽情報はソーシャルメディアで深刻な問題になっています。特に、短い形式、視覚的な魅力、ユーモラスな性質を考えると、ミームはオンラインコミュニティへの普及に大きな利点があり、偽情報を広めるための効果的な手段になります。偽情報ミームの検出に役立つDisinfoMemeを紹介します。データセットには、Redditからマイニングされたミームが含まれており、COVID-19パンデミック、ブラック・ライヴズ・マター運動、ビーガニズム/菜食主義の3つの現在のトピックをカバーしています。データセットには、データとラベルの不均衡の制限、外部知識への依存、マルチモーダル推論、レイアウトの依存関係、OCRからのノイズなど、複数の固有の課題があります。このデータセットで、広く使用されている複数のユニモーダルモデルとマルチモーダルモデルをテストします。実験によると、現在のモデルではまだ改善の余地が大きいことがわかります。

Disinformation has become a serious problem on social media. In particular, given their short format, visual attraction, and humorous nature, memes have a significant advantage in dissemination among online communities, making them an effective vehicle for the spread of disinformation. We present DisinfoMeme to help detect disinformation memes. The dataset contains memes mined from Reddit covering three current topics: the COVID-19 pandemic, the Black Lives Matter movement, and veganism/vegetarianism. The dataset poses multiple unique challenges: limited data and label imbalance, reliance on external knowledge, multimodal reasoning, layout dependency, and noise from OCR. We test multiple widely-used unimodal and multimodal models on this dataset. The experiments show that the room for improvement is still huge for current models.

updated: Wed May 25 2022 09:54:59 GMT+0000 (UTC)

published: Wed May 25 2022 09:54:59 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト