Multimodal Learning for Hateful Memes Detection

Yi Zhou; Zhenhao Chen

嫌なミーム検出のためのマルチモーダル学習

ミームは、ソーシャルネットワークを通じてアイデアを広めるために使用されます。ほとんどのミームはユーモアのために作成されていますが、一部のミームは写真とテキストの組み合わせで嫌になります。嫌なミームを自動的に検出することで、それらの有害な社会的影響を減らすことができます。視覚情報とテキスト情報が意味的に整列している従来のマルチモーダルタスクとは異なり、嫌なミーム検出の課題は、その固有のマルチモーダル情報にあります。ミーム内の画像とテキストは、整列が弱いか、無関係でさえあります。そのため、モデルはコンテンツを理解し、複数のモダリティに対して推論を実行する必要があります。この論文では、マルチモーダルな嫌悪ミーム検出に焦点を当て、画像キャプションプロセスをミーム検出プロセスに組み込んだ新しい方法を提案します。マルチモーダルミームデータセットで広範な実験を実施し、アプローチの有効性を示しました。私たちのモデルは、Hateful Memes DetectionChallengeで有望な結果を達成しています。

Memes are used for spreading ideas through social networks. Although most memes are created for humor, some memes become hateful under the combination of pictures and text. Automatically detecting the hateful memes can help reduce their harmful social impact. Unlike the conventional multimodal tasks, where the visual and textual information is semantically aligned, the challenge of hateful memes detection lies in its unique multimodal information. The image and text in memes are weakly aligned or even irrelevant, which requires the model to understand the content and perform reasoning over multiple modalities. In this paper, we focus on multimodal hateful memes detection and propose a novel method that incorporates the image captioning process into the memes detection process. We conduct extensive experiments on multimodal meme datasets and illustrated the effectiveness of our approach. Our model achieves promising results on the Hateful Memes Detection Challenge.

updated: Sun Dec 06 2020 22:16:30 GMT+0000 (UTC)

published: Wed Nov 25 2020 16:49:15 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト