Contrastive Domain Adaptation for Early Misinformation Detection: A Case Study on COVID-19

Zhenrui Yue; Huimin Zeng; Ziyi Kou; Lanyu Shang; Dong Wang

早期の誤報検出のための対照的なドメイン適応: COVID-19 のケーススタディ

誤った情報検出システムのパフォーマンスを改善する最近の進歩にもかかわらず、目に見えないドメインで誤った情報を分類することは、とらえどころのない課題のままです。この問題に対処するための一般的なアプローチは、ドメインクリティカを導入し、ドメイン不変の入力機能を奨励することです。ただし、初期の誤報は、既存の誤報データ (COVID-19 データセットのクラスの不均衡など) に対する条件付きシフトとラベルシフトの両方を示すことが多く、そのような方法では早期の誤報を検出する効果が低くなります。この論文では、早期誤報検出（CANMD）のための対照的適応ネットワークを提案します。具体的には、疑似ラベル付けを活用して、ソースデータとの共同トレーニング用の信頼性の高いターゲットサンプルを生成します。さらに、ソースドメインとターゲットドメインの間のラベルシフト (つまり、事前クラス) を推定して修正するためのラベル修正コンポーネントを設計します。さらに、対照的な適応損失を目的関数に統合して、クラス内の不一致を減らし、クラス間の不一致を拡大します。そのため、適応モデルは、ターゲットデータ分布の推定を改善するために、修正されたクラス事前分布と両方のドメインにわたる不変の条件付き分布を学習します。提案された CANMD の有効性を実証するために、COVID-19 の初期の誤報検出の事例を研究し、複数の実世界のデータセットを使用して大規模な実験を行います。この結果は、CANMD が、最新のベースラインと比較して大幅に改善され、目に見えない COVID-19 ターゲットドメインに誤情報検出システムを効果的に適応できることを示唆しています。

Despite recent progress in improving the performance of misinformation detection systems, classifying misinformation in an unseen domain remains an elusive challenge. To address this issue, a common approach is to introduce a domain critic and encourage domain-invariant input features. However, early misinformation often demonstrates both conditional and label shifts against existing misinformation data (e.g., class imbalance in COVID-19 datasets), rendering such methods less effective for detecting early misinformation. In this paper, we propose contrastive adaptation network for early misinformation detection (CANMD). Specifically, we leverage pseudo labeling to generate high-confidence target examples for joint training with source data. We additionally design a label correction component to estimate and correct the label shifts (i.e., class priors) between the source and target domains. Moreover, a contrastive adaptation loss is integrated in the objective function to reduce the intra-class discrepancy and enlarge the inter-class discrepancy. As such, the adapted model learns corrected class priors and an invariant conditional distribution across both domains for improved estimation of the target data distribution. To demonstrate the effectiveness of the proposed CANMD, we study the case of COVID-19 early misinformation detection and perform extensive experiments using multiple real-world datasets. The results suggest that CANMD can effectively adapt misinformation detection systems to the unseen COVID-19 target domain with significant improvements compared to the state-of-the-art baselines.

updated: Sat Aug 20 2022 02:09:35 GMT+0000 (UTC)

published: Sat Aug 20 2022 02:09:35 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト