Exploring Semantic Perturbations on Grover

Pranav Kulkarni; Ziqing Ji; Yan Xu; Marko Neskovic; Kevin Nolan

Grover でのセマンティック摂動の調査

ニュースや情報へのアクセスが現在と同じくらい簡単になっているため、人々が読んだ内容によって誤解を招かないようにすることがこれまで以上に重要になっています。最近、ニューラルフェイクニュース (AI によって生成されたフェイクニュース) の台頭と、それが人間を欺く有効性が実証されたことから、それを検出するモデルの開発が促進されています。そのようなモデルの 1 つにグローバーモデルがあります。このモデルは、ニューラルフェイクニュースを検出してそれを防止し、それを生成して、モデルがどのように悪用されて人間の読者をだますことができるかを示すことができます。この作業では、入力ニュース記事の摂動を通じて標的型攻撃を実行することにより、Grover モデルのフェイクニュース検出機能を調べます。これにより、これらの敵対的攻撃に対するグローバーの回復力をテストし、潜在的な脆弱性をいくつか明らかにして、すべてのタイプのフェイクニュースを正確に検出できるようにするために、さらなる反復で対処する必要があります.

With news and information being as easy to access as they currently are, it is more important than ever to ensure that people are not mislead by what they read. Recently, the rise of neural fake news (AI-generated fake news) and its demonstrated effectiveness at fooling humans has prompted the development of models to detect it. One such model is the Grover model, which can both detect neural fake news to prevent it, and generate it to demonstrate how a model could be misused to fool human readers. In this work we explore the Grover model's fake news detection capabilities by performing targeted attacks through perturbations on input news articles. Through this we test Grover's resilience to these adversarial attacks and expose some potential vulnerabilities which should be addressed in further iterations to ensure it can detect all types of fake news accurately.

updated: Wed Feb 01 2023 15:28:55 GMT+0000 (UTC)

published: Wed Feb 01 2023 15:28:55 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト