Private Eye: On the Limits of Textual Screen Peeking via Eyeglass Reflections in Video Conferencing

Yan Long; Chen Yan; Shilin Xiao; Shivan Prasad; Wenyuan Xu; Kevin Fu

プライベートアイ: ビデオ会議における眼鏡の反射による文字画面の覗き見の限界について

この研究では、数学的モデリングと人間を対象とした実験を使用して、新しいウェブカメラが、ウェブカメラによってキャプチャされた眼鏡の反射からきらめく認識可能なテキストおよびグラフィック情報をどの程度漏らす可能性があるかを調査します。私たちの仕事の主な目標は、ウェブカメラ技術が将来進化するにつれて、認識可能性の要因、制限、およびしきい値を測定、計算、および予測することです。私たちの研究では、一連のビデオフレームに対してマルチフレーム超解像技術を使用して、光学攻撃に基づく実行可能な脅威モデルを調査し、特徴付けています。制御されたラボ設定でのモデルと実験結果は、720p ウェブカメラを使用して、高さ 10 mm ほどの画面上のテキストを 75% 以上の精度で再構成および認識できることを示しています。さらに、この脅威モデルを Web テキストコンテンツに適用し、さまざまな攻撃者の機能を使用して、テキストが認識可能になるしきい値を見つけます。 20 人の参加者を対象としたユーザー調査では、攻撃者が大きなフォントの Web サイトでテキストコンテンツを再構築するには、現在の 720p Web カメラで十分であることが示唆されています。私たちのモデルはさらに、4K カメラへの進化により、テキスト漏洩のしきい値が、人気のある Web サイトのほとんどのヘッダーテキストの再構築に向けられることを示しています。テキストターゲットに加えて、720p ウェブカメラを使用した Alexa トップ 100 ウェブサイトのクローズドワールドデータセットの認識に関するケーススタディでは、機械学習モデルを使用しなくても、10 人の参加者で 94% の最大認識精度が示されています。私たちの調査では、ユーザーがビデオストリームの眼鏡部分をぼかすために使用できるソフトウェアプロトタイプを含む、短期的な軽減策を提案しています。可能性のある長期的な防御のために、さまざまな設定で脅威を評価するための個別のリフレクションテスト手順を提唱し、プライバシーに配慮したシナリオでは最小特権の原則に従うことの重要性を正当化します。

Using mathematical modeling and human subjects experiments, this research explores the extent to which emerging webcams might leak recognizable textual and graphical information gleaming from eyeglass reflections captured by webcams. The primary goal of our work is to measure, compute, and predict the factors, limits, and thresholds of recognizability as webcam technology evolves in the future. Our work explores and characterizes the viable threat models based on optical attacks using multi-frame super resolution techniques on sequences of video frames. Our models and experimental results in a controlled lab setting show it is possible to reconstruct and recognize with over 75% accuracy on-screen texts that have heights as small as 10 mm with a 720p webcam. We further apply this threat model to web textual contents with varying attacker capabilities to find thresholds at which text becomes recognizable. Our user study with 20 participants suggests present-day 720p webcams are sufficient for adversaries to reconstruct textual content on big-font websites. Our models further show that the evolution towards 4K cameras will tip the threshold of text leakage to reconstruction of most header texts on popular websites. Besides textual targets, a case study on recognizing a closed-world dataset of Alexa top 100 websites with 720p webcams shows a maximum recognition accuracy of 94% with 10 participants even without using machine-learning models. Our research proposes near-term mitigations including a software prototype that users can use to blur the eyeglass areas of their video streams. For possible long-term defenses, we advocate an individual reflection testing procedure to assess threats under various settings, and justify the importance of following the principle of least privilege for privacy-sensitive scenarios.

updated: Wed Sep 14 2022 03:10:22 GMT+0000 (UTC)

published: Sun May 08 2022 23:29:13 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト