3ET: Efficient Event-based Eye Tracking using a Change-Based ConvLSTM Network

Qinyu Chen; Zuowen Wang; Shih-Chii Liu; Chang Gao

3ET: 変更ベースの ConvLSTM ネットワークを使用した効率的なイベントベースのアイトラッキング

この論文では、AR/VR ヘッドセットなどの次世代ウェアラブルヘルスケア技術の鍵となる、イベントベースの視線追跡用のスパース変化ベース畳み込み長短期記憶 (CB-ConvLSTM) モデルを紹介します。当社は、従来のフレームベースのカメラに比べて、Retina にインスピレーションを得たイベントカメラの利点、つまり低遅延応答とまばらな出力イベントストリームを活用しています。当社の CB-ConvLSTM アーキテクチャは、イベントストリームから瞳孔追跡のための時空間特徴を効率的に抽出し、従来の CNN 構造を上回るパフォーマンスを発揮します。 CB-ConvLSTM は、活性化のスパース性を強化するデルタエンコードされたリカレントパスを利用して、v2e で生成されたラベル付き瞳孔のイベントデータセットでテストした場合、精度を損なうことなく算術演算を約 4.7 倍削減します。この効率の向上により、リソースに制約のあるデバイスでのリアルタイム視線追跡に最適になります。プロジェクトのコードとデータセットは、https://github.com/qinche106/cb-convlstm-eyetracking で公開されています。

This paper presents a sparse Change-Based Convolutional Long Short-Term Memory (CB-ConvLSTM) model for event-based eye tracking, key for next-generation wearable healthcare technology such as AR/VR headsets. We leverage the benefits of retina-inspired event cameras, namely their low-latency response and sparse output event stream, over traditional frame-based cameras. Our CB-ConvLSTM architecture efficiently extracts spatio-temporal features for pupil tracking from the event stream, outperforming conventional CNN structures. Utilizing a delta-encoded recurrent path enhancing activation sparsity, CB-ConvLSTM reduces arithmetic operations by approximately 4.7× without losing accuracy when tested on a v2e-generated event dataset of labeled pupils. This increase in efficiency makes it ideal for real-time eye tracking in resource-constrained devices. The project code and dataset are openly available at https://github.com/qinche106/cb-convlstm-eyetracking.

updated: Tue Aug 22 2023 20:24:24 GMT+0000 (UTC)

published: Tue Aug 22 2023 20:24:24 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト