VOCALExplore: Pay-as-You-Go Video Data Exploration and Model Building [Technical Report]

Maureen Daum; Enhao Zhang; Dong He; Stephen Mussmann; Brandon Haynes; Ranjay Krishna; Magdalena Balazinska

VOCALExplore: 従量課金制のビデオデータ探索とモデル構築 [テクニカルレポート]

ユーザーがビデオデータセットに対してドメイン固有のモデルを構築できるように設計されたシステムである VOCALExplore を紹介します。 VOCALExplore は、対話型のラベル付けセッションをサポートし、ユーザー指定のラベルを使用してモデルをトレーニングします。 VOCALExplore は、収集されたラベルで観察されたスキューに基づいてサンプルの選択方法を自動的に決定することにより、モデルの品質を最大化します。また、特徴選択を増加するバンディット問題としてキャストすることにより、モデルをトレーニングするときに使用する最適なビデオ表現を選択します。最後に、VOCALExplore はモデルのパフォーマンスを犠牲にすることなく低レイテンシを実現する最適化を実装します。 VOCALExplore が、候補取得関数と特徴抽出機能を考慮すると、可能な限り最高に近いモデル品質を達成し、目に見える短い遅延 (反復ごとに約 1 秒) で、高価な前処理なしでそれを達成できることを実証します。

We introduce VOCALExplore, a system designed to support users in building domain-specific models over video datasets. VOCALExplore supports interactive labeling sessions and trains models using user-supplied labels. VOCALExplore maximizes model quality by automatically deciding how to select samples based on observed skew in the collected labels. It also selects the optimal video representations to use when training models by casting feature selection as a rising bandit problem. Finally, VOCALExplore implements optimizations to achieve low latency without sacrificing model performance. We demonstrate that VOCALExplore achieves close to the best possible model quality given candidate acquisition functions and feature extractors, and it does so with low visible latency (~1 second per iteration) and no expensive preprocessing.

updated: Tue Jul 25 2023 20:09:55 GMT+0000 (UTC)

published: Tue Mar 07 2023 17:26:04 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト