Tuning computer vision models with task rewards

André Susano Pinto; Alexander Kolesnikov; Yuge Shi; Lucas Beyer; Xiaohua Zhai

タスク報酬によるコンピュータービジョンモデルの調整

モデルの予測と意図した使用法との不一致は、コンピュータービジョンモデルの展開に悪影響を及ぼす可能性があります。タスクに複雑な構造化された出力が含まれる場合、この不整合に対処する手順を設計することが難しくなるため、問題は悪化します。自然言語処理では、これは多くの場合、モデルをタスク報酬に合わせる強化学習手法を使用して対処されます。私たちはこのアプローチを採用し、オブジェクト検出、パノプティックセグメンテーション、カラー化、画像キャプションなど、複数のコンピュータービジョンタスクにわたってその驚くべき有効性を示しています。このアプローチは、モデルをさまざまなコンピュータービジョンタスクとより適切に連携させるために広く役立つ可能性があると考えています。

Misalignment between model predictions and intended usage can be detrimental for the deployment of computer vision models. The issue is exacerbated when the task involves complex structured outputs, as it becomes harder to design procedures which address this misalignment. In natural language processing, this is often addressed using reinforcement learning techniques that align models with a task reward. We adopt this approach and show its surprising effectiveness across multiple computer vision tasks, such as object detection, panoptic segmentation, colorization and image captioning. We believe this approach has the potential to be widely useful for better aligning models with a diverse range of computer vision tasks.

updated: Thu Feb 16 2023 11:49:48 GMT+0000 (UTC)

published: Thu Feb 16 2023 11:49:48 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト