Aerial View Goal Localization with Reinforcement Learning

Aleksis Pirinen; Anton Samuelsson; John Backsund; Kalle Åström

強化学習による空撮目標位置特定

気候に起因する災害は増加し続けており、行方不明者を特定して支援する捜索救助 (SAR) 活動の重要性が高まっています。多くの場合、大まかな位置はわかっている可能性があり、行方不明者の位置を正確に特定するために、UAV を配備して特定の限られたエリアを探索することができます。時間とバッテリーの制約により、可能な限り効率的にローカリゼーションを実行することが重要な場合がよくあります。この作業では、実際の UAV へのアクセスを必要とせずに SAR のようなセットアップをエミュレートするフレームワークで、航空写真の目標位置特定タスクとして抽象化することで、この種の問題に取り組みます。このフレームワークでは、エージェントは航空画像 (検索領域のプロキシ) の上で動作し、視覚的な合図の観点から記述された目標をローカライズする任務を負います。実際の UAV の状況をさらに模倣するために、エージェントは低解像度であっても捜索エリア全体を観察することができないため、目標に向かってナビゲートするときに部分的な垣間見るだけに基づいて動作する必要があります。このタスクに取り組むために、AiRLoc を提案します。これは、探索 (遠い目標の検索) と活用 (近くの目標のローカライズ) を分離する強化学習 (RL) ベースのモデルです。広範な評価により、AiRLoc はヒューリスティック検索法や代替の学習可能なアプローチよりも優れていることが示されています。また、トレーニング中に単一の災害シナリオを確認することなく、災害に見舞われた地域など、データセット全体で一般化されることが示されています。また、学習可能な方法が平均して人間よりも優れていることを示す概念実証研究も実施しています。コードとモデルは、https://github.com/aleksispi/airloc で公開されています。

Climate-induced disasters are and will continue to be on the rise, and thus search-and-rescue (SAR) operations, where the task is to localize and assist one or several people who are missing, become increasingly relevant. In many cases the rough location may be known and a UAV can be deployed to explore a given, confined area to precisely localize the missing people. Due to time and battery constraints it is often critical that localization is performed as efficiently as possible. In this work we approach this type of problem by abstracting it as an aerial view goal localization task in a framework that emulates a SAR-like setup without requiring access to actual UAVs. In this framework, an agent operates on top of an aerial image (proxy for a search area) and is tasked with localizing a goal that is described in terms of visual cues. To further mimic the situation on an actual UAV, the agent is not able to observe the search area in its entirety, not even at low resolution, and thus it has to operate solely based on partial glimpses when navigating towards the goal. To tackle this task, we propose AiRLoc, a reinforcement learning (RL)-based model that decouples exploration (searching for distant goals) and exploitation (localizing nearby goals). Extensive evaluations show that AiRLoc outperforms heuristic search methods as well as alternative learnable approaches, and that it generalizes across datasets, e.g. to disaster-hit areas without seeing a single disaster scenario during training. We also conduct a proof-of-concept study which indicates that the learnable methods outperform humans on average. Code and models have been made publicly available at https://github.com/aleksispi/airloc.

updated: Fri Feb 10 2023 13:34:10 GMT+0000 (UTC)

published: Thu Sep 08 2022 10:27:53 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト