OODIn: An Optimised On-Device Inference Framework for Heterogeneous Mobile Devices

Stylianos I. Venieris; Ioannis Panopoulos; Iakovos S. Venieris

OODIn: 異種モバイルデバイス向けに最適化されたオンデバイス推論フレームワーク

深層学習 (DL) の分野における急激な進歩により、さまざまな推論タスクで前例のない精度が実現しました。そのため、モバイルプラットフォーム全体で DL モデルを展開することは、次世代のインテリジェントアプリの開発と幅広い可用性を可能にするために不可欠です。それにもかかわらず、現在、DL モデルの幅広く最適化された展開は、モバイルデバイスの広大なシステムの不均一性、さまざまな DL モデルのさまざまな計算コスト、および DL アプリケーション全体のパフォーマンスニーズのばらつきによって妨げられています。このホワイトペーパーでは、異種モバイルデバイス間で DL アプリを最適に展開するためのフレームワークである OODIn を提案します。 OODIn は、DL アプリケーションをモデル化するための分析フレームワークとともに、DL 固有の新しいソフトウェアアーキテクチャを構成します。(1) 高度にパラメータ化された多層設計によって、デバイスリソースと DL モデルの変動を打ち消す。 (2) ユーザーが指定したパフォーマンス要件とデバイス機能に展開を適応させるために、DL 推論アプリ用に設計された多目的定式化を通じて、モデルレベルとシステムレベルの両方のパラメーターの原則に基づいた最適化を実行します。定量的評価は、提案されたフレームワークが異種デバイス全体で常に現状の設計を上回っており、高度に最適化されたプラットフォームおよびモデル対応の設計よりもそれぞれ最大 4.3 倍および 3.5 倍のパフォーマンス向上を実現し、リソースの可用性の動的変化に効果的に実行を適応させることを示しています。

Radical progress in the field of deep learning (DL) has led to unprecedented accuracy in diverse inference tasks. As such, deploying DL models across mobile platforms is vital to enable the development and broad availability of the next-generation intelligent apps. Nevertheless, the wide and optimised deployment of DL models is currently hindered by the vast system heterogeneity of mobile devices, the varying computational cost of different DL models and the variability of performance needs across DL applications. This paper proposes OODIn, a framework for the optimised deployment of DL apps across heterogeneous mobile devices. OODIn comprises a novel DL-specific software architecture together with an analytical framework for modelling DL applications that: (1) counteract the variability in device resources and DL models by means of a highly parametrised multi-layer design; and (2) perform a principled optimisation of both model- and system-level parameters through a multi-objective formulation, designed for DL inference apps, in order to adapt the deployment to the user-specified performance requirements and device capabilities. Quantitative evaluation shows that the proposed framework consistently outperforms status-quo designs across heterogeneous devices and delivers up to 4.3x and 3.5x performance gain over highly optimised platform- and model-aware designs respectively, while effectively adapting execution to dynamic changes in resource availability.

updated: Tue Jun 08 2021 22:38:18 GMT+0000 (UTC)

published: Tue Jun 08 2021 22:38:18 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト