Optimising Resource Management for Embedded Machine Learning

Lei Xun; Long Tran-Thanh; Bashir M Al-Hashimi; Geoff V. Merrett

組み込み機械学習のためのリソース管理の最適化

機械学習の推論は、レイテンシ、プライバシー、接続性に明らかな利点があるため、モバイルプラットフォームと組み込みプラットフォームでローカルに実行されることが増えています。このホワイトペーパーでは、異種マルチコアシステムでのオンラインリソース管理のアプローチを紹介し、それらを適用して機械学習ワークロードのパフォーマンスを最適化する方法を示します。パフォーマンスは、プラットフォームに依存する（速度、エネルギーなど）およびプラットフォームに依存しない（精度、信頼性）メトリックを使用して定義できます。特に、ディープニューラルネットワーク（DNN）を動的にスケーラブルにして、これらのさまざまなパフォーマンスメトリックをトレードオフする方法を示します。提供されるリソースとその機能、および他のワークロードと一緒に実行するときの時間とともに変化する可用性のために、異なるプラットフォームで実行するときに一貫したパフォーマンスを達成することは必要ですが、困難です。利用可能なハードウェアリソース（多くの場合、本質的に多数で異種）、ソフトウェア要件、およびユーザーエクスペリエンス間のインターフェイスの管理はますます複雑になっています。

Machine learning inference is increasingly being executed locally on mobile and embedded platforms, due to the clear advantages in latency, privacy and connectivity. In this paper, we present approaches for online resource management in heterogeneous multi-core systems and show how they can be applied to optimise the performance of machine learning workloads. Performance can be defined using platform-dependent (e.g. speed, energy) and platform-independent (accuracy, confidence) metrics. In particular, we show how a Deep Neural Network (DNN) can be dynamically scalable to trade-off these various performance metrics. Achieving consistent performance when executing on different platforms is necessary yet challenging, due to the different resources provided and their capability, and their time-varying availability when executing alongside other workloads. Managing the interface between available hardware resources (often numerous and heterogeneous in nature), software requirements, and user experience is increasingly complex.

updated: Sat May 08 2021 06:10:05 GMT+0000 (UTC)

published: Sat May 08 2021 06:10:05 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト