What can we Learn by Predicting Accuracy?

Olivier Risser-Maroix; Benjamin Chamand

精度を予測することで何がわかるか?

このホワイトペーパーでは、「精度を予測することで何がわかるか?」という質問に答えようとしています。実際、分類は機械学習で最も人気のあるタスクの 1 つであり、この微分不可能な目的関数を最大化するために多くの損失関数が開発されています。実験によって検証される前に主に直感と理論によって導かれた損失関数の設計に関する過去の作業とは異なり、ここでは反対の方法でこの問題にアプローチすることを提案します。つまり、実験によって知識を抽出しようとします。このデータ駆動型のアプローチは、データから一般法則を発見するために物理学で使用されるアプローチに似ています。シンボリック回帰法を使用して、線形分類器の精度と高度に相関する数式を自動的に見つけました。埋め込みの 260 以上のデータセットで発見された式は、0.96 のピアソン相関と 0.93 の ar^2 を持っています。さらに興味深いことに、この式は非常に説明可能であり、損失設計に関するさまざまな以前の論文からの洞察を裏付けています。この作業が、機械学習理論のより深い理解につながる新しいヒューリスティックの検索において、新しい視点を開くことを願っています。

This paper seeks to answer the following question: "What can we learn by predicting accuracy?". Indeed, classification is one of the most popular tasks in machine learning, and many loss functions have been developed to maximize this non-differentiable objective function. Unlike past work on loss function design, which was guided mainly by intuition and theory before being validated by experimentation, here we propose to approach this problem in the opposite way: we seek to extract knowledge by experimentation. This data-driven approach is similar to that used in physics to discover general laws from data. We used a symbolic regression method to automatically find a mathematical expression highly correlated with a linear classifier's accuracy. The formula discovered on more than 260 datasets of embeddings has a Pearson's correlation of 0.96 and a r^2 of 0.93. More interestingly, this formula is highly explainable and confirms insights from various previous papers on loss design. We hope this work will open new perspectives in the search for new heuristics leading to a deeper understanding of machine learning theory.

updated: Tue Aug 23 2022 13:36:31 GMT+0000 (UTC)

published: Tue Aug 02 2022 10:58:17 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト