Multi-point dimensionality reduction to improve projection layout reliability

Farshad Barahimi

投影レイアウトの信頼性を向上させるための多点次元削減

通常のDimensionalityReduction（DR）では、高次元空間（元の空間）または元の空間距離を示す距離マトリックス上の各データインスタンスは、低次元空間（視覚空間）の1つのポイントにマッピング（投影）されます。距離、近隣関係、トポロジ構造など、データの一部のプロパティを可能な限り保存しようとする投影点のレイアウトを構築し、データのセマンティックプロパティを、保存された幾何学的プロパティまたはトポロジ構造で視覚空間に近似することを最終目標とします。。このホワイトペーパーでは、多点次元削減の概念について、各データインスタンスを視覚空間の複数の点にマッピング（投影）できる場所について詳しく説明します。次元削減の信頼性、使いやすさ、解釈可能性を改善する方向。さらに、データインスタンスごとに複数のプロジェクション（マッピング）を持つ可能性を維持しながら、視覚空間内のポイントを2つのレイヤーに分割できるようにすることで、信頼性の低いポイントから信頼性の高いポイントを分離することの利点について説明します。信頼できるポイント。このホワイトペーパーで提案されているソリューション（アルゴリズム）は、Layered Vertex Splitting Data Embedding（LVSDE）と呼ばれ、通常のDRとグラフ描画技術の組み合わせに基づいて拡張されています。経験的側面では、この論文は、特定の提案されたアルゴリズム（LVSDE）が、一般的な通常のDR手法（セマンティクス、グループ分離、サブグループ検出、または組み合わせグループ検出）を、簡単に説明でき、トップに近接して実行する方法で、実際に優れていることを示しています。 KNN分類精度に基づく定量分析で論文で研究されたほとんどのデータセットについて。 [長さのために切り捨てられた要約]

In ordinary Dimensionality Reduction (DR), each data instance in a high dimensional space (original space), or on a distance matrix denoting original space distances, is mapped to (projected onto) one point in a low dimensional space (visual space), building a layout of projected points trying to preserve as much as possible some property of data such as distances, neighbourhood relationships, and/or topology structures, with the ultimate goal of approximating semantic properties of data with preserved geometric properties or topology structures in visual space. In this paper, the concept of Multi-point Dimensionality Reduction is elaborated on where each data instance can be mapped to (projected onto) possibly more than one point in visual space by providing the first general solution (algorithm) for it as a move in the direction of improving reliablity, usability and interpretability of dimensionality reduction. Furthermore by allowing the points in visual space to be split into two layers while maintaining the possibility of having more than one projection (mapping) per data instance , the benefit of separating more reliable points from less reliable points is dicussed notwithstanding the effort to improve less reliable points. The proposed solution (algorithm) in this paper, named Layered Vertex Splitting Data Embedding (LVSDE), is built upon and extends a combination of ordinary DR and graph drawing techniques. On the empirical side, this paper shows that the particular proposed algorithm (LVSDE) practically outperforms popular ordinary DR methods visually (semantics, group separation, subgroup detection or combinational group detection) in a way that is easily explainable and performs in close proximity to top on most of the data sets studied in the paper in a quantitative analysis based on KNN classification accuracy. [Abstract truncated for length]

updated: Sun May 29 2022 13:52:40 GMT+0000 (UTC)

published: Fri Jan 15 2021 17:17:02 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト