AttentionViz: A Global View of Transformer Attention

Catherine Yeh; Yida Chen; Aoyu Wu; Cynthia Chen; Fernanda Viégas; Martin Wattenberg

AttentionViz: Transformer Attention のグローバルビュー

Transformer モデルは機械学習に革命をもたらしていますが、その内部の仕組みは謎のままです。この作業では、トランスフォーマーの自己注意メカニズムを研究者が理解できるように設計された新しい視覚化手法を提示します。これにより、これらのモデルは、シーケンスの要素間の豊富なコンテキスト関係を学習できます。この方法の背後にある主なアイデアは、変換モデルが注意を計算するために使用するクエリとキーベクトルの共同埋め込みを視覚化することです。以前の注意の視覚化手法とは異なり、私たちのアプローチは、複数の入力シーケンスにわたるグローバルパターンの分析を可能にします。これらの共同クエリキー埋め込みに基づいてインタラクティブな視覚化ツールである AttentionViz を作成し、それを使用して、言語と視覚変換の両方の注意メカニズムを研究します。モデルの理解を向上させ、いくつかのアプリケーションシナリオと専門家のフィードバックを通じて、クエリとキーの相互作用に関する新しい洞察を提供するアプローチの有用性を示します。

Transformer models are revolutionizing machine learning, but their inner workings remain mysterious. In this work, we present a new visualization technique designed to help researchers understand the self-attention mechanism in transformers that allows these models to learn rich, contextual relationships between elements of a sequence. The main idea behind our method is to visualize a joint embedding of the query and key vectors used by transformer models to compute attention. Unlike previous attention visualization techniques, our approach enables the analysis of global patterns across multiple input sequences. We create an interactive visualization tool, AttentionViz, based on these joint query-key embeddings, and use it to study attention mechanisms in both language and vision transformers. We demonstrate the utility of our approach in improving model understanding and offering new insights about query-key interactions through several application scenarios and expert feedback.

updated: Thu May 04 2023 23:46:49 GMT+0000 (UTC)

published: Thu May 04 2023 23:46:49 GMT+0000 (UTC)

arXiv

参考文献 (このサイトで利用可能なもの) / References (only if available on this site)

被参照文献 (このサイトで利用可能なものを新しい順に) / Citations (only if available on this site, in order of most recent)

Amazon.co.jpアソシエイト