0% found this document useful (0 votes)
14 views13 pages

Generative AI

The document explores the attention mechanisms of the GPT-2 model using the BertViz visualization tool, aiming to interpret syntactic and semantic relationships. It details the methodology of analyzing attention patterns across various linguistic phenomena, such as coreference and ambiguity, revealing insights into attention specialization and biases. Key findings indicate that while some attention heads are interpretable and encode biases, not all contribute equally to performance, and future work will focus on further isolating model behaviors and applying these methods to newer models.

Uploaded by

D3K P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views13 pages

Generative AI

The document explores the attention mechanisms of the GPT-2 model using the BertViz visualization tool, aiming to interpret syntactic and semantic relationships. It details the methodology of analyzing attention patterns across various linguistic phenomena, such as coreference and ambiguity, revealing insights into attention specialization and biases. Key findings indicate that while some attention heads are interpretable and encode biases, not all contribute equally to performance, and future work will focus on further isolating model behaviors and applying these methods to newer models.

Uploaded by

D3K P
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Exploring GPT-2 Attention with

BertViz
Shivam Kumar(2411AI63) Saumyadweepta Paul(2411AI62)
Ankit Kumar Pandey(2411AI63) Aashish Kumar Gupta(2411CS25)
24 April 2025
Objective

● To visualize and interpret attention mechanisms


in GPT-2 using BertViz.

● Understand how GPT-2 models syntactic and


semantic relationships.

● Identify specialized attention heads and


investigate model behavior.
Introduction to Transformers and GPT-2

● Transformers rely entirely on self-attention to process input sequences.

● GPT-2 is a decoder-only language model trained to predict the next token.

● Self-attention enables GPT-2 to model long-range dependencies efficiently.


What is BertViz?

BertViz Visualization Tool

● An open-source tool for interpreting attention in Transformer models.

● Offers multiple views: Attention-Head View, Model View, and Neuron View.

● Adapted to support both encoder (BERT) and decoder (GPT-2) models.

● Helps explore head specialization, bias, and structure.


Methodology Overview

Input carefully crafted sentences to GPT-2.

Visualize each layer's self-attention using BertViz.

Analyze layer-wise and head-wise attention distributions.

Capture screenshots to document key attention behaviors.


Dataset and Sentence Design

Linguistic Phenomena in Input Sentences

● Coreference: “The doctor spoke to the nurse. She listened.”

● Ambiguity: “The chicken is ready to eat.”

● Subject–verb agreement: “The cat that the dog chased was fast.”

● Gender bias detection: contrast male vs. female pronouns in context.


Syntax Attention Patterns

Example: Complex Clause Interpretation

Sentence: “The cat that the dog chased was fast.”

● Observed strong backward attention from "was" to "cat" across heads.

● Some heads focused on aligning subject and verb.

● Others distributed attention across tokens, possibly for context modeling.


Coreference Attention Patterns

Example: Gender Bias in Coreference

Sentence: “The doctor spoke to the nurse. She listened.”

● Certain heads linked “She” more to “nurse” than “doctor.”

● Reveals how GPT-2 encodes coreference, possibly influenced by gender


stereotypes.

● Replacing roles (e.g., “engineer” instead of “doctor”) affects attention


Ambiguity Resolution

Example: Structural Ambiguity

Sentence: “The chicken is ready to eat.”

● Model attention varies: “chicken” links to both “is” and “eat.”

● GPT-2 distributes attention across interpretations: subject vs. object.

● No clear resolution—suggests GPT-2 maintains ambiguity unless


disambiguated by context.
Attention Specialization

Head and Layer Behavior

● Some heads specialize in:


○ Syntactic roles (subject–verb)
○ Punctuation and clause boundaries
○ Coreference tracking

● Redundant heads: exhibit diffuse or uniform attention.

● Patterns change across layers: deeper layers show more abstract


dependencies.
Key Findings

● Attention heads are interpretable in some cases, revealing structure and


semantics.

● Certain heads encode biases (e.g., gendered associations).

● Not all heads contribute equally—some may be pruned without loss in


performance.

● BertViz aids in debugging and understanding model decisions


Conclusion and Future Work
● BertViz helps demystify attention in GPT-2, showing

structure and specialization.

● Key dependencies like coreference, syntax, and bias are

traceable.

● Future directions:

○ Use neuron view to isolate responsible units.

○ Explore intervention strategies (neuron editing, head

pruning).

○ Apply similar methods to newer models (GPT-3,

GPT-4).
Thank You!

You might also like