Group Crosscoders for Mechanistic Analysis of Symmetry

Gorton, Liv

Computer Science > Machine Learning

arXiv:2410.24184 (cs)

[Submitted on 31 Oct 2024 (v1), last revised 1 Nov 2024 (this version, v2)]

Title:Group Crosscoders for Mechanistic Analysis of Symmetry

Authors:Liv Gorton

View PDF HTML (experimental)

Abstract:We introduce group crosscoders, an extension of crosscoders that systematically discover and analyse symmetrical features in neural networks. While neural networks often develop equivariant representations without explicit architectural constraints, understanding these emergent symmetries has traditionally relied on manual analysis. Group crosscoders automate this process by performing dictionary learning across transformed versions of inputs under a symmetry group. Applied to InceptionV1's mixed3b layer using the dihedral group $\mathrm{D}_{32}$, our method reveals several key insights: First, it naturally clusters features into interpretable families that correspond to previously hypothesised feature types, providing more precise separation than standard sparse autoencoders. Second, our transform block analysis enables the automatic characterisation of feature symmetries, revealing how different geometric features (such as curves versus lines) exhibit distinct patterns of invariance and equivariance. These results demonstrate that group crosscoders can provide systematic insights into how neural networks represent symmetry, offering a promising new tool for mechanistic interpretability.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2410.24184 [cs.LG]
	(or arXiv:2410.24184v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2410.24184

Submission history

From: Liv Gorton [view email]
[v1] Thu, 31 Oct 2024 17:47:01 UTC (756 KB)
[v2] Fri, 1 Nov 2024 03:29:29 UTC (756 KB)

Computer Science > Machine Learning

Title:Group Crosscoders for Mechanistic Analysis of Symmetry

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Group Crosscoders for Mechanistic Analysis of Symmetry

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators