sparse-autoencoder

Star

Here are 76 public repositories matching this topic...

vgel / repeng

Star

A library for making RepE control vectors

machine-learning transformers language-model sparse-autoencoders sae sparse-autoencoder saes representation-engineering

Updated Sep 24, 2025
Jupyter Notebook

PaulPauls / llama3_interpretability_sae

Star

A complete end-to-end pipeline for LLM interpretability with sparse autoencoders (SAEs) using Llama 3.2, written in pure PyTorch and fully reproducible.

pytorch feature-extraction open-research sparse-autoencoder llama3 llm-interpretability feature-steering

Updated Mar 23, 2025
Python

ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models

Star

This repository collects all relevant resources about interpretability in LLMs

dictionary-learning sparse-autoencoder interpretability-and-explainability mechanistic-interpretability

Updated Nov 1, 2024

wblgers / tensorflow_stacked_denoising_autoencoder

Star

Implementation of the stacked denoising autoencoder in Tensorflow

tensorflow autoencoder denoising-autoencoders sparse-autoencoder stacked-autoencoder

Updated Aug 21, 2018
Python

codelion / pts

Sponsor

Star

Pivotal Token Search

Updated Dec 20, 2025
Python

neilwen987 / CSR_Adaptive_Rep

Star

Official Code for Paper: Beyond Matryoshka: Revisiting Sparse Coding for Adaptive Representation

retrieval efficient-algorithm sparse-autoencoder sparse-representations matryoshka-representation-learning

Updated May 6, 2026
Python

syorami / Autoencoders-Variants

Star

Pytorch implementations of various types of autoencoders

deep-learning pytorch autoencoder variational-autoencoder sparse-autoencoder

Updated Dec 4, 2018
Python

explanare / ravel

Star

Evaluate interpretability methods on localizing and disentangling concepts in LLMs.

intervention interpretability sparse-autoencoder probing disentangled-representations causal-intervention

Updated Oct 30, 2025
Jupyter Notebook

glami / sansa

Star

SANSA - sparse EASE for millions of items

collaborative-filtering recommender-system sparse-matrix sparse-autoencoder approximate-inverse

Updated Nov 21, 2025
Python

recombee / CompresSAE

Star

Sparse Embedding Compression for Scalable Retrieval in Recommender Systems

recommender-systems sae similarity-search sparse-autoencoder embedding-compression

Updated Nov 21, 2025
Python

tim-lawson / mlsae

Star

Multi-Layer Sparse Autoencoders (ICLR 2025)

transformer sae sparse-autoencoder mechanistic-interpretability

Updated Feb 6, 2026
Python

khoink94 / tensorflow-Deep-learning

Star

Tensorflow Examples

Updated May 11, 2017
Python

corl-team / flexsae

Star

Official Triton kernels for TopK and HierarchicalTopK Sparse Autoencoder decoders.

triton emnlp sae interpretability sparse-autoencoder llm emnlp2025

Updated Sep 29, 2025
Python

snooky23 / K-Sparse-AutoEncoder

Star

Sparse Auto Encoder and regular MNIST classification with mini batch's

deep-neural-networks python3 mnist-dataset pure-python sparse-autoencoder

Updated Jul 18, 2025
Jupyter Notebook

Ki-Seki / Awesome-Transformer-Visualization

Star

Explore visualization tools for understanding Transformer-based large language models (LLMs)

visualization awesome interactive transformer attention-mechanism bert gemma interactive-visualizations sae sparse-autoencoder explainable-ai large-language-models llm mechanistic-interpretability

Updated Dec 1, 2024

mrquincle / keras-adversarial-autoencoders

Star

Experiments with Adversarial Autoencoders using Keras

jupyter keras autoencoder variational-autoencoder sparse-autoencoder adversarial-autoencoder

Updated Dec 31, 2019
Jupyter Notebook

zer0int / CLIP-SAE-finetune

Star

Sparse Autoencoders (SAE) vs CLIP fine-tuning fun.

vit fine-tune clip sae adversarial-learning sparse-autoencoder finetune fine-tuning adversarial-attacks vision-transformer

Updated Dec 19, 2024
Python

Butanium / tiny-activation-dashboard

Star

A tiny easily hackable implementation of a feature dashboard.

sparse-autoencoders sparse-autoencoder feature-visualization feature-dashboard

Updated Oct 21, 2025
Jupyter Notebook

ssfgunner / VL-SAE

Star

[NeurIPS 2025] This is the official repository for VL-SAE: Interpreting and Enhancing Vision-Language Alignment with a Unified Concept Set

sparsity autoencoder interpretability sparse-autoencoder interpretable-machine-learning vision-language-model

Updated Oct 29, 2025
Python

09Catho / axon

Star

Real-time 3D visualisation of SAE feature activations inside GPT-2, token by token

python threejs machine-learning deep-learning websocket 3d-visualization sparse-autoencoder fastapi gpt2 mechanistic-interpretability transformerlens llm-interpretability

Updated May 19, 2026
JavaScript

Improve this page

Add a description, image, and links to the sparse-autoencoder topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the sparse-autoencoder topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sparse-autoencoder

Here are 76 public repositories matching this topic...

vgel / repeng

PaulPauls / llama3_interpretability_sae

ruizheliUOA / Awesome-Interpretability-in-Large-Language-Models

wblgers / tensorflow_stacked_denoising_autoencoder

codelion / pts

neilwen987 / CSR_Adaptive_Rep

syorami / Autoencoders-Variants

explanare / ravel

glami / sansa

recombee / CompresSAE

tim-lawson / mlsae

khoink94 / tensorflow-Deep-learning

corl-team / flexsae

snooky23 / K-Sparse-AutoEncoder

Ki-Seki / Awesome-Transformer-Visualization

mrquincle / keras-adversarial-autoencoders

zer0int / CLIP-SAE-finetune

Butanium / tiny-activation-dashboard

ssfgunner / VL-SAE

09Catho / axon

Improve this page

Add this topic to your repo