Skip to main content

sparse-autoencoder-training

Guides users in training Sparse Autoencoders to analyze neural network activations and discover interpretable features.

Install this skill

or
93/100

Security score

The sparse-autoencoder-training skill was audited on May 23, 2026 and we found 7 security issues across 1 threat category. Review the findings below before installing.

Categories Tested

Security Issues

low line 331

External URL reference

SourceSKILL.md
331Browse pre-trained SAE features at [neuronpedia.org](https://neuronpedia.org):
low line 364

External URL reference

SourceSKILL.md
364- [ARENA SAE Curriculum](https://www.lesswrong.com/posts/LnHowHgmrMbWtpkxx/intro-to-superposition-and-sparse-autoencoders-colab)
low line 367

External URL reference

SourceSKILL.md
367- [Towards Monosemanticity](https://transformer-circuits.pub/2023/monosemantic-features) - Anthropic (2023)
low line 368

External URL reference

SourceSKILL.md
368- [Scaling Monosemanticity](https://transformer-circuits.pub/2024/scaling-monosemanticity/) - Anthropic (2024)
low line 369

External URL reference

SourceSKILL.md
369- [Sparse Autoencoders Find Highly Interpretable Features](https://arxiv.org/abs/2309.08600) - Cunningham et al. (ICLR 2024)
low line 372

External URL reference

SourceSKILL.md
372- [SAELens Docs](https://jbloomaus.github.io/SAELens/)
low line 373

External URL reference

SourceSKILL.md
373- [Neuronpedia](https://neuronpedia.org) - Feature browser
Scanned on May 23, 2026
View Security Dashboard
Installation guide →