SAELens

Llama-3-8B SAEs (layer 25, Post-MLP Residual Stream)

Introduction

We train a Gated SAE on the post-MLP residual stream of the 25th layer of Llama-3-8b-instruct model. The width of SAE hidden dimensions is 65536 (x16).

The SAE is trained with 500M tokens from the OpenWebText corpus.

Feature visualizations are hosted at https://www.neuronpedia.org/llama3-8b-it. The wandb run is recorded here.

Load the Model

This repository contains the following SAEs:

  • blocks.25.hook_resid_post

Load these SAEs using SAELens as below:

from sae_lens import SAE

sae, cfg_dict, sparsity = SAE.from_pretrained("Juliushanhanhan/llama-3-8b-it-res", "<sae_id>")

Citation

@misc {jiatong_han_2024,
    author       = { {Jiatong Han} },
    title        = { llama-3-8b-it-res (Revision 53425c3) },
    year         = 2024,
    url          = { https://huggingface.co/Juliushanhanhan/llama-3-8b-it-res },
    doi          = { 10.57967/hf/2889 },
    publisher    = { Hugging Face }
}
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no pipeline_tag.

Dataset used to train Juliushanhanhan/llama-3-8b-it-res

Space using Juliushanhanhan/llama-3-8b-it-res 1