|
|
|
|
|
by NitpickLawyer
47 days ago
|
|
> We also release an interactive frontend for exploring NLAs on several open models through a collaboration with Neuronpedia. Whatever they did on LLama didn't work, nothing makes sense in their example where they ask the model to lie about 1+1. Either the model is too old, or whatever they used isn't working, but whatever the autoencoder outputs is nothing like their examples with claude. Gemma is similarly bad. |
|