| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by roadside_picnic 302 days ago

> Attention as a concept itself is already quite unintuitive

Once you realize that Attention is really just a re-framing of Kernel Smoothing it becomes wildly more intuitive [0]. It also allows you to view Transformers as basically learning a bunch of stacked Kernels which leaves them in a surprisingly close neighborhood to Gaussian Processes.

0. http://bactra.org/notebooks/nn-attention-and-transformers.ht...

1 comments

tough 302 days ago

Nice read

> I'd be grateful for any pointers to an example where system developers (or someone else in a position to know) have verified the success of a prompt extraction.

You can try this yourself with any open source llm setup that lets you provide a system prompt no? Just give it a prompt, ask the model the prompt ,and see if it matches.

gpt-oss is trained to refuse so it wont share (you can provide system prompt on lmstudio)

link