| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by internetguy 300 days ago
	wow - this is really well made! i've been doing research w/ Transformer-based audio/speech models and this is made with incredible detail. Attention as a concept itself is already quite unintuitive for beginners due to is non-linearity, so this also explains it very well

2 comments

roadside_picnic 300 days ago

> Attention as a concept itself is already quite unintuitive

Once you realize that Attention is really just a re-framing of Kernel Smoothing it becomes wildly more intuitive [0]. It also allows you to view Transformers as basically learning a bunch of stacked Kernels which leaves them in a surprisingly close neighborhood to Gaussian Processes.

0. http://bactra.org/notebooks/nn-attention-and-transformers.ht...

link

tough 300 days ago

Nice read

> I'd be grateful for any pointers to an example where system developers (or someone else in a position to know) have verified the success of a prompt extraction.

You can try this yourself with any open source llm setup that lets you provide a system prompt no? Just give it a prompt, ask the model the prompt ,and see if it matches.

gpt-oss is trained to refuse so it wont share (you can provide system prompt on lmstudio)

link

adityamwagh 300 days ago

It’s a very popular article that has been around for a long time!

link

gdiamos 300 days ago

It's so good it is worth revisiting often

link