Hacker News new | ask | show | jobs
by Gigachad 249 days ago
I bet they probably are adding that to the system prompt at least in the short term while people are paying attention before looking for a longer term answer.

The system prompts I've seen are absolutely massive.

1 comments

I find interesting that their blog post on prompt/context engineering kind of stands against their ultra long system prompt. Maybe it is not too specific as in their visual example (too specific - just right - too vague). https://www.anthropic.com/engineering/effective-context-engi... and the system prompt https://docs.claude.com/en/release-notes/system-prompts#sept...
> This attention scarcity stems from architectural constraints of LLMs. LLMs are based on the transformer architecture, which enables every token to attend to every other token across the entire context. This results in n² pairwise relationships for n tokens.

The n² time complexity smells like it could be reduced by algorithm engineering. Maybe doing a preprocessing pass to filter out attending to tokens (not sure what the right term of art is here) that do not contribute significantly to the meaning of the input. Basically some sort of context compression mechanism.