| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by SOLAR_FIELDS 142 days ago
	It does feel like with each new frontier model release the major improvement I notice is that the model is, in fact, getting better at reading your mind. And what I mean by that is that it gets better at understanding the nuance and the subtleties of the intent of what you are saying better, and teasing out the actual intent of what you want better. So it gets easier and easier for the model to build a world around less input. So in a significant way, yes, newer models are reading your mind in a way, because they are probabilistically figuring out better how most humans communicate in natural language and filling in the gaps. Re writing code: most people find the writing of code to be a chore. For those that don’t, I don’t envy them, because that is the part that just got completely destroyed by AI. It’s becoming pretty abundantly clear that if you enjoy hand writing code that it will be a hobby rather than something you can do professionally and succeed over people who aren’t writing by hand

3 comments

linkregister 141 days ago

I think that the skill of hand-writing software is still useful in 2026. The vast majority of programming is a module calling another API. This does not spark joy. Truly interesting classes of problems —application of algorithms or applying complex arcane knowledge— are often not handled well by LLMs. Also, what the author wrote really strikes a chord. We should write the exceptionally difficult sections ourselves so we understand how the software operates.

This reminds me of the observation that Anthropic's unsupervised LLM-generated Rust implementation of sqlite3 was correct for the subset of features they chose, but thousands of times slower (wall clock). Of course, performance will be the next skill to be targeted by expert-led RHLF, but this is a hard problem with many tradeoffs. It may prove to be time-consuming to improve.

link

andai 141 days ago

Yeah they have more "common sense", though not as much as I'd like. I used to think Opus is big, but after using it a lot, I think it should actually be a lot bigger. The difference from Sonnet to Opus is really noticeable, but the difference from Opus to human (in common sense) is also massive. I expect as the hardware improves, we'll see 3-10x bigger models become the default.

Small models are making great strides of course, and perhaps we will soon learn to distill common sense ;) but subtlety and nuance appear physically bound to parameter count...

link

qsera 141 days ago

> teasing out the actual intent of what you want better.

Do you mean they ask clarifying questions before generating a response?

link

SOLAR_FIELDS 140 days ago

Kind of. I mean that they have gotten way better at taking some braindead sentence like “trace the performance of this app” and deriving what you actually mean which involves looking at your codebase, identifying your deployment scenario, identifying the steps required to pull the traces, writing the query to sample the traces, then correlating it all together. Just an example, you say 5 words and it’s able to figure out exactly what you want it to do and it might ask questions to clarify but otherwise it’s really good at figuring out what you actually need.

In the dark before times of 6 months ago, the thing would go completely off the rails and fuck it all up. In today’s world, 80% of the time it’s gonna get you pretty close to what you actually want with literally 5 words for simple tasks.

Complex tasks require more upfront work but my anecdata has demonstrated for me that complex tasks are showing similar relative reductions in upfront planning and effort to succeed

link