Hacker News new | ask | show | jobs
by bsaul 152 days ago
An ecosystem is being built around AI : Best prompting practices, mcps, skills, IDE integration, how to build a feedback loop so that LLM can test its output alone, plug to the outside world with browser extensions, etc...

For now i think people can still catch up quickly, but at the end of 2026 it's probably going to be a different story.

3 comments

Okay, end of 2026 then what? No one ever learns how to use the tools after that? No one gets a job until the pre-2026 generation dies?
For now i think people can still catch up quickly, but at the end of 2027 it's probably going to be a different story.
I heard 2028 is when it really gets impossible to catch up.
Lol
> Best prompting practices, mcps, skills, IDE integration, how to build a feedback loop so that LLM can test its output alone, plug to the outside world with browser extensions, etc...

Ah yes, an ecosystem that is fundamentally inherently built on probabilisitic quick sand and even with the "best prompting practices", you still get agents violating the basics of security and committing API keys when they were told not to. [0]

[0] https://xcancel.com/valigo/status/2009764793251664279

One of the skills needed to effectively use AI for code is to know that telling AI "don't commit secrets" is not a reliable strategy.

Design your secrets to include a common prefix, then use deterministic scanning tools like git hooks to prevent then from being checked in.

Or have a git hook that knows which environment variables have secrets in and checks for those.

That's such an incredibly basic concept, surely AIs have evolved to the point where you don't need to explicitly state those requirements anywhere?
They can still make mistakes.

For example, what if your code (that the LLM hasn't reviewed yet) has a dumb feature in where it dumps environment variables to log output, and the LLM runs "./server --log debug-issue-144.log" and commits that log file as part of a larger piece of work you ask it to perform.

If you don't want a bad thing to happen, adding a deterministic check that prevents the bad thing to happen is a better strategy than prompting models or hoping that they'll get "smarter" in the future.

Part of why these things feel "not fit for purpose" is that they don't include the things Simon has spent three years learning? (I know someone else who's doing multi-LLM development where he uses job-specialty descriptions for each "team member" that lets them spend context on different aspects of the problem; it's a fascinating exercise to watch, but it feels even more like "if this is how the tools should be used, why don't they just work that way"?)
Doesn't seem to work for humans all the time either.

Some of this negativity I think is due to unrealistic expectations of perfection.

Use the same guardrails you should be using already for human generated code and you should be fine.

I have tons of examples of AI not committing secrets. this is one screenshot from twitter? I don’t think it makes your point

CPUs are billions of transistors. sometimes one fails and things still work. “probabilistic quicksand” isn’t the dig you think it is to people who know how this stuff works

> I have tons of examples of AI not committing secrets.

"Trust only me bro".

It takes 10 seconds to see the many examples of API keys + prompts on GitHub to verify that tweet. The issue with AI isn't limited to that tweet which demonstrates its probabilistic nature; Otherwise why do need a sandbox to run the agent in the first place?

Nevermind, we know why: Many [0] such [1] cases [2]

> CPUs are billions of transistors. sometimes one fails and things still work. “probabilistic quicksand” isn’t the dig you think it is to people who know how this stuff works

Except you just made a false equivalence. CPUs can be tested / verified transparently and even if it does go wrong, we know exactly why. Where as you can't explain why the LLM hallucinated or decided to delete your home folder because the way it predicts what it outputs is fundamentally stochastic.

[0] https://old.reddit.com/r/ClaudeAI/comments/1pgxckk/claude_cl...

[1] https://old.reddit.com/r/ClaudeAI/comments/1jfidvb/claude_tr...

[2] https://www.google.com/search?q=ai+deleted+files+site%3Anews...

you could find tons of API keys on GitHub before these “agentic” tools too. that was my point, one screenshot from twitter vs one anecdote from me. I don’t think either proves the point, but posting a screenshot from twitter like it’s proof of some widespread problem is what I was responding to (N=2, 1 vs 1)

my point is more “skill issue” than “trust me this never happens”

my point on CPUs is people who don’t understand LLMs talk like “hallucinations” are a real thing — LLMs are “deciding” to make stuff up rather than just predicting the next token. yes it’s probabilistic, so is practically everything else at scale. yet it works and here we are. can you really explain in detail how everything you use works? I’m guessing I can explain failure modes of agentic systems (and how to avoid them so you don’t look silly on twitter/github) and how neural networks work better than most people can explain the technology they use every day

> you could find tons of API keys on GitHub before these “agentic” tools too. that was my point, one screenshot from twitter vs one anecdote from me. I don’t think either proves the point, but posting a screenshot from twitter like it’s proof of some widespread problem is what I was responding to (N=2, 1 vs 1)

That doesn't refute the probabilistic nature of LLMs despite best prompting practices. In fact it emphasises it. More like your 1 anecdotal example vs my 20+ examples on GitHub.

My point tells you that not only it indeed does happen, but a previous old issue is now made even worse and more widespread, since we now have vibe-coders without security best practices assuming the agent should know better (when it doesn't).

> my point is more “skill issue” than “trust me this never happens”

So those that have this "skill issue" are also those who are prompting the AI differently then? Either way, this just inadvertently proves my whole point.

> yes it’s probabilistic, so is practically everything else at scale. yet it works and here we are.

The additional problem is can you explain why it went wrong as you scale the technology? CPUs circuit design go through formal verification and if a fault happens, we know exactly why; hence it is deterministic in design which makes them reliable.

LLMs are not and don't have this. Which is why OpenAI had to describe ChatGPT's misaligned behaviour as "sycophancy", but could not explain why it happened other than tweaking the hyper-parameters which got them that result.

So LLMs being fundamentally probabilistic and are hence, more unexplainable being the reason why you have the screenshot of vibe-coders who somehow prompted it wrong and the agent committed the keys.

Maybe that would never have happened to you, but it won't be the last time we see more of this happening on GitHub.

I was pointing out one screenshot from twitter isn’t proof of anything just to be clear; it’s a silly way to make a point.

yes AI makes leaking keys on GH more prevalent, but so what? it’s the same problem as before with roughly the same solution

I’m saying neural networks being probabilistic doesn’t matter — everything is probabilistic. you can still practically use the tools to great effect, just like we use everything else that has underlying probabilities

OpenAI did not have to describe it as sycophancy, they chose to, and I’d contend it was a stupid choice

and yes, you can explain what went wrong just like you can with CPUs. we don’t (usually) talk about quantum-level physics when discussing CPUs; talking about neurons in LLMs is the wrong level of abstraction

I have tons of examples of drivers not running into objects.
like my other comment, my point is one screenshot from twitter vs one anecdote. neither proves anything. cool snarky response though!
> probably going to be a different story

Can you elaborate? Skill in AI use will be a differentiator?

Yes.

At some point you will need to combine multiple skills together:

- communication

- engineering skills (understanding requirements, finding edge cases, etc)

- architectural proficiency

- prompting

- agentic workflows and skills

- context management

- and yes, proper old fashioned coding skills to keep things tidy and consistent