| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by AntiUSAbah 42 days ago

There is always marketing involved and people should be able to put marketing into perspective.

Also curl in this regard is a open source project, relativly small but critical, well known and used everywhere. Besides image libraries, tools like curl or sudo, su, passwd, etc. would also be my first try.

Mythos is still not known at all what it can do. What does it mean from cost and benchmark pov to have a 10 Trillion parameter model?

Nonetheless, the fact that LLMs got significant better in finding this, better than humans, started to happen half a year ago? so at one point we need to address the elefant in the room and state that today you need to do security scanning additional with LLMs. You need to take this serious.

In worst case, use Anthropics marketing to state that its a must now and something changed.

3 comments

Tade0 41 days ago

> What does it mean from cost and benchmark pov to have a 10 Trillion parameter model?

To me it means that we've hit the top end of the S-curve with regards to effects of scaling - if the tool isn't remarkably better despite the scale, then we're firmly in diminishing returns territory.

link

u_fucking_dork 41 days ago

> Mythos is still not known at all what it can do.

And this is very much on purpose my friend. Think about what people already believe it can do though.

link

flohofwoe 42 days ago

> Nonetheless, the fact that LLMs got significant better in finding this, better than humans, started to happen half a year ago?

*rolls eyes* regular static analyzers also have been "better than humans" for decades, being better than a human at a specific mechanical task really doesn't mean much. The interesting new thing is the type of potential "fuzzy bugs" described in the article that LLMs are able to identify (a comment not matching the code it describes, uncommon usage of a 3rd party library, mismatch of code and a protocol it implements, or often just generally weird looking code somebody should have a closer look at... this closes a gap in the traditional debugging toolboxes, but shouldn't replace them)

link

AntiUSAbah 41 days ago

You don't have to dismantle a comment on a microlevel.

It has been clear for ages that certain type of bugs or issues are better solved from software.

But there was still plenty of things a proper SecOps Person would be able to find with help from tooling which automatic tooling wouldn't find.

Taking a limited amount of resources and focusing on the critical things.

I do think this is gone now. Same with Threat modeling etc.

link

pixl97 41 days ago

Static analyzers are balls. For every real bug they find you are dealing with with piles of false positives and negatives.

Now, I'm not saying you shouldn't use them. They do catch the low hanging fruit. It's that LLMs actually have a much better understanding of things like intent when looking at your code and general architecture configurations that can lead to problems.

As you say we've had static analyzers forever, hence why they aren't dropping out 50 new CVE's a day. LLMs are. There is a massive stack of software out there that is getting analyzed and exploited at a rate faster than it's getting patched. Adding to that things like NPMs exploited package of the day and popular github repository takeovers this year looks massively different from last year in quantity and quality of exploits alone.

link

flohofwoe 41 days ago

IME LLMs generate at least as much false positives as static analyzers, but they're good at catching entirely different types of problems than static analyzers. 99% of false positives are avoided with a proper assert hygiene, and from what I've seen that seems to be true both for traditional static analyzers and llms, those assert annotate the code with valuable hints that may go beyond a specific type system's capabilities.

link