Hacker News new | ask | show | jobs
by noperator 481 days ago
A concept that I've been thinking about a lot lately: transforming complex problems into document ranking problems to make them easier to solve. LLMs can assist greatly here, as I demonstrated at inaugural DistrictCon this past weekend.
1 comments

So would this be 1600 commits and one of which fixes the bug (which might be easier with commit messages?) or is this a diff between two revisions, with 1600 chunks, each chunk a “document” ?

I am trying to grok why we want to find the fix - is it to understand what was done so we can exploit unpatched instances in the wild?

Also also

“identifying candidate functions for fuzzing targets“ - if every function is a document I get where the list of documents is, what what is the query - how do I say “find me a function most suitable to fuzzing”

Apologies if that’s brusque - trying to fit new concepts in my brain :-)

Great questions. For commits or revision diffs as documents—either will work. Yes, I've applied this to N-day vulnerability identification to support exploit development and offensive security testing. And yes, for fuzzing, a sensible approach would be to dump the exported function attributes (names, source/disassembled code, other relevant context, etc.) from a built shared library, and ask, "Which of these functions most likely parses complex input and may be a good candidate for fuzzing?" I've had some success with that specific approach already.