| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ozgrakkurt 1 day ago

You are not thinking straight if you are making out of bounds errors in inline asm.

Inline asm should take 10x or more effort compared to writing the surrounding c++ code and should be tested with protected pages at the edges if possible. It should always have assertions before/after that check invariants too.

Also there are at a lot of cases that this won’t work. One example is implementing strlen using avx512 where you want to align the address down to a multiple of 64 and run until the end of the page, so you can do simd while avoiding segfault.

Another example is just handling loop remainders with masking in avx512.

Also it is pretty naive to think an LLM got this right

Overall it seems like a huge waste of time.

If you are writing inline asm and want to make it better, just get as many LLMs or, even better, humans to review it. LLMs are really good at finding mistakes in inline asm, with a high false positive rate though, so you have to understand the concept.

For example one bug I had was about not consuming the inputs before writing to the outputs. Compiler can assign the same register to input and outputs unless outputs are marked with & (or something like that). It was super frustrating to debug this until I asked an LLM and it found the problem.

1 comments

avadodin 9 hours ago

Just don't add memory safety bugs is solid advice, but treating "asm" as C's "unsafe" keyword would void the memory safety guarantees in Fil-C.

I don't know how the author's proprietary LLM swarm handled the job but his stated approach sounds reasonable to me.

link

ozgrakkurt 3 hours ago

> Just don't add memory safety bugs is solid advice, but treating "asm" as C's "unsafe" keyword would void the memory safety guarantees in Fil-C.

This doesn't sound right to me and I wrote a decent amount of inline assembly in C like C++ code.

Are you saying this because you had unexpected memory safety bugs in inline asm?

link

avadodin 2 hours ago

There is a difference between being probably safe and being provably safe.

Rust's unsafes are likely safe.

Assembly snippets in the Linux kernel are likely safe.

These statements have no bearing on whether the present asm block being compiled right now is actually for–a–fact safe.

When none of the instructions perform a memory access, that is a guarantee.

As a diagnostics tool, Fil-C finds issues that are rarely present on any code I work on. A large subset of the issues are C++–adjacent.

I still believe its ideas —if applied correctly— can secure systems where someone thought an object system hacked in a weekend in C belongs in the tool "all LLMs depend on" or whatever.

BTW I do hand–write general–purpose assembly that is not a straight–forward intrinsic–equivalent and early drafts are full of all sorts of memory and register bank safety bugs.

link

ozgrakkurt 31 minutes ago

Thanks for explaining, I understand the importance of memory safety but I don’t think it is relevant enough in inline assembly context to implement something like this.

link