Hacker News new | ask | show | jobs
by fragmede 434 days ago
One thing I've thought about for coding with LLMs, is that passing in source code to be tokenized by the clip/whatever English text parser seems like it would be suboptimal compared to training on the AST that gets generated by the compiler after parsing the source.