Hacker News new | ask | show | jobs
by ifyoubuildit 1009 days ago
The difference there is that you can probably look at (or design a test for) that hodge podge of regexes and understand the range of outputs.

You can prompt gpt4 and get something that looks plausible for a few test cases with very little effort, but can you get any guarantees that it will behave reasonably for most inputs? And if you can, will those guarantees last as the model is updated underneath you?

1 comments

I would be very worried that the LLM would say something medically wrong, and we'd get sued for a lot of money. ISTM that a better thing to do is to use the LLM to generate a lot of training data that you then test your handwritten super-regex against.