| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gnat 1136 days ago

I find that when it writes code, I can't trust its output. I handle that by being Grumpy Old Programmer: I eyeball it closely and ask myself about error handling, assumptions, off-by-one errors, is it confusing the 1.1 API with 2.0, is it efficient or naive, and all the other questions that happen in a code review.

So you're pushing 100 Chinese numbers through ChatGPT to get Arabic equivalents. What do you then do to ensure the quality of output is high? Do you eyeball the list and go "hm, seems plausible"? Spot checks? Is there some context around the lists that means erroneous translations will be quite obvious to the trained eye?

I'm always curious what QA looks like in other fields.

1 comments

tatrajim 1128 days ago

I'm working with historical documents from the 18th century, and will likely be the last person to look at them for a decade or so, and consequently need to be correct. I translate them quickly manually, then use ChatGPT to check my efforts. It tends to catch my errors (about 1 in 20) and I catch its predictable mistakes (easy to see), which occur about 1 time in 10. The type of mistakes it makes (inventing a number in the 1000s position instead of using a 0) are ones I am unlikely to make, and vice versa (I am usually off by one due to a typo). So a strange, but serviceable team.