Hacker News new | ask | show | jobs
by mike256 1322 days ago
Do I understand that correctly? When I have to create a spreadsheet like this, there are 2 options. Option 1 I write a table zipcode to state and use this table to generate my column. If I carefully check my table my spreadsheet would be okay. Option 2 I ask GPT3 to do my work. But I have to check the whole spreadsheet for errors.
1 comments

I dealt with something similar. I was creating a large directory of childcare centres in Canada. I had thousands of listings with a url but no email address. I created a Mechanical Turk job to ask turkers to go to website and find an email address. Many came back with email addresses like admin@<<actualURL>>.com. After checking a few, I realized that the turkers were just guessing that admin@ would work and I'd approve their work. I ended up having to double check all the work.
> I ended up having to double check all the work.

Me too, different project and different labelling company but the conclusion - it's better to do it in house. Labelling is hard. You need to see, talk with and train your labelling team.

That’s why you always set up layers of work with mturk, with other layers validating the first ones. Or give the same task to multiple workers and compare the results.
For getting email addresses when I had a url, it would have likely been easier to set up a scrapper to visit page and pull any string with an @ in it. Then scroll through the list to find the most obvious general intake email.
I wonder if those workers can be reported and fired.
It was quite a while ago so I don't remember what my options were. If I recall correctly I was able to see individual workers responses, so once I found one that was obviously faking results, I was able to reject all their submissions.
I mean, depending on how the OP phrased the work to be done, they probably did valid work.