|
|
|
|
|
by eserorg
6060 days ago
|
|
Sweet! I'm hacking together a perl script right now using the Mechanical Turk API. It's automatically spliting up the mp3 files, generating the html forms, and loading them into MTurk. I'm going to use a 2x coverage for each chunk and see what happens. This is a brilliant idea. Thank you for the suggestion! I can't believe it didn't occur to me before -- and we are _very_ heavy users of AWS. |
|
The system we use now has multiple stages. We split up the files into smaller chunks which are then picked up by our transcribers. Each transcript is then reviewed, speaker initials and timestamps are added and then they are finally collated.
We've gotten pretty decent results with our system so far with some very satisfied customers.
More about our process at http://callgraph.biz/transcriptionservice#process