|
|
|
|
|
by woodson
622 days ago
|
|
There’s just not a single one-size-fits-all model/pipeline. You choose the right one for the job, depending on whether you need streaming (i.e., low latency; words output right when they’re spoken), run on device (e.g. phone) or server, what languages/dialects, conversational or more “produced” like a news broadcast or podcast, etc. Best way is to benchmark with data in your target domain. |
|