|
|
|
|
|
by yorwba
3237 days ago
|
|
If I understand correctly, "customizing the model" essentially adds new words to the vocabulary and adjusts the language model to change the probability of some phrases, but does not require any information about pronunciation, let alone audio samples. But isn't having just the English text really error prone, especially when you are dealing with terms of art and proper names, that might even have roots in foreign languages? E.g. some people pronounce SQL as "sequel", and the English pronunciation of French words varies between "French pronunciation with English accent" and "French orthography interpreted as English orthography". (I'm guessing your model would tend towards the latter?) So what I'm interested in is whether you have encountered examples of this during your testing, and whether you have some way to work around it (I would try phonemic transcriptions in addition to English); or whether this is not relevant for the use-cases you are trying to cover and the convenience of just using English text trumps the accuracy loss due to just using English text. |
|