| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by moron4hire 1587 days ago
	What are you talking about? Microsoft's Text-To-Speech APIs are the best on the market. Google's are definitely a distant second: not as many languages, not as many voices, and the output is nowhere near as good. After those two, there isn't really anything left worth mentioning.

2 comments

simion314 1587 days ago

I am talking about they have one documentation but if you look deeper there are 2 products, thay say here are 2 options long and short API , but look closer and see one uses names like "Name", "Voice" the other "name", "voice" , the list of voices of this 2 options are not the same , and you randomly get weird errors with shit message that will solve themselves in the next few days. If my memory is correct you authenticate in2 different ways.

So I would prefer MS do this;

1 this is our 2 completly different and incompatible APIs , they might look similar but are not the same, outpiut can differ even if you send same params to each one

2 give me good error messages, like if is your fault a request fails make it clear , if is my input the problem make it clear it is me and what is wrong

link

moron4hire 1587 days ago

I mean, there's the old Windows-only SAPI from the Windows XP days that they haven't been developing for several years now, and the current Azure Cognitive Services, which is just a REST API with a pretty standard auth scheme. There's an official .NET package for wrapping that REST API, but it's certainly not necessary to use it if you know how to handle REST APIs. Is that what you're talking about?

link

simion314 1587 days ago

Yes, the Azure API, there are 2 different things under the hood, the short and login APIs, that are different names, different auth headers, different voices supported, voices with same name that have different styles supported. Bad errors messages that popup and get fixed in a few days but only on one of the APIs. The issue is that I am trying to combine the short and long APIs in one product and I am hitting this big inconsistencies, I see lcearly there are 2 teams and do things different , if you use only one section you have a completly different experience.

Edit. I do the REST calls directly, not via an SDK and use the documentation from MS for the REST API so no SDK documentation or SDK code that hides the issues.

link

causi 1587 days ago

Microsoft's Text-To-Speech APIs are the best on the market.

Wow, I had no idea they were that good. Is there a way to get at them from a consumer level? For example, there are plenty of e-reader apps that use Google's TTS to read epub books as audiobooks. Anything similar for Microsoft or is it all on the developer side?

link

zidad 1587 days ago

You can just try it out here from the browser if you like:

https://azure.microsoft.com/en-us/services/cognitive-service...

link

causi 1587 days ago

That's a good demo, but it isn't much use for making audio content from ebooks.

link

hobo_mark 1587 days ago

Transcribing an entire ebook this way is going to be _expensive_.

link

AshamedCaptain 1587 days ago

They have basically been licensing L&H then Nuance until they bought it outright.

link