Hacker News new | ask | show | jobs
by applesan 1049 days ago
Generally if you are beginner and want to actually learn the language, you should focus on the input not the output.

Few problems I find with these kind of apps:

How is that different from just using chatGPT?

Why can't I just write my response instead of using mic?

chatGPT makes mistakes, as a beginner you can't spot them. (I talked to it in polish (my native language), it was making grammar mistakes)

Speech recognition is not the same as native listening to you, speech recognition software may "understand" you, but native would not and vice versa.

ChatGPT can't correct your pronunciation.

Replies generated are stiff and unnatural.

TTS can't model speech accurately (it lacks emotions etc.)

1 comments

I personally like that it's using speech recognition.

First, chatting and speaking are not nearly using the same skills, training for one does not necessarily train the other and you can end up having a hard time to find the words you want on the spot.

Secondly, speech recognition while not perfect, does help to make you understood by a native speaker. Speech recognition is usually working best on what's considered some of the most neutral accents in the target language, which is as a foreign speaker, exactly what you want. Seeing the recognition failed is a clue that you might need to train again to speak those words.

> TTS can't model speech accurately (it lacks emotions etc.)

I do agree on this last part though and usually TTS lacks support for other accents.

I agree that chatting and speaking requires different skill set. However I would argue that it is even more of an argument to not use speech recognition here (or at least not to force it), because chatGPT is chatting and learner is speaking. Transcription will always lose some information (for example your tone can indicate sarcasm, but chatGPT can't detect it).

To the second point: whisper can be helpful, but how can you know if it fails because of you and not the software's error? I spoke in my native language with traditional accent and it still made mistakes, also it hallucinates. Additionally being understood by whisper doesn't mean, native will understand you.

I do agree that the text generated by ChatGPT isn't really "natural" but more verbose and text oriented, some things that a native speaker would not necessarily say and won't understand speech nuances. It's clearly not perfect, I'm really not claiming that it is.

It's a good tool that I'm going to use a few times per day though, there's no really substitutes to speaking to get better at it. I'm also using other methods and tools and this would be a minor addition to my learning schedule.

> I spoke in my native language with traditional accent and it still made mistakes, also it hallucinates.

I'm also in this scenario actually because I'm a native French speaker and I cannot make myself understood by Google or Siri at all because my accent is way too strong and far outside the training voices that they used.

It's kind of a paradox but it's less a problem for non native speakers in my opinion who are trying to pick the most common accents they can in order to be broadly understood.

That's understandable,these tools definitely can be helpful, but learners should know their limitations and problems.

Also I just think speaking to the actual native speaker is still much better practice, especially given tts quality. It even pronounces words incorrectly in Japanese (wrong pitch accent).

Oh yeah sure, there's no doubt about that, it's just that these ChatGPT tools have two massive qualities compared to native speakers, they are available at any time and timezone, day or night (even for just 2 minutes) and they never get bored, you can ask the most mundane questions over and over again to practice.