Hacker News new | ask | show | jobs
by SamBam 403 days ago
Like others recently, I've been extremely impressed by LLM's ability to play GeoGuessr, or, more generally, to geo-locate random snapshots that you give them, with what seem (to me) to be almost no context clues. (I gave ChatGPT loads of holiday snapshots, screenshotted to remove metadata, and it did amazingly.)

I assume that, with enough training, we could get similarly accurate guesses of a person's linguistic history from their voice data.

Obviously it would be extremely tricky for lots of people. For instance, many people think I sound English or Irish. I grew up in France to American parents who both went to Oxford and spent 15 years in England. I wouldn't be surprised, though, if a well-trained model could do much better on my accent than "you sound kinda Irish."

4 comments

We actually did something like this for non-native English speakers a few months back. Check out https://accentoracle.com (most mind-blowing if you're a non native English speaker)
Well, it says I'm Finish. But now I have a new game, where I put on my best Italian or Russian or Greek or Australian accent and try to see how close I am.

I'm terrible, according to the program. My Italian is Russian or Hungarian or Swedish, my Australian is English.

New party game unlocked.

Amazing! If you can make it go viral again too, I will love you!
I've been building that exact game

accentgame.xyz

Fun. I have a strongly modulated North American midwestern accent so unsurprisingly it had me read several paragraphs before only being able to say with any certainty that my accent was 83% English with the rest being Spanish/Russian. It couldn't detect the country of origin.
Agreed, pretty meh. Tried my usual accent (the one where natives mostly can’t tell where I'm from) — got 78%. Then went full cartoon russian ‘bad neighborhood’ mode — somehow scored 68%.
Interestingly enough, it thought I was Russian, even though my native language is French. It was tied at 32% with French though.

Edit: Tried it a few times and also got English as an accent. Pretty fun application!

I would love to be able to explore combinations of X spoken language with Y accent, like for example I've always been curious how French sounds spoken with an Indian accent.
It detects my non-native English accent correctly, but then it makes the mistaken assumption that I want to sound like an American of all things?
I'm 42% Arabic apparently! And 20% Russian. Got an 81% American accent level. I guess it is tuned to non-native-English speaker accents.
Was that right? Or what is the correct native language it should have predicted? Note the %s in the accent breakdown section are prediction probabilities
Native English speaker with a rolled-R accent, so I can see why it picked Arabic/Russian.
Swiss-German accent doesn't seem to be on the list, so it guessed mostly Swedish.
Wow that was actually accurate
Yes, although I believe this is a speaker embedding model here, so not LLM related.

This kind of speech clustering has been possible for years - the exciting point with their model here is how it's highly focused on accents alone. Here's a video of mine from 2020 that demonstrated this kind of voice clustering in the Mozilla TTS repo (sadly the code got broken + dropped after a refactoring). Bokeh made it possible to directly click on points in a cluster and have them play

https://youtu.be/KW3oO7JVa7Q?si=1w-4pU5488WxYL3l

note: take care when listening as the audio level varies a bit (sorry!)

Correct, not LLM
I bet you are right.

I had a forensic linguistics TA during college who was able to identify the island in southeast Asia one of the students grew up on, and where they moved to in the UK as a teenager before coming to the US (if I am remembering this story right).

From what I gather, there are a lot of clues in how we speak that most brains edit out when parsing language.

Or the classic scene in Mrs Doubtfire where Pierce Brosnan attempts to locate the origin of Robin Williams’s fake English accent.
I’ve seen some online quizzes that based on regional variations in accent (does root rhyme with foot or boot?) and vocabulary (what do you call a sweet fizzy beverage) that did a great job of locating where my Facebook friends back in the day grew up. It got me a bit off largely because while I grew up in Chicago, I had spent most of my adult life in Los Angeles so I tend to prefer “freeway” to “expressway” (changing that answer moved me from Rockford to Chicago).