Hacker News new | ask | show | jobs
by nequo 929 days ago
It’s being done right now:

https://blog.padi.com/talk-to-whales-with-ai/

One difficulty is that there isn’t nearly as much whale chatter available for training data as there’s human chatter.

And one important question is, why is this useful if we don’t know what the LLM says to them? But the post above touches on that too.

2 comments

>And one important question is, why is this useful if we don’t know what the LLM says to them?

There's no reason to be sure we couldn't know.

It's not like there are examples in the training set for every lang to lang combination modern models are capable of translating.

Won't they need some documents that combine whale speech noises with human words to bridge the gap? Otherwise they are making comments that are just word-like or sound-like fragments.
I don't think this is strictly needed. An English dictionary may seem pointless because it defines every word using only other English words. But the meaning is contained in the _relationship_ between the words.

I'm sure you've seen the example of word vectors that captures some of this meaning. king - man + woman = queen

In Spanish, rey - hombre + mujer = reina

The _relationship_ between "king" and "queen" in English may look close enough to the _relationship_ between "rey" and "reina" in Spanish, allowing you to bridge the gap between the two languages, even if they are entirely disconnected and you've never seen a direct translation between them.

If you had enough recordings, you could (I think) build weights based _solely_ on whale speech. Humans wouldn't be able to understand the weights, and the word vectors in that model wouldn't match the word vectors in an English model, but I suppose there's a chance that vectors might be similar? I don't know. I think you'd have to be very good at both linguistics and also AI to know.