|
|
|
|
|
by matroid
670 days ago
|
|
Thanks for the feedback. You raise great points and this was the reason why we wrote this post, so that we can hear from people where the actual problem lies. On a related note, this sort of explains why our model is struggling to fit on 500 hours of our current dataset (even on the training set). Even so, the current state of automatic translation for Indian Sign Language is that, in-the-wild, even individual words cannot be detected very well. We hope that what we are building might at least improve the state-of-the-art there. > It's more of a bad and broken transliteration that if you struggle to think about you can parse out and understand. Can you elaborate a bit more on this. Do you think if we make a system for bad/broken transliteration and funnel it through ChatGPT, it might give meaningful results? That is ChatGPT might be able to correct for errors as it is a strong language model. |
|
No, because ChatGPT's training data has practically no way of knowing what a real sign language looks like, since there's no real written form of any sign language and ChatGPT learned its languages from writing.
Sincerely: I think it's awesome that you're taking something like this on, and even better that you're open to learning about it and correcting flawed assumptions. Others have already noted some holes in your understanding of sign, so I'll also just note that I think a solid brush up on the fundamentals of what language models are and aren't is called for—they're not linguistic fairy dust you can sprinkle on a language problem to make it fly. They're statistical machines that can predict likely results based on their training corpus, which corpus is more or less all the text on the internet.
I'm afraid I'm not in a good position to recommend beginner resources (I learned this stuff in university back before it really took off), but I've heard good things about Andrej Karpathy's YouTube channel.