Hacker News new | ask | show | jobs
by ttmb 899 days ago
Wasn't one of their initial monetization ideas that crowdsourced translations - from amateurs, not even paid contractors - would outperform AI translation? The translation thing never came close to panning out, from what I can tell.

Back in the day you used to be able to see a mini discussion thread about any exercise. The comments in these threads - from unpaid users - were frequently much more helpful than the official teaching notes. But they got rid of those a while back, too. At the time people suggested it was because they wanted to push people into some sort of paid tutoring offering that they'd announced. But I don't see any hint of that around. Now I'd guess it's because the questions are no longer hand crafted by people, and are now autogenerated by AI, leading to an inability to have discussions about specific questions.

4 comments

This seems to be a scaling issue for all of these online language/SRS learning systems. Every single one of them scales and then starts dialing back/deleting/hiding all of the user-generated content that helped the platform get to where it is today.

Almost a decade ago "smart.fm" was a thing and the best thing about it was all of the user-generated content (blogs) offering discussions and explanations for so many various languages. From grammatical concepts being explained to people asking questions and having users provide answers in public Q&A style format. Then smart.fm got rid of all of that and became "iknow.jp" and deleted everyone's blogs and hard work built to create the smart.fm community (many of whom then moved to Memrise).

Memrise was originally about using user-submitted mnemonics and it had a vast library of them being created. Nowadays, as far as I can tell, none of the mnemonics are around anymore and they're no longer the focus. The focus is on monetization and gamification of their Pro user stats (of which I have lifetime membership until the year 9999 due to my help/work during their Beta testing years).

Because mnemonics were the focus - words across all of their courses had to be combined so that the mnemonics between courses would carry over. Myself and a number of (unpaid) volunteers spent months combining all of the words for every popularly used language (I helped with the Japanese dictionary) across all existing courses at that time.

Hosting (and moderating) user-generated content is an issue at scale. At the start when you have mostly good faith actors and few trolls it works quite well. But after a certain scale moderation becomes a massive issue.

At least I got a cool staff-only T-shirts out of it. Ben is an awesome dude and when I donated to the Memrise bus tour godfundme I asked if I could have one of their staff shirts that I knew they had - and they actually sent me one instead of one of the bus tour shirts!

The better funded of these sites seem to last a bit longer/scale a bit larger but it seems the death of user-generated content is inevitable after a certain point.

The founder's academic work is literally on using games to get people to generate training data for AIs. I watched his lecture on this in ~2007.
Have a link?
Maybe not exactly what was being referred to, but one of Duolingo's founders, Luis von Ahn, is one of the inventors of CAPTCHA:

https://link.springer.com/chapter/10.1007/3-540-39200-9_18

https://kilthub.cmu.edu/articles/CAPTCHA_Using_Hard_AI_Probl... (direct PDF download)

https://en.wikipedia.org/wiki/CAPTCHA

RE: the computer game thing, https://dl.acm.org/doi/abs/10.1145/985692.985733

Wait, CAPTCHAs don't generate or capture any data. It prevents computer automated systems, which were a huge pain at the time. Google then came around and released reCAPTCHA which was using originally books for Google Translate but later Street view photos for Google Maps. That was when dara collection and learning was introduced.
reCAPTCHA is his baby.
Oh, they've removed those mini threads? That's a shame, those used to be super helpful.
Yeah, you used to be able to dive into really interesting/useful discussions (pretty much a forum) about nuances of certain phrases/words, usually involving people who have grown up speaking the respective language. It was really helpful to get that extra detail and context on weird little quirks that may not have been obvious by the content built into Duolingo itself.
It was fairly recent, maybe in the last 3-4 months. I know for much of 2023 I would get the mini threads for some (not all) questions, but now I don't get them for any question. I agree it's a shame, because they really were the most educational thing on the app.
Those threads were great