Hacker News new | ask | show | jobs
by ecshafer 520 days ago
They would be buying one of the largest amounts of user generated data in the world. Sounds good to train on.
3 comments

The entire point is that Perplexity is several orders of magnitude smaller than ByteDance and just making the offer is a sign of immaturity.
I think Perplexity just wants to stay in the news cycle and get more than it's fair share of discussion.

This is obviously not going to happen, and it's absurd anyone's really engaging with it as a serious proposition. But, because they have a new cycle behind it a lot more people are talking about Perplexity than Claude etc.

Lots of training data for: - Lip syncing - Interpretive dance - Hover text
Electrical/framing/machining/woodworking/physics/tumbling/rockclimbing/instruments/etc/etc/etc.

Honestly it would be a goldmine of training data.

Is there anything of value in TikTok videos?
If you're using the term "value" to refer to monetary value, yes. If you're using it to refer to some other kind of value, it probably isn't relevant to the viability of the proposed merge.
I guess the PP question was like "what real world problems could solve an AI trained on TikTok videos?". Sociology research, maybe.
The training data is an absurd moat for a mega social network, that’s why you save everything forever: the machine learning architecture to exploit it hasn’t even been invented yet and you have longitudinal dissensions all but impossible to get otherwise.

Voice, video, sentiment, ads stuff. With today’s technology you know if a user is about to get cheated on before they do.

The same as an AI trained on YouTube. Extremely valuable.
Electricians showing bench demonstrations of what symptoms faulty neutrals cause, machinists doing step-by-step demonstrations of making tooling, including their fuckups, scrapyards doing deep dives into the in-practice economics of when and why they buy and sell, lawyers giving explanations of the laws involved in popular viral accident videos, baking recipes, comedians showing their material, musicians demonstrating techniques, etc