| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by gcmac 2110 days ago
	I’d say it’s more likely they have super advanced/clever ways of doing the latter. The algorithm could be a simple dot product and the result could be great or terrible depending on how good the feature extraction is. Pulling useful features out of videos is no small task. The fact that everyone raves about how good the recommendations are indicates to me that this is where their innovation lies.

1 comments

xnx 2110 days ago

There's so much good meta-data (likes, comments, duration, sound used, views, like/view ratio, skips, loops, subscribes, etc.) that I'd be surprised if they were digging into the contents of the video at all right now.

link

dannyw 2110 days ago

Bytedance has thousands of the smartest data scientists in China.

link

rvnx 2110 days ago

Bytedance has thousands of Manual Labor specialists as well.

Using ML it is very easy to tag videos.

link

jakear 2110 days ago

They could also be digging only into audio, doing speech recognition on it, then clustering the text. Augment that with the text users have put into the video directly using the in-app editor and you have some pretty solid data.

link

ramimac 2110 days ago

If that were true, it'd be interesting to see if they push out support for close-captioning. It's an accessibility push, but also would leverage a lot of the same capabilities...

link

novok 2110 days ago

I would also start doing image recognition in the video frames, to extract things like gender, objects, etc.

link

thekyle 2110 days ago

Would this have any advantage over just using video embeddings (or a sequence of frame embeddings?) which in theory should capture those things in vectorized form.

link

wombatmobile 2110 days ago

> I'd be surprised if they were digging into the contents of the video at all right now.

Why would you be surprised to learn TikTok is doing video content analysis?

link

blueblisters 2110 days ago

It can be a) very expensive b) also very difficult to implement.

Video understanding is an active field of research and I'm not sure state of the art is there yet for capturing nuance like engagement potential, categories etc.

link

wombatmobile 2109 days ago

State of the art where? College? Silicone Valley? Bangalore? Shanghai? Beijing? Hangzhou?

link

blueblisters 2109 days ago

State of the art in academia, which is largely location agnostic.

link

xnx 2110 days ago

Google was able to build a very useful search engine that ran for decades relying on the significance of links and keywords, without much understanding of the meaning of page content. You can get very far with the readily available data, before you need to delve into the fancy stuff to make it a few percent better.

link

btilly 2110 days ago

They claim to be looking at the music in the video and avoiding sending you to another video with the same music.

link

Drew_ 2110 days ago

That would be the "sound used". The music in the video is specified/labeled before upload so there's no need to actually process the sound of the video.

link

srean 2110 days ago

Almost all of those applies to YouTube, do they not ?

link

htrp 2110 days ago

IIRC youtube vids are too long to do any useful feature extraction from the videos.

link

srean 2110 days ago

The comment I was responding to mentioned a lot of metadata around videos, that is what I was responding to.

link