|
|
|
Ask HN: TikTok scraping – maximize signal when only 5% of content is useful?
|
|
2 points
by alliewithane
100 days ago
|
|
Hey everyone. I'm working on a machine learning project that needs a lot of TikTok video data that has to do with ads and behavior. I'm gonna do a bunch of transcription and analyses on individual videos after getting their ids. I'm facing a problem right now. I estimate only about 5% of the data fetched by tools like ensembledata through the degrees of freedom it allows (individual hashtag / keyword search.) I understand this possibly is a confounding problem. My question is has anyone here worked on something similar? How did you approach this? Did you use iterative/stratified sampling? Thank you! |
|