| Hi Chony, thanks for asking. 1. What method did you use to get the summary out of all the subtitles? I measured the similarity between words in each sentence. If words in two sentences are not very semantically similar, they will be divided into two different chapters. As for how I measure their semantic similarity, I used word2vec (it will be more accurate if I use something like BERT but this is just a prototype). 2. How to get the subtitles of the video (Youtube API)? Subtitles are available on the YouTube video's HTML, you can write a crawler to get them. YouTube API might also be a way. 3. How to get the timestamp of the specific word in the subtitle?
I would really like to build something similar! Thanks a lot! As timestamps are sentence-level only, there is no perfect way to get them for each word. You will need to do the approximation for it. And I didn't do it for my case. Hope the answers are helpful. Let me know if you have more questions! |