| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by justEgan 2138 days ago

The syncing is done automatically, at least mostly.

TL;DR: ScreenplaySubs fetches the subtitles from Netflix, parses the PDF-formatted screenplays into JSON, and syncs by calculating the sentence similarities between subtitle and screenplay dialogue.

In particular, we use the Universal Sentence Encoder for deciding whether a subtitle matches with a screenplay dialogue. If a screenplay dialogue is similar enough with the subtitles, the former will be tagged with the timestamp provided by the latter.

A lot of the underlying problems presented with each step sounds deceptively simple at first, but turns out to be quite challenging and fun to research. E.g. Parsing PDFs in general are not straightforward (https://filingdb.com/b/pdf-text-extraction), and there’s only a handful of resources on parsing PDF screenplays beside a handful of research papers (https://github.com/drwiner/ScreenPy/blob/master/INT17_screen...), which lead us to create our own open source repo for this (https://github.com/SMASH-CUT/screenplay-pdf-to-json).

Our screenplay-pdf-to-JSON converter is able to contain all dialogues, transitions, actions within a particular screenplay scene. With this, we’re treating scenes as atomic, being able to detect changes in scene ordering based on the tagged scene timestamps. This also means if dialogues are swapped within a scene in the movie, there will be some syncing inconsistencies.

Some scenes do have little to no dialogues, which would pretty much cause the extension to work on a best-effort basis. E.g. The opening scene of There Will Be Blood has very minimal if not no dialogue at all. This is the case where I need to jump in and sync up the screenplay manually. OTOH, the opening scene of Inglourious Basterds will work very well, since there are tons of dialogues in it. This is the reason why I can’t just add movies and instantly upload it to the site.

Would you be interested for me to get into more details? I was thinking of writing a series of technical blog posts if there are enough interests!

5 comments

abathur 2138 days ago

Interesting work. Glad you've been able to chart a path through some tedious problems.

Over the last several years I've imagined a lot of projects (both serious utilities, and the absurd/artistic) in roughly the territory you're exploring...

- For my MFA thesis (2012) I used plaintext (thankfully, though they had plenty of their own problems) transcripts of a TV show as a corpus for generating poems from, and at the time I thought it would be an interesting follow-up project to turn them back into video clips.

- Mapping film quotes/citations back to the script/film and accuracy-checking movie quotes. (can imagine both of these being useful for film forums like the movies/sci-fi stack-exchange sites).

- Generating script-cuts of movies that re-order/drop scenes and just show the printed script on-screen where scenes were cut.

- A film-analysis/screenwriting-class sort of interface oriented around reading a segment and then playing it (could be particularly interesting when there happen to be multiple known script drafts?)

- Re-constructing a character monologue from lines spoken by an actor that turned down the role.

- Generating a super-cut of actor X saying Y.

- Generating focused cuts of a film that cover, say, every scene a given character does/doesn't appear in, or every scene that mentions X.

link

walterbell 2138 days ago

Please blog about the details! Are you following the W3C work on synchronized multimedia?

https://github.com/w3c/sync-media-pub

https://www.w3.org/community/sync-media-pub/

link

justEgan 2138 days ago

Will do! I am not aware of that, tell me more!

link

AriaMinaei 2138 days ago

I'd definitely be interested to read more about the tech. I wonder if it can be used to time-sync audiobooks to their ebooks counterparts.

This is my use-case:

Kindle has a feature called "Audible Narration." You buy a Kindle book, and the Audible audio book, which allows you to play the audio book while it highlights the words on the Kindle book as you're listening. This effortless switching between audio and text enables some interesting reading behavior. Certain books become easier to read. Note taking also gets much easier (Highlighting text is much easier than bookmarking timestamps on an audio book).

The problem is, getting your annotations and highlights and other data out of Kindle is very difficult, because Kindle does not have a public API. Same with Audible.

So I'm thinking of emulating Audible narration with a hybrid ebook/audiobook reader app. The ebook would be a simple HTML page (converted from epub, formatting be damned) and a simple audio player. As the audio plays, the HTML page would scroll and words would be highlighted.

Challenge is to timestamp tag the HTML with the audio track. I'd guess I could TTS the audio track and then somehow diff the generated text with the epub content. Given that some audiobooks are abridged, some read the footnotes on each mention, and some explain the visuals, I would assume diffing would not be very straightforward.

Do you know of any solutions I could look into?

link

davidzweig 2138 days ago

The task is called 'forced alignment', take a look at aeneas and other projects at https://www.readbeyond.it/ :) IIRC, Aeneas has some features for handling extra text and the beginning/end of the book, while abridgement etc. isn't handled.

link

AriaMinaei 2138 days ago

You just saved me weeks of work. Thank you :)

link

smnrchrds 2138 days ago

> ScreenplaySubs fetches the subtitles from Netflix

How is this done? Isn't everything on Netflix protected by DRM?

link

justEgan 2138 days ago

You can fetch them by recording the network requests, as explained in this repo: https://github.com/isaacbernat/netflix-to-srt

link

monadic2 2138 days ago

Is there any support for querying actual timestamp of the video?

link