| Hi Everyone, My name’s Jim, and I created MovieChat.org as an archive and replacement for IMDB’s message boards which are shutting down this week. For those of you not familiar, the IMDb message boards allowed you to discuss any single movie or tv show with others (there was a separate forum for each movie/show). IMDb recently announced they were shutting down the message boards and its users were furious (there's a petition with close to 10k signatures here: https://www.ipetitions.com/petition/petition-to-keep-the-imd...). I ventured out to create an archive (of all the existing posts) and replacement and hence MovieChat.org was born. Key Features of MovieChat.org: 1. Any movie/show on IMDB is also on MovieChat.org (over 4 million and counting) - we have separate boards for each movie/show, just like IMDB 2. I backed up most of the posts for IMDB’s top 10,000 movies/shows - most existing conversations on IMDB should also appear on MovieChat.org - we have over 3 million posts already (and I'm working non-stop to back up even more from IMDB)! Please visit http://MovieChat.org, join or start a discussion, and let me know what you think. If you like it, please spread the word. If there’s anything I can improve, just email me (jim@moviechat.org) and I’ll get on it. Jim
jim@moviechat.org |
Just throwing it out there, but would you consider making a dump of the data you scraped that could be used by data scientists? Maybe as a torrent or something like that? Data about movies and what people say about them could form the basis of a lot of NLP projects.
What other big datasets are there for forum post text data? The reddit dataset most immediately comes to mind, and I've also seen a similar one for HN comments. Any others?