If anyone here is enrolled in Udacity's CS101 class, this actually seems like a good extension/application for what the class is currently teaching, ie using Python's find method to search for URLs and using indices to return a string that follows "I don't mean to be".
There are probably much better ways to collect this data, but I thought this was a relevant and interesting connection.