Nice read. I did something sort of similar with the same dataset about a year ago. I compared LDA (Latent Dirichlet Allocation) to TF-IDF as tools to find similar beers based on their review text. Lots of intuitive and funny topics discovered.
I suggest you play with LDA, it seemed to work really well at generating topics. There is also a lot of fascinating, very readable research using it. Check out SNAPs work on the same dataset [1] and some of the Yelp Dataset challenge winners [2]. If you end up interested in doing so, Gensim [3] was pleasant enough to work with.