Hacker News new | ask | show | jobs
by drkevorkian 4537 days ago

  for word, count in Counter(open('test.txt').read().lower().split()).most_common():
      print word, count
1 comments

I like this one. Though I would do it like this to keep the lines under 80 characters:

    words=open('test.txt').read().lower().split()
    for word, count in Counter(words).most_common():
        print word, count
(edited per child comment)
Yeah, those are nice -- and may actually be more efficient on smaller files, as you're only doing the lower() once on a big string. However, for big files you don't necessarily want to read the whole thing in at once.

One nitpick: it's Pythonic (I think) to just name the list of words "words" rather than "word_list".

Yes that's a classic tradeoff, a proficient programmer will have to pick one.

Personally I always read entire files into memory first unless I have reason to believe memory will be an issue or need to program defensively against malicious/careless input. The code is always much cleaner and easier to read and if you need to do a second pass on the data you don't need to re-read it from disk.