| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by drkevorkian 4584 days ago

  for word, count in Counter(open('test.txt').read().lower().split()).most_common():
      print word, count

1 comments

Goladus 4584 days ago

I like this one. Though I would do it like this to keep the lines under 80 characters:

    words=open('test.txt').read().lower().split()
    for word, count in Counter(words).most_common():
        print word, count

(edited per child comment)

link

benhoyt 4584 days ago

Yeah, those are nice -- and may actually be more efficient on smaller files, as you're only doing the lower() once on a big string. However, for big files you don't necessarily want to read the whole thing in at once.

One nitpick: it's Pythonic (I think) to just name the list of words "words" rather than "word_list".

link

Goladus 4583 days ago

Yes that's a classic tradeoff, a proficient programmer will have to pick one.

Personally I always read entire files into memory first unless I have reason to believe memory will be an issue or need to program defensively against malicious/careless input. The code is always much cleaner and easier to read and if you need to do a second pass on the data you don't need to re-read it from disk.

link