| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by georgehotelling 993 days ago
	I'm just happy for itertools.batched for chunking iterables: https://docs.python.org/3.12/library/itertools.html#itertool...

6 comments

philshem 993 days ago

Yes. I’ve written explicit code that needed this 100s of times.

link

intalentive 993 days ago

Yeah I've memorized it by now:

for i in range(len(lst) // batch_size + 1): batch = lst[i * batch_size : (i + 1) * batch_size]

link

hddqsb 993 days ago

You have a minor bug -- when len(lst) is a multiple of batch_size, this will have an extra iteration at the end with an empty batch. The fixed version is `range((len(lst) + batch_size - 1) // batch_size)`, which emulates `ceil(len(lst) / batch_size)`. Yet more proof that this should be part of stdlib :)

Personally I think I'd actually write it like this:

    for i in range(0, len(lst), batch_size):
        batch = lst[i:i+batch_size]

The docs give another pretty nice implementation using iter() and islice() in a loop (but it uses the walrus operator `:=` so it requires Python 3.8+ as written).

link

rossant 993 days ago

Blatant reason why a native solution was long overdue.

link

kastden 993 days ago

This is the greatest addition since f-strings!

link

ehsankia 992 days ago

99% of my more_itertools imports are exactly for this.

there's 1-2 other stuff from more_itertools that I think should make it to itertools. I'd actually like to see statistics from huge monorepos/opensource about usage stats of various more_itertools functions.

link

miiiiiike 992 days ago

Same but for the ‘batch’, ‘ibatch’, and ‘abatch’ functions I started writing back in 2008.

link

kzrdude 993 days ago

Great call-out! (Mistakes elided..)

link

LegionMammal978 993 days ago

What do you mean by "empty sequence in"? The function doesn't raise if the input iterable is empty: it only raises if the chunk size n is 0. While that does have a natural interpretation of returning an infinite sequence of empty tuples, such a behavior would be qualitatively different than for other chunk sizes. The caller would never be able to retrieve any elements from the input iterable, and the output would be infinite even if the input is finite. In that light, it makes some sense (IMO) to avoid letting applications hit such an edge case unintentionally.

link

kzrdude 993 days ago

I skimmed that too quickly and was mistaken. Thanks.

link

abyesilyurt 992 days ago

Checkout more-itertools for more variants: https://pypi.org/project/more-itertools/

link

zem 993 days ago

yeah! that's been in the ruby stdlib practically from day one, no idea why python was so resistant to it.

link