Hacker News new | ask | show | jobs
by stfwn 2905 days ago
This immediately looks useful for things like:

    if foo := bar[baz]:
        bar[baz] += 1
        return foo
    else:
        bar[baz] = 1
        return 0
Where foo is a dict keeping track of multiple things, and a non-existing key (baz) is never an error but rather the start of a new count. Faster and more readable than

    if baz in list(bar.keys()):
    ....
Similar to Swift’s ‘if let’, it seems.
4 comments

The place I see using it is in (quoting Python's "python.exe-gdb.py"):

        m = re.match(r'\s*(\d+)\s*', args)
        if m:
            start = int(m.group(0))
            end = start + 10

        m = re.match(r'\s*(\d+)\s*,\s*(\d+)\s*', args)
        if m:
            start, end = map(int, m.groups())
With the new syntax this becomes:

        if m := re.match(r'\s*(\d+)\s*', args):
            start = int(m.group(0))
            end = start + 10

        if m := re.match(r'\s*(\d+)\s*,\s*(\d+)\s*', args)
            start, end = map(int, m.groups())
This pattern occurs just often enough to be a nuisance. For another example drawn from the standard library, here's modified code from "platform.py"

    # Parse the first line
    if (m := _lsb_release_version.match(firstline)) is not None:
        # LSB format: "distro release x.x (codename)"
        return tuple(m.groups())

    # Pre-LSB format: "distro x.x (codename)"
    if (m := _release_version.match(firstline)) is not None:
        return tuple(m.groups())

    # Unknown format... take the first two words
    if l := firstline.strip().split():
        version = l[0]
        if len(l) > 1:
            id = l[1]
It' a problem with re module really.

re.match should return a match object no matter what, and .group() should return strings, empty string if non were matched.

I don't see how that would improve things. Could you sketch a solution based around your ideas?
Don't wait for 3.8, and don't bother with defaultdict.

collections.Counter is what you want for the counting case.

dict.get() + dict.setdefault() for the general case.

defaultdict is only useful if the factory is expensive to call.

As pointed, you can use either a default dict or just simply, and [more pythonic](https://blogs.msdn.microsoft.com/pythonengineering/2016/06/2...):

    try:
      bar[baz] += 1
    except KeyError:
      bar[baz] = 1
Also you can check if a key is in a dict simply by doing "if baz in bar" no need for "list(bar.keys())", which will be slow (temp object + linear scan) vs O(1) hashmap lookup.
The error-catching method seemed too drastic to me before, but the article explains the LBYL vs. EAFP arugument quite well. Thanks!

I should find a way to get more code reviews, I really enjoy learning these small nuggets of info.

Alternatively

`bar[baz] = bar.get(baz, 0) + 1`

One line and no error checking.

But the OP was probably just illustrating a basic example where you might have some more intense logic

It's also time saving since the hash lookup needs to be done at most 1, as well. GP has two lookups in the hash list.
For stuff like that I'd just use `defaultdict`. That if/else tree then reduces to 2 lines total.
That’s a good tip, thanks!