Hacker News new | ask | show | jobs
by drahazar 1229 days ago

  2.14 True/False Evaluations
  Use the “implicit” false if at all possible.

This one is my personal bug-bear. I find this:

  if not users:
     ...
significantly worse than:

  if users == []:
      ...
The second is totally explicit, reminds the reader that users is (expected to be) a list and makes it totally clear that we can only enter the conditional block if users is an empty list.

The first option:

a) obfuscates the type of users on first reading

b) evaluates to True if users is None (or LOADS of other things?!) which can lead to hard-to-find bugs.

Granted, type-checking can help here but purely from a readability perspective the second option seems way more friendly and for almost no downside. The same holds true for all of the "False-y" objects:

  if users == {}:
  if users == 0:
  if users is None:
  if users == ():
  if users is False:
Why is the implicit:

  if not users:
an improvement in any of these cases?

  If you need to distinguish False from None then chain the expressions, such as if not x and x is not None:.
!!!

Why not just:

  if x is False:

?
10 comments

Dynamic languages works since they are very polymorphic across types. Your function is more general if it doesn't have to know if you pass a list or a tuple or a dict, the implicit false will work on all of those. Making strict type checks quickly makes dynamic typing impossible to work with, and then you start to require strict typing everywhere, and at that point I'd not work with python but in some other language.

This goes for if you make a library as well, your library will be easier to use if you are less strict about the inputs you take, since that allows your user to work in a more naturally dynamic way. I love static types, but I have worked on making python libraries and there accepting a wide range of inputs is an important part of usability.

On the other hand, big codebases without type hints rely on their engineers remembering types of every argument. Or the functions being overly defensive.

Each to their own liking, I prefer knowing what argument types a function accepts so I don't need to think about it, and focus on writing business logic. If the function could accept more types, Id just improve it.

I used type hints everywhere in python, they are orthogonal to what I talked about.
Isn't your first paragraph all about not knowing what argument types the function accepts just looking at its declaration, as it defeats the purpose of using a dynamic language?
Type hints supports generics and abstract interfaces, you use those to display what behaviour you are using within the function and then you try to do what you need in the function as dynamically as possible.

That is for library code, maybe it would be too cumbersome to try to do that for code with less reuse. I have never worked on a large python codebase that wasn't a library so I'm not sure what is best there.

This makes it easier for errors to go unnoticed in large codebases. If the function expects to take a list, and someone passes a tuple, it's likely that they passed the wrong value by accident.

For beginners, and in toy examples, it's kind of neat when code bends over backwards to work. Take this example:

    >>> def all_uppercase(lst):
    ...     return [s.upper() for s in lst]
    ...
    >>> all_uppercase('hi')
    ['H', 'I']
    >>>
Kind of neat, right? It's almost like a joke in code. Ha ha, iterating over a string gives you strings! But the charm of finding cute things to do with unexpected inputs doesn't scale. What is a helpful attitude at a small scale translates to "errors should manifest as far away as possible from the programming mistake that caused them" at a large scale. 99 times out of 100, if your code expects a list and somebody passes a tuple, they want a stack trace, not a return value.

> I have worked on making python libraries and there accepting a wide range of inputs is an important part of usability

I work with a large Python codebase at work, and this is a frequent source of frustration. I frequently track down bugs and find that on some untested code path our code passes nonsensical values of the wrong type to a third-party library, and the library just... finds some way of interpreting it.

Even if all the code paths get tested, they can't be tested with every possible input. Property-based testing seems like overkill for our application, and our tests are already almost slow enough to be annoying. And what if the third-party library is side-effecting in a way that's hard to test? It gets mocked out. And I find that the mocks are configured to expect the nonsensical values, because the original programmer found that they "work."

All because libraries don't want to make an unfriendly impression by throwing a stack trace.

> Why is the implicit:

> if not users:

> an improvement in any of these cases?

Function iterates over the input. User provides a list,

     if users == ():
test fucks up because lists and tuples are never equal.

Literally no gain, only pain.

I don't quite understand. In the case where the function is expecting a list, why would you want to execute the logic for "empty list" on an empty tuple?

Wouldn't this just be the developer using the wrong comparison for the types the function is expecting(hence more reason to be explicit instead of using the implicit false)?

> I don't quite understand. In the case where the function is expecting a list, why would you want to execute the logic for "empty list" on an empty tuple?

Because for most functions that's not a relevant or useful distinction, in Python a tuple is an immutable list, both are sequences.

By mis-handling empty tuples you're just unnecessarily constraining the caller. Not only that, but you might also create an inconsistency which is hard for the caller to notice if your function only fucks up on empty collections.

Good point!
A good answer is because you don't necessarily know if the thing you're getting is a list, a tuple, or a RepeatedCompositeFieldContainer (a protobuf list), or some other type that meets the Sequence/MutableSequence abstract base class contract. `if not foo` will check that they're all empty, while `if foo == []` will have unexpected behavior if foo starts returning a set tomorrow.

The generalization of this is to code against as generic an api as possible, you wouldn't do `list.__eq___(x, y)` in your code, but you're suggesting almost exactly that.

(granted you can still run into this kind of issue if foo is a generator, but that's a less common way to explode).

The style guide does tell you to use explicit `x is None`, instead of implicit bool when checking noneness, specifically to disambiguate between binary and ternary values, but usually that's not what you want.

>type-checking can help

If we use Python as a strongly typed language, it makes no difference which one you use.

If we don't (i.e. use Python as it is: a dynamically-typed language), then this is just a preference.

Using (or exploiting, depends on how you think) Truthiness this way is actually an intentional choice in lots of case, especially if you have "else" condition.

Think it this way: you're going to split the conditions into two: `users` is non-empty, which is the "good" condition; and `users` is empty, which is the "bad" condition.

Then you have unexpected condition that "users" is something that shouldn't be, most commonly being None. In most of cases, this is a "bad" condition. So it makes sense it's grouped together with `users == []`.

If `users` is "True" or "False" as you said (which you should ensure to not happen in other ways anyway), then indeed it will not be captured by `users == []`, but it would still be broken/unmanaged in "else" side.

> If we use Python as a strongly typed language, it makes no difference which one you use. If we don't (i.e. use Python as it is: a dynamically-typed language),

Nitpick: Python is strongly typed, it's also dynamic. The strongly-weakly typed axis is different from the static-dynamic axis.

Thanks for letting me know
> Think it this way: you're going to split the conditions into two: `users` is non-empty, which is the "good" condition; and `users` is empty, which is the "bad" condition.

In a lot of cases though, an empty collection isn't a "bad" condition at all, e.g. it's a valid collection to apply filters/maps to.

Similarly when people get used to doing "if not i" for ints, but then forget about the times that zero is a valid value.

It's true that dynamic coercion is a feature of the language, but coding conventions generally are often about enforcing "least surprise" to remove a burden from the person reading the code.

> an empty collection isn't a "bad"

Then you don't need to check if it's empty to begin with.

I think they're making the distinction between special cases and error cases.

An empty list could be valid(e.x. a search of users providing no results) so you still need to differentiate between "special cases that need special logic" and "bad input".

Lumping the two together in one `if` block makes any code less readable imo, because they're not the same thing.

Yes agreed but the trouble being that if you see "if not users", you've to second guess the intent of behind it.

Is an empty list being routed to the else branch because it is an error in this instance or because it's an error in 90% of the codebase so the author forgot to handle it explicitly here?

Or is the author always expecting users will be a full or empty list and that other falsy values will never occur?

Ideally you'd use `if len(x) == 0`, which handles lists but also list-like things like tuples, while not letting None and False through.
len will count the object whereas “not” will only check for truthiness. This can matter in cases where counting takes significantly longer; for example, a sql query set.
This is purely subjective. First of all, I prefer putting type hints everywhere. PyCharm helps reminding one in a lot of cases.

If there is a bug and somebody passes invalid type to my function, I would just fix it and move on.

Very often both None and empty list are not an interesting case and I return early. Thus `not users` makes sense.

There are cases, though, when None means a sane default should be used instead and you can't use default argument values due to mutability. In these cases `users is None` makes perfect sense.

There are also cases I explicitly check for True and False - tests. In such cases I wouldn't rely on truthy/falsy values and assert True and False values by reference.

Having said that, its all subject ive. You like this style, some one else likes other style. What ultimately matters are two things: automatic formatters and consistency.

One thing I haven't seen anyone else mention besides not being idiomatic Python is that ”if users == []" and it's ilk allocate a new empty object for the comparison. It's unnecessarily slow and wasteful.

Others have mentioned that the comparison to say a tuple will also fail. If the intent is to ensure a list instance, use isinstance, instead.

You are 100% right. Truthiness leads to bugs. I would have thought that was well known by now.
When you're comparing collection literals, often type checking is part of equality.
No, this isn't explicit enough:

    if users == []:
Nor is this:

    if (users == []) == True:
Nor is this:

    if ((users == []) == True) == True:
...

/s