Hacker News new | ask | show | jobs
by jerrygenser 1275 days ago
Even since the start of python typing, it was recommended to use a more generic type like Iterable instead of List. The author claims that List is too specific -- this seems like a straw man argument against typing that doesn't acknowledge python's own advice.

Also, mypy has gotten really good in recent years and I can vouch that on projects that have typing I catch bugs much much sooner. Previously I would only catch bugs when unit testing, now they are much more commonly type errors.

The other thing typing does is allow for refactoring code. If anything, high code quality relates to the ability to refactor code confidently and typing helps this. Therefore I would put it at the top of the list above all the tooling presented (exception I agree with ci/cd)

4 comments

Iterable is an import away, while list is already at my fingers.

There's zero harm in using list in private interfaces: I know I'm the only one passing the value, I know it is always a list.

As an argument type, Iterable is compatible with list, so it's benefits are minimal (with rare exceptions).

Lists are easier to inspect in a debugging session.

Iterable can be useful as return type, because it limits the interface.

Iterable is useful if you are actually making use of generators because of memory implications, but in this case you already know to use it, because your interfaces are incompatible with lists.

I can count on fingers of my hands when using Iterable instead of list actually made a difference.

> As an argument type, Iterable is compatible with list, so it’s benefits are minimal (with rare exceptions).

Iterable is not compatible with list, but list is compatible with iterable. As the more general type, Iterable is better as an argument type unless you have a reason to force consumers to use lists. Even in private interfaces, I tend to prefer it, because I often end up wanting to pass something constructed on the fly, and creating an extra list for that rather than using a genexp just seems wasteful.

What I meant is argument marked as Iterable is compatible with list being passed.

> Iterable is better as an argument type unless you have a reason to force consumers to use lists

See, I feel the exact opposite: I use Iterable only if I have a reason to force consumers to use Iterable.

When you're marking argument as Iterable, how confident do you feel that you will never query collection size or access it by index?

I understand the desire to limit the interface and YAGNI, but since lists are more familiar and ubiquitous, using Iterable feels more complicated and unnecessarily pedantic.

> See, I feel the exact opposite: I use Iterable only if I have a reason to force consumers to use Iterable.

A broad argument type doesn’t force consumers not to use a narrower type. (It forces the implementer of the function to not rely on additional features of the narrower type, but if I am writing the function, I can be certain whether or not that is acceptable.)

Meanwhile, using a narrower type than needed for an argument does impose additional, unnecessary constraints on the consumer.

> When you're marking argument as Iterable, how confident do you feel that you will never query collection size or access it by index?

Absolute certainty, since I know what the function does and what I need to do it.

> I understand the desire to limit the interface and YAGNI, but since lists are more familiar and ubiquitous, using Iterable feels more complicated and unnecessarily pedantic.

Since all lists are Iterables but not all Iterables are lists, Iterables are necessarily more ubiquitous than lists.

> Since all lists are Iterables but not all Iterables are lists, Iterables are necessarily more ubiquitous than lists.

Yeah, that's what I meant by being pedantic :)

Here's a question: you receive a JSON payload that contains a list. You will then pass this list to two functions, one of them only iterates, another one uses list interface (let's say checks length among other things). Should you mark the argument as a list, or as an Iterable in the first function?

Solely from the code perspective, it's definitely an Iterable. But in my mental model it still remains a list. I don't like it when code deviates from my mental model. Forcibly treating it as an Iterable only makes it more complicated, while not giving anything in return.

Sure, you could say that callee should not have expectations of the caller, but what if those functions are already coupled? They are in the same module, and argument names clearly denote a collection. The fact that in certain scenarios it is "technically Iterable" serves nothing but pedantic value.

> Solely from the code perspective, it's definitely an Iterable. But in my mental model it still remains a list.

A list is an iterable with special additional features, so this is no conflict at all.

> I don't like it when code deviates from my mental model

But how is there a deviation; being an Iterable is part of being a list, not a deviation from it.

> Forcibly treating it as an Iterable only makes it more complicated, while not giving anything in return.

How is there anything “forcible”. Broader typing doesn’t “forcibly” impose anything. And it does give something, more freedom to callers.

> Sure, you could say that callee should not have expectations of the caller, but what if those functions are already coupled?

If there is coupling that exists for good cause and demands a list as the type of the data structure to be passes around, then, fine, use list. But usually Iterable or Sequence makes more sense; coding to interfaces which impose only what is actually required is better than to unnecessarily specific concrete types.

> Iterable is an import away, while list is already at my fingers.

`list` might be but `List` isn't. Are you not defining the type of the contents of the list?

typing.List is deprecated.

https://docs.python.org/3/library/typing.html#typing.List

> Deprecated since version 3.9: builtins.list now supports subscripting ([]). See PEP 585 and Generic Alias Type.

> The other thing typing does is allow for refactoring code.

No. What allows you confident refactoring code are automated tests. I honestly can't understand why people are so obsessed about types, especially in languages like Python or Javascript.

It's not just about types. It's about having interfaces I should expand on. And, I'm assuming there are automated tests, otherwise and typing is additive. I should clarify that it's also having defined interfaces using the type system to do it.

By depending on interfaces/abstractions instead of specific cases you can refactor the interface and not break clients. It's very difficult to do this unless you have types.

This is something that Go is really good at and encourages but can be done with python/js on top of their type systems.

> I honestly can't understand why people are so obsessed about types,

Types in Python feel like an added layer of confidence that my code is structured the way I expect it to be. PyCharm frequently catches incorrect argument types and other mistakes I've made while coding that would likely result in more time spent debugging. If you don't use any tools that leverage types you won't see any benefit.

> I honestly can't understand why people are so obsessed about types

It's a very powerful sanity check that lets me write correct code faster, avoiding stupid bugs that the unit tests will also, eventually, find.

And, to me, reading the code is much much nicer. Types provide additional context to what's going on, at first glance, so I don't have to try to guess what something is, based on its name:

    results: list[SomeAPIResult] = some_api.get_results()
is much easier to grock.
> I don't have to try to guess what something is, based on its name

It's probably just a bad example, but in case it isn't:

Sounds like you ended up at the same place. You went from guessing what is some_api.get_results(), based on it's name, to guessing what is SomeAPIResult, also based on it's name.

If some_api is your library, then you could have just added type hints to get_results() and let type inference do it's job.

If it's a third party library, then using your custom SomeAPIResult means that code is becoming alien to other engineers that worked with that library in the past. It might be worth it, but it's definitely controversial. You probably should've done it with stubs anyway.

> guessing what is SomeAPIResult

I disagree. It’s not a guess, it is precisely what it is, where the variable name is free to betray me. A sane IDE/linter will tell me if my local assumption is incorrect, where a variable called result_SomeAPIResult relies on an assumed, possibly ancient, state of reality.

You do realize nobody writes code like that, right? Even in static typing land people rely on type inference.

list[SomeAPIResult] in your example is redundant. You can get all the benefits of types without it.

Type inference is good for the writer, not the reader.

Relying on type inference isn’t some rule. Your can find many projects that use it selectively, being explicit where it makes sense. The point of writing code is to make it readable and maintainable. The explicit type isn’t redundant, it’s explicit in presentation, and can be functional, like my example.

I mean, just look at this example. You know the type without having to dig in, do you not? You don’t have to look at the function definition. You know immediately. That’s the point of being explicit, where it makes sense. No guessing, where it makes sense. This is why we have all these type hints now, in a dynamic language: because guessing sucks.

> What allows you confident refactoring code are automated tests.

Typing facilitates automated testing; e.g., hypothesis can infer test strategies for type-annotated code.

Shorter feedback loops = increased productivity.
I got good use of the run-time type checking of typeguard [0] when I recently invoked it via its pytest plugin [2]. For all code visited in the test suite, you get a failing test whenever an actual type differs from an annotated type.

[0]: https://github.com/agronholm/typeguard/

[1]: https://typeguard.readthedocs.io/en/latest/userguide.html#us...

> Even since the start of python typing, it was recommended to use a more generic type like Iterable instead of List. The author claims that List is too specific

These statements contradict themselves? List is too specific, and Sequence[item] is preferred. Sometimes you are dealing with a tuple, or a generator, and so it makes more sense to annotate that it is a generic iterable versus a concrete list.

From the original article:

> For example, you basically never care whether something is exactly of type list, you care about things like whether you can iterate over it or index into it. Yet the Python type-annotation ecosystem was strongly oriented around nominal typing (i.e., caring that something is exactly a list) from the beginning.

I'm saying that this quote is a straw man and that contrary to what is claimed in the quote, instead, the ecosystem would go with/recommend Iterable[Item] or Sequence[Item] and not List[Item] if applicable.

I think we both agree, not sure which part of my comment you think is contradictory.

Whether something is generic/specific depends on the context.

As an argument type, Iterable is permissive (generic).

As a return type, Iterable is restrictive (specific).