| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by pedrovhb 1992 days ago

I think this is neat but I'm not sure it's the best way to go about things.

> all(map(func3, filter(func2, map(func1, zip(a, b)))))

> a.zip(b).map(func1).filter(func2).forall(func3)

The original is indeed terrible and the second version is a bit better. A lot better than either one, though, is splitting your logic into multiple lines and assigning a descriptive identifier to each step. Maybe even throw in some inline comments if you're particularly respectful of others' time.

As tempting as it is to do something super clever and cram a ton of functionality into a small number of lines or characters (it does feel good), it's just better to be a bit more verbose and write simple, obvious code. I feel like code should be read like a book, not a puzzle.

10 comments

brundolf 1992 days ago

What I like about "cramming a ton of functionality into [a single expression]" is that it doesn't leak any intermediates to the rest of the block, and it doesn't allow for mutation. There's a single output exposed; you can't accidentally use the wrong value downstream. You could wrap it all in an inner function, I guess, but that seems like overkill unless you plan to reuse it.

Though to be fair, having explicit intermediate variables is idiomatic in Python, from what I've seen. It's one of my biggest pet-peeves about the language, but it's not without precedent.

techdragon 1992 days ago

This is exactly the main situation where I'll happily "get clever" with my code.

It's not being reused and one of the following is true... I don't want to leave behind intermediary objects for whatever reason is relevant, or I feel its worth it to compress the logic to make it possible to use a language feature that requires an expression, like lambdas or list/dict comprehensions.

bko 1992 days ago

> a.zip(b).map(func1).filter(func2).forall(func3)

Lets make this a somewhat concrete example.

---

heights = [1,2,3]

widths = [4,5,6]

# printing area greater than 10

# functional

heights.zip(widths).map(to_area).filter(lambda area: area > 10).forall(lambda a: print("Area " + a)

#Verbose way

hw_zipped = zip(a,b)

areas = hw_zipped.map(to_inches)

big_areas = areas.filter(a: a > 10)

for a in big_areas: print("Area " + a)

---

Which do you prefer? I would argue the right level of abstraction is the functional way in this example, and its often the case in my experience, especially in python where you don't often use a namespace to store these intermediary variables and you have can't rely on typing

claytonjy 1992 days ago

As another point of comparison, as of python 3.8 you can do this in one list comp without nesting or double-computing areas with the walrus:

    result = [area for x,y in zip(heights,widths) if (area := to_area(x,y)) > 10]

I don't think that's very easy to read; I'd opt for two list comps like

    areas = [to_area(x,y) for x,y in zip(heights,widths)]
    result = [area for area in areas if area > 10]

But I agree with OP that map+filter is easier to read.

bko 1992 days ago

I agree. My main problem is I don't want intermediary variables floating around. Especially something like "areas". If python localized variables to a blocked namespace, I wouldn't mind

In scala:

---

val widths = Seq(1,2,3)

val heights = Seq(4,5,6)

widths.zip(heights).foreach { case (w, h) => {

  val area = w * h

  if (area > 10) {

    println(s"Area: ${area}")

  }

}}

println(area) // error: not found: value area

joshuamorton 1992 days ago

You can do this without the walrus in a one liner as well, I believe:

    [area for area in (to_area(x, y) for x, y in zip(h, w)) if area > 10]

or generally, you can take a multiline statement like the one you have and replace named value with its expression. Add some indentation and it's not too bad:

    [area for area in 
     (to_area(x, y) for x, y in zip(h, w))
     if area > 10]

syrrim 1992 days ago

  for x, y in zip(a,b):
      area = to_area(x, y)
      if area > 10:
          print(f"Area {area}")

>in python where you don't often use a namespace to store these intermediary variables

Hm? Most python code is within a function, in my experience.

bko 1992 days ago

You can abstract it out to a function but I think its overkill, even if you generalize to something like print_area_filter(heights, widths, value, cmp) or whatever

If its not in a function, your example may (or may not depending on length if either a or b have a length of zero) create a floating variable called area out there.

lauriat 1992 days ago

I agree, and yes, the line may be a bit excessive. The idea of Arrays is not just to cram a heap of functions to a single line. The readability (at least to me) is improved even with e.g. a single map

  arr.map(func)

vs.

  list(map(func, arr))

snicker7 1992 days ago

> assigning a descriptive identifier to each step

Working with data scientists, in practice, these identifiers are usually "arr1", "arr2", &c. I'd rather have method chaining. Often the intermediates are not meaningful.

disgruntledphd2 1992 days ago

I agree with you in general, people (especially data scientists) are bad at naming things.

It's probably the core skill of good programmers though, so it should be taught more. I don't think anyone sets out to use misleading names, but it's easy for name and code to diverge, and it's crippling to readability.

However, often when refactoring/updating such data scientist code (or even understanding), I need to break apart the long method chains, and this is much, much more annoying than dealing with crummy names.

At least I can print the values associated with the names, which is not easily possible in the really long method chain.

derwiki 1992 days ago

Code is read more often than it’s written; optimize for reading.

dragonwriter 1992 days ago

> As tempting as it is to do something super clever and cram a ton of functionality into a small number of lines or characters (it does feel good), it's just better to be a bit more verbose and write simple, obvious code.

I find fluent style often clearer as well as more terse than with superfluous intermediate variables. Verbosity isn't the same thing as clarity.

(But in Python, comprehensions/genexps are often clearer than either.)

ElevenPhonons 1992 days ago

Are these really the same?

The idiomatic Python 3 version uses generators to compose the computation and to avoid unnecessary memory allocations. Does funct.Array also do this?

- https://docs.python.org/3/library/functions.html#map - https://docs.python.org/3/library/functions.html#filter

6gvONxR4sf7o 1992 days ago

You can split the a.b.c.d onto different lines and comment each, which is a decent middle ground sometimes (a\n.b\n.c\n.d). A problem, still, is exceptions and debugging. You get paged and see that something went wrong in that expression that does so many different things, and it’s much more frustrating to track down the bug. It makes step debugging trickier too. I’d love better error message/debugger support for that kind of programming.

rowanG077 1992 days ago

I disagree with this. Splitting this simple pipeline into more variables makes stuff a lot less readable. Splitting it into variables would very clearly indicate to me the intermediate computations are used elsewhere. Which wouldn't be the case here.

Phemist 1992 days ago

This feels luke a strawman example. I feel like list comprehension results in a much more readable example here. I think, at least.

> all(func3(a) for h,w in zip(a,b) for a in func1(h,w) if func2(a))

lauriat 1992 days ago

Fair enough. Readability is subjective but I understand the sentiment. Constructing list comprehensions of such long chained expressions can be rather tedious and error prone, though (as your example shows).