Hacker News new | ask | show | jobs
by patresh 2125 days ago
I find that the fact that the functions min and max have the same name as the variables min and max increases cognitive load which makes it harder to think about it.

I find the following easier to read :

  Math.min(Math.max(num, lower_bound), upper_bound)
7 comments

Easy to remember, but may take some time to grasp:

  Arrays.sort( {lower_bound, num, upper_bound} )[1];
Next challenge: teach the optimizer to make that almost as fast as the min/max way ;-)

(You can’t reduce it to the min/max call because it also works if you accidentally pass a lower bound that’s larger than the upper bound. Worst-case, the above takes 3 comparisons, unless at least two of the inputs are constants)

> Next challenge: teach the optimizer to make that almost as fast as the min/max way ;-)

I did exactly this for my PhD!

https://chrisseaton.com/phd/

I'm glad things like this are being worked on. I have been writing a set of implementations of the nBody benchmark in JavaScript using various forms of abstractions. In an ideal world they should all take the same speed since they perform the same fundamental task and produce the same result. They just represent different scales of optimization effort.

It's interesting seeing the difference between vectors in arrays vs objects and if you do immutable versions. The trickiest to optimize form is using a micro vector library which uses closures and array map().

    var vop = op => ((a, b) => (a.map((v, i) => op(v, b[i]))));
    var vdiff = vop((a, b) => a - b);
    var vequals = (a, b) => {  return (vdiff(a, b).reduce((c, d) => c + Math.abs(d), 0)) === 0;};
    var vadd = vop((a, b) => a + b);
    var vdot = (a, b) => a.reduce((ac, av, i) => ac += av * b[i], 0);
    var vlength = a => Math.sqrt(vdot(a, a));
    var vscale = (a, b) => a.map(v => v * b);
    var vdistance = (a, b) => vlength(vdiff(a, b));
Currently on my lowly atom laptop and FireFox. The version using mutable objects is ten times faster than the mapping immutable array version. I live in hope that one day there will be an optimizer that turns

    var transpose = matrix => (matrix.reduce(($, row) => row.map((_, i) => [...($[i] || []), row[i]]), []));
    var multiply = (a, b, ...rest) => (!b) ? a : multiply(a.map((p => transpose(b).map(q => vdot(p, q)))), ...rest);
Into a CPU optimum (or dare I suggest GPU) version of a matrix multiply.
For those reading, this work went into TruffleRuby, which implements Ruby on top of Truffle/GraalVM.
Very noble, but I'm assuming (hoping) the technique will be more generally applicable than Ruby?
Wait, how is that not trivial?
> how is that not trivial?

It could be trivial to implement an optimisation which does this for that exact code. But what are you going to do? Hand-code an optimisation for every similar thing people could write? I implemented a general solution.

So it also works through metaprogramming:

    [1, 2, 3].send(:sort).send(:[], 1)
Through user-defined sorting order:

    [1, 2, 3].sort_by { |a, b| b <=> a }[1]
When nested:

    [[1, 2].sort[1], 3].sort[0]
And so on.

Note that it also needs to be transparent to debuggers and profilers, it needs to handle multiple method redefinitions (for example what happens if someone redefines the sorting order for integers).

It's not a pattern-matching optimization - it's partial evaluation enabled by a new kind of polymorphic inline cache.

> But what are you going to do? Hand-code an optimisation for every similar thing people could write?

While I appreciate that you solved the general problem, I wonder if there are legs here. Specifically, could one mine GitHub to find automatically common patterns that one could write specific optimizers for, or at minimum, leverage that to learn what semi-general cases are worth optimizing? To my knowledge optimizing compilers already do have effectively handlers for common operations, but I don’t know if anyone has leveraged “big data” to help guide this.

IIRC, various tweaks and optimizations in Java were guided by Sun analyzing their own code based. GitHub is just so much bigger, and polyglot.

It's not just a hardcoded optimization for that construct.
You know what's great about that? The order of the arguments doesn't matter. So all the debate about "should it be num, min, max or min, num, max"- your solution does not care. Put them in any order you like!

You've redefined the problem from clamping a given value into picking the middle value from 3. This is a lovely way to re-interpret it.

Well, the order of the arguments does matter. Sorting three values requires 2.67 comparisons where clamping a value between two other values requires exactly 2. There are plenty of contexts where cleverly avoiding a problem by doing 33% more work isn't viewed as desirable.
True, but I think there are more situations where minimizing the chance of programmer error, now or in the future, is more important than a micro-optimization that will never matter. All depends on context.
Beware that in JavaScript, the language that this tweet is about, the sort function sorts alphabetically by default.

    [100, 10, 11].sort()[1] === 100
Most popular programming language folks. You give it 3 numbers and it treats them as 3 strings of characters...
Wait, really?

BRB, just checking some code…

Yep, the sort method converts values to strings before comparing. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe...
I was confused why the author claims this doesn't work, then I saw JavaScript... ah
If you are going to call a library function, just call median().
I think some of us (most? all?) develop a tendency to act/be clever instead of act/be clear. When I resist that temptation, I ask for something like:

    clamped(num, range=[ lower_bound, upper_bound ]);
and then have no interest in how this actually gets implemented.
this doesn't work for NaN, e.g {0d, Double.NaN, 1d} returns 1d.
Does NaN have an "order" in the set of reals or integers or whatever? I would have no idea what to expect from `min(NaN, x)` or max same. But is it specified by an IEEE standard or something?
Both min and max should return NaN, if any of their parameters is NaN. Sorting can be defined where to place the NaNs (head/tail) but it's largely irrelevant in this case as simply the substitution won't be permitted by any compiler.

NaN is part of IEEE754 but of course it's not a 'real' number (integer numbers don't have NaNs)

Edit: you can consider NaN (and to a degree both infinities) as an exception, once it occurs - it has to be propagated. Any operation involving NaN should be returning NaN, any operation comparing NaN to anything has to return 'false'. That includes "if (NaN == NaN)". boolean isNaN(double d) is effectively "return d != d;"

> Both min and max should return NaN, if any of their parameters is NaN.

That's not a given, though. IEEE 754-2008 defined min and max as returning the non-NaN parameter. They have been removed in IEEE 754-2019 though.

The reference to IEEE 754 is made later on, mostly to answer the question posted. I meant regular functions in C alike languages - Math.min/max - java/javascript, fmin/fmax - C++. They do the "right" thing to propagate the NaN
No, the only valid result for NaN is NaN. Saturating range bounds are for real numbers.
No, it breaks the ordering requirements. NaN compares greater than and less than every number.

You can say that a call to max should return NaN if any argument is NaN, but you can't say the same about sorting. (For one thing... sorting an array doesn't return a scalar value.) Sorting is done with comparisons, and what happens if a NaN gets into the list of values will depend on which specific comparisons happen to be done.

The only reason I know what to expect is because Suckerpinch on YouTube made a video in which he managed to define a logic system using NaN and +∞, and does so by abusing min and max, among other expressions:

https://youtu.be/5TFDG-y-EHs

> Does NaN have an "order" in the set of reals or integers or whatever?

By definition, something that is not a number (real, integer, etc.) cannot be compared to something that is a number.

It depends on what space you're working on (e.g. the https://en.wikipedia.org/wiki/Extended_real_number_line define an order on the real field union {-∞, +∞}).
Yes, but in that context, ∞ is a number. We often interpret "NaN" to mean "infinity," but it only means "not a number." Maybe I'm being pedantic, but if we want a token representing infinity as a number, it ought not be called "not a number."
typeof NaN
So I tried a thing in python3.

    >>> max(float('nan'), 0)
    nan
    >>> max(0, float('nan'))
    0
Numpy works though

    >>> np.maximum(float('nan'), 0)
    nan
    >>> np.maximum(0, float('nan'))
    nan
Edit: Fixed numpy example.
The functions minNum and maxNum ([IEEE 754-2008, 5.3.1, p19]) take two arguments and return the min and max, respectively. They have the special, distinguished property that “if exactly one argument is NaN, they return the other. If both are NaN they return NaN.”

Source: http://tom7.org/nand/nand.pdf

Edit: As of 2019, the formerly required minNum, maxNum, minNumMag, and maxNumMag in IEEE 754-2008 are now deleted due to their non-associativity. [https://en.wikipedia.org/wiki/IEEE_754#2019]

And for this particular use case, numpy has .clip(), that takes:

    np.clip(ndarray, lower_bound, upper_bound)
And handles NaNs correctly.
Having a NaN at that point feels like a bug anyway, the solution is probably to check the arguments and throw an exception if NaN is provided (or use an input type that doesn't allow invalid values)
That would depend on what you do with the NaNs. For instance I have been using them extensively in time series data representation to denote a specific entry has no value - think of Saturday and stock/forex markets.
That's not what NaN means. It means there is a value, and that value is... Check this out... "Not a Number".

Recording the lack of a value is what null is for.

Null does not pertain to primitive types in java and in C - it'd be zero when applied to a 'double'. (note: you want double[] as backing storage in java and you absolutely do not indirections). Aside that I have quite a good idea how NaN is represented internally and what it does, e.g. you can have several different NaNs that have different representation bitwise. Baring that - NaNs are pretty decent to represent lack of value as all operations with them result into a NaN. In the end NaN is just a composition of bits that the hardware can optimize for.
ohh this is clever!
Assuming we know (lower_bound <= upper_bound) I'd write:

Math.max(lower_bound, Math.min(num, upper_bound))

Since i read right to left.

It’s amazing what a difference it makes when you just good names and formatting
It’s usually nicer to make a helper function:

  const clamp = (x, low, high) =>
    Math.max(low, Math.min(x, high));
Then it can be easily used later via a descriptive name. e.g.

  color_component = clamp(color_component, 0, 255);
That doesn't really help. "Max" to enforce a "lower bound" is briefly halting.
I think the reason it's weird is that we might intuitively think of the "enforce a lower bound" function as taking two named arguments (lowerBound and inputValue) and the order of those two arguments mattering.

But of course, it turns out that the order of the arguments doesn't matter: applying a lowerBound of 5 to an inputValue of 100 turns out to be the exact same thing as applying a lowerBound of 100 to an inputValue of 5.

We know that the order of arguments doesn't matter for the Math.max function, so I think that's where the moment of incredulity comes from.

I agree.

But "min(-, constant_x)" should be thought of as "at most constant_x" and similarly for max. Maybe there's a way to make it more expressive.

I think your "at most" language is pretty expressive. You could do that as an alias for `min` and `max`

I think `at_most(at_least(num, lower_bound), upper_bound)` is much easier to understand instantly than `min(max(...))`.

I'm tempted to make these aliases myself in some of my development actually. I find a pretty big conceptual difference between "I want to find the minimum point in this data", and "I want to restrict the range of this number" that giving them different names will probably help the readability of my code.

(Of course, for `min(max(...))` I usually write a `clamp()` function to hide that for me, but someones I want to only clamp in one direction)

> (Of course, for `min(max(...))` I usually write a `clamp()` function to hide that for me, but someones I want to only clamp in one direction)

You could make clamp work in only one direction too. clamp(number, None, upper_bound) or the idiomatic equivalent in your language of choice seems pretty readable.

Maybe add an alias to Max called AtLeast and one for Min called AtMost to be used in these situations ;)
I would prefer it if the methods in java.lang.Math had been called "larger" and "smaller", instead of "min" and "max".

I sometimes mix up "min" as "take the minimum" rather than "take the larger given this minimum".

But "min" does take the minimum: `Math.min(1,3) == 1`
I think this was posted purely for the limerick quality.

  There was a young coder whose hacks
  His manager often claimed lacked
  The requisite clarity
  For to clamp vars would he:
  Math.min(Math.max(number, min), max);

    @jaffathecake had a problem of truncation
    and posted to Twitter his calculation.
    The gist of his attack
    was min( max( num, min ), max)
    yet refused to add any annotation.
In a file of utility hacks

near get(), post(), and ajax()

// we alias at import

// to keep all the code short

min(max(number, min), max)

Haiku is nice, too:

    Cryptic sorcerer!
    "Math.min(Math.max(v, min), max);"
    his incantation.
I find the extra words wordy.
I prefer "ceiling" and "floor", but yes, agreed.
Those also happen to be fairly common names for operations (rounding up or down to nearest integer).
I use ceil/floor (and make people use whenever I can) if something is going to happen when something hits the ceiling or drops to the floor. And avoid if it is just for clamping.