Hacker News new | ask | show | jobs
by avgcorrection 1251 days ago
> I think you’re supposed to write some nonsense, like assert fibonacci(15) == 8, then when the test says “WRONG! Expected 8, got 610”, you’re supposed to copy and paste the 610 from your terminal buffer into your editor.

> This is insane!

The sane approach is presumably to either expand the call tree and verify all the unique subsolutions. Or to do every step with a calculator if you can’t expand the call tree.

> The %expect block starts out blank precisely because you don’t know what to expect. You let the computer figure it out for you. In our setup, you don’t just get a build failure telling you that you want 610 instead of a blank string. You get a diff showing you the exact change you’d need to make to your file to make this test pass; and with a keybinding you can “accept” that diff. The Emacs buffer you’re in will literally be overwritten in place with the new contents [1]:

Oh okay. The non-insane approach is to do the first thing but Emacs copies the result on your behalf.

3 comments

Well, the non-insane thing is to do property-based testing. Instead of testing only a handful of examples.
They also do that, the post refers to their Quickcheck library. But how do you property test the Fibonacci function ? There isn't much to say about it...
Properties of the Fibonacci function:

It is non-decreasing monotonic. fib(n) <= fib(n+1)

It is increasing monotonic after 1. fib(n) < fib(n+1)

Its domain and codomain are non-negative integers.

fib(n) + fib(n+1) == fib(n+2) Notice this is like the recursive solution except going the other way (addition not subtraction) and is missing the base case.

It also converges closer and closer to the golden ratio. Implementing a property test for that would be interesting.

    a, b, c = fib(n), fib(n+1), fib(n+2)
    assert abs(c / b - phi) < abs(b / a - phi)
I believe the way to test it is to have a property like `n in integer | fib(n+2) == fib(n+1) + fib(n)`. This is close to the naive (but obviously correct?) implementation of fib and can be used to test the optimized version of the function.

You can also test that the sequence is increasing like `fib(n+2) > fib(n+1) > fib(n)`.

You use the naive implementation as a test oracle, limit `n` to something small (through the property tester), and use the test oracle on your efficient implementation.

Unit testing elegant functions has no value.

(fib is often used as an example. But you asked how to test it.)

In combinatorics the adjusted fibonacci numbers start with 1 instead and is more commonly used as it aligns with many other results. One might want to document in the code, via a test, which sequence is of interest.

This is just an example of course but elegant functions might need to be tested.

In addition to what travisjungroth said, you can also check against a reference implementation.

Eg if you coded up an O(n) version of the Fibonacci calculation, you can check against the naive recursive one (or if you are feeling confident, you can check against the O(log n) solution via repeated squaring of matrices.)

Actually, Fibonacci results can be fairly precisely tested with the golden ratio ratio.
You could compare against a closed-form solution:

    let fib2 n =
        let sq5 = sqrt(5.0)
        ((1.0 + sq5)/2.0)**(float n)/sq5
        |> round |> int
The problem is that the closed-form solution is vulnerable to floating-point error. If the calculations are done in float32 (including all intermediate steps), then the 32nd fibonacci number is erroneously given as 2178310, instead of the correct value of 2178309. Using float64 does better, but still has an error at the 71st fibonacci number. (I made a quick plot of the error as a function of N at https://i.imgur.com/bbc9OFC.png. As soon as the error crossed ±0.5, the rounding results in the wrong result.)

These are fine for property-based testing, so long as you restrict yourself to the range in which you have a correct value. But at that point, you might as well just hard-code the first 93 fibonacci numbers (the most that will fit in a uint64_t) and be done with it.

I prefer "code it twice and hope you get it right once" testing.

Complex systems use that system everywhere. Why aren't we doing it for our code?

Comparing the output of your system against an oracle is one property you can test.

But you don't always have an oracle. So other properties still make sense.

As a simple example: if you code up a quantum mechanics simulator, that's hard, and I wouldn't be able to code up an oracle for you straight away. But I can tell you that you probably want to check that things like momentum and energy better be conserved.

Yes, I have difficulty understanding the point of a test-writing system that relies on your explicit assumption that whatever the code already does is correct.

What are you testing? Why?

A regression test is checking causality: Changes in new code, updating dependencies, updating the OS the software is running on, updating shared libraries, porting the code to a new platform, etc. aren't supposed to change the test results.

"I may not know what cos(x) means, but whatever it is shouldn't depend on what OS version I'm running"

> "I may not know what cos(x) means, but whatever it is shouldn't depend on what OS version I'm running"

Cosine is a terrible example to use for that idea. It's pretty likely to change, for certain x, in similar circumstances to your examples of "when test results should never change".

If it's likely to change, then you especially want the regression test so you can decide how to handle the divergence during your port. Maybe one library preserves the signal on NaNs and the other doesn't. Or maybe the CPU's default rounding mode is different when called in this context, and you're off by 1 ulp.

In either case, if the behavior is to change, it should change as an informed decision and not because nobody noticed.

This looks similar to snapshot testing in UI, where you save an output of UI components and test system notifies you when the output changes. This can be useful to detect changes in components that you didn’t intend to change.
Do you simply mean regression testing?
Yeah, weird to see all these variations

- Aha, an expect test!

- Oh, you mean a snapshot test!

- This here is akin to UI testing framework X where the test framework can compare an expected screenshot of the UI to a screenshot of the actual UI!

The last one basically requires automation if you want anyone to make use of it. The regression testing automation described in the OP is a nice-to-have, not a so-good-that-it-gets-a-new-name.

… And apparently also “change detector test” https://news.ycombinator.com/item?id=34379175
...and my favourite term, "characterization test": https://en.wikipedia.org/wiki/Characterization_test

"Regression test" means something else, at least at the companies I've worked at: It means a test that was written after a defect was found in production, to ensure that the same defect doesn't happen again (that the fix doesn't "regress"). It can be a manual test or an automated test. https://en.wikipedia.org/wiki/Regression_testing

Lol I came here to post this but you beat me.