Hacker News new | ask | show | jobs
by theshrike79 2182 days ago
Also: have a framework in place, which supports worry-free refactoring.

Comprehensive Unit/Integration tests, a robust type-system, pick whatever suits your style.

It's a lot easier to refactor stuff when you don't have to worry about breaking something hard to debug with a big code change.

1 comments

Unit tests are absolutely fantastic during refactoring. I once had to rewrite a piece of code where nobody really knew what it did or what it had to do, and the original author just made some guesses about the intention.

I started out by writing unit tests for everything, which became my handhold and documentation for what the system originally did. Then I started reorganising the code into a more structured and more readable form, without changing the functionality, as proven by my unit tests. Then I started asking domain experts what exactly it should do, unit tests in hand, asking if these answers were correct. If they weren't, I changed the unit test, and then changed the code to match.

Surprisingly painless for something that by all reasonable standards was a terrible mess.

You invented the same refactoring technique Michael Feathers suggests in his book[1]. You write tests to document the current state of the legacy software and then start slowly changing it.

Great book BTW, should be on a top10 must read list for software developers. (#1 will always be Peopleware[2])

[1] https://www.goodreads.com/book/show/44919.Working_Effectivel... [2] https://www.goodreads.com/book/show/67825.Peopleware

Peopleware is great, but seeing that it was written more than 30 years ago, also completely pointless. All the advice that book gives is ignored by literally every leader I’ve ever met.

Our CIO was regaling us with the story about how they were going to change our office today, make it more open, fancy like Google or Facebook, free desk. But when I asked if they’d considered dealing with the noise issues we were having, no, no they hadn’t considered that.

It just blows my mind. I just really wanted to ask them what the hell they thought they were doing modifying the office layout without asking the people who need to work there.

I'm intrigued what level you write these tests at, because one of the most painful things to me is having to do large refactorings in a codebase with lots of little niggly unit tests, but few integration or functional tests. During a refactoring, you're constantly breaking impelementation details, and many people feel compelled to have test coverage of these details. My favourite systems are ones where use cases are explicitly modeled, or at least you have a well-defined (and tested) service layer. This is often something you'll introduce first on top of a legacy system. In systems built this way, it's far easier modify the guts, knowing you're not breaking anything with business value, but without the overhead of small unit tests that are highly coupled to implementation details.
I would argue that having a robust type system and some (not too many) end-to-end tests for your software makes unit testing almost completely useless overhead.
End-to-end tests are more overhead than unit tests. End-to-end tests are also not great for testing the many edge cases that you are likely to mess up if you refactor carelessly.

I also don't see how a robust type really helps there. It might even be part of the thing that needs to be refactored. Besides, many languages don't have a very robust type system.

End-to-end tests and type systems certainly have their uses, but for refactoring messy code, I don't think there's a good substitute for thorough unit tests.

Although there are different kinda of refactoring of course. The case I'm referring to was about one very messy module. This makes it very easy to unit test. If instead it's the entire architecture of your application that needs to be refactored, then you're looking at a very different case, and end-to-end tests become more important than edge cases.

Unit tests are too. What the comment you are replying to is referencing is Integration tests, the type that instead of working on a “this class returns this” instead works on “component A calls Component B to do something, are the two still speaking the same language and expectations?”
There's three levels of testing: unit, integration, and end-to-end.

End-to-end (e2e) tests the entire stack: deploy the site with database and all, run a script that visits a page, clicks and stuff, and checks if the right things become visible.

Unit tests are the other extreme: you take a single unit of code and test whether it does what it's supposed to, while mocking all communication with other units of code.

Integration tests are in between those two extremes, and probably the least understood as a result (at least by me). They look like unit tests, and don't generally test the UI, but they don't mock the other components, and ideally also set up a database to test against.

I think there's a bit of overlap between unit and integration tests; a sloppy unit test where you don't mock (some) other components but treat them as part of the unit you're testing, start to look like integration tests at some point. If you want a clear demarcation, I think you might consider a database connection essential to count as integration test.

Conventional type systems don't really help you when your code is pretty much just taking in vectors/matrices of floats and returning vectors/matrices of floats.
You still have the option of introducing new types, even there.

In Ada, the programmer is discouraged from using the Ada equivalent of int directly, and is encouraged to instead introduce a subtype that reflects the specific use of int (including automatic range checking).

This isn't as natural in C++ but is still possible. Boost offers a BOOST_STRONG_TYPEDEF [0] to deliberately introduce an incompatible type. (I do recall having trouble getting it to behave, but it's been a while.)

Whether this makes sense in most mathematical code, I'm not sure, but it seems like it's an option.

[0] https://www.boost.org/doc/libs/1_73_0/boost/serialization/st...

If you are working with physicial quantities in C++ there are also

[0] https://github.com/nholthaus/units

[1] https://github.com/mpusz/units

You are correct, however that's a niche use case that would warrant using a non-conventional type system. Conventional type systems are mostly for just dumping a bunch of strings and integers to and from a database and formatting them neatly.
The problem is that you want to have tests for your edge cases (e.g. a text field whose contents get stored to the database with specific validation will have a lot of different cases to test for), an end-to-end test will take a LOT longer than a unit test. Unit tests are for rapid feedback on a small section of your application.
Perhaps in some world where you never modify a type (e.g. concatenate/truncate strings, multiply numbers, etc). Type systems check types, they don’t help verify correctness of the logic performed on the types.

For an example, dig into some crypto libraries. They operate on bytes all over the place performing XORs, etc. A type system isn’t really gonna help you ensure you got the correct number of AES rounds and stitched the blocks in the right order.

IMO the only systems where this “type system eliminates most tests” philosophy seriously works are the ones that don’t do anything other than pass data between components without doing anything beyond calling some serialization methods.

> IMO the only systems where this “type system eliminates most tests” philosophy seriously works are the ones that don’t do anything other than pass data between components without doing anything beyond calling some serialization methods.

Which is what 90% of programmers on this website are essentially doing. And for the remaining 10% there is likely a better suited, different programming language or type system available. Even if there isn't unit tests would still be very niche and the general case would be that by default you shouldn't be unit testing.

If the code is a mess the type system can’t help, it is part of the mess.

End to end tests are really slow, but if you can get them into the 300-1600 tests per second range then i have no beef. I value tests but I seriously grudge waiting for tests.

If the code is that bad, unit tests aren't going to help you. Messy spaghetti code with massive structural issues will break every unit test every time you change anything in the code, in ways that make no sense and provide you with little useful information. You will fix things faster if you just let the tests fail, fix the code and then rewrite the tests. Which makes the unit tests useless.

Also, have you ever seen a project where the code is a mess but the unit tests are perfect? Even if somehow you could write unit tests that would cover for super bad code (which you can't), it is extremely unlikely that your unit tests would be that amazing.

What do you mean with "perfect" unit tests? For this case, I wrote the unit tests to document the current functionality. That's basically what unit tests do: they document functionality and enable you to preserve that functionality of that piece of code. Of course once you realise that the functionality is wrong, you should change it and the unit test. And you can't fix the code if nobody knows what it's meant to do.

There's quite a lot you can refactor without breaking unit tests. If you've got a single 200-line function full of nested loops with cryptic variable names, modern IDEs make it really easy to extract those loops to their own functions. Figure out what they're meant to do, give them a descriptive name, and you already improved the code a bit without breaking any unit tests. If your IDE does this well, you could even do this without unit tests, but you really will need those unit tests once you start reorganising the code making use of the excessive number of parameters those extracted functions invariably end up with.

Of course you can write unit tests for super bad code. If it's a function that returns something, it's trivial to unit test, no matter how badly written that function is. If it calls other code, you have to mock those calls and test that those mocks get called under the same conditions. If they mess with global variables, that's terrible, but even that can be mocked.

If the code uses gotos to code outside the module, somebody needs to get shot, and I guess you need a unit testing framework that can mock those gotos. I've never seen one, because nobodoy uses gotos anymore.

Of course they can help, you don’t have to use the existing tests, you can add your own as you learn the system and what its supposed to do
This means you test your code manually ? I can't imagine not doing TDD, except during extremely early prototyping before knowing if a code will be useful at all.
TDD and unit tests aren't exactly the same thing. Also, not doing TDD doesn't mean that there's no automated testing. I prefer to write my code and then do a few end-to-end automated tests for the most important parts of the code to serve as a backup in case some change in the code causes massive failures. But TDD is overall tedious for (usually) little benefit when compared to a few well selected end-to-end tests. And unit testing is even less benefit for even more work, unless you are doing something very very specific.
Interesting, in my experience TDD is easy (it's just that a specific mental process needs to happen, besides learning a xUnit API or something, and also experience tells how to know which tests to write and which not to write, so that maintaining the tests doesn't become a burden) and always provides better ROI on the long run.

With end-to-end tests such as when piloting a browser, it's not really easy to get things like tracebacks into the console output for example.

Exactly, in the time it would take me to write a proper TDD suite, I've written a skeleton of a product from end to end and can start iterating over it.

If you're working on a very specific box that has well defined, well known inputs and outputs then TDD is an excellent tool.

But for anything with a non-specific "We'd like to do X and display the result on Y" it just gets in the way.

I'd TDD that "do X results in Y" and then add an end to end that that verifies that "Y is displayed", unless that code that displays Y is too trivial then an end to end test might not even be necessary.
Test-driven development is a useful tool, but it doesn't remove the need for manual testing completely, especially on frontend projects.

I've worked on mobile apps before with a small team, and inevitably, we'll find bugs that show up in the user interface when the user rotates their phone. It's hard to unit test for rotation changes, and it's also hard to code a rotation change into an end-to-end test on mobile. Animations are also something that's difficult to test in an automated fashion, and all the testing in the world won't be as good as showing the animation in front of a designer. So some level of manual testing is needed on mobile.

I've worked on web frontends where there would be bugs with scrolling jumping back and forth. An end-to-end test using Selenium may not catch the issue, but for a user, it can be painfully obvious. Similarly, animations are also hard to unit test on the web. So some level of manual testing is needed on the web.

The only place where I could see manual testing NOT being needed is for backend development, since the input and output to a backend system is much more controlled. You could write an end-to-end test for any scenario a user could throw at your system.

In summary, don't underestimate the value of manual QA!

"Fantastic" or "The only way" ? has anyone seen a case of refactoring without test end well ? I haven't.

Seems like that "refactoring untested code" would be a well established recipe for disaster.

> "Fantastic" or "The only way"?

Not sure if it counts as a 'way', but a static type system helps too.