Hacker News new | ask | show | jobs
by jasonmit 2928 days ago
> you have to do a manual recording to create them the first time (which means manually setting up test data in production that is just-right for your test case)

Your test invokes the the recorder. There isn't anything manual outside of writing & running your test.

> you have to manually re-record when they fail

Again, nothing manual. It would require running your test again with Polly in record mode if you want to "refresh" the recording with a newer set of responses.

> In both cases, you've accepted that you can't easily setup specific, isolated test data in your upstream system, and are really just doing "slightly faster manual testing".

This is by no means a replacement to E2E testing. It is a form of acceptance/integration testing where you're testing your application against a point in time that you verified all systems were talking correctly with your application. E2E tests are much slower, difficult to debug, and intended to capture those breakages in contracts.

It's a tool for your toolbox, reach for it when needed. We plan to release a tutorial/talk to should clear up any misconceptions. There are also other applications for Polly such as building features offline or giving a demo using faker to easily hide any confidential data.

3 comments

> It's a tool for your toolbox, reach for it when needed

Sure, apologies for being negative about a tool you've worked on and are rightly proud of. I'm sure you already have more users than any open source project I've ever written. :-)

I struggle a bit at this point in my career, as I've made enough mistakes and seen enough mistakes, that I generally have strong gut opinions on "yeah, that's probably not going to work/scale/etc."

So, when observing new developers/teams starting to "make a mistake" that I've seen before, my gut says "no! bad idea!"...but I know I could be wrong, so it's tempting to say "well, sure, that didn't work for us, but go ahead and try again".

Because, who knows, maybe eventually someone will figure out an innovation that makes a previously-bad approach now tenable, and even best-practice.

But, realistically, that rarely happens, and so teams, orgs, the industry as a whole stumbles around re-making the same mistakes, and codebases/teams/etc. pay the cost.

I've thought a lot about micro-service testing at scale:

http://www.draconianoverlord.com/2018/01/21/microserving-tes...

Basically there are no easy answers, short of some sort of huge, magical, up-front investment in testing infra that only someone like a top-5/top-10 tech company has the eng resources to do.

So, definitely appreciate needing to do "something else" in the mean time. ...record/replay is just not a "something else" I would go with. :-)

> Again, nothing manual

Yes, sorry for being inexact/overusing the term--I understand the tests drive the recording.

What I meant by manual is getting the e2e system into your test's initial state.

E.g. tests are invariably "world looks like X", "system under test does Y", "world looks like Z".

In record/replay, "world looks like X" is not coded, isolated, documented in your test, and is instead implicit in "whatever the upstream system looked like when I hit record".

Which is almost always "the developer manually clicked around a test account to make it look like X".

This is basically a giant global variable that will change, and come back to haunt you when recordings fail, b/c you have to a) re-divine what "world looks like X" was for this test, and then b) manually restore the upstream system to that state.

If no one has touched the upstream test data for this specific test case, you're good, but when you get into ~10s/100s of test, it's tempting to share test accounts, someone accidentally changes it, or else you're testing mutations and your test explicitly changes it (so need to undo the mutation to re-record), or you wrote the test 2 years ago and the upstream system aged off your data.

All of these lead to manually clicking around to re-setup "world looks like X", so yes, that is what I should have limited the "manual" term to.

But in the case we're talking about, where you're reliant on an external service that can change underneath you, "world looks like X" is genuinely not under your control. It feels like pretending that it is will lead to just as many failures as acknowledging it's inherent volatility.
Agreed! And, to me, record/replay is still pretending like it's controllable, b/c even if you decouple for replays, records will always be a PITA.

My depressing solution is to just not even try to automate tests against the upstream system and instead invest in test builders/DSLs that make mocks/stubs on both sides as pleasant as possible.

And when bugs slip through, make sure to update your stubs/mocks on both sides to prevent the regression.

To me this gets the most agility and reliability, and will be a test suite that developers don't hate 1-2-5 years down the road.

Can you post the tutorial here? Thanks @jasonmit