Hacker News new | ask | show | jobs
by tayloramurphy 1210 days ago
I was fortunate enough to learn this strategy outside of a software context first. In grad school I was responsible for maintaining our GC-MS and LC-MS machines (Gas/Liquid Chromatograph / Mass Spectrometer) in addition to using them for my own experiments on a regular basis. They were complex pieces of equipment that had a ton of fiddly components.

The best way to troubleshoot these machines was to split the problem in half again and again until you found the part that needed to be cleaned, repaired, or replaced. The way to do that was to run different known chemicals with their own known signatures through the machine again and again. For example, you could run a single chemical (maybe ethanol - I honestly don't remember which ones at this point) and see if it goes straight through the GC and gives you the peak you expect. If it doesn't you could look at the result and see if the timing was off (indicates something with the GC or gas flow) or if the peak was wonky (indicates something with the MS). And then you just keep going. (Sounds a lot like unit and integration tests right?)

Applying that same strategy works wonders for data pipelines as well. Is it something with the extractor or loader (or god forbid Airflow)? Break it down and go from there.

My only nit on this is that while bifurcate is technically accurate (or is it actually?) it feels unnecessarily complex sounding for folks learning this skill.

1 comments

I've always used "divide and conquer" which is a cliche for a reason.

Bifurcate is commonly used by Indian colleagues, and perhaps programmers are familiar with the term, but it's not something used by US business people in my experience.

Also, binary search
I more often heard and used "bisect". Is that subtly different or subtly the same?
It's the same, e.g. the git bisect command is just a binary search API for a commit history.
To finish the thought explicitly ~

If one has pretty good automated tests, it’s possible to automate and pinpoint which commit has the last working version and which commit had the test failure by using git bisect and using the test results as input.