| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by wruza 1365 days ago

We OCR and parse a dynamic set of rows so it produces a stream of snapshots of ordered, mostly unique colored integers. Each snapshot has exactly N items, except when there is naturally less than N or parsing failed for some rows. A delay between two snapshots may be arbitrary, so assume that we can even miss few rows completely. Which algorithm or data structure would you use to find only (most likely) new black items? Optimize for less errors and duplicates.

This is a common example of a real-world programming. Algos and DS’s make you a better engineer in vitro, but whether they do that in situ is an open question.

As a personal anecdote, I helped businesses to calculate and automate things for 15 years and only once had to use something “advanced” like makeshift BFS (it was a production planning system in a plastics factory that could pick up from any state of shops and inventories and tell which positions/qtys to order to meet the plan). All other algo/data magic is usually behind RDBMS and other well-tested systems.

I don’t think it is worth anyones time to learn to pattern-detect and/or implement these things, except when it is a literal job description. Just being aware that they exist and having some programmer-level intelligence for search is enough, imo.