|
I encountered one about 17 years ago. It was for diffing IP packets, TCP segments, and network event payloads. At the time I worked at RSA Security on a network forensics and security analytics product, written in a mix of C and C++. In one of the projects I worked on, we needed to let users diff the packets, segments, and payloads. Back then we were very conservative about adding third-party libraries to the product. I have written more about that culture here: https://news.ycombinator.com/item?id=39951673 Long story short, due to the conservative culture, most data structures and algorithms were implemented in house. The diff algorithm for packets/segments/payloads was written in house too and I was the one to write it. My implementation was based on a straightforward dynamic programming solution to the longest common subsequence problem. If I recall correctly, it ran in O(mn) time and O(min(m, n)) space in the worst case, where m and n are the lengths of the two sequences. I knew there were more efficient algorithms, but this code was not performance critical. I chose to keep the implementation simple so anyone could understand it, learn it quickly, and fix bugs if they arose. It served us well for the next seven years until the product was replaced with a new one. On a related note, I sometimes miss that older style of software development where we would dive deep into a problem domain, master it, and design solutions ourselves. I am not being naively nostalgic though. I am very well aware that modern development, with its reliance on well established libraries, usually delivers much greater productivity and reliability. Still, I think the slower and more deliberate approach of building things from the ground up had a certain charm. |