|
|
|
|
|
by jakequist
2683 days ago
|
|
I've felt your pain. So much so that I took 6 months off and put a lot of groundwork into starting a company that would solve this problem. But in the end, I decided to abandon it. I realized that in order to be 10x better than the alternatives, I was going to need to solve some very tricky AI problems. For example, acurately deduplicating a customer record "John Doe" vs "Johnathon Doe" is not straightforward. Maybe it's two different people?. Maybe it's just a spelling mismatch? The system must have a great deal of context to accurately determine if the data is indeed duplicated. And even if it does, perhaps there's a perfectly good reason for the spelling mismatch. (e.g. perhaps one table is his preferred name, while the other is just referential, etc). In the end, deduplication often comes down to the requirements of the company and it's hard to generalize. I think there's space in the market for this kind of business, but it'll be a slog. Unless you have a 10x solution (i.e. super AI), you'll be competing with the likes of Trifacta, etc. And it's hard to compete with that kind of sales force. Really good question. Thanks for posting. |
|
I think the question is about line of business software and issues there are very different.
For instance there is a literature on record matching and good techniques exist, but without an exception handling workflow you don't have a way to deal with the unusual cases the code works up.
I would love to talk and share notes about what you did.