|
|
|
|
|
by parad0x0n
146 days ago
|
|
So fuzzy matching only makes sense if you expect two columns having the same data more or less, otherwise you can skip that step. And then you have to pick a threshold -> if similarity of strings is above that threshold, it's a match, otherwise, not. Threshold should be high to prevent false positives. LLM will take care of the non-matches |
|