|
|
|
|
|
by mynegation
1482 days ago
|
|
I remember helping my little sister who got entity resolution (people’s names and company names) homework assignment for programming class 26 years ago (she is economics major and I am CS). That was infuriating and intellectually challenging at the same time. We came up with a combination of n-grams, Levenshtein distance, and common abbreviation (think “Inc.” and “Corp.”) canonicalization. It worked reasonably well. |
|