|
|
|
|
|
by leobg
709 days ago
|
|
The example could be handled with no machine learning at all. Just use a bag of words comparison with a subword tokenizer. And if you do need embeddings (to map synonyms/topics), fastText is faster, cheaper and runs locally. For hard cases, you can feed the source/target schemas to gpt-4o once to create a map - and then apply that one map to all instances. |
|
the question is if quality will be acceptable