Hacker News new | ask | show | jobs
by chuckhend 835 days ago
Its certainly up for debate and there is a lot of nuance. I think it can simplify the system's architecture quite a bit of all the consumers of data do not need to keep track of which transformer model to use. After all, once the embeddings are first derived from the source data, any subsequent search query will need to use the same transformer model that created the embeddings in the first place.

I think the same problem exists with classical/supervised machine learning. Most model's features went through some sort of transformation, and when its time to call the model for inference those same transformations will need to happen again.