|
|
|
|
|
by seanhunter
951 days ago
|
|
The problem with this line of thinking is that the specific embedding chosen has a big impact on task performance[1], and the person producing the piece of content doesn't necessarily know what you're going to be wanting to use that content for a lot of the time. The second thing is that certain embedding formats are very specific to particular model architectures. From a practical perspective there are some standard embedding formats that people use a lot so if you're performing a normal sort of task there's probably a standard format for that particular task (eg worth checking out spacy's embeddings library which works with a lot of different libraries[2]) [1] For example see this paper for a comparison of performance for different code embeddings https://arxiv.org/pdf/2109.07173.pdf [2] https://spacy.io/usage/embeddings-transformers/ |
|