| HN Mirror

Yes but they are not trained to explicitly encourage similar texts to be semantically similar, only to do next token prediction. In embedding models a contrastive loss is used to minimize distance between pairs of semantically similar content and maximize distance to all other embeddings