| HN Mirror

> As with all LLM models and their subproducts, the only way to ensure good results is to test yourself, ideally with less subjective, real-world feedback metrics.

This is excellent advice. Sadly, very few people/organizations implement their own evaluation suites.

It doesn't make much sense to put data infrastructure in production without first evaluating its performance (IOPS, uptime, scalability, etc.) on internal workloads; it is no different for embedding models or models in general for that matter.