|
|
|
|
|
by ck_one
842 days ago
|
|
Congrats on the launch! In what way is "prefix-aware embedding models trained with contrastive loss" better than the standard embedding model provided by OpenAI? "added in learning from feedback and time based decay"
=> Sounds interesting! Have you seen significant gains in precision and recall here? It looks like you are using NextJS app dir + external backend.
Why did you decide against NextJs for frontend and backend?
Are you happy with your choice? |
|
For learning from feedback for sure! No exact benchmarks, but we've heard from quite a few users about how useful this is to push high quality docs up and reduce the prevalence of poor docs. This is all very hard to evaluate since there aren't readily available, real-world "corporate tool / knowledge base" datasets out there. We're actually building our own in house right now, so we should have more concrete numbers around these things soon.
For the backend, we do a lot of stuff with local embedding models / cross encoders / tokenization / stemming / stop word removal etc. Python has the most mature ecosystem for this kinda stuff (and the retrieval pipeline is the core of our product), so we don't regret it at all!