|
|
|
|
|
by soco
39 days ago
|
|
But can you actually get usable results from those embeddings, specially in multilanguage setups? My experience is the similarities they find are more random than not, and without building some (fckin expensive) ontology and graph search you're done for. Data set of one, trying to build a pipeline able to answer legal questions like "cases where self-defense was rejected" or "discussion about parental authority vs custody". The vector rag collects random results strong with either terms, but mostly without any link to the actual problem. Edit: I didn't try query rewriting though, might have mitigated it a bit. But not hugely. |
|