| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by soco 39 days ago
	But can you actually get usable results from those embeddings, specially in multilanguage setups? My experience is the similarities they find are more random than not, and without building some (fckin expensive) ontology and graph search you're done for. Data set of one, trying to build a pipeline able to answer legal questions like "cases where self-defense was rejected" or "discussion about parental authority vs custody". The vector rag collects random results strong with either terms, but mostly without any link to the actual problem. Edit: I didn't try query rewriting though, might have mitigated it a bit. But not hugely.