| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by araghuvanshi 1204 days ago
	So I will admit that I don't know a lot about how elastic search works under the hood, but I can describe how ours does. Under the hood we use a large language model called BART which has 2 key benefits: 1. It has more "general knowledge", so it doesn't just search for semantic similarity between the text and tags you provide. This makes the tagging more accurate. 2. This model can be fine-tuned fairly easily for specific use-cases or on more recent datasets. For example, if you have your own taxonomy that you'd like to categorize certain text by - this is popular amongst advertisers who need to place ads on specific types of content, and e-commerce companies that need to categorize products on their site in an easily separable way - it's pretty straightforward to teach this model. The synonyms question is a good one - so far, we've found that adding them here isn't really necessary as they'll all present similar scores.

1 comments

apsurd 1204 days ago

thanks, i've been conflicted between "go with what you know" and investing time to keep up with tech.

your value prop is a nice hedge in that i can try it out and play with results without much any investment, thanks!

(long winded added context: im interested in classifying recipes into useful taxonomies. ex: "sweet potato" -> vegetables, root-vegetables, potato family, flavor: sweet, etc. still exploring, my goal is to recommend ingredient substitutions in an intuitive way. like "other potatoes" yes, but it's because it's starchy with a particular flavor profile etc.)

link

araghuvanshi 1204 days ago

Of course, we're happy to help :) The pace of change, especially in AI right now, is pretty dizzying so I can certainly relate.

That's a super interesting use case. I'm curious to see if the model can achieve that out of the box or if it'll need to be fine tuned. Please keep us updated!

link