| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by siva7 1055 days ago
	Opinion pieces like shopping recommendations are quite hard for current LLMs. Either it is a hard fact - or pure creative work - that's where AI shines. Anything between and things get tricky

1 comments

2bitencryption 1055 days ago

This is one of those areas where the poor quality of the data influences the output, I think.

There are so many garbage, lazily written product reviews, by websites that only exist to get people to click affiliate links. These sites only have one goal, which is to get you to click an affiliate link and make a purchase. So it is not in their best interest to say "You shouldn't buy this."

Rather, they make a list of "top X Foobars", they start with a really expensive one, then they follow with a more reasonably-priced one, and give it a very positive review. It leads to clicks and purchases.

Given this, it's not surprising to me that even the best LLMs carry pieces of this with them. Ask it to predict text describing some tech product on a sales page, and of course parts of that low-quality data will bleed through.

link

cubefox 1055 days ago

There is an argument to be made for automatically downweighting (be it training epochs or pagerank rating) anything with affiliate links. But I guess it would be trivial to hide them behind a redirect.

That being said, I recently asked the Bing chatbot about the difference between two similar sounding printer models, and it gave a good explanation which I previously couldn't quickly find via Google. In case of Bing it is sometimes not completely clear to which degree its answer depends on the Web search, if it performed one, and to which degree it is just answering from its background knowledge (which could be prone to hallucination, but is less "gullible", so to speak). It provides sources, but not everything it says is necessarily present in the source. I'm actually surprised how quickly Bing is able to search (load and read) multiple websites, given that the loading times are not always trivial. It turns out they are much faster at reading than at typing. Indeed, each forward pass reads the entire context window, so once for every generated token!

link

nowooski 1055 days ago

For sure. The garbage in, garbage out problem is quite real for ecommerce applications.

link