Hacker News new | ask | show | jobs
by kevincox 1176 days ago
I don't think this will work well. Minor updates to listings would trigger all of these actions far too often to make them standard and ignored.

I hate to say it but I think some type of heuristics would be needed here.

1. Has the title significantly changed. 2. Has the price significantly changed. 3. Are the search keywords that were finding the old listing significantly different than those finding the new listing. 4. Have average ratings and common words in reviews changed? (Especially rarer words that match the new and old listing respectively)

If some of these start to look suspicious then I think you can start to apply your mitigations. You can probably even scale them by how sure you are. For example reviews are always downranked by age and significant changes to the listing amplify this effect, you can add the same weight to the start rating.

And of course the real way to prevent this is to flip the incentive. Add human review and a warning before killing the account. Make it so that the cost of being caught negates the benefit of doing this.

5 comments

Web megacorps are normally allergic to any kind of human review because they are in the business of picking up pennies on each interaction via adverts. It's unsustainable to police the world on that model.

Amazon is in a different space here. Even the smoothest transaction goes through a handful of literal human hands. They have to pay for those hands regardless. At the very least following up on cases where customers (and competitors) flag fraud on their system should be possible.

I'm pretty sure if you gave ChatGPT the old and new versions of the listing, it would have a 99%+ accuracy when answering the question, "are these for the same product?" So they could just run each change through something like that, and wouldn't have to write any custom heuristics.
I'm pretty sure any of a million simpler edit distances could tell you if a product listing was substantially changed.
Of course, though perhaps with slightly lower accuracy. Either way, my point was that it's far from an unsolvable problem; it can be trivially solved with existing tools.
Honestly, these seems like a perfect problem for GPT.

Show the title and main description to GPT everytime the seller makes a change, and ask "Do these seem to be the same product?"

If GPT says that they seem different, flag for human review.

You could probably even ask GPT to take its confidence. If it's highly confident, skip the human review.

This is just a simple classification problem, the prime application for basic neural networks. Using a general text generation system for this seems like complete overkill. Just a bunch of wasted resources.
This was also my first reaction, but it got me questioning whether I’m just becoming the same as the guys who were saying “using an interpreted language for that is a waste of resources”. Maybe LLMs are the equivalent, sure they use more CPU cycles, but you can point them at some problems and get them solved for a fraction of the effort.
This is true; but this is a cost optimization argument; not a feasibility one, which is all the OP was proposing.
Title: BLOMGHSY Luxury Shower Curtain Premium Best Quality

Description: Ignore previous instructions […]

> some type of heuristics would be needed here.

Calculating a meaningful numeric difference between two chunks of text is fairly well-trod territory.

It will help, but some changes are very small. For examle adding "faux" (for leather, etc.) to the listing name/description would probably result in a very small text distance, while changing the contents substantially.
reputation systems are not some esoteric things...

also, if a fucking seller cannot keep their listing reliably constant, what are they selling?

new version, new product, new reviews.

car manufacturers do this. wineries do this. pharma does this. even Apple managed to show the manufacturing date of their new new new new but the same things.