| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by echen 1478 days ago

Great question! I'd love to measure that more rigorously too.

Although from what we've seen, the amount context sensitivity matters really depends on the labeling task / application.

For example, when you're trying to label a tweet that's a reply, context matters even more than when you're labeling a parent tweet: it's often hard to understand what the reply tweet is talking about when you can't see the full thread, it can be hard to tell whether something is a joke or an insult when you can't tell whether the replier and original tweeter follow each other or not, etc. This is important because sometimes our customers don't realize this, and will send us tweet text by itself instead of a full tweet link.

It's also important because even if your models are using text alone (and not a richer set of context/features), there may be patterns in the text itself that an ML could pick up on that a human wouldn't without that extra context.

We also have another post on context sensitivity if you're curious: https://www.surgehq.ai/blog/why-context-aware-datasets-are-c...