Hacker News new | ask | show | jobs
by danso 3678 days ago
Given that there's often a lot of boilerplate for TOS, would it be fairly easy to come up with a few heuristics for a tool that when fed the text of a TOS, could deemphasize the boilerplate and perhaps flag known problematic (or, at least, esoteric) language? Not a sophisticated AI thing, mind you, just something that cuts a little more to the chase.

edit: I mean that the tool should be a dumb high-pass filter that would work in addition to what tosdr provides through user reviews and manual classification via its plugin: https://tosdr.org/classification.html

3 comments

I'm a lawyer, and not an expert in AI. I once gathered a ton of Terms of Service, cleaned them up as text files, and began training a classifier. By the time I was done, it was pretty good at telling me which section heading in the typical ToS table of contents any random text belonged to.

It was an interesting thought experiment, so I expanded it to my large body of private contracts that I've collected after many years of law practice. The results were less accurate (because ToS tend to follow a pretty rigid pattern, mostly), but still pretty good.

The big question is: does anyone really care about it? For example, I never look at tosdr.org because what are you going to do? negotiate a ToS? It's not as if there's any meaningful freedom of choice in this space. My personal view (not legal advice!) is that most browse-wrap ToS aren't enforceable as contracts.

Not a lawyer but used to read EULAs etc carefully.

At some point I gave in and decided a better approach would be to just point out that no sane consumer can read all that.

In Germany at least (and thus maybe most of the EU?) no customer actually has to read any of it. Anything in a ToS/EULA that could be considered "surprising" is unenforceable and therefore void.

Of course that means the exact enforceable contents of every ToS ultimately boil down to case law but for customers this is a much better solution than "you may have accidentally sold your soul".

That's more or less true in America, too. Consumer contracts are generally subject to the "unconscionability" test: https://en.wikipedia.org/wiki/Unconscionability#United_State.... It's not the same as "surprising", but the sentiment is roughly the same.
The good news is that if you wait long enough, they'll incorporate terms from URIs that are dead. I can't tell you how many commercial contracts I review from the 2000's that are governed by 404 pages.
You took the path of reading EULA's carefully?! These documents are painfully lacking in clarity and sincerity. Content generators basically chump subscribers into clicking "I Agree".
Off topic, but you must have a lot of cool potential projects in your domain to apply programming.
I have a lot of "mad science" projects, but only to help me make my work easier. I would never try to turn any of my work into products for sale. Selling tech to lawyers and law firms is a good way to make yourself completely miserable. (They are nearly always nightmare customers.)

Lawyers and law firms currently have no serious imperative to work efficiently. Until companies let go of their reliance on big law firms, this will continue. I hear a lot of talk about people wanting lawyers to change, but then if you talk to any funded started, they nearly always either waste a ton of money on expensive law firms, or they avoid lawyers altogether. It's a complicated problem that perhaps only time can change.

> what are you going to do?

Possibly avoid the service or at least limit what data I share with them.

That's the right idea. It's a take-it-or-leave-it proposition. I think what's more interesting than policing Terms of Service changes is making the terms interactive, and attaching pricing to terms that are more consumer-friendly. Then I can choose how much I want to pay for my right to be treated like a valued human being.
Slightly related, but there's a plugin for Chrome that sort of does this: https://chrome.google.com/webstore/detail/terms-of-service-d...

It doesn't read the ToS for a site. Rather, it pulls from its own database that describes potential issues with a site's ToS and alerts you of them when visiting. YouTube always gets flagged for me.

What about when the problematic becomes boilerplate?