Hacker News new | ask | show | jobs
by nikolayasdf123 282 days ago
> scraping LinkedIn profiles

is this legal? last time I checked linkedin.com/robots.txt do not allow scraping, unless explicit approval from linkedin

3 comments

If it is publicly available information it is legal to scrape it, regardless of what robots.txt says.

See: https://www.webspidermount.com/is-web-scraping-legal-yes/

As an attorney (and this is not legal advice), I don't think it's quite that simple. The court held that the CFAA does not proscribe scraping of pages to which the user already has access and in a way that doesn't harm the service, and thus it's not a crime. But there are other mechanisms that might impact a scraper, such as civil liability, that have not been addressed uniformly by the courts yet. And if you scrape in such a way that does harm the operator (e.g. by denying service), it might still be unlawful, even criminal.

There's a relevant footnote in the cited HiQ Labs v. LinkedIn case:

"LinkedIn’s cease-and-desist letter also asserted a state common law claim of trespass to chattels. Although we do not decide the question, it may be that web scraping exceeding the scope of the website owner’s consent gives rise to a common law tort claim for trespass to chattels, at least when it causes demonstrable harm."

They also said: "Internet companies and the public do have a substantial interest in thwarting denial-of-service attacks and blocking abusive users, identity thieves, and other ill-intentioned actors."

It's a good idea to take legal conclusions from media sites with a grain of salt. Same goes for any legal discussion on social media, including HN. If you want a thorough analysis of legal risk--either for your business or for personal matters--hire a good lawyer.

Smart
Or run your legal questions through a frontier model and then have a lawyer verify the answers. You can save a lot of money and time.

Yes, all LLM caveats apply. Due your diligence. But they are quite good at this now.

Have you actually tried this approach? I’m curious as to the result, especially when you took it to your lawyer. Not a contract review but a business practice risk evaluation.
Some context from coverage of GPT 5:

https://legaltechnology.com/2025/08/08/openai-launches-gpt-5...

https://www.artificiallawyer.com/2025/08/08/gpt-5-tops-harve...

Remember when "asking for a friend" was a thing?

Today's expression is "I asked a friend". You can try that when talking to your lawyer about your latest ChatGPT — they might still believe you.

Hmm this is a good idea too
what a nonsense. they explicitly say "do not scrape us, unless we approve". they put paywalls and captchas. their service is literally selling access to users data.

now you scraping it. this is direct violation and direct harm to their business, despite their explicit statements for you to stop.

you loose the case, it is clear as day.

what a nonsense. this is equivalent of "sovereign citizens" online. go and try it, and get yourself into jail.
Do not confuse strong language with strong argument. Yours is the former not the latter.
LinkedIn has api. So why to scrap?
because they are pulling what they are not supposed to. they are doing it illegally. that's why.
> they are doing it illegally.

ToS aren't real laws, mate.

Edit: oops, just saw a message from the creator of this thing saying he gets the data in the most illegal possible ways. They have no salvation.

It is possible to do what they propose legally tho the "agent" is just the users computer.

ToS are leagally binding contracts. there are there for a reason.

contracts are not laws themselves. but correctly done ToS (I bet LinkedIn does) hold very real legal power.

We get our data from third party data vendors who we assume have gotten explicit approval from linkedin!
You assume! Such due diligence!
Unfortunately not able to get into their codebase
Or yours...
What would you like to see?

Can tell you :)

you're building a tool that is designed to sink its tentacles into peoples' most personal accounts and take unsupervised automated actions with them, using a technology that has serious, well known, documented security issues. you haven't demonstrated any experience with, awareness of, or consideration for the security issues at hand, so the ideal amount of code to share would likely be all of it.