Hacker News new | ask | show | jobs
by rootlocus 7 days ago
You also need to scrape huge amounts of data with no regard for copyright which is:

1. No longer possible the same way it was for openai and anthropic and

2. Much more regulated in the EU

Also the EU would need state backing since we don't have the same private capital, meaning the regulations are even tighter.

1 comments

You can do what frontier labs do today which is to properly license things that are copyrighted and use open source web crawls for things that don’t have copyright issues. You can then also commission new datasets (volume needed goes down when quality is high).

The European regulations are the thing that will kneecap anything meaningful coming out of Europe. Mind blowing to me that this is worth the tradeoff since Europe will be beholden to other frontier labs be it China or the US, so regulations accomplishing very little if anything on impacting actual AI development and losing vast amounts of leverage in the process.

> You can do what frontier labs do today which is to properly license things that are copyrighted and use open source web crawls for things that don’t have copyright issues. You can then also commission new datasets (volume needed goes down when quality is high).

It cost Anthropic $1.5 billion for training on libgen's 480k pirated ebooks.

Investors will cough up that money if you're already clearly a frontier lab with a model people are paying a lot of money for.

Tough to get that much cash without anything to show.

I thought the joke was that people aren't paying enough money.
Regulations aside, Europe is extremely divided. There's constant resistance from individual states, disputes and far right extremism gaining traction. At this point, it seems like EU can barely agree to make any decision.
Well-meaning restrictions that threaten Europe's ability to compete sound like something that would eventually encourage far right extremism by impeaching the validity of the restrictions' philosophical underpinnings.
In my experience it doesn't take anything that ample to encourage far right extremism. It's enough to point at an existing problem and a convincing scapegoat. It works today not eventually and it works regardless of any reason or reality.

It's true though that multiple problems mean multiple propaganda seeds.