| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by neximo64 1132 days ago
	Are you actually throttled if you try to git clone or is that what the theory is, or is the assumption that it uses API calls to scrape through github? Has anyone actually tried, because i've cloned lots of repos and have never been throttled. I'd go so far as to say the author of that post has never even tried it.

3 comments

jackdaniel 1132 days ago

I'm not arguing for or against whether they are in the dominant position; what I'm doing is pointing out that the grandparent quoted part of the text (and argues against it) without quoting the justification the author provided that is directly relevant to what they say.

> There’s an important notion to address here. Open source code on GitHub might be thought of as “open and freely accessible” but it is not. It’s possible for any person to access and download one single repo from GitHub. It’s not possible for a person to download all repos from Github or a percentage of all repos, they will hit limitations and restrictions when trying to download too many repos. (Unless there’s some special archives or mechanisms I am not aware of).

link

logifail 1132 days ago

> Has anyone actually tried, because i've cloned lots of repos and have never been throttled

(Full disclosure: I have some pretty serious data hoarding issues)

When someone says "I've cloned lots of repos and have never been throttled" I'm afraid I immediately start wondering whether "lots" means multiple GB or multiple TB ... or more!

link

quickthrower2 1132 days ago

21Tb of data, they might rate limit you! But might be possible via proxies. But only public repos.

link

neximo64 1132 days ago

Copilot was only trained on public repos. Id be surprised if you were throttled.

link

scott_w 1132 days ago

I'd be surprised if they didn't throttle anyone trying to download 21TB of data. And I wouldn't judge them for it.

link