Hacker News new | ask | show | jobs
by kibwen 557 days ago
> Value increases with # total downloads and LTM downloads on PyPI.

While I applaud the OP for the initiative, if this ever takes off it will cause people to exploit the system in the following ways:

1. Hammer the package registries with fake downloads, which will increase the financial burden on the registries both in terms of increased resource usage and in employing countermeasures to stop bad actors.

2. People spamming the repositories of popular packages with PRs that just so happen to add a dependency on their own personal package, so that they can pick up transitive downloads. This increases the burden on package authors who will need to spend time rejecting all of these.

So this approach carries the risk of possibly making things even worse for OSS maintainers.

If a metric can be gamed, and there is a financial incentive to game it, it will be gamed. I coin this the "this is why we can't have nice things" law.

7 comments

> While I applaud the OP for the initiative, if this ever takes off it will cause people to exploit the system in the following ways

It's true that the metrics used in this story could lead to being exploited. But the value of the initiative is not in the specific method used to donate, but in the idea of finding worthy yet non-obvious projects to donate and in leading by example.

If the initiative catches on, the community can find better, harder-to-exploit methods to find deserving targets, as for example it has happend with NGOs. This idea could create a healthy ecosystem that supports FLOSS software, just like the idea of a stock exchange supported the emergence of public traded corporations in the XVIII and XIX centuries.

Exactly! The idea is to use available data for evaluating the value and risk of OSS and then allocate donations accordingly to the wide algo-based systemic index, not to a narrow set of manually picked projects (usually large or popular ones).

The current algorithm is far from being perfect (it's an MVP) and will never be, but with more measurable inputs and after multiple iterations with the help of the community, it can lead to an analogue of "S&P500" for OSS, that's worth using for donating to reduce the risk of the global OSS supply chain we all rely on.

As with publicly traded companies, having a decentralized set of private donors with skin in the game helps a lot to efficiently evolve the approach and make it harder to exploit in the future. And on the contrary, I would not trust an algorithm created and maintained by some state-owned or simply very large institution.

Even an index fund has some human-curated criteria for what to include, though, right? The S&P 500 isn't open to just anyone. So it seems totally legitimate to have it be not completely algorithmic.

If there were an "Open 500" that was trying to be like the open equivalent to the S&P 500, I would happily donate to it. Right now I do GitHub sponsors but it feels kind of random.

You just don't want to include projects like React or TypeScript that are operated by a for-profit company - they don't need our donations. You want it to be, this money is actually going to an organization that will invest it in software quality.

Totally agree! Actually I had outlined similar ideas and even an example (Pydantic) in https://news.ycombinator.com/item?id=42353209

In a nutshell:

- Algorithmic does not always mean automatic. An algorithm can have a human-in-the-loop element, as S&P500 or NASDAQ Composite have.

- Future versions of the index will account for known funding of OSS owners and maybe even exclude well-funded companies.

If everyone use their own idiosyncratic algorithm for choosing OSS to donate to, it's going to be awfully hard to exploit.
There are probably only so many obvious metrics from which to pick and you wouldn't have to game them all, just pick the easiest ones and keep grinding. Fraudsters are usually motivated and not that dumb.
The mechanism is kinda like the Spotify fake songs case: https://www.justice.gov/usao-sdny/pr/north-carolina-musician...

In the same way, there was a fixed pot of money available split up by popularity, so making thousands of songs and streaming them as much as possible with bot accounts is profitable, even though each bot account cost a few dollars per month.

Here, the bots you use to juice your numbers don’t even need a subscription fee!

Which is why spotify should pay a percentage of MY subscription fee to only the artists that I listen to. My money shouldn’t go to Taylor Swift if I don’t listen to Taylor Swift.

That would eliminate direct financial payment from botting. But botting could still affect trending or “related” recommendations for indirect financial boost.

The issue there is that the listens from people who listen to less music would be worth more than the listens from people who listen to more music.
That's not an issue, that's accurately reflecting reality. If I'm paying the same $10/month just to listen to $OBSCURE_ARTIST for 10 plays per month, then each play of that _is_ worth more to Spotify than each play from a 10-year old listening to the same track of $SUPERSTAR one thousand times in a month.

In one case, 10 plays brought in $10 of revenue to Spotify, and those 10 plays should get $PERCENT of that $10.

In the other case, 1000 plays brought in $10 of revenue to Spotify, and those 1000 plays should also get the same $PERCENT of that $10.

A fixed monthly subscription amount with unlimited usage will always carry this deficiency. A solution that addresses this would be usage-based pricing.
I don't understand how this is a deficiency.
That's not an issue. That's the entire point. You track listens per account and if you're only listening to a single niche musician, all your money (not someone else's) goes to that musician.

The real mystery is why it should work any differently, because the cross subsidy seemingly creates a perverse profit incentive for bots to scalp off some of that cross subsidy. The economics are broken. This is socialism for the rich and popular.

I find that court case very off putting, since it was Spotify that stole the royalties, because the same mechanic applies to simply being popular. When will someone sue Taylor Swift for stealing royalties?

Also, since they didn't change the economics, they have done nothing to prevent this from happening again. Any economist that sees that he can earn $12 from a $11 payment would keep doing this until the risk adjusted return is equal to the interest rate. Ironically this will remain profitable until the cross subsidy is gone. I.e. there is an incentive to use the bots to boost real musicians who lose out from not being the recipient of the cross subsidy.

Exactly, and Goodhart's law drives the nails in the coffin.

https://en.wikipedia.org/wiki/Goodhart%27s_law

Makes me think of the "cobra effect", like the Great Hanoi Rat Massacre.[1]

Set arbitrary metrics like download count -> bad actors make bots to download their package -> they profit while the registry suffers from very heavy load.

[1]: https://en.wikipedia.org/wiki/Great_Hanoi_Rat_Massacre

Rather than trying to donate to the most popular packages, people could try to donate to the packages they use, and then their dependencies (it would be nicer, though, if repos had a way for packages to list their dependencies and automatically propagate donations they received down—which would be a usable by the top level packages but, eh, you have to trust people at some point).
2 is already happening, I have seen this multiple times.
sweet sweet human nature.