Hacker News new | ask | show | jobs
by mjangle1985 1947 days ago
I participated in this bounty, it was a blast.

The team at Dolthub is great and extremely accessible on their discord. These bounties seem like a great use case for their tech.

If you're into git and data (like I am) then these bounties are just awesome.

1 comments

So you acquired and preprocessed some data for them? How did they verify its correctness?

I agree this looks like a great enterprise.

So data was accepted via pull requests.

The maintainer will review a PR and either accept the PR or ask you to modify it to better fit their requirements or reject it.

For example one of their requirements were no 0 vote rows. So that's a pretty simple SQL query on the database and can be checked before the maintainer does a merge.

All data was required to be sourced. I got most of my data from state and county websites so those links were included with the comments in the PR.

In addition I was in communication with the team via their discord so they would ask for changes to PRs from there also.

Interesting, thanks for sharing. So for the latest one where they have a bounty for assembling the largest healthcare dataset -- how do they determine who gets what portion of the bounty? It's not just winner takes all right?

This data looks cool too, I'll have a look in the Discord...

It's divided based upon total additions to the final data set I believe. They have github repo for their bounty board that shows that calculation I think.

I think the final calculation is based on the percent you've added to the final dataset.

edit:

yeah here's the repo that calculates the final payment. https://github.com/dolthub/bounties

Do they require you to "show your work" or somehow demonstrate that your methodology is sound? Is there any requirement that it be repeatable (e.g. let's say they find a minor issue in data you sourced. If you were required to provide code that does the work, rather than just the data, it could be fixed and re-run).

I'm kind of fascinated by the process, but I am having a hard time figuring out how this can really work. It can't really be as simple as paying people to shove arbitrary data of unknown value in their dbs, can it?

Join the discord and hit them with some questions. I'm sure they'll be happy to answer you.
Why not answer here? It’s an opportunity to generate more hacker interest. I’m probably not going to sign up for discord to ask one question.