Hacker News new | ask | show | jobs
by Chris2048 3349 days ago
I think here is an idea isn't unique to Quants/finance: curated datasets.

Hosted data-sets that are fully cleaned, verified and kept up to date. You pay a fee for the feed, which essentially covers initial and on-going curation. Fees would probably be based on usage (dev/test/commercial etc) but also the realistic market value of curation.

There's a mile of difference between a data-set that's been fed through a few cleaners and is 99% right, and one that is thoroughly checked, 99.99% right, and still updated as such with little delay. The former is the "one-man dev looking for easy passive income", the latter is the "quality datasets taken seriously".

3 comments

Algo trading is about having the edge.

The edge is something you have and other people don't.

Enjoy your feature engineering.

(Meaning: selling the same curated data product to many customers undermines its value. Overpricing it and selling only to the selected few, on the other hand...)

But this is algo trading specifically, the scope for curated datasets is larger.

Plus, what stops anyone building on top of a dataset? If this isn't dive ebay value do any third parties add?

A dataset sold to many customers doesn't undermine the price charged by the seller, as there would be no competitive advantage by not using it either.

s/"dive ebay"/"done, what"/

And to clarify the last point:

If a create a dataset for $100 I could sell it to one person for $120, or 6 for $20 - I make the same even if the value to each individual client is reduced; on the other hand, the value to each client versus* making their own is (120-100=) $20 in the first case, but (120-20=) $100 in the second, so fewer clients are likely to "roll their own" competing datasets.

Seems like neural networks would have some advantage here...
What sort of datasets do quants and finance need? For finance, I imagine it has to be both accurate and realtime for it to be of any value?
Accurate yes. Realtime not necessarily. EOD risk doesn't need real time market data.
Yes where are these data sets ?
That's been around for 20 years, and comes with it's own hardware platform (the Bloomberg terminal).