Hacker News new | ask | show | jobs
by chollida1 3349 days ago
Intersection of trading and data mining.

I've said this a few times but we're going through a growth period like AAA video games have over the past 20 years.

I used to be that 2 guys could make a video game, then it went to 10, then 50, now its around 200 from what I've last heard.

Hedge funds are going through a similar shift.

It used to be that one person could manage data cleaning, and algo generation for a fund.

Then cleaning got split out into its own job.

Then the number of data streams exploded growing by a couple orders of magnitude.

Then the data types diverged so that each new data stream needs its own special cleaning, and normalization and even data storage, ie some data isn't suitable for a sql or non sql database storage, like satellite images.

Nowadays a typical algo fund might make use of 100 different algos for trading, each of which has 20 different inputs, some real time, some updated irregularly.

It takes those signals and weights them to come up with a trading signal, which then gets mixed with a portfolio balancing signals and risk signals.

It can be tough to disentangle each individual signal from the algos themselves so even things like detecting if a signal still has alpha generating abilities is tough.

You can have 10 people just back testing signals and monitoring risk levels.

And the growth of data and data sources isn't slowing down.

This is good if you are one of the larger players, see Virtu buying out competitor KCG, who previously ate competitor Knight Capital, yes that fund with the huge blowup, but not so great news if you want to remain a small, person wise, fund.

Not sure how to run a quant fund anymore with only 4 people. Not sure anything an be done about.

4 comments

The tools still exist to be a one man shop though. Get yourself an Interactive Brokers account, learn pyalgo, and off you go.
I would caution people not to believe it is that simple to actually make money like this.
Agreed. It's a quick way to go broke if you don't know what you're doing.
That's a great way to give your hard-earned money to large trading firms.
Are you sure you understood the site? You don't give them money or anything.
I am pretty sure you are misunderstanding the comment.

The comment is not "By signing up for Quantopian, you pay a membership fee to Quantopian, which is run by large trading firms."

The comment is "Large trading firms make money by being the counterparty of day traders making mistakes." When you mess up - which isn't just losing USD, it is underperforming the market's usually-positive return - there is someone out there on the other side of the trade who is buying what you sold and selling what you bought. When you underperform the market, they overperform the market by exactly as much.

Okay... but on Quantopian you just write algorithms, you don't have to invest your own money. You don't mess up and lose USD, you don't pay a membership fee.

The site's owners decide to invest with the best algorithms on the site, and you get a portion of the returns if they choose yours.

> When you mess up - which isn't just losing USD, it is underperforming the market's usually-positive return - there is someone out there on the other side of the trade who is buying what you sold and selling what you bought. When you underperform the market, they overperform the market by exactly as much.

If I trade profitably but making below market returns, the notional someone else who is taking the opposite sides of my trades is not outperforming the market by as much as I am underperforming, they are losing money and, thereby, underperforming even worse than I am.

There's obviously people in the market overperforming, but it's not someone taking the opposite of my positions that is doing it.

Counter position is hard to define in a large market, so take a market with two stocks A and B. You buy shares of A from counter-position-inc who moves that money into B (possibly buying the shares you just dumped).

Over a time period A goes up 5% B goes up 10%. You sell your shares in A (profitably) but under performed the average market returns by 2.5%. counter-position-inc sells its shares in B, over performing the market average by 2.5%.

I think here is an idea isn't unique to Quants/finance: curated datasets.

Hosted data-sets that are fully cleaned, verified and kept up to date. You pay a fee for the feed, which essentially covers initial and on-going curation. Fees would probably be based on usage (dev/test/commercial etc) but also the realistic market value of curation.

There's a mile of difference between a data-set that's been fed through a few cleaners and is 99% right, and one that is thoroughly checked, 99.99% right, and still updated as such with little delay. The former is the "one-man dev looking for easy passive income", the latter is the "quality datasets taken seriously".

Algo trading is about having the edge.

The edge is something you have and other people don't.

Enjoy your feature engineering.

(Meaning: selling the same curated data product to many customers undermines its value. Overpricing it and selling only to the selected few, on the other hand...)

But this is algo trading specifically, the scope for curated datasets is larger.

Plus, what stops anyone building on top of a dataset? If this isn't dive ebay value do any third parties add?

A dataset sold to many customers doesn't undermine the price charged by the seller, as there would be no competitive advantage by not using it either.

s/"dive ebay"/"done, what"/

And to clarify the last point:

If a create a dataset for $100 I could sell it to one person for $120, or 6 for $20 - I make the same even if the value to each individual client is reduced; on the other hand, the value to each client versus* making their own is (120-100=) $20 in the first case, but (120-20=) $100 in the second, so fewer clients are likely to "roll their own" competing datasets.

Seems like neural networks would have some advantage here...
What sort of datasets do quants and finance need? For finance, I imagine it has to be both accurate and realtime for it to be of any value?
Accurate yes. Realtime not necessarily. EOD risk doesn't need real time market data.
Yes where are these data sets ?
That's been around for 20 years, and comes with it's own hardware platform (the Bloomberg terminal).
Do you recommend any open-source tool/stack for complex event processing?
Akka has had a lot of success.