Hacker News new | ask | show | jobs
by codingslave 2417 days ago
"you cannot brute force hypotheses", this isn't really true. Credit card data has notorious gaps and bias, but that doesn't mean that an algorithm cannot determine and make decisions about certain situations within that data. For example, if I receive a daily feed file of walmart transactions and the data is increasing in some kind of confidence measure that walmart will beat earnings, I stand to make a good sum by jumping into the market before competitors. It's common that all competitors are aware of the situation, aware of the possible alpha, and competing on speed/accuracy for it. So the superior ability of my model to take a calculated risk from incomplete data (as well as combine other data sources) is one way for me to make money. I may build a model of the common structure of the transactions, ensuring that any signal coming from the data is a real signal, and not one coming from one of the many data quality issues. In the case that my data quality classifier is pushing out high confidence, the result is saying earnings will beat, and other data sources are saying the same, then my model buys. Completing this kind of analysis by hand is too cumbersome (for my usual length of holding period), the money is in who gets there first. There are many ways, some more conservative.