Hacker News new | ask | show | jobs
by ikeboy 3233 days ago
Several flaws immediately jump out, all potentially fatal to the conclusion and title:

1. "We will do so by removing outliers in the 10th and 90th percentiles." you can't remove outliers when investing unless you have a time machine

2. Their "starting price" includes an average of 10 days, 5 of which are before the analyst releases the recommendation. Again, to buy before the release requires a time machine (or inside information). Suppose a stock is at $50 for 5 days, an analyst releases a $75 target, stock jumps to $60, then hits $70 over the next 12 months. The gain based on the average of $55 would be $70/$55= 27%. But the average based on what price you could actually buy it at is $70/$60= 16%.

3. They narrow down analysts apparently a second time to pick out only the top 10, which seems to be distinct from the first outlier removal (not entirely clear what's being removed in either case.)

I see no statistical tests relating to removing outliers, significance, etc. This is all besides the considerable degrees of freedom (cutoff price, cutoff marketcap, 100 rating minimum for analysts which seems to have been implemented after collecting data, etc).

8 comments

> you can't remove outliers

To drive home this point, portfolio returns are driven at the margin. A minority of positions dictate the majority of performance.

This has the Texas Sharpshooter fallacy all over it. Shoot at the side of a barn and then put the bullseye where the most holes are.

Your number 3 is basically, "the ten people who did the best did better than everybody else" really? You don't say...

This is nothing like the sharpshooter fallacy. The analysis determined the average performance of the analysts with >100 stocks rated and 10 analysts out of 16 did better than the rest.
What? They removed all the outliers. They cooked the data to say something that made sense instead of something that was accurate.
To be fair, looking at real data if you look at the 20 best analysts _one year ago_ they made a pretty decent margin (the article doesn't cover this). I can see if I can dig up some old charts if you're interested.
There are a couple of services which collect analyst ratings and at least in the US and Europe, they have gone to great lengths to be nice to analysts so analysts will enter their ratings into their systems or banks will integrate with them. One of the things that have been built into most systems is the ability to "correct" so called data-entry errors. This functionality is heavily abused with analysts revising bad calls to make themselves look better than they actually are.

One will find fairly large differences between data sets that collect analyst ratings and "corrections" as they happen (which you are going to have to collect yourself) and analyst ratings as reported by the various services and the differences almost always make the analysts seem more accurate.

It is possible things are different in Canada, but if you look at bias-free data for the US or Europe, analyst recommendations are mostly random. Analyst recommendation data is mostly useful so you can bet against the short term spike or drop caused by people reacting to a change in analyst recommendation. This was very effective up to the 2000s but doing this with a reasonable Sharpe Ratio today is very hard (otherwise I wouldn't write about it) though it may be useful as an add on to an otherwise profitable strategy.

TipRanks was founded (I'm no longer affiliated) to solve that. Analyst opinions are extracted from real world news rather than paid for.

The point was to hold analysts accountable for their actions by objectively collecting them. TipRanks sells this data and I'd love to run a study on it but unfortunately I no longer has access.

The Marketbeat data set used here is tiny in comparison and only consists of a third of analyst recommendations if I remember it correctly.

Bloomberg has a much better dataset which is nearly as good as TipRanks'.

Again, I can't really make claims on their behalf (I'm employed at Peer5 (YC@W17) at the moment and am no longer affiliated).

Also, can confirm that TipRanks never "correct"ed any errors - which often resulted in "cease and desist"s and legal threats. No analyst who loses money likes a site saying it like it is.

The legal threats held to merit - but to be fair they were very scary as a young startup before partnerships with E*Trade, Nasdaq and others started.

Yep this is like coming up with a stock algorithm that says "every tech stock goes up on average 20% in the first week of September unless its an even year and the stock starts with the letter T"
I upvoted the parent post ONLY so people could read this comment. I wish there were a way to tag parent posts as "wrong."

As an antidote, take a look at this short McKinsey piece analyzing the performance of sell-side analysts, concluding that "earnings forecasts exceed realized earnings per share:" http://www.mckinsey.com/business-functions/strategy-and-corp...

Obviously, predicting company earnings is not the same as predicting stock prices, but if sell-side analysts have such a terrible track record at the former , one would expect them to have a terrible track record at the latter too.

I will agree that the methodology is not as rigorous as it could be but where can you prove it is "wrong"?

My blogpost shows that stock price predictions also show a terrible track record. They are wildly off and on average higher than actual results.

You are the one making a controversial (I'd say fantastical) claim, so you are the one who has to prove it's "right."

Here are some basic questions for you, related to the points made above:

* WHY did you remove outliers in the 10th and 90th percentiles? What happens if you don't remove them?

* WHY did you use a 10-day window centered on dates of recommendation? What happens if you use the price on the same day?

* Why did you choose those return horizons? What happens if you choose different ones?

* WHY did you pick out only the top 10 analysts? What happens if you don't?

* WHY did you not do statistical tests relating to removing outliers, significance, etc.?

* WHY did you choose those cutoffs for price, marketcap, and minimum analyst rating?

Outliers were removed to get a better measure of the "accuracy" of the price targets.

10 day windows were used to reduce the amount of volatility/noise in a time frame

Return horizons for 1 years was used because price targets are for one year.

Theres only 15 or so analysts I looked at.

I was doing this as an exploratory data analysis and didn't want to pull out my old stats textbook.

Cutoffs were chosen to reduce volatility of measurements since I was looking at percentages. A stock going from $1.5 to $2.0 is a 33% increase whereas the movement of $100 to $133 is significantly more impactful. Stock with lower market cap have more volatility. The minimum analyst rating was chosen to eliminate analysts with very small number of ratings as they would be unreliable.

Even though there may be some flaws in the way stock price basis etc.., the Pandas and python code and the methodology of calculating 'returns based on Analysts 12 month price forecasts' is very valuable .

One can easily tweak the code to suit their needs, I appreciate CODE Author for the effort . The Pandas and Python code can be used as spring board to further one's reserach

I've never seen a backtest that wasnt amazing. Problem is that there are so many errors, mistakes, statistical biases that creep into a study. Without a proper review by multiple people before publishing, chances or you will be plain wrong. I subscribe to the strong EMH. Only way to make money in the markets is to either insider trade (which will certainly land you in jail) or be a service provider.
Funds like Renaissance consistently make money
1. Removing outliers was for making the data easier to analyze as some outliers were skewing the average. Of course when investing you cannot ignore outliers, but you could possibly curb them with stop losses/stop limits.

2. A more in-depth analysis could be done on analyst releases' effect on prices but assuming that this does occur, then the performances are understated and provides further evidence to the conclusion that outperform ratings can do better than the market.

3. Not sure how narrowing down the top analysts is a flaw here.

This blogpost is probably not as mathematically rigorous as it could be as I just wrote it as an exploratory analysis for fun and out of curiosity.

The problem is "making the data easier to analyze" may make the analysis invalid. Your response is not increasing my faith that you carefully considered what removing the outliers would do to validity of analysis.

> Not sure how narrowing down the top analysts is a flaw here.

Potentially, because how did you decide what "top analysts" were? If it's using the same methods you used to determine they were successful, it just means analysts that come out of your math come out of your math.

If a 50K people flip 10 coins, one of them might flip 10 heads. It doesn't mean that person is better at flipping heads. We could in fact calculate the chances of one of 50K people flipping ten heads. If I decided it meant that some people really were better at flipping heads, I'd probably be wrong. (Although if I calculated the chances and discovered it was like a one in bazillion chance that even one of 50K people would flip ten heads... I'd probably at least consider that they might be better at flipping heads! But I'd probably run the experiment again. :) )

If I pick the top 100 heads-flippers from my 50K coin flippers, and show that they really are better at flipping heads because they flipped more heads in the same dataset that I used to pick them as the top 100 heads-flippers in the first place --- I haven't really shown that at all. By "narrowing down top analysts", depending on how you did it, it's possible you simply found the analysts who got lucky, while ignoring the ones who didn't.

Statistical analysis is _tricky_.

If 50,000 people each flip 10 coins, it's actually overwhelmingly likely that someone will get 10 heads. The chance that it doesn't happen is about one in a sextillion (10^21).
It's a completely different matter. As far as I know analysts are doing their analysis when pricing stocks. They may not be so good or discount all the factors, but analysing the products of some companies, their revenues, returns and other parameters seems extremely different than flipping coins to me. So your analysis has no whatsoever basis given that you are comparing a completely random outcome of some well known physical action to a chaotic system (the market) in which at least the basics influence factors on his constituents are well understood. Or are you suggesting for example that warren buffet is just being lucky for endless decades and you and everyone else know at least how to equate his performance? If is that what you think then please, I would be rather amused to see your performance as an investor compared to him in the course of several decades.
Well, see, that's the whole deal, investigating _how_ different it is than flipping coins. That's the whole question, really. Starting with the assumption that they _must_ be doing better than chance is not the right place to start in order to analyze if they are or not.

Most statistical analysis is about trying to distinguish meaningful results (implying a repeatable correlation of some kind that means something), from random chance with no meaning. The whole point is you _don't_ start out knowing if the thing you are investigating is random chance or not, if you did, you wouldn't need to analyze it. That's what statistical analysis is for. In part because we humans are really really good at finding patterns and assuming a meaningful correlation when in fact it's just random chance.

The coin example is useful because we all know (or define for the sake of the discussion) that it must be random chance, so any analysis that appeared to say it wasn't is probably in error. And using the same sort of analysis on something where you don't know how much of the effect is due to random chance--is not going to answer the question.

Funny you mention Buffet, he's about to win his bet that a set of many hedge funds fail to beat the market over a decade.
I mention him because apparently for the parent message he is only a coin thrower and he will give us only insights on the percentage of people that can get 10 heads in a row.
If someone tried to use Buffet as an example of market beating but did zero statistical analysis to determine how likely it is to be actual skill they would also deserve to be dismissed.
I probably should have included the outliers when analyzing overall performance but if I recall correctly they did not have a significant effect.

The top analysts were determined by the average performance from one year after their ratings have been made. This isn't the top analysts out of 50,000 it's the top out of 50 or so analyst-rating pairs. There were only 16 or so analysts in total that I looked at. This isn't an instance of survivor bias as your example states. If I were to be more rigorous I could give a statistical test for this.

Looking for top performers is always invoking survivor bias. It's a classic data snooping issue where the common sense approach is exactly wrong, but it'll sell a lot of books and it'll convince people who you know what you're doing as a stock analyst when it's just random luck.
Top 10 performers out of 16 or so analysts in the analysis is not survivor bias.
Sure it is, it was survivor bias when you selected the 16.
If you're ultimately making a claim that strategy X beats the market, you need to make sure strategy X is something you can implement without time travel.

For 2, it's not just the effect of the release on prices, but could also be the effect of other things on both releases and prices - imagine great news came out which bumped up the price and also caused analysts to upgrade their ratings.

You should use price data from the day after release.

For top analysts, if you only know who's top after looking at their performance, then it's again not a repeatable strategy. Compare: "you can beat the market just by buying the top 10 stocks!"

It's nice to play with data, I'm just laying out some of the reasons these won't work in the "real world", and pointing towards where a future analysis could be improved.

You're correct that I should have accounted for outliers when measuring performance of this strategy. If I recall correctly, even with the outliers, they did not significantly affect the average performance of analyst ratings. I would have to rerun the numbers though.

The analysis was more about measuring the performance of analysts which is why the price data for before and after the recommendation. For practical purposes of using this strategy, you are right that the price data from days after release would be better.

If the top 10 stocks you picked beat the market and have consistent earnings and dividends over a period of time, would this not be a repeatable strategy?

Point is, you don't know which were the top ten until after the fact.
How do you measure top ten if you don't have a ranking system?