Hacker News new | ask | show | jobs
by ahulak 3487 days ago
There are ~100M homes, ~3k counties, ~27k zipcodes, and ~300k neighborhoods and subdivisions that all have different market dynamics. A sample size of 9 make this analysis completely meaningless.

Equally important, is that the author of this article doesn't appear to know much about how the real estate data industry works (especially in regards to market trends and how they are used to generate the Automated Valuation Models (AVMs for short). A bit surprising given the nature of the business he is representing.

The Zestimate is one of a handful of Automated Valuation Models (AVMs in industry speak) - it's the only one with a consumer-facing brand, and as a result sees the most scrutiny despite the fact that the other AVMs on the market are the ones actually used by the banks during the appraisal process.

If you think the Zestimate is bad, you should see some of the other commercial AVMs.

The reality is that creating an automated valuation for the ~100M properties in the US is an incredibly difficult task given the availability of data (or lack thereof!). Zillow does a pretty darn good job and has a bit of a data advantage given its incredibly high coverage of for sale listings (though they lack the experience - some firms have been producting AVMs for decades!).

Another thing to note, is that AVMs (like any statistical model), usually aim for a sweet spot that covers as many homes as possible. As a result, ultra high-end properties are often incorrectly valued. E.g. that $25M mansion down the street will be very hard for the model to price. Additionally, certain luxury features, like say, a tennis court, are almost impossible to incorporate into these models on a national scale.