Hacker News new | ask | show | jobs
by quotedmycode 3980 days ago
From what I understand, they never implemented any of the algorithms, because their existing one was good enough, and the premise has changed. It used to be, you'd make suggestions so people could get DVDs sent to them, and you'd know they would enjoy it. Now with streaming, the cost to send something you don't like is cheap. So the recommendation engine just has to come up with something you'd try to watch and perhaps enjoy. If you start watching something, it's not your taste, you switch streams. So there's not a lot to gain from improving recommendations 10% unless they were pretty low quality ratings to begin with.
4 comments

I think the cost of postage/fulfillment/bandwidth should be a secondary concern to how many users can Netflix attract by providing content they actually enjoy. A bad recommendation for streaming or DVD both waste n minutes of viewing time. A bad DVD recommendation additionally wastes 2-6 days of waiting on USPS.

They already have the recommendation engine built and they need every differentiation they can get to compete with the crowd of streaming providers.

I completely agree. Real hype is generated when Netflix consistently nails recommendation. Content is still king, but curation of that content is the engine that drives the whole thing
I agree about your "real hype" statement. There is always the potential for a provider to build business value by nailing recommendation, especially when it comes to lesser known content. Most viewers have heard of big-budget films they plan to watch, or they'll watch any movie with <insert actor>. However, if someone created a system that did this well enough for users to trust it with unknown content, it would go a long way toward meaningful differentiation.

Right now, I don't know of anyone who has limited spare time who would risk trying something unknown just because it shows up in their Netflix recommendations. If Netflix got those right frequently, people would rely on it more and probably enjoy it more. A 10% increase in accuracy has a decent chance to push beyond the invisible "good enough to trust" threshold and create an experience other streaming services just don't have.

A lot of Netflix's success comes from finding great, unknown movies and allowing you to discover them. They're great for Netflix because they're cheap for them and provide high value.

When Netflix users only decide what movies to watch based only on the titles they recognize from marketing campaigns, their disappointment with Netflix's selection will be higher.

Their optimal strategy is to have a recommendation engine that routes users to good, but unknown movies that they know people end up enjoying if they just take the dive to start watching them.

In Search of General Tso.
> So the recommendation engine just has to come up with something you'd try to watch and perhaps enjoy.

Heck, if it did that I'd be happy. I don't even know why I still have my Netflix subscription. The selection is basically as good (bad) as Hulu and Crackle, the price is higher and the recommendations are laughable (right now they're pushing: Velvet, a TV series about a Spanish fashion house; Sense 8, the latest lameness from the Wachowski brothers; Elsa & Fred, aimed at old people, a Robert Pattinson vehicle, Orange is the New Black, The Butler and piles of other stuff I would never watch if you tied me to a chair).

Frankly, I don't believe they have a recommendation engine anymore: I think they just suggest whatever is cheapest for them to play.

There was a parsimonious model that got to 8% improvement. We know this as it was posted publicly by "Simon Funk" [0] after he decided to give up halfway through the competition.

This simple but more accurate model would have been a useful foundation for Netflix to build its own internal models on, and highlight the flaws in the assumption of the Cinermark algo.

[0] http://sifter.org/~simon/journal/20061211.html

That looks interesting. I wonder if nowadays that result could be replicated with readily available algorithms from scikit-learn.