Hacker News new | ask | show | jobs
by api 4118 days ago
The failure is the failure to generate novelty or adequately explore alternatives.

Enough people might reliable go see 25 sequels, but none of those films will be memorable. None will advance the art of film-making. None will change anyone's life or mentality or affect the culture in any meaningful way.

Being data driven means chasing the biggest, loudest signal in your data set. It means pandering to that signal, because it swamps all others. A data-driven approach is not going to lead you anywhere new.

In machine learning / AI we refer to such algorithms as "greedy." A classic example would be simple hill climbing a.k.a. gradient descent. These algorithms are known to be very good at optimizing within the bounds of a simple well-behaved and regular fitness landscape, but they readily become stuck at local maxima when presented with any solution space with any complex structure.

We're living in what I'm tempted to call the dark age of the local maximum, the age of gradient descent.

7 comments

> Being data driven means chasing the biggest, loudest signal in your data set.

This reminds me of how one time, a professor of mine was discussing a pie-chart. He pointed towards the smallest sliver and said "Know what that is? That's opportunity!" He was most interested in the smallest signal because it represented untapped potential. (IIRC the topic was related to replacing oil with renewable energy.)

> The failure is the failure to generate novelty or adequately explore alternatives.

Perhaps you value novelty more than the average person. What might not be "adequate" for you might very well be adequate for a huge portion of people.

Throughout this comment you mention several potential goals of content producers (being memorable, advancing the art, changing someone's life, etc.), but you make no argument for why those goals ought to be prioritized over other goals, other than the implication that you personally prefer those goals.

> Perhaps you value novelty... > What might not be "adequate" for you...

It isn't about what he values versus what you value. What the author complains about are well known problems with recommendation engines. Take the naive reco algo - "You just bought a 50 inch Philips TV. You might also like - 50 inch Sony TV, 50 inch LG TV, 50 inch Samsung TV, 50 inch ..." - See the problem ? I already bought my fucking TV, you can't expect me to buy more & more of the same or similar goods.

Then there's the CF algorithm & its variants, with well known problems - namely, they don't actually match content to one's preferences. Typically, one POV dominates across the board due to the sparsity of the matrix, & getting the diversity required for the matrix to fill out takes a long long time and a very large number of people with diverse opinions. You mistakenly give a five star rating to Godfather & you are bombarded with mafia movies for a long time. You attempt to confuse the system by giving Pretty Woman five stars as well. Then the system tries to gamely proceed by suggesting "Those who watched Godfather AND Pretty Woman are more likely to watch - So I married an Axe murderer."

Can't win.

There are auto-complete screenplay software that basically make a composite of the top 100 best selling screenplays & do what in the industry is called a flip. Namely, change male to female, winner to loser, comedy to tragedy etc. These data-driven screenplay software might suggest that if you take the ladies from Thelma & Louise & replace them with grizzly old men, you get Unforgiven.

There are lyric generation software with the same flavor, umpteen loop generators & infinite jukeboxes, content recommendation systems along the same lines - since you just starred this code sample on angular, you will enjoy this github repo on react,...

Hopefully you see the downside.

Exactly. I don't want what I already have. Give me what I don't have, or better yet something so creative I don't even know I want it yet.

But that's hard, while stupid hill climbers and infinite monkey machines are easy. It will keep working until people tire of it, which judging from the abysmal sales in music is happening. Won't be long before people also stop going to see Batman Redone.

I understand the general argument, and I don't disagree in general. My intuition is that showing 50 inch TVs to someone who just bought one is not ideal [0]. My point is that the specific examples provided (Hollywood blockbusters, etc.) are not accompanied by any evidence or reasoning to convince me that these industries are not doing a good job of satisfying the market.

[0] That said, I've seen lots of counterintuitive but very real phenomena regarding user behavior, so I won't claim to be that confident about this being ineffective. Perhaps people return TVs a lot and buy other ones. I don't have the data.

The data might just show that offering 50 inch TVs to someone who just bought one is actually a rich opportunity. Perhaps both televisions were stolen in a burglary. Or someone is finally upgrading all their televisions from CRT to flat screen -- maybe they moved from a house to a small apartment. It would not surprise me in the least if people were 10 times more likely to purchase a television having just bought one, compared to individuals randomly selected.
And yet you can find plenty of articles about how procedurally generated game environments create longevity.

Imagine a machine that could make procedurally generated movies, would that interest you?

Stories have already been distilled to the Seven Basic Plots [1]

Novelty needs repetion in order to be novel.

[1] http://tvtropes.org/pmwiki/pmwiki.php/Main/TheSevenBasicPlot...

It's seems clear that any content that advances the art is better than a content doesn't advance the art. I guess that what he meant is that global maxima is better than local maxima (which is clearly true), and data-driven vision is a hill climbing strategy, thus locking you into a local maxima.
> It's seems clear that any content that advances the art is better than a content doesn't advance the art.

Better in what way?

Let's say Movie A is a well loved blockbuster that millions of people see and enjoy. Movie B is a very mixed piece that isn't really enjoyable to watch, but "advances the art" in some key ways. Movie C is an even more well loved blockbuster than movie A, which even more millions of people will see and enjoy, but that was only made because the director was one of the few that really understood Movie B.

The argument, as I understand it, is that Movie B is somehow objectively better than Movie A and Movie C, because it enables Movie C to exist, even though Movie C isn't actually good, because it doesn't advance the art? That doesn't make sense to me. The journey has value only to the extent that the destination is valuable, no? If C is trash, then what was the point of "advancing the art" enough that we could make C? (Conversely, if we're discussing "advancing the art" in a way that isn't required to make anything anyone wants to watch, then we're clearly not discussing finding a global maxima, right?)

You're retroactively adding new premises (eg that Move C is crap) in order to support your conclusion. Step back from your post for a moment and you'll see it's based on a logical fallacy.
B does not exist for the sake of C, which is profiteering from B.

B exists for D, which will be better than B.

Isn't this the history of progress in creative human activities?

Sounds like an exploitation vs. exploration discussion. The correct answer is that both is worthless without the other; executing movies well is worthless without exploring new movie ideas and vice versa.

Given that we've been exploring for quite some time I guess exploitation is generally preferable if we had to choose.

Executing a movie with new ideas well is not an option as you're relying on the new idea being good i.e. dumb luck.

> It's seems clear that any content that advances the art is better than a content doesn't advance the art.

Perhaps that's true with all else being equal, but clearly all else isn't equal.

> I guess that what he meant is that global maxima is better than local maxima (which is clearly true), and data-driven vision is a hill climbing strategy, thus locking you into a local maxima.

My issue is that none of those examples are backed by any evidence that they are not doing a decent job of finding a global maximum.

It's true in the sense that what makes technology and culture interesting is invention, not endless recycling.

You're pretty much suggesting that using strong feedback to force culture to stay within a tiny area of the total possible cultural phase space is just as interesting as allowing chaotic exploration of the entire space.

It's not just an argument against creativity, it's an argument against invention in general.

>My issue is that none of those examples are backed by any evidence that they are not doing a decent job of finding a global maximum.

That's the thing about global maxima - you only know that you've found a global maximum if you've explored the entire space.

Otherwise you've just stumbled across a local attractor, and you're stuck in a loop around it.

This isn't even a good analogy, because cultural attractors are contingent, and they vary over time. They're also unpredictable.

The reason they capture attention isn't because they're maxima in some analytic sense. They become found maxima because they summarise some aspect of human experience, so they appeal to a lot of people at once.

The spectrum of possible maxima is mysterious and not understood, which is how you get - say - Harry Potter coming out of nowhere and captivating a generation.

Culture is music, not a sine wave. You don't just want a single signal - you want a mix of related-but-different signals running all the time.

> It's true in the sense that what makes technology and culture interesting is invention, not endless recycling.

That may very well be what makes technology and culture interesting to you. It's not necessarily what makes technology and culture interesting to everyone, or even a large portion of people.

> You're pretty much suggesting that using strong feedback to force culture to stay within a tiny area of the total possible cultural phase space is just as interesting as allowing chaotic exploration of the entire space.

I made no remarks even remotely suggesting any of those claims.

> That's the thing about global maxima - you only know that you've found a global maximum if you've explored the entire space.

That may be a useful statement for extremely small problem spaces. It's not useful for the problem space of films. There are a lot of different possible 90 minute long 1080p 24FPS 24-bit color films. Good luck performing a search over that problem space.

>That may very well be what makes technology and culture interesting to you.

Do you really prefer static, stagnating cultures to dynamic and inventive ones?

How do you advance art?

I am not very educated in the field of art. I always find orders of magnitude more beauty in nature than in art.

It's possible to make that exact argument in reverse. Those goals are, if not quite universal human values, products of a rich and extremely broad-based cultural tradition.

In contrast, being data-driven means selecting goals that are easy to measure without any attempt to justify them as intrinsically more valuable than other, more difficult to measure goals.

The reality is that goal-selection is always at least somewhat arbitrary, and words like 'data' and 'science' can be and often are used as a cudgel anyone who might support different (arbitrary) goals.

You can make exceptions to all those rules though. Empire Strikes Back is probably the most obvious example of a sequel that was far and beyond the first and changed cinema.

If you're talking about money-making sequels than Toy Story 2 and 3 were memorable and interesting.

Data is useless for art because art is expected to explore the boundaries of what's acceptable and what's conventional. There is no data for things that haven't even been done yet.

Historical data on sales of paintings would've never told Picasso to pursue Cubism, or for Salinger to write 'Catcher in the Rye'.

It was data that causes so many publishing houses to reject future bestsellers. Sales figures for past children's books told publishers that "Harry Potter" would never work.

We all know how that played out.

As for business, existing data would have never told Jobs to build an iPhone - simply because there really was no data on touchscreen devices.

> Historical data on sales of paintings would've never told Picasso to pursue Cubism, or for Salinger to write 'Catcher in the Rye'.

In that case I wish art would be data-driven, and produce more of the pre-Picasso paintings that are fun to look at and pre-Salinger books that are fun to read.

> ...25 sequels, but none of those films will be memorable.

* Back to the Future 2 * Indiana Jones and the Temple of Doom * Army of Darkness * Goldeneye * Evil Dead 2 * Terminator 2 * Ghostbusters 2 * Captain America: Winter Soldier * Batman: The Dark Knight

Yeah, none of those sequels are memorable. We should stop making sequels and only make original films so we don't waste money on travesties such as these.

Those were not sequels planned for this year.
Feh. What you're saying is only true if 100% of decisions are data driven. Just because Hollywood has chosen to make 25 sequels doesn't mean that Sundance no longer exists. It, and countless events and organizations like it, are celebrated inside and outside of Hollywood.

The implications of the hypothetical extension of this trend are clear to a lot of people. As long as some percent of people are pushing the boundary, culture will be fine. Let everyone else enjoy their sequels and reality shows.

disagree about sequels not advancing the art of filmmaking

although the Star Wars prequels were not great IMO, they certainly brought a ton of technological innovations which have had enduring impact on filmmaking in general