Hacker News new | ask | show | jobs
by gabipurcaru 3514 days ago
Why is everyone working on style transfer? It doesn't seem like such an interesting problem in the field, compared to things like speech recognition for example. Is it just because it's a "cracked" problem and it looks nice? I'm just genuinely curious here, not trying to bash the amazing work these people do.
5 comments

Style transfer is part of a new trend that is concerned with generation of content. It is very difficult to generate images or text because the space of possible shapes/messages is infinite and highly dimensional. We know how to classify in 1000 categories (which corresponds to generating tags from a set of 1000 choices) but when it comes to painting, it requires to select a combination of pixels from a much much higher dimensional space. Hence, the difficulty.

But I think that generating in high dimensional spaces, such as in translation, style transfer, gameplay and robotics is the most interesting part of AI. It is what makes AI appear more intelligent and creative to us. AlphaGo was impressive because it could select movement sequences from a space of 10^120 possible combinations (compare that with an ImageNet classifier that outputs from a space of 10^3 labels).

So, in conclusion, it is essential to learn to generate images, text, sounds and behavior or movement that are just as complex and coherent as those created by humans. Being able to do so would mean half the way to AGI would be achieved, we could have talking moving robots that are not lame. Remember the latest text to speech engine from DeepMind - that's speech generation from a higher dimensional space. It shows the difference compared to regular TTS.

I don't think anybody is taking it extremely seriously. For Google, it's PR. For individuals working on it, it's fun, interesting, and accessible.
Simply put, it's because apps like Prisma have demonstrated that there are 100's of millions of people that want this. So developers are following the market demand.
If you can change the style of an image to anybody's style I guess you could:

- take photos and apply the styles of famous photographers

- take your writing and apply the styles of famous writers

- take your code and apply the style of famous coders

etc.

I'd like you to be right, but I don't think you are.

There's a big difference between style transfer in art vs. literature or code. In art, it's ok to get close enough, laymen will forgive a lot of noise. A lossy painting is still a painting.

With great literature, every word is carefully chosen. You can't take something like Franz Kafka and randomly fuzz it, you'll destroy hidden features which differentiate it from the mediocre.

With code it's even harder. There's almost zero room for noise, a stray period throws it completely off.

Theres some recent work in style transfer for sentences. The way someone says something can vary a huge amount between individuals, even if the meaning of a sentence is the same. The hard part is separating meaning from style, which requires a dataset of different sentences with the same meaning. Translation datasets are one possible solution. Different translators might have unique styles and word choices that an NN could learn to separate.
Because it's a way to avoid confronting the increasingly unavoidable fact that the AI renaissance DNNs were supposed to usher in is looking increasingly less impressive. Unsurprising, given that throwing more computing power at neural networks doesn't constitute a fundamental leap forward -- but disconcerting to a community that expected, and promised, far more than is being delivered.
Hold on a second. We're still in the very, very early stages here. We haven't even started to connect those networks together to make hierarchies.

You're speaking like someone watching the Wright brothers testing some of their earliest models, and going "supersonic flight my ass, you guys can't even fly across this football field".

> We haven't even started to connect those networks together to make hierarchies.

What's stopping us at this point?

Nothing, but it might not be the right approach.

We didn't progress from the wright flyer by stacking more and more wings on. (Although that path was explored for a couple decades)

What exactly do you think was 'promised and expected'? Because from here it looks like deep learning has delivered an awful lot more than what anyone expected. No one expected it to beat Go. No one expected it to achieve human level results on problems like image recognition. And no one expected all this to happen in just a few years.

NNs have made measurable and enormous progress in many different AI domains in a very short space of time. There are awesome new applications and improvements coming our every day.

It's easy to say, from the vantage point of hindsight bias, that everything that's happened was predictable. So what exactly do you expect from NNs and AI in the near future? Make some testable predictions.

I actually agree with you, as I feel that deep neural networks have exceeded expectations, but I like the guessing game, so I'll do a few predictions that, who knows, might be exceeded.

Fully autonomous vehicles (as in, all passengers can sleep) with less deaths than human drivers in 2020.

Realtime text-to-speech matching top humans, including proper intonation, in 2025.

Fully autonomous computer factories (as in, trucks deliver raw materials in containers at one location, and fetch the computers in containers at another) in 2035.

Right, making the best Go player in the world and cutting Google's power bill by 40% were huge yawns.
Optimization problems -- the bread and butter of machine learning for years. DNNs are certainly more powerful than many earlier-generation systems, but it's a quantitative difference, not a qualitative one. A DNN may have more neurons, more synapses, and access to more data, but it's not doing anything genuinely new.

A lot of hopes seem (to me) to have been pinned on the notion that neural nets (as we currently understand them) are the one true algorithm. This notion seems to have been fueled by the significant success of DNNs for certain (highly specific) problems, and by a (shallow) analogy with the human brain. However, it's becoming increasingly clear that this is not the case -- that an artificial neural net is an artificial neural net, no matter how many GPUs you throw at it.

From what I understand, the current bottlenecks for machine learning are:

- The lack of good data. Machine learning and DNN's specifically perform best with large datasets, that are also labeled. Google has open sourced some, but they (supposedly) keep the vast majority of their training data private.

- Compute resources. Training these datasets (which can be over terabytes in size) takes a lot of computational power, and only the largest tech companies (e.g. Google, Facebook, Amazon) have the capital to invest in it. Training a neural net can take a solo developer weeks or months of time while Google can afford to do it in a day.

There are actually a lot of advances being made in the algorithms, but iteration cycles are long because of these two bottlenecks and only large tech companies and research institutions have the resources to spend overcoming those bottlenecks. Web development didn't go through a renaissance until web technology became affordable and accessible to startups and hobbyists from reduced server costs (via EC2 and PaaS's like Heroku).

By that analogy, I think we're still in the early days of machine learning and better developer tools and resources could spur more innovation.

I don't have the impression that serious researchers regard them as a One True Algorithm, or as sufficient in their own right for development of human-level AI. Why do you believe that?
I'm not claiming that they do, although AI researchers who focus on DNNs certainly have a vested interest in accentuating their capabilities -- particularly when they have industry ties. I'm referring more to intellectual trends in Silicon Valley at large.
Who promised the renaissance?! We should put them in the stocks and throw overripe fruit at them!