Hacker News new | ask | show | jobs
by CompleteSkeptic 3175 days ago
An appropriate quote: "If you can't intelligently argue for both sides of an issue, you don't understand the issue well enough to argue for either."

There are many people for whom the declarative paradigm is a huge plus. I would say there are at least 2 major approaches in running fast neural networks: 1. Figure out the common big components and make fast versions of those. 2. Figure out the common small components and how to make those run fast together.

Different libraries have different strengths and weaknesses that match the abstraction level that they work at. For example, Caffe is the canonical example of approach 1, which makes writing new kinds of layers much harder than with other libraries, but makes connecting those layers quite easy as well as enabling new techniques that work layer-wise (such as new kinds of initialization). Approach 2 (TensorFlow's approach) introduces a lot of complexity, but it allows for different kinds of research. For example, because how you combine the low-level operations is decoupled from how those things are optimized together, you can more easily create efficient versions of new layers without resorting to native code.

4 comments

After being exposed to several declarative tools during my career, I must say they age poorly: make, autoconf, Tensorflow, and so on. They may start out being elegant, but every successful library is eventually (ab)used for something the original authors didn't envision, and with declarative syntax it descends into madness of "So if I change A to B here does it apply before or after C becomes D?"

At least Tensorflow isn't at that level, because its "declarative" syntax is just yet another imperative language living on top of Python. But it still makes performance debugging really hard.

With PyTorch, I can just sprinkle torch.cuda.synchronize() liberally and the code will tell me exactly which CUDA kernel calls are consuming how much milliseconds. With Tensorflow, I have no idea why it is slow, or whether it can be any faster at all.

I believe that make's declarative is not the cause of it's problems at all - it's poor syntax and lack of support for programming abstractions is what makes it clunky to use.

Something like rake, which operates on the same fundamental principles (i.e. declarative dependency description) but using ruby syntax has aged better.

Indeed. Getting these text based configuration tools work requires a lot of experience in language design.

Lots of tools become accidentally Turing complete, like Make. You need to plan these things from the start. If you want any computation possible at all, you need to be extremely vigilant, and base your language on firm foundations. See eg Dhall, a non-Turing complete configuration language (http://www.haskellforall.com/2016/12/dhall-non-turing-comple...).

If you are happy to get Turing completeness, you might want to write your tool as an embedded DSL and piggy-bank on an existing language, declarative or otherwise.

SBT in the Scala world would also fit this description.
I took the article to be the counterpoint to the uninhibited praise of TF. In that light, I don't think it was meant as a balanced assessment of the whole product, but had a narrow scope of simply pointing out a handful of flaws that he thinks isn't discussed enough.

It's the same feeling when you hate a movie that everyone gives five stars: you might agree with some aspects of the praise (or even most of it), but that's not what you're going to be talking about. You'll talk about how and why it sucks compared to better movies.

I'd guess he could make a strong pro-TF argument if desired, but that just wasn't the point of this post.

The assumption that there are always two intelligent sides to an issue is a pretty big assumption. If you understand both sides of an issue really deeply and you choose side B and are against side A, you should be able to argue intelligently for side A otherwise your choice of side B is not made intelligently, but this falls down on further examination.

If you believe that side B is correct and side A is incorrect given your deep understanding of the issue then an argument for side A is in some way not intelligent because you must keep out your most potent arguments for side B from your argument for side A - you must deny their existence in your head and thus argue from a less intelligent position than you normally would.

The ability to argue both sides is only really possible when all sides are considered trivial in their differences.

on edit: improved formatting for legibility.

You should still be able to give the other side the best defense imaginable. (See 'steelmanning'.)
ok, I've seen stellmanning? https://www.google.com/search?dcr=0&source=hp&q=stellmanning...

on edit: never mind, I see you mean steelmanning. However that does not really have anything to do with what I said, you should be able to give someone the best defence imaginable, but what if the best defence imaginable is shit compared to the other side. Then you cannot argue both sides equally, this does not mean you do not understand either side. It means one side is actually wrong, and the other is correct.

Sorry, I can't spell. It's steelmanning. (Edited the other comment.)

The idea is to beat a steelman of the idea. Because that's a greater victory than beating a strawman.

Sure. Eg you'd be hard pressed finding good arguments for 2=3 (without resorting to shenanigans around definitions).
"If you can't intelligently argue for both sides of an issue, you don't understand the issue well enough to argue for either."

please argue the opposite of this before continuing