Hacker News new | ask | show | jobs
by Florin_Andrei 3566 days ago
I'm not sure what Tensorflow source you're compiling, but I've been trying many times recently and it fails in many, many different ways. It's a neverending maze of fail, basically. I've never seen the end of it yet. It failed today, too, so the code base is not getting better.

I'm using Ubuntu 16.04, CUDA 8.0RC + the gcc patch, cuDNN 5.1, nvidia-driver-[367|370], tensorflow-master, python-2.7. My process is basically identical to yours.

A few issues are listed here:

https://github.com/tensorflow/tensorflow/issues/2559#issueco...

In some cases, Bazel seems to be the culprit. In other cases, it's Tensorflow itself. I've also seen a "gcc: internal compiler error" https://github.com/tensorflow/tensorflow/issues/4214

Some issues with your howto:

There's a chapter title "Install Nvidia Toolkit 7.5 & CudNN" but the instructions below use 8.0RC

```

Configure TensorFlow Installation

$ cd ~/tensorflow $ ./configure Use defaults by pressing enter for all except:

Please specify the location of python. [Default is /usr/bin/python]:

```

No. If you do that it won't compile with GPU support. You have to hit Enter on every question except these ones:

- Do you wish to build TensorFlow with GPU support? (answer: y)

- Please specify a list of comma-separated Cuda compute capabilities you want to build with. (answer: 6.1, or less for older GPUs)

- Please specify the Cuda SDK version you want to use, e.g. 7.0. [Leave empty to use system default]: (answer 8.0)

You don't have to specify the cuDNN version, apparently it can detect the version automatically. It's only the CUDA version detection that fails. https://github.com/tensorflow/tensorflow/issues/3985

"You must also have the 361.42 NVidia drivers installed"

No, that would not work with Pascal GPUs.

The only way I've seen it work is if you install CUDA 7.5 and cuDNN 4, and install Tensorflow from the binary package. But then you get weird errors if you run complex models on Pascal GPUs, because CUDA 7.5 doesn't work well with Pascal.

Seriously, if you made it work on Ubuntu 16.04 with CUDA 8 and it's GPU enabled, please upload the pip package somewhere. I'd love to give it a try.

1 comments

Just follow the instructions friend.
I may go ahead and do a literal clone of your instructions. However, looking at your process, it's what I do, step by step, AFAICT without actually going ahead and doing it.

It's also the fact that it fails in so many different ways. Bazel bombs out after ./configure; the master branch today does not even begin to build at all, the old Bazel workaround is not working anymore. Then there's the gcc issues.

You may have gotten lucky once, who knows why.

Again, do you still have the pip package you claim you've built using this procedure? If so, can you upload it somewhere? I would very much like to test it. Thank you.

https://www.dropbox.com/s/0cdoy7e8xh54wrx/tensorflow-0.10.0r...

Here is the pip3 wheel but I am skeptical given it was built for my system.

Pip is building right now, will give dropbox soon.
Sorry missed that other 7.5 reference, replaced with 8. Sorry copy and pasted my 7.5 instructions as they were mainly the same.
and yes sorry thought hitting yes for gpu support was obvious and did not see the need to type that one out.

there is also a link for the list of cuda capabilities in the post.

"No, that would not work with Pascal GPUs."

I cannot test this but possibly just use whatever version the drivers work and make sure the run file does not install different drivers.