| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by ericjang 2782 days ago

This is neat! I hope you don't mind a bit of constructive criticism here, but early on in my research career I also thought it would be a good idea to "visualize the neural network connectome" in 3D (I implemented a very rudimentary version of your Browser-based visualizer in QT + OpenGL, no training frontend).

And then I followed up with an early TensorFlow visualizer https://github.com/ericjang/tdb

It turns out that while such tools seem useful at first glance, they turn out to not be that helpful to power users. For models bigger than LeNet, things get really ugly to visualize. And once you understand a high-level module and can take its training for granted, there isn't a need anymore to really look at it anymore. It can also be kind of annoying to tumble around in 3D when you just want to look at some activation maps. What does the 3D aspect of the visualization buy you here?

Tools like TensorBoard + Jupyter notebooks for inspecting weights and ad-hoc visualizations (e.g. VizDom) seem to strike the right balance.

If you want to continue pushing in this direction, I highly encourage embarking on an actual Deep Learning research project using your tool. In ML it's so important to dogfood your own software!

3 comments

cr0sh 2782 days ago

To kinda build on this, I think where this particular tool could be most beneficial would be for users just learning about neural networks, deep learning, CNNs and the like. That is, newbies to the arena.

Some may not need such a tool, and can easily visualise the layers, interconnections, functionality, etc. Others may be able to get by with 2D and "paper" representations.

I think, though, there may be a segment of learners who could benefit from a tool like this. For such people, the tool wouldn't have to support anything super-complex; smaller models and architectures would be fine.

I don't have a lot of experience with ANNs (just a few MOOCs and tutorials here and there), but from what I recall from those experiences, a tool like this could be beneficial, both for visualizing a complete NN graph, as well as visualizing a partial graph as it is built up, layer by layer (and to investigate "middle layer" operations and processes).

link

syt123450 2780 days ago

Thanks for your suggestions! TensorSpace is an in developing framework, and I acknowledge that it has some shortcomings, as you mentioned that for a big network, it will be too long in TensorSpace to visualize, and I am trying to figure out a solution to optimize it, for example, add "layer group" concept in TensorSpace in the future, which may make the long network shorter.

Honestly speaking, there are many mature tools for developers to use in model training, and everyone has own "optimum scheme". For me, TensorSpace is not designed to be a "silver bullet" to replace previous solutions or tools, I hope it can help engineers understand existing models and find areas where they can be applied.

Personally, I am interested in data visualization, and I found that neural network visualization is really a cool area. My original intention to make TensorSpace is to share this "beautiful scenery" with hackers!

link

mendeza 2782 days ago

Are there any good guides, tutorials, or research papers that investigate or advise how to inspect weights during training for debugging. The only things I read are to watch out for vanishing gradients, and when fine-tuning the most change in layers are seen toward the end of the network, not the beginning layers.

link

ericjang 2782 days ago

Yes, a recent exciting phenomena of interest to researchers is how and why the spectrum of the Hessian appears to separate into 2 parts - a "bulk" part that changes very slowly and "outliers" that change quickly. This suggests that only a few weights in the model actually change during training. If one could determine which weights these are, it might lend to faster and more efficient learning algorithms that don't have to backprop to all the parameters in a large neural network.

https://arxiv.org/pdf/1706.04454.pdf

https://openreview.net/forum?id=ByeTHsAqtX

link