| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tbenst 2190 days ago
	First of all this is very cool. Dunno if author is on here, but I’m curious why both Flux and Knet are used rather than just one of them (Flux seems the most Julianic?). Also, is this really faster than PyTorch/TF? Last time I benchmarked Flux for non-trivial networks, the speed was quite good with small models but memory usage was ~5x higher than pytorch, and I couldn’t fit my models on the GPU for flux. For large models, I had to compromise on batch size in Julia, although maybe with Zygote.jl the memory issues have been resolved?

4 comments

jonath_laurent 2190 days ago

I suspect FLux/Knet are still slightly slower and less memory efficient than PyTorch/TF, although things are moving very fast here!

This is not relevant in understanding AlphaZero.jl speed though. The reason it is much faster than Python implementations is because tree search is also a bottleneck, and Julia shines here!

link

tbenst 2190 days ago

Ah, I hadn’t appreciated this. Thanks for making & sharing your code!

link

jonath_laurent 2190 days ago

Author here. AlphaZero.jl supports both Flux and Knet indeed and users can choose whatever framework they want to use.

As far as I understand, Flux and Knet have different strengths. I think Knet is a bit more stable and mature for large-scale Deep Learning, but Flux shines for "scientific-ML" usecases where low AD overhead is crucial.

link

ViralBShah 2190 days ago

While some may be addressed and others are being addressed, what would really help us if people file issues when they don't find performance to be adequate. If you still have the code handy, please do open some issues.

link

tbenst 2187 days ago

I ran the test 15 months ago using example code from Metalhead vs PyTorch examples repo. Unfortunately my test consisted of staring at nvidia-smi, so don’t have code handy. I believe I benchmarked Resnet.

Edit: I also had an issue with VGG and opened an issue: https://github.com/FluxML/Metalhead.jl/issues/42. Perhaps this has since been resolved

link

dunefox 2190 days ago

> 5x higher than pytorch, and I couldn’t fit my models on the GPU for flux. For large models, I had to compromise on batch size in Julia

I had the exact same experience. While I like Julia and Flux I can't use it in this state for my models.

link

dklend122 2190 days ago

Would you mind opening corresponding issues on the repo? That would help guide the ongoing compiler work.

link