| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by sandGorgon 2672 days ago
	interesting - there's no scikit support, which for long has been the mainstay for data scientists everywhere. Are people migrating from scikit to tensorflow in production for non-deep learning usecases ?

6 comments

mmq 2672 days ago

I think it should probably support scikit as well as any other library, since it's only making suggestions of hyper-parameters based on recorded/historical observations or random evaluations.

At least that's the behaviour of the platform[1] I am working on.

[1]: https://github.com/polyaxon/polyaxon#hyperparameters-tuning

link

pplonski86 2672 days ago

I think it all depends on the purpose of the library and who is a target user. The NNI is a package for tuning neural networks models, it will be mostly used in use cases that require deep neural networks, like image classification or voice recognition.

BTW, I think all autoML solutions forget about end users. They all require too much engineering knowledge from the user. I think it will be nice to have an autoML solution that can be used by citizen data scientist.

link

human_scientist 2671 days ago

What about approaches like auto-sklearn [1]? With these it is basicaly:

  >>> automl = autosklearn.classification.AutoSklearnClassifier()
  >>> automl.fit(X_train, y_train)
  >>> y_hat = automl.predict(X_test)

[1] https://automl.github.io/auto-sklearn/stable/

link

minimaxir 2671 days ago

> BTW, I think all autoML solutions forget about end users. They all require too much engineering knowledge from the user. I think it will be nice to have an autoML solution that can be used by citizen data scientist.

This is the approach of a project I am currently working on. (and am now explicitly making clear in the README!)

link

pplonski86 2671 days ago

Could you provide some link to the project?

link

mmq 2672 days ago

UPDATE: Looking at the docs, there's an example[1] using this library with scikit-learn.

[1]: https://nni.readthedocs.io/en/latest/sklearn_examples.html

link

scottlegrand2 2671 days ago

At a previous gig we tried to do this: port a computational graph that wasn't a neural network to tensorflow. It was a disaster. Tensorflow is very tightly optimized for the things Google think are important. if you fall off of those paths tensorflow is a god-awful slow tool to use. We saw a ~20x regression in performance.

in contrast, when we wrote bespoke GPU code for the graph, we saw a ~25x performance increase over relying on CPU plus MKL. I am being deliberately vague here and I cannot give further detail.

link

ec109685 2662 days ago

You are somewhat uniquely qualified to do so:

> possibly the world's first or second (full-time) CUDA programmer, with 14 filed patents, and the world's fastest implementations of molecular Dynamics (CUDA ports of Folding@Home and AMBER).

link

scottlegrand2 2660 days ago

Yes, compared to someone who insists on doing all of their computation from python alone, I have a unique (and in my opinion absurd) advantage.

Because I think that's insane. It's one thing if you don't care about speed and you care more about time-to-market. It's another thing if you're complaining about things being too slow but you're not willing to learn about anything that would let you do anything about it. I run into far more of the latter.

link

williamsmj 2671 days ago

There is scikit support. There's an example in the docs.

https://nni.readthedocs.io/en/latest/sklearn_examples.html

link

cuchoi 2672 days ago

I think that for Neural Networks scikit has not been the "go to" library, in particular AutoML advertises that they automate neural architecture search which I don't think scikit allows a lot of flexibility for that.

link

samcodes 2671 days ago

Have you seen TPOT [0]? AutoML library that uses genetic algorithms to write scikit code for you. So fun.

[0] https://github.com/EpistasisLab/tpot

link

ayidnelm 2672 days ago

There's also auto scikit-learn https://github.com/automl/auto-sklearn if you haven't already come across that.

link

streetcat1 2671 days ago

scikit learn is a different type of search, hence it will not be supported by this tool or any DNN search tool.

DNN require an architecture search, I.e. the building block are full layers, depth of the network, optimizer etc.

scikit learn search a parameter space, I.e. the algorithm weight are much much simpler and few.

So to sum up, DNN search involve big building blocks while scikit learn search (or for that reason any "classical ML" algorithm) is more of a parameter search.

[ The actual sci kit learn search would also include pre processing steps, which can be seen as a separate block]

Also, note that that DNN search is much more expensive than scikit learn search (100X) ]

link

human_scientist 2671 days ago

Automatically building a scikit learn estimator might include many conditional hyperparameters and also a very large amount of them (<100) [1]. However, performing joint architecture and hyperparameter search can be framed to be on a much simpler search space, e.g., for a recent paper that aims to automate the design of RNA molecules, we formulated a 14 dimensional search space which includes very little conditional hyperparameters [2].

The tools included in the repository are very broadly applicable and only a few of them are specifically targeted at neural architecture search.

[1] https://www.kdnuggets.com/2016/08/winning-automl-challenge-a... [2] https://openreview.net/forum?id=ByfyHh05tQ

link

williamsmj 2671 days ago

This tool absolutely supports scikit-learn. Please see the docs. https://nni.readthedocs.io/en/latest/sklearn_examples.html.

link