Hacker News new | ask | show | jobs
by mlthoughts2018 2927 days ago
This is very confusing and meandering. It gives flow charts and lists of steps that don’t map to my experience building deep learning models at scale, and spends a strange amount of time passive aggressively dismissing Lua Torch and extolling virtues of TensorFlow that aren’t very important.

As with all of these purported pipelining systems, I’m skeptical and happy to let a bunch of other people deal with the headches of making it adequately general for a few years before I’ll even start caring about grokking it for my use cases.

In the meantime, creating build tooling, data pretreatment tooling and deployment tooling is pretty valuable for me to understand business considerations and make sure all my modeling & experimentation aren’t just time wasting ivory tower projects, particularly in terms of customizing performance characteristics on a situation-to-situation basis, free to design the deployed system without a constraint to a particular serving architecture.

It also makes me very disinterested in applying to work for the Cortex team, because even though the article is talking about DeepBird v2 as a means to free ML engineers to do more research, it seems pretty obvious that there’s a huge surface area of maintenance and feature management for this platform. Your job is probably going to be less about research, which is scarce work that people compete over anyway.

Possibly attractive for people who just like deep C++ platform building, which is an internal drive not often found in people wanting to solve business problems with ML models.

2 comments

> Possibly attractive for people who just like deep C++ platform building, which is an internal drive not often found in people wanting to solve business problems with ML models.

And the world has plenty of people who are not interested in trying to solve business problems with ML models, but are rather interested in the engineering side of ML. I am one of those people. My current work (at cortex) involves improving the time taken to train the types of models not usually described in research papers.

If anyone is interested in machine learning but wants to build high performance training / serving platform, message me over twitter (@pavan_ky), Cortex is hiring.

Personally I am very interested in both how models can help solve business problems and how to make effectively engineered tools for machine learning.

In my work I spend a lot of time on new deep learning architectures or experimenting with modifications or fine-tuning or ensembling.

I write a lot of container and Makefile tooling to ensure experiments are reproducible and results have identifiers that map back to the full set of data, software and parameters.

I also write a lot of backend server software to wrap trained models in a web application, mostly in Python, and do a lot of work with Cython after profiling to target only those spots of the code that reveal actual performance bottlenecks in terms connected directly to a specific business problem’s latency or throughput requirements — as in, not taking the huge premature optimization step of assuming a whole system needs to be written in C++, and instead using profiling and case-by-case diagnostics to know when to write something as an optimized C extension module callable from Python for very specific and localized sections of code.

My experience has been that there is such a lack of transparency about how deployment will work, how performance will work, etc., when using cookie cutter pipeline approaches, like sklearn Pipelines, TensorFlow serving, Fargate, etc. You’ll always need to break some assumption of the pipeline, layer in new diagnostics, debug latency issues, etc., on a case-by-case basis.

99% of the time, ease of specifying a new model or articulating an experiment is not hard, requires little dev work, and only represents about 10% of the actual work needed to explore a model’s appropriateness for a given problem at hand.

The rest requires very specialized control and visibility to basically perform application-specific surgery on the pipeline, customizing and tailoring many aspects, from how multi-region deployment should look to how optimized the web service code should be to whether to use asynchronous workers or a queuing service to stage and process requests, to optimizing preprocessing treatments, to instrumenting some extra New Relic metric tracking that the pipeline isn’t extensible enough to just specify in some config, and so on.

What’s been most important is that the deep learning engineers on the team, who are researchers, are also excellent system engineers at all those topics too and display a high degree of curiosity towards them, and absolutely do not look at it like “boring work” that distracts from the experimentation they would rather do. Their value add is not driven by spending more time experimenting — that’s virtually never the case. Their value add is in both knowing the details of the deep learning models intimately while also knowing the deep details of implementation, optimization, deployment, and diagnostics.

In that sense, tying model development to a cookie cutter pipeline framework, whether Spark or sklearn or something custom in-house, is something I believe strongly is an anti-pattern.

I’m developing the build tooling for C/C++ in pants (the twitter build tool) to support this work, see [0]. Twitter open sources basically all of its infrastructure, we don’t really like throwing open source over the wall, and we maintain several high quality open source libraries that don’t require you to develop specific expertise in platform building (e.g. scalding [1], one of my faves)!

[0] https://github.com/pantsbuild/pants/pull/5815 [1] https://github.com/twitter/scalding