| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by vbychkovsky 2910 days ago

We had two big challenges: (1) getting people to clearly specify the feedback mechanism, (2) figuring out what features to use.

(Isn’t all ML about good labels and features? :-) )

Structured ML systems require you to provide (ideally) unambiguous examples of expected behavior. In case of Spiral (or any other online learning) such examples need to be generated automatically. In our experience this part took a good amount of effort: distributed systems issues (aka race conditions and transient bugs in remote systems) made automatic generation of “clean” examples difficult. Once the bulk of these problems were addressed the system began to operate very smoothly. Specifically, it adapted to changing conditions very well.

Spiral is designed to be a drop-in replacement for hand-coded heuristics. In other words, if you had a somewhat working tree if-else statements that specified your image caching policy (if size<100k and type==jpeg..), you should already have an idea for what features to use. There is a bit of work involved in translating these features into the form suitable for classifiers in Spiral. For example, if a classifier is using binary features, the file size feature would need to be quantized (123kb -> “100-200kb bucket”). While this type of work requires forethought and effort, runtime cost of running this classifier is very low.