Hacker News new | ask | show | jobs
by mohaps 2908 days ago
One of the authors here. Be happy to answer questions
2 comments

Could you talk a bit about the challenges that you faced in developing this? And, the before ans after benchmarks?

I tried something kinda similar to help with tuning data engineering jobs and pipelines for performance and costs. But, it turned out to be a fruitless activity because there were too many variables that affected performance. I’d produce some models that seemed to be marginally effective. But, after code changes, configuration changes, changes to data input sizes, the models quickly became stale and ineffective.

We had two big challenges: (1) getting people to clearly specify the feedback mechanism, (2) figuring out what features to use.

(Isn’t all ML about good labels and features? :-) )

Structured ML systems require you to provide (ideally) unambiguous examples of expected behavior. In case of Spiral (or any other online learning) such examples need to be generated automatically. In our experience this part took a good amount of effort: distributed systems issues (aka race conditions and transient bugs in remote systems) made automatic generation of “clean” examples difficult. Once the bulk of these problems were addressed the system began to operate very smoothly. Specifically, it adapted to changing conditions very well.

Spiral is designed to be a drop-in replacement for hand-coded heuristics. In other words, if you had a somewhat working tree if-else statements that specified your image caching policy (if size<100k and type==jpeg..), you should already have an idea for what features to use. There is a bit of work involved in translating these features into the form suitable for classifiers in Spiral. For example, if a classifier is using binary features, the file size feature would need to be quantized (123kb -> “100-200kb bucket”). While this type of work requires forethought and effort, runtime cost of running this classifier is very low.

I'm having trouble fully grasping this:

> Today, rather than specify how to compute correct responses to requests, our engineers encode the means of providing feedback to a self-tuning system.

"encod[ing] the means of providing feedback to a self-tuning system", got it, very cool!

But don't they still have to "specify how to compute correct responses to requests"?

Not OP / GP; from what I understand, this isn't an API generated by ML, it's a cache manager. "Computing correct responses to requests" refers to deciding whether or not it should read it from some caching layer and whether or not it should be cached in the future, and it does this by optimizing some parameters. The difference is something like it decides whether or not to use the API or the cache to load an image or some content, rather than saying "this request should return a response with these properties". Hopefully I'm right and this makes sense.
the basic difference in approach was to switch the earlier approach of

if (conditionA && conditionB && !conditionC) cache_it()

to

hey look, an item with featureset X is cacheable while one with featureset Y is not.

This reflects in the API which is just two calls predict() and feedback()

this simplifies the integration code and is easily debuggable even in the face of changes.