Hacker News new | ask | show | jobs
by dhairya 1583 days ago
The challenge for the engineers at our AI startup is that deterministic testing paradigms don't adapt well to probabilistic models that are continually being retrained. As a scientist is hard to convey the acceptable range of variance and often the random change of individual predictions at the decision boundary. It's also hard debug behavioral issues that actually are systematic model failure versus those that are traditional infrastructure bugs. Often times the band aid is that build lookup tables to ensure certain behavior which in turn also underlying issues from being discovered.

Testing paradigms are either too high level or too specific. Recent work on evolving behavioral tests addresses this but it requires more manual effort and interpretation which kinda defeats to point of automated tests.

1 comments

Good discussion. I have often wondered about that interface.

This bears more resemblance to traditional manufacturing actually. I think there may be some value in borrowing ideas from statistical process control, rather than trying to force predictions into deterministic cases.