Hacker News new | ask | show | jobs
by rohxnsxngh 114 days ago
We are doing exactly what you described with continuous calibration. We have essentially built our own in-house labeling, ingesting, and task assignment software for these tasks. Low confidence predictions get flagged for review, corrected labels feed back into training, and we retrain on a rolling basis. We also stratify our calibration datasets intentionally by time of day, tank conditions, and fish density rather than just grabbing random frames. Early on our datasets were too homogenous and the models would work great in testing then degrade in production. The architecture matters less than having a tight feedback loop between deployment and retraining.

On the welfare angle, yes we are thinking about this carefully. The data we collect includes body shape, fin integrity, spinal curvature, and other morphological traits that are signals of fish health and robustness, not just growth rate. Farms that care about sustainability can use this to select for fish that are healthy and resilient rather than just fast growing. The tool is neutral but the selection criteria are up to the breeder. We do not want to enable the same failure mode that happened with poultry.

The talent pipeline point is interesting too. You are right that most CV talent ends up in adtech or fintech. We have found that people get excited about working on something physical and tangible once they realize the problems are just as hard.

1 comments

That feedback loop architecture sounds right — the homogenous dataset trap is one of the most common failure modes in production CV and it's good you caught it early. The welfare data angle is particularly interesting. The distinction between "the tool is neutral but the selection criteria are up to the breeder" makes sense bc it avoids the poultry industry's mistake of optimizing the selector itself. Have you published anything on the calibration pipeline? Would love to see the methodology.
We have not published anything yet but it is on our roadmap. Right now we are heads down on deployment and iterating fast so documentation has taken a backseat. Once we have more data across multiple facilities and species we want to write up what we have learned about calibration strategies for non-controlled environments. There is not a lot of practical literature out there on edge CV in wet and variable conditions so it feels like something worth sharing.