Hacker News new | ask | show | jobs
by ajratner 2166 days ago
Agreed! As noted in other answer, Snorkel certainly does not work for everything :) And indeed, in many cases it may be easier to express what you know extensionally (label examples) vs intensionally (write functions). A lot comes down to the unit cost per label over time- and whether it's more economical to label a bunch of data by hand vs. write LFs or similar.

That's btw why a lot of examples of ML today are ones where data is (i) simple for non-experts to label, (ii) non-private and therefore easy to outsource for labeling, and (iii) low rate of change (e.g. images for self-driving, basic NLP stuff for chat bots, etc)- this kind of data can be labeled cheaply and once, so hand-labeled training sets are (barely) economically feasible to build manually. However, most data is not that easy or cheap to label, needs to be relabeled constantly to adapt to change, and thus the investment in a programmatic approach is often far better even if certainly not push-button!