|
|
|
|
|
by eggie5
2166 days ago
|
|
you have a dataset of images and you write code (labeling functions LF) to label the images. Snorkel handles the pipeline but more importantly corrects the conflicts/correlations between the LFs. The output is a supervised dataset w/ mutually exclusive labels a la softmax classification. the labels are noisy, but you have a quantity that you could not get by humans, AND at a faster/cheaper rate. they provide analysis arguing that, for discriminative models, quantity CAN outweigh quality. to your point it's not typically used w/ the image-only modality. It's mostly used where there is some meta-data attached. |
|