Hacker News new | ask | show | jobs
by elandau25 1971 days ago
Hi Sriku, with regards to your first point, not necessarily. I mentioned in another comment, but the model you are using the labels to build and the labelling process are related but not the same, they have different fundamental constraints and rely on different techniques. You can't have a human in the loop to help you out in a live model as an example.

You are right there are a bunch of difficult problems this technique isn't perfect for, but it actually can still help improve the efficiency of labelling a lot and when I do it I get the added bonus of understanding the dataset a lot better.

1 comments

I still wouldn't push this beyond a limited number of cases where the average human can identify patterns and explain them without much doubt. For example if you want to apply this to radiology images, where the intention behind using a DL technique is to discover patterns we may not be able to notice and exploit them, the approach would probably be as labour intensive as labeling datasets.

Otherwise I overall agree with you that we should consider this where we can .. as evidenced by my snorkel link.

I hear you, but I don't even think the labor intensiveness is a lost cause here. Labor intensiveness in mining insights from a dataset is worth way more than labor intensiveness in manual labelling.

I think we are lazily giving up our intellectual power to models hoping that they will just discover patterns by magic, where it is actually very worth to go through the data science process starting with labelling because we actually learn as humans. The thesis is that this will also make our DL models better in the long run. We would never have come up with cool algorithms if we just always outsourced this work to models.