|
|
|
|
|
by fractionalhare
2173 days ago
|
|
If I understand correctly, it sounds like your platform is primarily intended for improving awareness and understanding of the data a team has, so they know which features to focus on and emphasize. Do you think you'll get into synthetic data generation as well? In other words, improving dataset quality additively, not just curatively. |
|
Said another way: once you've found "I do badly on green cones," we use similarity search on the embeddings of known green cone examples to find more instances of green cones in the wild. We pick the right examples from streams of unlabeled data, then send it to labeling + add to your dataset so it does better the next time you retrain.