Hacker News new | ask | show | jobs
by kenhktsui 673 days ago
Hi HN community,

In this release, I introduced synthetic data generation for text classification. As a result, one can build classifiers from scratch, even without a dataset. It achieves competitive results with synthetic data, comparable to using real data, in 5 benchmarks.

There are many open researches and implementation in the future: - research on synthetic data algorithm resulting in higher performance - agentic workflow of model evaluation, error analysis and model improvement - multilingual support If you are interested, please give a star, try it out, and contribute.

Detailed Blog: https://huggingface.co/blog/kenhktsui/anyclassifier

Best, Ken