Hacker News new | ask | show | jobs
by skyde 770 days ago
Forest are good at classification but they cannot leverage pre-training on unclassified data.
2 comments

https://gradientdescending.com/unsupervised-random-forest-ex...

You can cluster data using unsupervised random forests and then use these cluster indices as features.

Aren't LLMs a classification problem in a sense? "Given this text, classify it based on the next token" seems like a viable interpretation of the problem. Although there are a couple thousand or so classes, which might be a lot (but I know very little about this field).