Hacker News new | ask | show | jobs
by jupiterelastica 1330 days ago
You are very clear about the current limitations on data size, which I find refreshingly honest! How sensible do you find the idea to fine tune the model to a specific problem that has more than 1000 observations, by resampling the data (similar to bootstrapping) and retraining on the subsamples? As I understand it, one could fine tune the algorithm that TabPFN learned to the specific problem.

Many thanks also for open-sourcing your work and making the colab notebook, I've been playing around with that a bit.

Edit: spelling

1 comments

We did try it a bit a while back, but did not have conclusive results. I expect you can bend it to perform better for larger datasets, too, but how exactly I cannot say for sure. The bootstrapping is definitely a good candidate for this.