Hacker News new | ask | show | jobs
by lossolo 818 days ago
Don’t overlook the training data (used for both training and instruction fine-tuning), it is one of the most crucial aspects, if not the most critical, given the significant differences observed in models with similar architectures.