| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by miven 900 days ago
	What do they consider to be an "LLM of this size"? While this technique of scaling up an existing pre-trained model via fine-tuning is really impressive, it feels a bit unfair to compare what's essentially now an 8.3B model to mostly standard 7B ones, especially considering how important scale is in predicting LLM performance.