|
|
|
|
|
by knaik94
1130 days ago
|
|
It makes up those numbers, I asked about the difference between the small and large PaLM 2 data set size, and it asserted the small model was trained on 540 billion and the large model was trained on 540 trillion. A different draft instead specified 1.4 trillion for the large. |
|
Here is a table that summarizes the key differences between the two language models:
Feature Palm Bard Number of parameters 400 billion 540 billion Vocabulary size 137 billion words 1.5 trillion words