|
|
|
|
|
by rasbt
1029 days ago
|
|
Good catch. Above that paragraph, I wrote that the Code Llama models were initialized with the Llama 2 weights, which makes this contradictory, indeed. What I meant to say here was 500B domain-specific tokens. Maybe domain-specific is not the right word here, but tokens related to the problems that the LLM aims to solve. EDIT: Updated the text to be more clear. |
|