Y
Hacker News
new
|
ask
|
show
|
jobs
by
miohtama
217 days ago
Shouldn't all caps normalised to tokens like low caps? There are no separate tokens for all caps and low caps in Llama, or at least not in the past.
1 comments
minimaxir
217 days ago
Looking at the tokenizer for the older Llama 2 model, the tokenizer has capital letters in it:
https://huggingface.co/meta-llama/Llama-2-7b-hf
link