|
|
|
Ask HN: Can tokenizers go from fixed length tokens to varying length?
|
|
3 points
by jinen83
642 days ago
|
|
I am going through a workshop on building LLM grounds up. While I am studying tokenizers like BPE - I was curious why not use ideas from encoding techniques like huffman and make better optimized encoders. So a token like '.' also has same sie token as the word 'Algorithm'. Their frequency of occurence and their size could save us some GPUs? |
|