Y
Hacker News
new
|
ask
|
show
|
jobs
by
lt
2103 days ago
A character, both input and output.
1 comments
throwaway287391
2103 days ago
Not exactly, GPT-3 uses a variant of BPE [1], so one token can correspond to a character, an entire word or more, or anything in between. The paper [2] says a token corresponds to 0.7 words on average.
[1]
https://en.wikipedia.org/wiki/Byte_pair_encoding
[2]
https://arxiv.org/abs/2005.14165
, page 24
link
[1] https://en.wikipedia.org/wiki/Byte_pair_encoding
[2] https://arxiv.org/abs/2005.14165, page 24