|
|
|
|
|
by docmechanic
425 days ago
|
|
Source: Generative Deep Learning by David Foster, 2nd edition, published in 2023. From “Tokenization” on page 134. “If you use word tokens: …. willnever be able to predict words outside of the training vocabulary.” "If you use character tokens: The model may generate sequences of characters that form words outside the training vocabulary." |
|