Y
Hacker News
new
|
ask
|
show
|
jobs
by
sva_
1046 days ago
The tokens are in this case actually the individual characters:
vocab = sorted(list(set(lines)))