Hacker News new | ask | show | jobs
by riku_iki 1061 days ago
absolutely not. Transformer layers already communicate using embeddings, and ASCII would be absolutely less efficient there.
1 comments

And how many bits are in an embedded vector?
12k for gpt3.
It is not bits, but weights
So somehow ascii is less information dense than 12k 32-bit floats per token?