|
|
|
|
|
by endymi0n
1641 days ago
|
|
Just doing some napkin math, the whole GPT-J corpus was around 500 billion tokens, which at 4 tokens per byte would be roundabout 2 Terabyte. That, parked on a fast NVMe SSD will give you roundabout 1MM random lookups per second. Even with some transfers inbetween, this should be more than enough to not just perform in equal time, but probably less — as well as cost you less than the GPU you need for the reduced size model. Exciting times. |
|
If you don't update your database and indices they are great. But that's something really tempting to do when you do some machine learning, (specially if you know that people with deeper pockets will do so).
Typically you will have a neural network, you run it on your dataset, it produces a new dataset of embeddings, you index them, and you use this index to train a new neural network, and you repeat the loop, hopefully improving results along the way.
NVMe SSD can write at 6GB/s but can only write ~800TB that's about 37 hours of lifetime at max speed.