|
|
|
|
|
by soleveloper
239 days ago
|
|
Hey, cool initiative! Worth mentioning in the title that it's CPU-only: >1200 tokens/s on a single thread is impressive. Have you considered doing optimization iterations like nanogpt-speedrun? Would be interesting to see how far you can push the performance. |
|