Hacker News new | ask | show | jobs
by bjornsing 644 days ago
Have you tried implementing your ternary transformers on AVX(-512)? I think it fits relatively well with the hardware philosophy, and being able to run inference without a GPU would be a big plus.
1 comments

Our CPU implementation for X86/AMD64 utilizes AVX-512 or AVX-2 instructions where possible. We're experimenting with support for ARM with NEON.