Hacker News new | ask | show | jobs
by dandinu 196 days ago
Oh darn, you're absolutely right (pun intended) about the 32-bit situation. SSE2 is really the "floor" there if you want any kind of reasonable compatibility. I was being a bit optimistic with SSE4.2 even for 64-bit - technically safe for most chips from that era but definitely not all.

The Pentium Pro challenge though... pure x87 FPU inference? That would be gloriously cursed. You'd basically be doing matrix math like it's 1995. `-mfpmath=387` and pray.

I'm genuinely tempted to try this now. The build flags would be something like:

  -DGGML_AVX=OFF -DGGML_AVX2=OFF -DGGML_FMA=OFF \
  -DGGML_F16C=OFF -DGGML_SSE42=OFF -DGGML_SSSE3=OFF \
  -DGGML_SSE3=OFF -DGGML_SSE2=OFF  # pain begins here
And then adding `-ffast-math` to `CMAKE_C_FLAGS` because at that point, who cares about IEEE 754 compliance, we're running a transformer on hardware that predates Google.

If someone actually has a Pentium Pro lying around and wants to see Qwen-0.5B running on it... that would be the ultimate read for me as well.

Thanks for the kind words. Always fun to find fellow retro computing degenerates in the wild.