Hacker News new | ask | show | jobs
by mekpro 313 days ago
try enable flash attention and offload all layer to GPU