Hacker News new | ask | show | jobs
by coder543 898 days ago
The fallback does seem to work! Although the 4-bit 7B models only run at 1 token every several seconds.

I still wish Phi-2, Dolphin Phi-2, and TinyLlama-Chat-v1.0 were available, but I understand you have plans to make it easier to download any model in the future.