Seems like torchchat is exactly what the author was looking for.
> And the 8B model typically gets killed by the OS for using too much memory.
Torchchat also provides some quantization options so you can reduce the model size to fit into memory.
Seems like torchchat is exactly what the author was looking for.
> And the 8B model typically gets killed by the OS for using too much memory.
Torchchat also provides some quantization options so you can reduce the model size to fit into memory.