| HN Mirror

I have 64 GB, but it really depends on the quantization. Looking at LM Studio I see versions ranging from 15 GB to 49 GB, and that's roughly how much RAM they will require.

LM Studio will also let you do partial GPU offloads, but I've only started experimenting with that. The 1-2 Tokens/second value is what I got using GPT4All.