You need ollama[1][2] and hardware to run 20-70B models with quantization of Q4 at least to have similar experience to commercially hosted models. I use codestral:22b, gemma2:27b, gemma2:27b-instruct, aya:35b.
Smaller models are useless for me, because my native language is Ukrainian (it's easier to spot mistakes made by model in a language with more complex grammar rules).
As GUI, I use Page Assist[3] plugin for Firefox, or aichat[4] commandline and WebUI tool.
I have no idea what "reasonably fast" means for you. It good for performance when model fit inside memory of a graphic card. Nvidia 4090 with 24Gb will be more than enough to start learning. I use gaming notebook with Nvidia 3080Ti equipped with 16Gb of videomemory.
I have no issues with using just the CPU on smaller (<= 13b) models and it's quite fast enough for me. Even 70b models still work if you have the RAM, they're just much slower.