Consider also an online llama as a service like deepinfra. I have a local 3090 for playing around with the smaller models, but it's nice having the option of calling the 405b.
Ooh, I like that. Can see using them as a stepping stone where I'm using an open source model but without the hassle of setting up my own machine (but that I could later).