There are things called 'Open Weight' language models, a lot of which (those with between 8b to 34b parameters anyway) are considerably cheaper to run any of OpenAI's models.
Moreover, you get surprisingly out-of-class (size-wise) performance if you fine-tune for your specific problem space. Even if you only train in a parameter-efficient way.