Specifically: to explore your opensource options with compute limitations, ask the community at r/LocalLLaMA on reddit. That's where the current SOTA opensource text-to-text models live.
Yeah i was looking there earlier, its just we thankfully mostly have macbooks, but i recently found out new devs are getting the smaller 8gb ram macbooks as well. Which is going to be even more frusturating.
Since my team is mostly remote running LLM on a cluster in the office is not really viable short term.
This is totally going to suck, but here's one option I was just suggested a few mins ago: https://www.reddit.com/r/LocalLLaMA/comments/1th1mqx/comment... For context, I was asking about running anything OpenClaw-friendly on my RTX4060 8GB VRAM. I know yours is a more involved use-case, but there's still some optionality here.
Since my team is mostly remote running LLM on a cluster in the office is not really viable short term.