This is what I'm fiddling with. My 2080Ti is not quite enough to make it viable. I find the small models fail too often, so need larger Whisper and LLM models.
Like the 4060 Ti would have been a nice fit if it hadn't been for the narrow memory bus, which makes it slower than my 2080 Ti for LLM inference.
A more expensive card has the downside of not being cheap enough to justify idling in my server, and my gaming card is at times busy gaming.
Like the 4060 Ti would have been a nice fit if it hadn't been for the narrow memory bus, which makes it slower than my 2080 Ti for LLM inference.
A more expensive card has the downside of not being cheap enough to justify idling in my server, and my gaming card is at times busy gaming.