Hacker News new | ask | show | jobs
by blemis 65 days ago
love the budget-constraints section, feels way too familiar. i'm running 7 NLP models on an $11/month VPS for a different project and every architecture decision ends up being "what's the cheapest way that doesn't burn my bank account on one runaway request."

question — how are you handling demucs cold starts on modal? cold start was what eventually pushed me off modal for a request-response

use case. the user is staring at a spinner for 20-30s on first invocation and it kills conversion. did you solve it or just eat it because the rest is so computationally heavy anyway?