| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by erikbern 852 days ago
	Founder of Modal here. We've spent a ton of time on this, including building our own distributed file system optimized for low-latency high-througput workloads. We don't use K8s or Docker and built our own custom infrastructure instead. Cold starting containers quickly is a fascinating problems. We've gotten a long way but there's still a lot more to do. For GPU-based inference, starting containers isn't enough – you also need to initialize the model GPU quickly. We are working on a long list of things that will bring down cold start latency even further.

1 comments

hanrelan 848 days ago

Is Modal a good solution for running fine-tuned LLMs and Whisper models? If the cold-start time is low we're more than willing to modify our code to use Modal's infra. Happy to follow up via email but didn't see one in your profile.

link