| These are the biggest ways to lower cost that I've used in the past, with a high burn rate it's important to focus on the things that can change the economics on a short timeline ( think next week ), as well as activities on a longer-timeline ( next year ). You should have a plan in place for your board - and be able to discuss the cost reduction strategy for Cost of Goods Sold in any future financing rounds. Carefully consider the full TCO - buying colo hardware means opting out of ~3 years of future price reductions/hardware improvements in the cloud + opportunity cost. 1) Call your provider and find out what options they have to cut your cost. This can take the form of discounts, credits, or increased reservations 2) It's not uncommon for ML teams to have excess capacity sitting around for forgotten R&D activities. Make sure that you're team is tearing down hardware, consider giving all scientists their own dedicated workstation for model development activities. You can smoke test the opportunity here by verifying that the GPUs are actually being utilized to ~40-80% average capacity. 3) Really dive into whether you need the parameters/model architecture you have. The best model for your company will need to balance latency/cost with accuracy. If you're using a transformer where a CNN or even a logistic regression with smart feature extractors could do with 1% accuracy loss. Then do your customers really need the transformer? 4) As others have suggested drill-down on the inference and training costs. Train less frequently/not at all/or sample your data. Generally the benefit of using more data in a model is logarithmic at best vs. the linear training time. 5) Buy your own hardware, particularly for GPU inference RTX cards can be purchased in servers for your own colo - but not in clouds. The lead time would be a few months but the payoff could occur within ~2-6 months in a colo. 6) Leaving this here as it used to affect Analytics/Ad-Tech and other "big-data" companies. Programming languages are not created equal in performance, and given equal implementations a statically typed language will crunch data between 10 and 1000x faster and cheaper than a dynamically typed language. If your business is COGS pressed then your team will probably spend more time trying to optimize hardware deployments and squeezing perf out of your dynamic language than you gain in productivity. Drill down on your costs and check how much of it is raw data-processing/transaction scheduling/GPU scheduling and make sure that you're on the right tech path for your customers. Lastly at an 80% Cost of Goods Sold(COGS) it's quite possible that your business is either low margin or the pricing structure isn't well aligned as this is a new startup - ask yourself if you expect to raise prices for future non-founding customers. If so then it's possible that your current customers are helping reduce your marketing expenditures, and you may be able to leverage the relationship to help "sell" to future customers. |