| He has GLM 4.5 Running at ~100 Tokens per second. Assumptions: Batch 4x and get 400 tokens per second and push his power consumption to 900W instead of the underutilized 300W. Electricity around €0.2/kWhr. Tokens valued at €1/1M out. Assume ~70% utilization. Result: You get ~1M tokens per hour which is a net profit of ~€0.8/hr. Which is a payoff time of a bit over a year or so given the €9K investment. Honestly though there is a lot of handwaving here. The most significant unknown is getting high utilization with aggressive batching and 24/7 load. Also the demand for privacy can make the utility of the tokens much higher than typical API prices for open source models. In a sort of orthogonal way renting 2 H100s costs around $6 per hour which makes the payback time a bit over a couple months. |
GLM 4.5 Air, to be precise. It's a smaller 166B model, not the full 355B one.
Worth mentioning when discussing token throughput.