Hacker News new | ask | show | jobs
"Tinyboxes finally have a buy it now button" (twitter.com)
197 points by hedgehog0 655 days ago
15 comments

I thought to myself this morning: "boy, that $15k pricetage is tempting." Then I thought to myself "how many times have I downloaded a github repo only to hand-replace cuda with mps, and then tried to figure out if there's a version of xformers that will work this week with my m3?" and then I thought "boy, that $25k is tempting." (15k: Radeon / 25k: Nvidia).

For those wondering, 3200W power, in residential / low-end commercial in the US, they say you'll need two separate circuits, they have a built-in power limiting utility in the OS which will let you safely run on one circuit at reduced speed.

The only part of this that gives me pause is interconnect -- over PCIe, 64GB/s stated. This is much, much lower than infiniband -- can any ML engineers comment on using this box for a full finetune of, say, LLama 3.1 / 70b?

You can't fine tune a 70B model with this. It barely even fits the weights before it runs out of vram. Need a bigger machine.
I think the idea is you network them together if you need more and most models can be split nicely.
For that you'd probably be better off removing one of the GPUs, and replacing it with a networking card.

The problem of the form factor will remain. The tinybox is 15U big for compute that you'd normally expect to find in a 4U form factor.

I don't think they're intended for rack usage like that. More like for people to put under their desks... there would be no reason to build the giant case with fancy silent-ish cooling if you're going to put them next to your other jet engines.
Fully agree, and I think the tinybox is great if you put only one of them somewhere in your local office.

I just don't think it makes sense to connect multiple of them into a "cluster" to work with bigger models, as the networking bandwidth isn't good enough and you'd have to fit multiple of these big boxes into your local space. Then I might as well put up a rack in a separate room.

3kW under your desk... no need to turn on the heat in the winter!
Most models actually can't be split nicely by 6. There's a reason nvidia builds nodes with 4 and 8 GPUs.
I don't see why 6 is inherently worse than 4 or 8, not all of the layers are exactly equal or a power of 2 in count. 2^2, 2^3, vs 2^1*3^1 might give you more options.

The main issue I run into mainly is flops vs ram in any given card/model.

Usually you want to split each layer to run with tensor parallelism, which works optimally if you can assign each kv head to a specific GPU. All currently popular models have a power of 2 number of kv heads.
The networking of the tinybox is woefully inadequate. I.e. it only has an OCP 3.0 interface which is unoccupied. If you can fit everything onto one tinybox, then you'll be good, if you cannot, then you'd be better off by having a more professional workstation solution like e.g. NVIDIA RTX cards which have more memory.
That OCP 3.0 card has the same link bandwidth as the GPUs, so you can scale out without much loss of all-reduce bandwidth. In practice, for all models except the largest, the ~16GB/s all-reduce is totally fine. You just need to make sure you can all-reduce all weights in your training step time.

Say you are training a 3B parameter model in BF16. That's 6GB of weights, as long as your step time is >=500ms you won't see a slowdown.

> 3B parameter model

That's tiny. Can it train/fine-tune 70B models?

a 220 volt 20 amp circuit should be good for over 3500 watts constant load in North America. Why is it requiring two circuits?
Most likely what they actually mean is:

This server has two IEC C20 connectors, rated for ~16 amps, each feeding a PSU rated for 1600W (i.e. 16A @ 100v)

If you're plugging in to 110v you shouldn't plug them both into the same outlet, as a 20A circuit can't supply 32A.

As each PSU is rated for 1600W you'll have to plug both in to get 3200W even if you're running on 220v - although they'd only draw ~7.2A each in that case.

US Residential 220v dryer outlets are usually wired one-circuit-to-one-outlet, and multi-way adaptors are discouraged. So although plugging two 7.2A loads into a single 20A feed would work from a current perspective (and indeed it's common in Europe), I don't know how easy it is to do legally.

If you're in a data centre with a 3-phase 220v power you probably know what you're doing. Your UPS guy will probably thank you if you split your load over two phases instead of putting the whole load onto one phase.

Imagine dropping $15k on this but not wanting to spend $800 on an electrician to properly wire a 50A circuit so you run extension cords across the room creating a fire hazard.

As for the datacenter (I’ve racked many things with A/B power) the entire point is redundancy which this defeats the purpose of since each PSU is not properly rated. Seems incredibly bizarre to me in so many ways.

> As for the datacenter (I’ve racked many things with A/B power) the entire point is redundancy which this defeats the purpose of. Seems incredibly bizarre to me in so many ways.

Yes - often for the data centre you'd end up with something like [1] with 4x 2700W power supplies, providing redundancy and ample power at the same time. It does mean you need four 220v power feeds though.

[1] https://www.supermicro.com/en/products/system/gpu/4u/sys-421...

Why is it a fire hazard if the extension cords are properly rated for this load?
Extension cords are only supposed to be used for 90 days or so, you're technically violating the NEC if you're using them in a permanent installation.
You can feed a US outlet the split phase 240V and get two 120V@20A each.

It used to be done in kitchens in the US, back when appliances were power hungry. I have done so in my workshop for the same reason.

Houses are wired in split phase 240V, with the neutral in the middle. That is, you have two opposite 120V phases, around the same neutral.

This is a clever way to double the power, while adding a single wire.

In the US the standard outlet receptacle has two outlets. Bring the same neutral to the two outlets, and assign one phase per outlet (outlets have metals tabs you can break off, you don't need any extra wiring).

At the panel, you have a dual breaker. One breaker per phase, with a physical linkage forcing them to trip and arm together at once.

As a benefit; but very unsafe; you can make up a Y that plugs into the two 120V outlets, and gives you a single 240V receptacle. This is unsafe because if you plug only one of the 120V plug, the other one has now 120V on its exposed phase prong! On the other hand, I have both 240V@20A and 2×120V@20A anywhere in the shop ;)

Skip the Y hack and do it in style, legally!

https://store.leviton.com/products/duplex-receptacle-outlet-...

I am aware of this. But then I have a single 120@20 vs a single 240@20.

With my setup I have 2×120@20 always available, and 240@20 for the occasional welding.

I could assign a different 120 phase to every other outlet but then I would need some clear identification.

The two phases are assigned to the top and bottom outlets the same way all around the shop. If I need to run two high amperage machines, I only have to remember to use one bottom and one top outlet.

If you're talking about a workshop and anticipating that much ad-hoc power usage, I'd just put two dual 6-20 receptacles side by side rather than splitting one. And then since you're actually creating the premises wiring, stick an (L)14-20R next to them in parallel and get rid of your need to fuss with hacky combiner cords. At least that's what I plan to do when I have the time for such luxuries.
I only very rarely need 240V, if I had permanently mounted 240V machines or frequent needs, I would do exactly what you propose.
GFCI requirements will interfere with the legality of many modern-day multi-wire branch circuit plans, yeah?
You can get a two-pole GFCI breaker for this purpose. The prices are a bit silly.
These are not redundant PSUs, each PSU powers different GPUs in the same machine. Are you sure connecting them to different phases is a good idea?

I've been looking for a proper answer to this for a while, because I want to build a similar machine with 8 GPUs (~4500W max load) which would need to be split between two 16A 230V circuits.

The transformers in the power supplies provide 'isolation' between the input and output - which means you can connect the outputs together, even when the inputs are on different phases.

Are you planning to build such a machine for your personal home use? If so you should be aware that (a) you might find server hardware hasn't thoroughly tested compatibility with things like suspend; (b) you might find games haven't thoroughly tested compatibility with multi-GPU setups; and (c) you might find the idle power consumption is 200W or more, even while doing nothing.

It's for personal use, though it would not run any games, it would be for running offline inference and other experiments. Probably not a smart purchase, but a fun one...

That is good to know multiple phases can work. Perhaps there would still be a fire risk in case of a short? Like somehow bridging the circuits > breakers don't trip?

Keep in mind GPUs (and the rest of the computer) run on DC, not AC, so there is no phase by the time it comes to your computer. The PSU will step down the AC to the right voltage and then rectify it into DC, and they do that independently so whatever phase they started with shouldn't matter.

Something to keep in mind though is that (at least with consumer-grade PSUs) it is not safe to simply tie the outputs together, even if both PSUs produce 5V, 12V, 3.3V, etc. The voltages will be slightly different and connecting them together will cause current to flow back into one of the PSUs.

You can still use this setup though, the key is that the GPUs do not (or should not) connect the motherboard voltage provided via the card slot to the voltage provided via the power connector. This detail allows you to safely power the motherboard from one PSU and power the GPU from another one, you just have to be careful not to mix connectors on the same card between different PSUs (if it has multiple). Additionally the motherboard should be entirely powered from a single PSU.

Because most households in the US might have maybe 3 breakers setup this way, all of which are likely running critical infrastructure already.

Most folks aren’t going to unplug their water heater to turn on their AI.

Swapping an electric dryer around is maybe more practical. It also gives you an obvious place to dump the waste heat.

If I was serious about this I'd have an electrician and HVAC installer on the way first. A mini split in the computer room with a dedicated 50A/220v circuit.

I assume the people dropping 15$k on one of these to have in their house are comfortable with paying an electrician to wire it in if necessary.
There are few or no 220 volt circuits in North America. Your choices in that range are 208V or 240V.

But yes, a power supply can draw around 240V times 20A = 4800VA, which is nearly 4800W if the power factor is close to 1. An office in an office building is more likely to have 208V.

IIRC USA is 110V, not 220V?
I have a lot of 220V circuits. One is like 80A and powers a whole building. Also, almost all power comes into a home as 220V single phase from the local power distribution.

Water heater, heat pumps, stove, dryer, hot tub, etc are all 220.

Most US homes have at least one 220v split phase line for major appliances like stoves or AC.
Yes, but most homes don't have extra 220v outlets except for the ones provided for the specific appliances that need them.

So if you want to plug in a device like this "tinybox" at home, it's going to be a lot easier to find two separate 110v outlets on different circuits than to have a new 220v circuit added, or to unplug your stove every time you want to use it.

I don't know what adversarial relationship you have with electricians, but adding more 220v outlets is absolutely feasible. Usually takes an electrician a day of work.
Who needs a stove? My 3200W GPU box puts out more than enough heat to roast a chicken.
Most homes have a 240V supply with a neutral wire (V1, V2, N). This allows for split phase 120V power (V1+N, V2+N). You can also get 240V (V1+V2).

It's common for EVs, clothes dryers, ovens, and hot water heaters to use 240V while most other appliances are 120V.

220V is American version of what is known as 380V/400V elsewhere.
US three-phase power is mostly 208V, 240V, and 480V. The 208V is what normal residential 120/240V split-phase was made from. 240V is high-leg delta three phase and I think was old alternative to split-phase. 480V is used for light industrial that needs more power.

There is nothing in US power system that is 220V.

Ackshually, you need to tell that the GP of the thread, they began using "220v".
I had a preorder in for this but I canceled a few weeks back.

My experience trying to run machines this powerful in residential settings has been extremely poor.

All of the Seasonic power supplies that go beyond 1kW or so will trip my shitty (i.e. probably defective) Siemens AFCI breakers. Not even the same circuit all the time.

Even after violating local electrical code, I have found that living with a 1500w+ monster inside my house during the summer at 100% utilization is a complete joke. Unless you live in the perfect datacenter climate (i.e. the people who designed the tiny box), this thing needs to be inside. All of that wattage is pure heat being dumped into your home. The HVAC solutions in most residences were not designed for this kind of heat load. It would be like running your oven with the door hanging open all day. For those of us in places like Texas, this machine simply would not be feasible to run for half the year.

> All of the Seasonic power supplies that go beyond 1kW or so will trip my shitty (i.e. probably defective) Siemens AFCI breakers. Not even the same circuit all the time.

I don’t know much about US electrical standards but aren’t your residential circuits rated for 1800w or 2400w? Here in New Zealand they are 2400w and people regularly plug in 2400w fan heaters without issue.

> The HVAC solutions in most residences were not designed for this kind of heat load. It would be like running your oven with the door hanging open all day. For those of us in places like Texas, this machine simply would not be feasible to run for half the year.

Yes it wouldn’t be pleasant running this 24/7 in summer in any living space. But you could install a heatpump with 7kw of cooling capacity which should handle it (adding to the electricity bill).

> I don’t know much about US electrical standards but aren’t your residential circuits rated for 1800w or 2400w?

The residential AFCI issue I describe isn't about the wattage directly. It's about transient currents generated by large switch-mode power supplies being detected as arc faults. Similar concern as with induction motors.

That is interesting. In New Zealand AFCI is only required on 20A sub circuits in places that have a high fire risk, contain irreplaceable items, school sleeping areas and some other minor circumstances.
In the US the National Electric Code caps draw at 80% of rated load. So a 15A circuit is permitted a 1440W load even though it should handle 1800W.
Crazy. Our hair-dryer has 2kW.
> All of the Seasonic power supplies that go beyond 1kW or so will trip my shitty (i.e. probably defective) Siemens AFCI breakers.

In my experience, the Siemens AFCI just do that. I recommend switching them out for Eaton AFCI. That fixed all my nuisance tripping, especially from induction in other lines

I didn't realize Eaton had AFCI breakers listed for use with Siemens panels (or that they work better). I swapped it out for a non-AFCI Siemens, but if I can make it code compliant I'd much rather that.
Yeah. I think they have seperate models, I forget the details but it's working in my panel!
If you're spending $15k on a box, you can also spend $1200 for a small insulated shed kit and $800 for a small mini-split heat pump. I live in a much warmer summer climate than Texas and this solution works fine for me for my small network cabinet.
If the main argument for the box is compute/$, "and then you need to spend another 20% on top to even make it work" is not the most winningest position. (20% because you also need to pay an electrician for the two-circuit wiring. Well, three, you want to run the heat pump too)

At that point it isn't super price-efficient, it's an absolute space hog, and you need to maintain a whole bunch of infra. Still might work for you, but it's losing a lot of general appeal

Could you not duct the heat through a hose and out a window? Like with portable AC units
yep, or drop it in your basement. also, will help heat your home during winter.
https://tinygrad.org/#tinybox

Looks like good value, but I wonder if it would get CPU/RAM bottlenecked, especially if you want to train something with a lot of preprocessing in the pipeline. Something comparable I've found with 7x4090 which comes to about $50k, but with much better CPU/RAM (3x CPU, 4x RAM, 5x SSD):

https://www.overclockers.co.uk/8pack-supernova-mk3-amd-ryzen...

> 6x PCIe 4.0 x16 (64 GB/s)

Wikipedia [0] states that PCIe 4.0 x16 has a throughput of ~32GB/s, what does the (64 GB/s) indicate on the website, is this just a typo and you have 6x ~32GB/s or does it mean in total you can "only" expect a throughput of 64GB/s all lanes slots combined?

If so, wouldn't you also be bottlenecked by the PCIe bandwidth (when moving data between CPU and GPU)?

[0] https://en.wikipedia.org/wiki/PCI_Express#Comparison_table

Most EPYCs have 128 PCIe lanes, so I'd expect a full x16 link for all six GPUs.

Pedantically, the combined bidirectional bandwidth of PCIe x16 is ~64 GB/s, as it's a full-duplex ~32 GB/s link, but that's an awfully misleading spec if this is the intent (akin to claiming Gigabit Ethernet is 2 Gb/sec).

It's the same way NVIDIA states bandwidth for PCIe and NVLink.

https://www.nvidia.com/en-us/data-center/h100/

Well they're specifying the AMD EPYC and one of the things that the server line of AMD CPUs do that the consumer grade ones don't, is they have lots of connectivity. So for example an AMD EPYC 8324P is a 32 core CPU with 96 lanes of PCI Gen 5. Given that the 4090 GPU is PCI Gen 4, I think that's where you get the discrepancy. The 6 GPUS are connected in parallel to the CPU with 6 x16 connections (96 total lanes), the CPU could do this at Gen 5 (64GBs for each GPU) but the 4090 GPU is Gen4 only, so you'll only actually get 32GBps per connection.
It's 32GB/s in both directions. So when exchanging data two GPUs each can do this at 64GB/s. Is that a useful way to measure it? Who knows.
Closer to $42k, i think, at least if you're comparing it to the Tinybox price -- the price in pounds on the site includes VAT, which you wouldn't pay as a business or if you were getting it for export outside the UK, whereas you'd need to add on VAT if you were getting a Tinybox in the UK.
It’s weird how non-specific the CPU is there. Why wouldn’t they list a CPU part number? We don’t even know what generation of Epyc it is. (I get that it’s not the focus… but it is still important.)
They have a real website, why not link to that instead of Twitter?
I looked at the specs at the start of the year and just built something with the high end of consumer parts at around 4k usd. I was able to replicate the mlc 2x7900xtx results running some LLMs. Good enough to run most of the big models in gpu memory with a little quantization.
So is the plan for these to quietly update the hardware as better consumer hardware becomes available? This is a really interesting idea but as a small fry I would definitely be building myself if I went this route.
"tinybox red is 6x 7900XTX, tinybox green is 6x 4090"

So the red is ~$5k in gpu's - where is the other $10k going?

Motherboard and CPU, memory, NVMe drives, PSUs, SlimSAS cables and breakouts, a custom machined case, assembly, support.

You're free to try building one yourself for cheaper. If you consider your time for researching/assembling/testing it to be worthless, and are happy with a contraption in a miner frame, then you can probably do it.

PC builds seem to short circuit everyone's pricing logic and drive any labor cost down to $0, just because they're willing to do it for free. Anything above that $0 is considered overpriced.
It’s not that they’re willing to do it for free. It’s that they’re doing it for fun. It’s a hobby, not work.

Part of the fun is planning, researching, putting together the pieces and power it on.

There are services that will build a PC for $200. It's entirely valid to ask where the money goes, and the answer is not the labor to put the pieces together. There's no reason to assume OP is being dismissive of that specific cost.
Very unusual specs on paper.

- Air cooling 6x4090 and a 32 core CPU for sustained peak workloads.

- 3200W total power when a single 4090 can draw close to 600W.

Maybe they are targeting startups who aren't interested in overclocking.

I think the plan with this all along was George went off and built exactly what he thinks he needs for his specific work and then just makes it available. So is the wildly underpowered CPU bad? I don't know, I don't know his use case.

It also seems just weird from a business point of view. He's not going to sell many, he's not going to offer support, he's not at a scale where vendors are going to offer much particular support, and despite being absolutely tiny in scale he's still offering two totally different SKUs.

This thing could run off a single US outlet (1600W) with some throttling, if I'm not mistaken?

Shame it wasn't designed for EU sockets. 230V*16A = 3700W, or double that on separate breakers!

3200W seems perfect for a EU socket.

Seperate breakers aren't really a thing here, at least in my country, usually if you need more power you draw 400V

No it doesn't. A standard EU Socket is not certified for 24/7 3.2kW.

You should max. pull 2.7kW.

For everything else you need a blue eu socket or camper socket.

I learned this due to my EV which is able to be charged through a normal socket but it regulates it down due to this on purpose and has a temperature sensor build in as well.

US circuits are the same way. "Sustained use" (over 3 hours IIRC) has to be de-rated to 80% of max. So an EV can draw 40A on a 50A circuit.
So if it doesn't use 3.2kW continuously, but varies significantly based on what it's doing (perhaps even idle sometimes) then it's fine?
Yes. But you shouldn't risk it if you don't know. A ml job can run for hours or days
Is it per spec, or from experience?
from spec. full load only needs to be supported for up to an hour
On their site they say it can run at 220V 15A if you got that.

https://docs.tinygrad.org/tinybox/

Just being nitpicky - I'm from the EU, but I think in the US, you can get either:

  240V: Split-phase, this gives you 120V between each leg and neutral, and 240V across the two legs.
  208V: The interphase in a 3-phase system.
Might be still within tolerance of 220V :)

HTH, ducking out :)

Specs on website updated, anywhere from 100-240V is fine
A 4090 draws 450W unless you unlock the power limit
Transient might be an issue. GN discuss transients of ~30/40% over the nominal 450w (https://www.youtube.com/watch?v=j9vC9NBL8zo&t=616s).

And with a distributed training you can end-up with "synchronized" transients over all cards :(.

I hold an electrical certification in the EU, though I'm not currently practicing.

A quick point: transient surges are usually fine. Both cables and circuit breakers are designed to fail (trip or burn out) under sustained overloads. For example, a 16A Class C circuit breaker might take around an hour to trip with a constant 17A load, but a ~80A load would trip it ~instantly.

PS: Of course, everything is a matter of integration over time (heat dissipation in cables mostly).

The workstation equivalent, the RTX 6000 Ada, defaults to 300W. You can get most of the performance of a 4090 by capping the power.
Unless you're making money off it, $15k + however much you have to spend on installing a new breaker panel is too much to spend on hardware that will be outdated in 2 years. If you're making money off it, but you're still cheap, then buy a Supermicro + H100s and colo it in a datacenter. If you're not cheap, you'll just use Azure. So I'm not sure who this product is supposed to be for.
> The $15k tinybox red is the best perf/$ ML box in the world. It's fully networkable, so that's the metric that matters.

No it isn't. Capex is only part of the equation. Opex (power and cooling amongst other things) is important. And networking at scale isn't cheap either.

Looks like the Mac cheese grater now has serious competition.
Anyone know how well this would compare to the Nvidia based workstations at the GPTShop.ai place?

ie: things like this https://gptshop.ai/config/indexus.html

What is the most profitable thing I can do with 10 of these fine Tinybox boxes?
Sell them and invest the money
At least that’s what geo decided is the most profitable thing to do
Got confused with George Motz and thought this was some super burger making thing.