Hacker News new | ask | show | jobs
by fnands 358 days ago
Mojo (and Modular's whole stack) is pretty much completely focused at people who are interested in inference, not training nor research so much at this moment.

So going after people who need to build low latency high-throughput inference systems.

Also as someone else pointed out, they also target all kinds of hardware, not just NVidia.

2 comments

Why not use PyO3 instead? It had a much cleaner interface than cython and c++ libraries.

The primary advantage of mojo seems to be Gil-free syntax that is as close to Python as possible.

GPU programming in Rust isn't great.

In Mojo it's pretty much the whole point of the language. If you're only using CPUs, then yeah, PyO3 is a good choice.

What about Candle, made by Huggingface? Seems to at least allow the basics and has lots of examples, all of them run on both CPU and GPU. Haven't dived deeper into it, but played around with it a bit and found it good enough for embedding purposes at least.
I think the big value add of Mojo is that you are no longer writing GPU code that only runs on one particular GPU architecture.

In the same way that LLVM allows CPU code to target more than one CPU architecture, MLIR/Mojo allows GPU code to target multiple vendor's GPUs.

There is some effort required to write the backend for a new GPU architecture, and Lattner has discussed it taking about two months for them to bring up H100 support.

Indeed, but not only GPUs but accelerators in general. Mojo will be able to target weird esoteric hardware (portably if that is important)
Currently looks more like CPUs and eventually AMD, from what I have been following up on their YouTube sessions, and whole blog post series about freedom from NVidia and such.

They also miss CPUs on Windows, unless using WSL.

There's pretty broad support for server-grade and consumer GPUs. It's a bit buried, but one of the most reliable lists of supported GPUs is in the Mojo info documentation.https://docs.modular.com/mojo/stdlib/gpu/host/info/
Already GPU code, kernels, and complete models can run on datacenter AMD GPUs using the same code, the same programming model, and same language constructs.
Laptops?
Yes, recent NVIDIA and AMD consumer GPUs are supported: https://docs.modular.com/max/faq/
not sure, modular is focusing mainly on enterprise applications. but if you look at the current PRs you can see people hacking support for standalone consumer-grade Nvidia and AMD gpus because it is easy, you just add the missing or different intrinsics for the architecture in the lowest level (in pure mojo code) and wire it up in a few places and voila you already program and run code on this GPU. iGPU/Apple GPUs are still not supported yet but it would interesting to see their integration