| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by IshKebab 1061 days ago
	Setting up and deploying models in production or on edge devices is much much more complex if you have to deal with Python and Conda and whatnot.

1 comments

dbmikus 1061 days ago

You can compile the models to something that runs on edge though, right? For example, Tensorflow is a C++ framework that has Python bindings and a Python library, but when the models are served they are running on C++.

Maybe the act of compilation is an extra step, but I'd much rather have my development be in a high level language that is very suited to experimentation, probing, and testing, and then compile the final result down to something performant.

EDIT: I don't know much about the IOT world, and Tensorflow is likely a bad example as it's not designed to run on edge. So, I could understand that things like llama.cpp, GGML and GGUF are making strides towards easier runtimes. But I still think for dev-time, Python makes sense!

link

shepardrtc 1060 days ago

Cloudflare lets you just upload the model itself: https://blog.cloudflare.com/introducing-constellation/

No idea what they're using to run it though. But there's no way I'll stop using Python for working with ML code lol. It just makes life easy.

link

IshKebab 1060 days ago

> Tensorflow is a C++ framework that has Python bindings and a Python library, but when the models are served they are running on C++

Sure, and it's only a simple 20 step process that involves building Tensorflow from source. Yeay!

https://medium.com/@hamedmp/exporting-trained-tensorflow-mod...

Let me see what the process for compiling a LLM written in Rust is....

https://github.com/rustformers/llm

    cargo install llm-cli

Oh look, it doesn't make me immediately want to give up.

link

dbmikus 1060 days ago

llm-cli looks like it loads model files and it doesn't help with model development. It runs GGML model files. The models aren't written in Rust. Besides the point, GGUF is successor to GGML. There's a variety of ways to convert Pytorch, Keras, etc models to GGML or GGUF.

I dunno, maybe we're talking about different things. I'm saying it's better to do model development in a high level language and then export the training or runtime to a lower level framework, multiple of which exist and have existed. It's becoming simpler to use low-level runtimes (llama.cpp vs Tensorflow). Is that the point you're making?

link