Hacker News new | ask | show | jobs
by yuvalr1 606 days ago
Looking at how to deploy 1B and 3B Llama models on Android for inference. Some posts online recommend using Termux (an amazing app) to have an emulated shell and then install as if it's Linux, using ollama for example. However, this forces you into a manual installation process, and also most of the people don't know what Termux is, and would be afraid to install it from F-Droid.

Maybe someone can recommend a way to deploy Llama to Android without Termux, maybe even something that can be potentially fully implemented inside an app?

I'm currently looking into compiling llama.cpp for Android and bundling it inside an app. Is that a viable path? Would love to hear from someone who tried something similar.

3 comments

I actually did something similar using llama.cpp a while back, would be curious to see the speedup with this model.

https://github.com/TugdualKerjan/bunny/tree/main

You can use MLC LLM: https://llm.mlc.ai/