| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by yuvalr1 606 days ago

Looking at how to deploy 1B and 3B Llama models on Android for inference. Some posts online recommend using Termux (an amazing app) to have an emulated shell and then install as if it's Linux, using ollama for example. However, this forces you into a manual installation process, and also most of the people don't know what Termux is, and would be afraid to install it from F-Droid.

Maybe someone can recommend a way to deploy Llama to Android without Termux, maybe even something that can be potentially fully implemented inside an app?

I'm currently looking into compiling llama.cpp for Android and bundling it inside an app. Is that a viable path? Would love to hear from someone who tried something similar.

3 comments

tugdual 606 days ago

I actually did something similar using llama.cpp a while back, would be curious to see the speedup with this model.

https://github.com/TugdualKerjan/bunny/tree/main

link

niutech 603 days ago

You can use MLC LLM: https://llm.mlc.ai/

link

antonvs 606 days ago

This might be of use:

https://github.com/a-ghorbani/pocketpal-ai

link