|
|
|
|
|
by pickettd
220 days ago
|
|
Was the on-device local LLM stack that you tried llama.cpp or something like MLC? I've seen better performance with MLC than llama.cpp in the past - but it has been probably at a least a year since I tested iphones and androids for local inference |
|
It was a while ago. If I was to do it over again I might try https://github.com/tirthajyoti-ghosh/expo-llm-mediapipe. Maybe newer models will help.