I'm just about to ship an update to the iOS version my offline LLM app which will replace its current 3B default model (RedPajama Chat) with Stable LM 1.6B. Works extremely well even when quantized. I initially wanted to ship it with TinyLlama Chat, but TinyLlama and its fine tunes are quite subpar and many of my beta testers complained that it's much worse than even the old 3B model and then I found StableLM 2 Zephyr 1.6B. :)
https://imgur.com/a/Imd2l9o