Probably, yes. The slowness is not on the Streamlit end, but on the Replicate API end. The docs for the 13b API [0] say:
> Predictions typically complete within 9 seconds.
Whereas for the 70b API [1]:
> Predictions typically complete within 18 seconds. The predict time for this model varies significantly based on the inputs.
[0] https://replicate.com/a16z-infra/llama13b-v2-chat
[1] https://replicate.com/replicate/llama70b-v2-chat