The notebook can be helpful for those trying to load Llama2 on Colab.
1) Installed Transformers from the main branch (and other libraries)
2) Loaded Llama-2-13b-chat-hf on Colab using 4-bit quantizazion, thanks to the material shared by Younes Belkada
3) Disabled Tensor Parallelism, which caused some issues
4) Installed a minimal version of Haystack
5) Found a hacky way to load the model in Haystack's PromptNode
6) Had a fun chat session with the model, discussing everything from David Guetta to Don Matteo (an Italian TV series)!
The notebook can be helpful for those trying to load Llama2 on Colab.
1) Installed Transformers from the main branch (and other libraries)
2) Loaded Llama-2-13b-chat-hf on Colab using 4-bit quantizazion, thanks to the material shared by Younes Belkada
3) Disabled Tensor Parallelism, which caused some issues
4) Installed a minimal version of Haystack
5) Found a hacky way to load the model in Haystack's PromptNode
6) Had a fun chat session with the model, discussing everything from David Guetta to Don Matteo (an Italian TV series)!