this for zero shot instructions: https://huggingface.co/Open-Orca/OpenOrcaxOpenChat-Preview2-...
easiest way would be https://github.com/oobabooga/text-generation-webui
a little more complex way I do is I have a stack with llama.cpp server, a openai adapter, and bettergpt as frontend using the openai adapter as the custom endpoint. bettergpt ux beats oogaboga by a long way (and chatgpt on certain aspects)