Hacker News new | ask | show | jobs
by simonw 1047 days ago
If you want to try Llama 2 on a Mac and have Homebrew (or Python/pip) you may find my LLM CLI tool interesting: https://simonwillison.net/2023/Aug/1/llama-2-mac/
2 comments

Does it support Metal / MPS acceleration?
I’ve gotten mine running with FastChat - they have a Metal/MPS option.

Sadly 7b is not very good for SQL tasks. I think even with RAG it would struggle.

What's the inference time without gpu?
It might the time mentioned at the bottom of the page since the author isn't sure that the GPU is being used:

>How to speed this up—right now my Llama prompts often take 20+ seconds to complete.