Hacker News new | ask | show | jobs
by atairov 1045 days ago
If your goal is to make it as fast as possible, then for sure Python implementation is not a solution here. I think for this exactly reason llama.cpp got high attention