Hacker News new | ask | show | jobs
by menaerus 555 days ago
My question wasn't about how to run multiple queries against the LLM but rather how is it even possible from transformer architecture PoV to have a single LLM hosting multiple and different end clients. I'm probably missing something but can't figure that out yet.
1 comments

If you have a branchless program, you can execute the same step of the program on multiple different inputs. https://en.wikipedia.org/wiki/SIMD