|
|
|
|
|
by bambax
450 days ago
|
|
I'm confused by the language here; it seems "model" means different things. To me a "model" is a static file containing numbers. In front of that file is an inference engine that receives input from a user, runs it through the "model" and outputs the result. That inference engine is a program (not a static file) that can be generic (can run any number of models of the same format, like llama.cpp) or specific/proprietary. This program usually offers an API. "Wrappers" talk to those APIs and therefore, don't do much (they're neither an inference engine, nor a model) -- their specialty is UI. But in this post it seems the term "model" covers a kind of full package that goes from LLM to UI, including a specific, dedicated inference engine? If so, the point of the article would be that, because inference is in the process of being commoditized, the industry is moving to vertical integration so as to protect itself and create unique value propositions. Is this interpretation correct? |
|
What makes a file non-static (dynamic?) other than +x?
Both are instructions about how to perform a computation. Both require other software/hardware/microcode to run. In general, the stack is tall!
Even so, I do agree that “a bunch of matrices” feels different to “a bunch of instructions” - although arguably the former may be closer in architecture to the greatest computing machine we know (the brain) than the latter.
</armchair>