|
|
|
|
|
by _moog
338 days ago
|
|
I recently started diving into LLMs a few weeks ago, and one thing that immediately caught me off guard was how little standardization there is across all the various pieces you would use to build a chat stack. Want to swap out your client for a different one? Good luck - it probably expects a completely different schema. Trying a new model? Hope you're ready to deal with a different chat template. It felt like every layer had its own way of doing things, which made understanding the flow pretty frustrating for a noobie. So I sketched out a diagram that maps out what (rough) schema is being used at each step of the process - from the initial request all the way through Ollama and an MCP server with OpenAI-compatible endpoints showing what transformations occur where. Figured I'd share it as it may help someone else. https://moog.sh/posts/openai_ollama_mcp_flow.html Somewhat ironically, Claude built the JS hooks for my SVG with about five minutes of prompting. |
|