Hacker News new | ask | show | jobs
by _moog 338 days ago
I recently started diving into LLMs a few weeks ago, and one thing that immediately caught me off guard was how little standardization there is across all the various pieces you would use to build a chat stack.

Want to swap out your client for a different one? Good luck - it probably expects a completely different schema. Trying a new model? Hope you're ready to deal with a different chat template. It felt like every layer had its own way of doing things, which made understanding the flow pretty frustrating for a noobie.

So I sketched out a diagram that maps out what (rough) schema is being used at each step of the process - from the initial request all the way through Ollama and an MCP server with OpenAI-compatible endpoints showing what transformations occur where.

Figured I'd share it as it may help someone else.

https://moog.sh/posts/openai_ollama_mcp_flow.html

Somewhat ironically, Claude built the JS hooks for my SVG with about five minutes of prompting.

2 comments

Have you tried BAML? We use it to manage APIs and clients, as well as prompts and types. It gives great low level control over your prompts and logic, but acts as a nice standardisation later.
That's going to be super useful for some of the high-level prompt-testing work I'm doing. Thanks!

I'm also getting more into the lower-level LLM fine-tuning, training on custom chat templates, etc. which is more of where the diagram was needed.

+1 for BAML. I find that the "prompts as typed functions" concept really simplifies the mental model here, making LLM apps easier to reason about.
I found this really helpful. I've read a few different bits around this area, and being able to quickly click and scroll around this has confirmed my understanding of it now - thanks!

I thought it funny to think how this is all to give the impression to the user that the AI, for example, _knows_ the weather. The AI doesn't: it's just getting it from a weather API and wrapping some text around it.

Now, imagine being given a requirement 5 years ago like: "When the user asks, we need to be able to show them the weather from this API, and wrap some text around it". Imagine something like your diagram came back as the proposed the solution:| Not at all a criticism of any of your stuff, but it blows my mind how tech develops.