| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by amthewiz 283 days ago

It has puzzled me why someone hasn't already done this yet, given LLMs are so good at language now.

Probably the short answer is that is it hard to get this to actually work. There were many open questions that one has to tackle simultaneously -

- What is the right balance between relying on LLMs to do the right thing vs the runtime around LLMs? For example, I went back and forth a few times getting LLMs to manage stack as you call one playbook from another one. Finally decided that it is most reliable to let the runtime take care of that.

- Context engineering - What to put in the prompt, in what order, how to represent state, how to handle artifacts, how to make sure that we use LLM cache optimally as context grows, how to "unwind" context as playbook calls return, how to compact specific types of information, how to make sure important context isn't lost, etc

- LLMs today have vastly different capabilities than 2 years ago. I have had to rewrite the whole stack from scratch 4 times to adjust. Wasn't fun, but had to be done.

- Language(s): How to represent the pseudocode so that it is both fluid natural language and a capable programming language? How to transform it so that it can be executed reliably through LLMs? How to NOT lose the flexibility and fluidity in the process (e.g. easy to convert to a graph like LangGraph, but then you are stuck with control flow), how to create a semantic compiler for that transition, what primitives to use for the compiled language that I call Playbooks Assembly Language [1].

- Agents and multi-agent system considerations - How to represent agents, how they should communicate. Agents are classes and they expose public playbooks that other agents can call). Agents can send natural language messages to each other and engage in conversations. Agents can call multi-party meetings. How can the behavior across all these interaction patterns be defined so it remains intuitive. For example, lifetime of a meeting is tied to "meeting: true" playbooks so the agent simply returns from the playbook to exit a meeting and meeting lifecycle is tied to the host returning from its meeting playbook.

- Which LLMs to support? Go for "Bring your own LLM" or restrict the set? Which LLM? LLM selection impacts how all the internal prompts are implemented so prompt building had to occur simultaneously with LLM selection.

It felt like playing an N-dimensional game of chess!

[1] https://playbooks-ai.github.io/playbooks-docs/reference/play...