Hacker News new | ask | show | jobs
by 3cats-in-a-coat 240 days ago
I'll say the obvious. A lot of this you can just do in JSON.

Let's take the example:

    {
      "users": [
        { "id": 1, "name": "Alice", "role": "admin" },
        { "id": 2, "name": "Bob", "role": "user" }
      ]
    }

    users[2]{id,name,role}:
      1,Alice,admin
      2,Bob,user
We can keep it JSON, but use more compact list expressions, as tuples when pragmatic:

    ["users",
       [1, "Alice", "admin"],
       [2, "Bob", "user"]
    ]
The thing is the game with LLMs is not what's shortest, but what's:

1. Mainstream, so they understand it.

2. What they're tuned for, and their tuned for what's mainstream (JSON).

If you want to go extreme compression you can shove it all in JSON strings too and keep the larger structure JSON:

    ["users",
       "1:admin:Alice",
       "2:user:Bob",
    ]
You may say "how is this better". Well it's better because it's still JSON, there's less to explain to the LLM, and to your other devs. Even if we use a weird compact format like "id:role:name" this is still shorter to explain than a completely different syntax with its whole world of rules.
1 comments

If fairness to toon, the alternative json your giving doesn’t include hints on structure.

Not sure LLM are more “tuned” to JSON.

That said, your general point holds that toon maybe unnecessary. Especially in the examples given. But perhaps plan text would suffice. Toon could be useful when automating inputs with many different shapes.

Yea exactly. The LLMs are tuned to natural language. I don't think anything will beat good ol' templating (a.k.a. plain text). In Go I do something like this:

  // mytemplate.tmpl
  Description="The following data is for the users in our application."
  Format="id,name,role"
  length=2
  Data:
  {{range .}}
  {{.ID}}, {{.Name}}, {{.Role}}
  {{end}}
This way you're able to change the formatting to something the LLM understands for each struct. The LLM might understand some structs better as JSON, others as YAML, and others in an arbitrary format. Templating gives you the most flexibility to choose which one will work best.