| Lots of broken links in the doc, though I guess the YAML file specifies everything: https://github.com/open-llm-initiative/open-message-format/b... The metadata tokens is a string [1]... that doesn't seem right. Request/response tokens generally need to be separated, as they are usually priced separately. It doesn't specify how the messages have to be arranged, if at all. But some providers force system/user/assistant/user... with user last. But strict requirements on message order seem to be going away, a sort of Postel's Law adaptation perhaps. Gemini has a way of basically doing text completion by leaving out the role [2]. But I suppose that's out of the standard. Parameters like top_p are very eclectic between providers, and so I suppose it makes sense to leave them out, but temperature is pretty universal. In general this looks like a codification of a minimal OpenAI GPT API, which is reasonable. It's become the de facto standard, and provider gateways all seem to translate to and from the OpenAI API. I think it would be easier to understand if the intro made it more clear that it's really trying to specify an emergent standard and isn't proposing something new. [1] https://github.com/open-llm-initiative/open-message-format/b... [2] https://ai.google.dev/gemini-api/docs/text-generation?lang=r... |
> The metadata tokens is a string [1]... that doesn't seem right. Request/response tokens generally need to be separated, as they are usually priced separately.
For the metadata you are right. Request and response tokens are billed separately and should be captured accordingly. I've put a PR to address that [2]
> It doesn't specify how the messages have to be arranged, if at all. But some providers force system/user/assistant/user... with user last. ...
We do assume that last message in the array to be from user. But we are not forcing it at the moment.
[1] https://github.com/open-llm-initiative/open-message-format/p...
[2] https://github.com/open-llm-initiative/open-message-format/p...