|
|
|
|
|
by un1imited
650 days ago
|
|
hey @ianbicking - thanks a lot for the feedback. I've merged a change to fix the links [1]. > The metadata tokens is a string [1]... that doesn't seem right. Request/response tokens generally need to be separated, as they are usually priced separately. For the metadata you are right. Request and response tokens are billed separately and should be captured accordingly. I've put a PR to address that [2] > It doesn't specify how the messages have to be arranged, if at all. But some providers force system/user/assistant/user... with user last. ... We do assume that last message in the array to be from user. But we are not forcing it at the moment. [1] https://github.com/open-llm-initiative/open-message-format/p... [2] https://github.com/open-llm-initiative/open-message-format/p... |
|
Multiple system messages are kind of a hack to invoke that distinct role in different positions, especially the last position. I.e., second to last message is what the user said, last message is a system message telling the LLM to REALLY FOLLOW THE INSTRUCTIONS and not get overly distracted by the user. (Though personally I usually rewrite the user message for that purpose.)
Multiple user messages in a row is likely caused by some failure in the system to produce an assistant response, like no network. You could ask the client to collapse those, but I think it's most correct to allow them. The user understands the two messages as distinct.
Multiple assistant messages, or no trailing user message, is a reasonable way to represent "please continue" without a message. These could also be collapsed, but that may or may not be accurate depending on how the messages are truncated.
This all gets even more complicated once tools are introduced.
(I also notice there's no max_tokens or stop reason. Both are pretty universal.)
These message order questions do open up a more meta question you might want to think about and decide on: is this a prescriptive spec that says how everyone _should_ behave, a descriptive spec that is roughly the outer bounds of what anyone (either user or provider) can expect... or a combination like prescriptive for the provider and descriptive for the user.
Validation suites would also make this clearer.