<s>[INST] What is your favorite condiment? [/INST]
"Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"</s> [INST] The right amount of what? [/INST]
Note that the sentence starting "Well, I'm quite partial isn't inside the tag.
ollama run Mistral "<s>[INST] What is your favorite condiment? [/INST] Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen</s> [INST] The right amount of what? [/INST]"
That's the whole context with two user inputs in the INST tags, and one assistant output between/outside of the tags. They're just simulating the beginning of a conversation. You can see this very clearly in the JSON version in the next code block:
messages = [
{"role": "user", "content": "What is your favourite condiment?"},
{"role": "assistant", "content": "Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavour to whatever I'm cooking up in the kitchen!"},
{"role": "user", "content": "Do you have mayonnaise recipes?"}
]
Yes, but this whole block of text gets passed to the LLM on each call as the conversation history. The [INST] tags tell the LLM which parts were inputs (system instructions) as opposed to outputs.