| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by stavros 81 days ago
	What causes these? Given how simple the LLM interface is (just completion), why don't teams make a simple, standardized template available with their model release so the inference engine can just read it and work properly? Can someone explain the difficulty with that?

1 comments

Yukonv 81 days ago

The model does have the format specified but there is no _one_ standard. For this model it’s defined in the [ tokenizer_config.json [0]. As for llama.cpp they seem to be using a more type safe approach to reading the arguments.

[0] https://huggingface.co/google/gemma-4-31B-it/blob/main/token...

link

stavros 81 days ago

Hm, but surely there will be converters for such simple formats? I'm confused as to how there can be calling bugs when the model already includes the template.

link