| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by throwaway19423 846 days ago
	I am confused how all these things are able to interoperate. Are the creators of these models following the same IO for their models? Won't the tokenizer or token embedder be different? I am genuinely confused by how the same code works for so many different models.

1 comments

brucethemoose2 846 days ago

It's complicated, but basically because most are llama architecture. Meta all but set the standard for open source llms when they released llama1, and anyone trying to deviate from it has run into trouble because the models don't work with the hyper optimized llama runtumes.

Also, there's a lot of magic going on behind the scenes with configs stored in gguf/huggingface format models, and the libraries that use them. There are different tokenizers, but they mostly follow the same standards.

link

null_point 845 days ago

I found the magic! https://github.com/search?q=repo%3Aggerganov%2Fggml%20magic&...

link

null_point 845 days ago

Hey, c'mon now. Just being playful about the "magic" string used in GGUF files to detect that it is in-fact a GGUF file.

link