|
|
|
|
|
by arendtio
141 days ago
|
|
Thanks for the info, but I don't think it answers the question. I mean, you could train a 20-node network on 36 trillion tokens. Wouldn't make much sense, but you could. So I was asking more about the number of nodes / parameters or GB of file size. In addition, there seem to be many different versions of Qwen3. E.g. here the list from ollama library: https://ollama.com/library/qwen3/tags |
|