Y
Hacker News
new
|
ask
|
show
|
jobs
by
spwa4
64 days ago
It is much faster though. On my m1 max, describing a picture (quick way to get a pretty large context):
Qwen 3.6 35b a3b: 34 tok/sec
Qwen 3.5 27b: 10 tok/sec
Qwen 3.5 35b a3b: doesn't support image input
2 comments
upboundspiral
63 days ago
I've been using Qwen 3.5 35B-A3B with images as input so I suspect you perhaps didn't include the vision part of the model during testing (I use llama.cpp and I learned I needed to include the separate mmproj part).
link
m-emre
61 days ago
What is the quantization level of your Owen 3.6 3b model?
link