Hacker News new | ask | show | jobs
by hydra-f 34 days ago
Vision has become totally underappreciated, whereas I believe it brings important advantages to a model

Also, a big caveat in using Qwen models has always been its speech patterns. I do wonder how Google made the Gemma lineup so good at this

Let's hope Alibaba continues to open source its models

1 comments

Agreed. Incidentally, in my testing, qwen models (qwen3.6-35b-a3b and earlier 3.5) are WAY better with vision than gemma4-26b-a4b. I would normally want to stick with gemma4 only (I use it for spam filtering), but it just doesn't cut it for vision work, and qwen models do.
That has been my experience has well.

Qwen 3.5/3.6 are far better at vision. Even the 9B model beats Gemma 4 31B in my use case. They describe the scene more accurately and they focus on the important elements like a human would.

Gemma 4 frequently misses important element, doesn't understand what things are, and is very coy even if you ask for lots of detail. You have to give it hints "hey what's that round thing on the left" to get half decent answers.

(Yes I did set the min-tokens correctly. I also tested bf16 and Q8 to make sure it wasn't a quant issue.)

It's unfortunate because Gemma 4 is so so so much better at natural language interactions.

God I love qwen3.6-35b-a3b especially Q8
I second this notion, I am impressed daily with what little Qwen can do
> qwen models (qwen3.6-35b-a3b and earlier 3.5) are WAY better with vision than gemma4-26b-a4b

Can you give an example? And/or is there a benchmark specifically for this?

I have my own benchmark: a photo of filament boxes in my drawer. The task is to list (inventory) the filaments. It's not an easy task: the letters are small, the boxes are in various orientations, the labels have lots of writing on them, etc.

This is what qwen produces (basically perfect):

Fiberlogy (Blue Boxes - Left Side)

ASA Black: 750 g (0.75 kg)

Easy PLA Graphite: 850 g (0.85 kg)

Ref Easy PLA Gray: 850 g (0.85 kg)

Ref Easy PLA White: 850 g (0.85 kg)

Ref Easy PLA Inox: 850 g (0.85 kg)

Bambu Lab (Brown Boxes - Right Side)

ASA White: 750 g (0.75 kg)

PLA-CF (Burgundy Red): 1,000 g (1 kg)

PAHT-CF (Black): 500 g (0.5 kg)

PET-CF (Black): 500 g (0.5 kg)

PETG-CF (Black): 1,000 g (1 kg)

----------

And this is gemma-4-26b-a4b (basically useless):

### *Fiberlogy Filaments*

All Fiberlogy items listed below share these common specifications: Made in Poland, $1.75\text{mm}$ Diameter, $0.75\text{kg}$ Weight, and Print Temperature of $200-230^{\circ}\text{C}$.

* *PLA Black* (Barcode: 5903281997986)

* *Easy PLA Graphite* (Barcode: 5903281994057)

* *Ref Easy PLA Grey*

* *Ref Easy PLA White* (Barcode: 5903281994386)

* *Ref Easy PLA True...* (Text is cut off)

**

### *Bambu Lab Filaments*

All Bambu Lab items listed below share these common specifications: Made in China, Removable Spool (Do Not Remove).

* *PLA-CF (Carbon Fiber Reinforced)*

    \*   Color: Burgundy Red
    \*   Diameter: $1.75 \pm 0.02\text{mm}$
    \*   Weight: $1.0\text{kg}$
    \*   Suggested Drying Conditions: $45^{\circ}\text{C}$ for $6-12$ hours
* *PAHT-CF (High Temperature Polyamide with Carbon Fiber)*

    \*   Color: Black
    \*   Diameter: $1.75 \pm 0.02\text{mm}$
    \*   Weight: $0.5\text{kg}$
    \*   Suggested Drying Conditions: $80^{\circ}\text{C}$ for $6-12$ hours
* *PETG-CF (Carbon Fiber Reinforced)*

    \*   Color: Black
    \*   Diameter: $1.75 \pm 0.02\text{mm}$
    \*   Weight: $1.0\text{kg}$
Thanks. Did you set the image min/max tokens for Gemma4 to 1120 for this? This might not be a fair comparison without that, to the differences in architecture.

https://www.reddit.com/r/KoboldAI/comments/1sjnjic/imagemin_...

https://github.com/ollama/ollama/issues/15626

I think 1120 vs 280 tokens is a big difference, and you were perhaps using the latter value?

I did not, and I had no idea such a setting even existed. This could definitely change things. However, I don't see a way to set this in LM Studio, which is what I currently use to run models.
Seems like you can't set it, for now. There's an issue for it: https://github.com/lmstudio-ai/lmstudio-bug-tracker/issues/1...
Thanks, that's very useful. I find people's small individual tests more important than the usual benchmarks that tend to be gamed by every single lab.