Hacker News new | ask | show | jobs
by teamcubitflow 34 days ago
I'm surprised that kind of captioning came from a 2B model; glad the fine tuning process actually shows a deliberate approach to making qwen 3.5 into essentially a new model of it's kind.
1 comments

hey this is shubham, yeah Qwen3.5VL is awesome and it's training vocab is quiet strong so with the right data curation you can prolly take it into a bunch of other narrow tasks eg: we trying to fine-tune it to use SAM3 in a loop for segmentation tasks in the videos