The looping may be due to quantization -- I've seen it on locally quantized Q6_K Qwen 3.5/3.6 models. I recall seeing somewhere (here or r/LocalLlama) that Qwen models are sensitive to quantization of the keys, though I haven't yet experimented with/looked into fixing this. (I've been building up my promptfoo tests/infrastructure to detect looping, etc. on Qwen and other models.)
A fun thing I do with Qwen 3.5 0.8b is to take a screenshot of the Hackernews homepage and ask it to give me a JSON representation of the data and it does surprisingly well. With a well structured prompt I think it could be made to be pretty reliable tool for that type of task out of the box.
Yes apologies, Hackernews was just an example, you can do this with any website - it’s just a simple benchmark I like to use for testing vision models.