| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by pdyc 122 days ago
	thanks, i tested it, failed in strawberry test. qwen 3.5 0.8B with similar size passes it and is far more usable.

3 comments

algoth1 122 days ago

Does asking it to think step by step, or character by character, improves the answer? It might be a tokenization+unawareness of its own tokenization shortcomings

link

pdyc 122 days ago

no it did not with character by character it concluded 2 :-)

link

cztomsik 122 days ago

I hope you are kidding, how is that a test of any capabilities? it's a miracle that any model can learn strawberry because it cannot see the actual characters and ALSO, it's likely misspelled a lot in the corpus. I've been playing with this model and I'm pleasantly surprised, it certainly knows a lot, quite a lot for 1.1G

link

selcuka 122 days ago

Interesting. Qwen 3.5 0.8B failed the test for me.

link