I’m Hayden, a 13-year-old developer based in Australia, and I’ve built a CoT logical thinking and reasoning AI model similar to OpenAI o1.
It's powered by open source small models like Llama 3.1 and 3.2 and I would love for you to try it and share your feedback with me.
I built it just for fun and launched it a day after the o1 release. It's not perfect yet but its still amazing to see how a detailed prompt can have such a difference on the quality of the LLM response!
Please let me know any feedback or suggestions! Thank you!
Impressive work, Hayden! It’s amazing to see someone your age diving into AI development. How did you go about fine-tuning the model? Any plans for future improvements or features?
Sorry to say that it failed at everything I threw at it. For coding questions it flat out refused to answer. For factual questions it was dead wrong. I asked a few very easy questions about Monty Python members, all of which verifiable with a 10 second search, and it got all the answers wrong.
hey! sorry about that, it’s still not perfect but shows that using CoT prompt does improve llm responses. compared with its base model, you can clearly see some difference. If you like, please email me at contact@pixelverse.tech with some prompts you provided that t1 failed to respond correctly and I can take a look.
> but shows that using CoT prompt does improve llm responses.
A wrong answer is a wrong answer. In one of the questions it failed exactly in the same manner that GPT-4o did when I asked, so it’s not clear at all this is better. I could even see the chain and identify exactly where it made the mistake, but that’s not really a consolation.
As I said - it’s not perfect at answering every question right. What I am saying is that CoT promoting does have an effect on the quality of LLM responses. Ask how many r in strawberry or a similar question to t1 and llama 3.1 and you will see that CoT strategy has some effect.