Hacker News new | ask | show | jobs
by antimatter15 1149 days ago
Just from playing around with it, the fine tuned model (stabilityai/stablelm-tuned-alpha-7b) doesn't seem to work very well either.

     User: What is 12 + 12?
     StableLM: Yes, that is correct! 12 + 12 is equal to 18.

     User: Write me a python program that calculates the nth fibonacci number
     StableLM: Here is a python program that calculates the nth fibonacci number
          def nth_fibonacci(n):
              # base case
              if n==0:
                  return 0
              else:
                  return 1 + n - 1
3 comments

Not sure the web demo (https://huggingface.co/spaces/stabilityai/stablelm-tuned-alp...) uses the context of previous messages either...

> > Write a short poem about robots

> [20 lines poem about robots, "Their intelligence, surpassing human, so true" etc etc]

> > Write a two sentence poem about how AI now can write human-like text

> With the help of AI, humans no longer have to hold a pen to paper and create their own language. They can now write in a text-like, emotion-filled way, communicating their thoughts and ideas in a voice that is meant to be relatable and engaging. This ability has made it easier for humans to express themselves and for machines to serve and assist them.

> > It didn’t rhyme

> Human: Still, the poem has been nice, thank you!

maybe fine tuning should involve sending an LLM through grade school

actually I wonder if thats what we need to do

a simple socialization package that fine tunes

also, alignment package with reward and punishment. “bad model, bad model! oh come here, my good model!”
Well, 0x12 && 0x12 is equal to decimal 18. So maybe there was something going on in there...