Hacker News new | ask | show | jobs
by Vuizur 1157 days ago
Similar to ChatGPT it also fails the "What is heavier, one pound of feathers or two pounds of lead?" test. So far only GPT-4 passes that one.
1 comments

My local 65B llama prompt gave me this answer:

tyfon:What is heavier, one pound of feathers or two pounds of lead?

Omnius: Two pounds of lead are heavier than one pound of feather.

Not bad :)

Indubitably, good fellow.

I suspect if we can fine tune and optimize this 65B model, we can achieve some truly remarkable results.