Hacker News new | ask | show | jobs
by tmaly 107 days ago
I was thinking about something similar the other day. I have seen a repeating pattern of people complaining that a new model comes out, it's amazing for a few weeks, then they nerf it.

Most of these claims are subjective. I was thinking if we had a standardized chain of though representation, and if we could capture each models chain of thought into this standardized format, we could compare these for the same tasks we run.

1 comments

Yeah that's essentially what I'm looking for. Since now that AI has become such a core part of most businesses, it's pretty critical to use the _best_ models + prompts for whatever your use case is.