|
|
|
|
|
by alsodumb
929 days ago
|
|
By comparing on benchmarks that are either limited, or have data leaks, or in most cases just don't make sense in terms of usability - I've personally stopped looking at benchmarks to compare models. Personally, if I want to try a new model I hear a lot of chatter about, I use it for a few hours in my daily workflow. My baseline is GPT3.5 and GPT4, and I compare the models with them in terms of my day to day usage. |
|