Hacker News new | ask | show | jobs
by gregsadetsky 921 days ago
I used this just-released API (of Gemini Pro) with multimodal input to test some of the things from the infamous Gemini Demo. You can see here [ https://www.youtube.com/watch?v=__nL7Vc0OCg ] my GPT-4 recreation of that ad which went viral.

Gemini Pro is... not great. In one test, I asked what gesture I was making (while showing a thumbs up) -- it said thumbs down and "The image is a commentary on the changing nature of truth".

I just just made a heads-to-heads comparison -- you can watch it here: https://www.youtube.com/watch?v=1RrkRA7wuoE

Code is here: https://github.com/gregsadetsky/sagittarius

1 comments

I think the fair comparison would be GPT3.5 (if image inputs were supported) vs Gemini Pro. It would be great to compare this with Gemini Ultra next year.