Hacker News new | ask | show | jobs
by alasano 66 days ago
My favorite Google LLM benchmark is asking Gemini models to create a script that fetches API usage (just request counts) for a project from GCP.

100% failure rate.

1 comments

I've yet to receive an accurate response from Gemini about GCP services, beyond completely trivial topics. The most recent, I think, was Gemini advising me that I could attach an existing pd SSD PVC to a n4 or c4 VM. For whatever unknowable reason, Google doesn't allow this and doesn't offer a migration path, and Gemini doesn't "know" anything about it either. It's wild.
Agreed on the fact that it should know, it's their LLM.

What bothers me is even having it do extensive research in documentation, it still can't figure it out.

GCP must simply be so unintuitive that the LLM mind cannot comprehend it.