Hacker News new | ask | show | jobs
by cl42 856 days ago
I subscribe to both ChatGPT and Gemini Ultra. Some interesting features/experiences with Gemini...

1. I asked it to do an image search and it responded with in-line images throughout its response. This was a nice "wow" moment and very neat.

2. Its drawing/illustration style is different than DALL-E, so I use both.

3. Quality of general text responses is comparable, though I prefer ChatGPT.

I imagine you can probably just use one rather than have both. I am still primarily using ChatGPT.

1 comments

How do you get Gemini Ultra to generate images? It just tells me that it can't do that yet.
Try telling instead of asking. When I tried it the other day "Can you create a picture" gave the response "no try DALL-E instead". Then I noticed one of the example prompts was "Generate an image with an Elephant ...." It worked, as did some other random stuff I tried as long as I told it to do it not ask it to.

I just tried asking it again and asking seems to work now too.

Most European countries are excluded:

> Image generation in Gemini Apps is available in most countries, except in the European Economic Area (EEA), Switzerland, and the UK. It’s only available for English prompts.

(https://support.google.com/gemini/answer/14286560?hl=en)

"Can you draw a photo of an avocado-shaped chair with a pineapple-man sitting in it?"

Came up with 4 images.

You raise a good point, though... I've also asked it to use its "web search" capability for tasks and it says it doesn't have that capability, but when I ask it by implying it should do a web search, it goes ahead and does it. Weird!

Yes, Gemini and previously Bard has a lot of confusion about its own capabilities. I use it to translate Chinese text in aliexpress product listings by taking screenshots. It’s perfectly capable and quite helpful in translating the text from those screenshots, but I think depending on how you phrase the question while uploading the photo, it will sometimes say “I’m only a language model I can’t help with that” or even “I can’t help with images”. Once it says that, I think it poisons the chat history and I start a new session to try to get it to work. I’ve not translated many images but so far this error happens maybe 20% of the time. It’s very strange.

I have another issue which is that when I paste a C++ code file in to the web interface, I get an error from the web interface and Gemini never even sees the code. The web interface is refusing to accept my code file. I opened up AI studio instead of the normal Gemini window and that seems to work, but I’d rather just use the normal chat window.

It’s all statistics. In the training set, there were probably questions asking about its capabilities and it was trained to say it has less than it does. (Or it’s a bad system prompt)

There’s no internal understanding of itself or its capabilities.

There is no understanding to answer a question about its capabilities but the point is it has the capability but the prompt is failing to trigger it. This is separate from "knowing" or not. Think ChatGPT functions that don't work.
Knowing how these models are trained and how these chat systems are built, I wouldn’t expect the question

“Can you search the internet?”

To actually cause an internet search.

Generally you know information about yourself, and that quirk of humans is likely reflected in the QA training data and thus the model’s outputs.

I live in the UK and have the same problem - it doesn’t work here.