Hacker News new | ask | show | jobs
by throwuwu 995 days ago
Feature. Once the multimodal rollout is complete Plus will have image gen, image recognition, voice recognition and voice gen all integrated with the chat capabilities so you can combine those features in novel ways like the link Brockman retweeted showing ChatGPT acting as a language tutor and conversation partner.
1 comments

To be fair you can do all that from Bing Chat too(image/voice recognition and generation). And plugins are coming to it too.

The downsides with Bing currently are:

1. If you're not prepared to be civil to a language model, you're not going to have a good time.

2. The image input feature isn't quite the same. Feels like descriptions are bolted in from a separate (GPT-4 V unless the Bing CTO was lying) model so it's lossy in a way straight from GPT-4 V isn't

3. Voice recognition and TTS are good but worse than what Open AI is currently using. Perhaps they'll switch since the TTS is new ? But idk. It's also not hands off like Open AI have designed their implementation.

Still waiting to gain Chat GPT-4 Plus image upload access myself, but the Bing image ingest / recognition is vastly inferior to what I've experienced myself (trying to use Bing's image upload capabilities/recognition) vs. what I've seen on Twitter the last few days with Chat GPT-4's image upload feature / recognition capabilities.