| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Loeffelmann 452 days ago
	That's the point. With the old models they all failed to produce a wine glass that is completley to the brim full. Because you can't find that a lot in the data they used for training.

3 comments

colecut 452 days ago

Imagine if they just actually trained the model on a bunch of photographs of a full glass of wine, knowing of this litmus test

link

gorkish 452 days ago

I obviously have no idea if they added real or synthetic data to the training set specifically regarding the full-to-the-brim wineglass test, but I fully expect that this prompt is now compromised in the sense that because it is being discussed in the public sphere, it's has inherently become part of the test suite.

Remember the old internet adage that the fastest way to get a correct answer online is to post an incorrect one? I'm not entirely convinced this type of iterative gap finding and filling is really much different than natural human learning behavior.

link

friendzis 451 days ago

> I'm not entirely convinced this type of iterative gap finding and filling is really much different than natural human learning behavior.

Take some artisan, I'll go with a barber. The human person is not the best of the best, but still a capable barber, who can implement several styles on any head you throw at them. A client comes, describes certain style they want. The barber is not sure how to implement such a style, consults with master barber beside, that barber describes the technique required for that particular style, our barber in question comes and implements that style. Probably not perfectly as they need to train their mind-body coordination a bit, but the cut is good enough that the client is happy.

There was no traditional training with "gap finding and filling" involved. The artisan already possessed core skill and knowledge required, was filled on the particulars of their task at hand and successfully implemented the task. There was no looking at examples of finished work, no looking at example of process, no iterative learning by redoing the task a bunch of times.

So no, human learning, at least advanced human learning, is very much different from these techniques. Not that they are not impressive on their own, but let's be real here.

link

wegfawefgawefg 451 days ago

overfitting vs generalizing

also we all know real people who fail to generalize, and overfit. copycats, potentially even with great skill, no creativity.

link

vlovich123 452 days ago

Humans don’t train on the entire contents of the Internet, so i’d wager that they do learn differently

link

sayamqazi 451 days ago

I think there is a critical aspect of human visual learning which machine leanring cant replicate because it is prohibitively expensive. When we look at things as children we are not just looking at a single snapshot. When you stare at an object for a few seconds you have practically injested hundreds of slightly variated images of that object. This gets even more interesting when you take into account real world is moving all the time, so you are seeing so many things from so many angles. This is simply undoable with compute.

link

vlovich123 451 days ago

Then explain blind children? Or blind & deaf children? There's obviously some role senses play in development but there's clearly capabilities at play here that are drastically more efficient and powerful than what we have with modern transformers. While humans learn through example, they clearly need a lot fewer examples to generalize off of and reason against.

link

sayamqazi 448 days ago

> Then explain blind children I was only talking about vision tasks as an example. You can extend the idea to any sense.

> While humans learn through example, they clearly need a lot fewer examples to generalize off of and reason against.

Human brain has been developing over millenia. machines start from zero. What if this few example learning is just an emergent capbaility of any "leanring function" given enough compute and training.

link

wegfawefgawefg 451 days ago

they take in many samples of touch data

link

HelloImSteven 452 days ago

Even if they did, I’d assume the association of “full” and this correct representation would benefit other areas of the model. I.e., there could (/should?) be general improvement for prompts where objects have unusual adjectives.

So maybe training for litmus tests isn’t the worst strategy in the absence of another entire internet of training data…

link

orbital-decay 452 days ago

A lot of other things are rare in datasets, let alone correctly labeled. Overturned cars (showing the underside), views from under the table, people walking on the ceiling with plausible upside down hair, clothes, and facial features etc etc

link

myaccountonhn 451 days ago

They still can't generate a watch that shows arbitrary times I believe, so it could be the case?

link

nefarious_ends 452 days ago

imagine!

link

sejje 451 days ago

I did coax the old models into doing it once (dall-e) but it was like a fun exercise in prompting. They definitely didn't want to.

link

jorvi 452 days ago

The old models were doing it correct also.

There is no one correct way to interpert 'full'. If you go to a wine bar and ask for a full glass of wine, they'll probably interpert that as a double. But you could also interpert it the way a friend would at home, which is about 2-3cm from the rim.

Personally I would call a glass of wine filled to the brim 'overfilled', not 'full'.

link

kalleboo 451 days ago

I think you're missing the context everyone else has - this video is where the "AI can't draw a full glass of wine" meme got traction https://www.youtube.com/watch?v=160F8F8mXlo

The prompts (some generated by ChatGPT itself, since it's instructing DALL-E behind the scenes) include phrases like "full to the brim" and "almost spilling over" that are not up to interpretation at all.

link

drdeca 452 days ago

People were telling the models explicitly to fill it to the brim, and the models were still producing images where it was filled to approximately the half-way point.

link