| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by chimi 2444 days ago

You're touching on the "difficulty" in verbalizing it. I see what you mean, because you did learn that the heuristic was changing with just a yes or no. I said you can't teach that way, but you clearly learned that way, so I wasn't exactly correct, but I'm not practically wrong either still I don't think.

I wonder, how would an AI perform on the same test.

What is the mathematical minimum number of questions on such a test, subsequent to the heuristic change, that could guarantee that new heuristic has been learned?

I'm curious about the test. Did it have a name? What were they testing you for?

4 comments

visarga 2443 days ago

> I wonder, how would an AI perform on the same test.

This situation is called Multi-armed Bandit. In this setup you have a number of actions at your disposal and need to maximise rewards by selecting the most efficient actions. But the results are stochastic and the player doesn't know which action is best. They need to 'spend' some time trying out various actions but then focus on those that work better. In a variant of this problem, the rewards associated to actions are also changing in time. It's a very well studied problem, a form of simple reinforcement learning.

link

rusticpenn 2443 days ago

If the rewards are changing, then isnt it a moving target problem?

link

gbear605 2444 days ago

Doesn’t it depend on what you mean by guarantee? The test can’t get 100% certainty, since theoretically you could be flipping a coin each time and miraculously getting it right, for 1000 times in a row. The chance of that is minuscule (1/2^1000), but it’s nonzero. So we’d have to define a cutoff point for guaranteed. The one used generally in many sciences is 1/20 chance (p = 0.05), so that seems like a plausible one, and with that cutoff, I think you’d need five questions passed in a row (1/2^5 = 1/32). Generally, if you want a chance of p, you need log2(1/p) questions in a row passed correctly. However, that only works if your only options are random guessing and having learned the heuristic. If you sorta know the heuristic (eg. right 2/3 of the time), then you’d get the 5 questions right ~13% ((2/3)^5) of the time, which isn’t inside the p = 0.05 range. So you also need to define a range around your heuristic, like knowing it X of the time. Then you’d need log(1/p)/log(1/X) questions. For example, if you wanted to be the same as the heuristic 19/20 times and you wanted to pass the p = 0.05 threshold, you’d need log(1/0.05)/log(1/(19/20)) ~= 59 questions.

link

mark-r 2443 days ago

There were more than two possible answers to choose from on each page, so the odds of being right were considerably lower.

link

mark-r 2443 days ago

I'm sure the test was a standard with a name, but I was never told. It was a small part of a 3 hour ordeal, evaluating my healing progress since suffering a brain injury in March.

I would agree that it's a very inefficient way of teaching something. It gave me an unexpected insight into machine learning though.

I'm sure the test was designed so that picking the same answer each time or picking one at random would result in a fail.

link

munin 2444 days ago

Sounds like Ravens progressive matricies.

link

mark-r 2443 days ago

Similar but not the same.

link