| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by platypii 5039 days ago

While I believe the author is not wrong in his argument that information theory is a bound on machine intellignce, I think the problem with using information theoretic bounds is that it is not a tight bound. In other words, I don't think information theory is currently the limiting factor.

The author argues that any AI will consist of two parts, the machine learning program L and the training set T, which combine to form the "intelligent" program M. And thus by information theory, k(L) + k(T) >= k(M) [where k is the kolmogorov complexity]. Thus M is bounded by the information in L and also bounded by T. The author argues that since these both depend on humans supplying them, we are limited by the human factor.

But how much information does one really need for AI? Well, how much information is necessary for human intelligence? Assuming that L is our genome (ignoring epigenetics), and T is our life experiences. The amount of data in our genome is on the order of around 3gb. I would argue that's certainly within the realm of feasibility for programmer's output. How about the training set T? That's harder to say; does it include video, audio, touch, etc? How many years until a human is considered intelligent (by AI standards)? I think it's safe to say that a 10 year old blind human could pass a Turing test. So if we ignore tactile and olfactory feedback, we basically just need 10 years of compressed audio as the training set. Generously encoding the audio at 128kbps 24/7 for 10 years = 4.7 terabytes. Which is easily within the realm of current machine learning. We have far more information than that (and much more densely encoded as text), but still aren't close to True AI.

I think the problem is not that we don't have enough Information, I think it's that we have not yet searched enough of the problem space. And that's where more hardware can help us.

1 comments

enki 5039 days ago

as per http://paulbohm.com/pdf/10.1.1.76.5543.pdf and http://paulbohm.com/pdf/general_limitations.pdf learning by example requires many more examples than the kolmogorov complexity of the target concept.

T then is not just the life experience of one human - to learn from observation by example, to reconstruct a human, you'd need to observe the life experiences of trillions and trillions of humans to gain sufficient information about humans to narrow down the possible implementations that match human behavior not just mechanically given the same input, but also given new previously untested input.

at 3gb encoded as dna the search space already is huge, but that ignores that the genome alone doesn't contain the information needed to read it. (e.g. you need a living thing to use the DNA, for it to make sense)

link

platypii 5039 days ago

Well I guess I have some citations to read later, but intuitively I just have a hard time believing that lack of training data is the problem. There's so incredibly much data available on the internet. If that was the limiting factor, it would just be a matter of throwing more data at T to increase the amount of information in M, but that doesn't help if we haven't found the right machine learning algorithm L. And searching that space is the real challenge of AI. To search that space we either need human experts, huge amounts of computational power to brute force, or some combination thereof. We've tried human experts alone for decades, without much success, so I think it's likely that we will need some computational assistance to find the right algorithms. That's why AI people love throwing around graphs of Moore's law.

link

enki 5039 days ago

not if the amount of training data required to learn by example is exponential or even combinatorial. we might not even be in the same ballpark. all data ever recorded on digital media might not be enough to learn even a simple intelligence without assumptions about structure.

a kind of analogous problem is: can you learn how to build a living thing purely from digitally recorded DNA samples, if you don't have access to the internal structure of any living thing? How many dna samples would you need?

link