PAPERT: On Learning in a Machine
Does heredity — with the richness of the genome’s structured chemistry — or epistemology — the implicit logic of ideas in concert — offer the more fruitful explanation of what makes learning possible? Usually this problem is too difficult to decide, but Papert offers a single, compelling example which argues that the power of ideas MUST NOT be ignored in disucssing the roots of knowledge and the processes of learning [note 1]. The central question is what must be considered in inferring that an idea is “innate” in a thinking thing. Papert describes a machine called a perceptron as follows.
…Its structure is quite simple: it has a retina on which pictures can be projected, and the purpose of the machine is to recognize whether or not this image has a certain property called “the predicate” (for example, ‘is a square’). It has a large number of submechanisms, each of which can compute the answer, to be expressed as ‘yes’ or ‘no’ to any well-defined question about a tiny region of the retina. Collectively, these submechanisms (called ‘the local function’) cover the whole retina, but none of them has any global knowledge. In particular, none of them can see the whole figure.
There is also a central organ, which has access to the answers given by the local mechanisms; but this organ is constrained to a particular simple algorithm to generate the global decision (for example, it is a square !) from the local functions, namely the ‘linear threshold decision function’: the binary yes/no outputs of the local functions are represented as 1 and 0, respectively; the machine forms a weighted sum of these numbers using weighting coefficients characteristic of the particular perceptron; it then makes its decision according to whether the sum comes out to be more than or less than a certain quantity called ‘the threshold.’
Finally, the perceptron is equipped with a ‘learning mechanism’, which works like this: when the machine says ‘that’s a square’, it will be told whether it is right or wrong, and if wrong, will use this feedback to alter its weighting coefficients.
What can a perceptron learn? The answer is not always immediately obvious, either from an examination of its ‘innate structure’ or from simple experiments. It is easy enough to see that a perceptron could learn to distinguish between dichotomies such as SQUARE versus TRIANGULAR; in this case, the local functions recognize the presence or absence of at least one angle which is not a right angle, and so the hypothesis SQUARE can be eliminated on local grounds. But there are cases where it is much harder to see whether the global decision is reducible to such local ones. The classic example is the predicate ‘is connected’. The picture is assumed to be a black figure on a white ground, and the question for the perceptron is whether the black figure is made of one or several pieces. Is such a decision reducible to local observations? Intuition still says that a perceptron should not be able to tell whether there is only one blob or several. But a deep mathematical theorem by Euler can be adapted to show that the preceptron can learn any predicate like ‘the number of blobs is less than k’.
Let us now imagine an investigator who does not know the Euler theorem and happens to be concerned with whether blob-numerosity (in the sense of the predicate just mentioned) is innate in the perceptron. One can easily imagine such a person being very puzzled when shown the wiring diagram of the perceptron. He would see nothing there which (to his mind) even remotely resembles numerosity. He might conclude that something must be missing from our wiring diagram (so one of the cautioning morals of the story is that one has to be very careful about conclusions like this). But the even more important, if more subtle, conclusion is that even with full knowledge (of the wiring diagram and the mathematics), it is not at all clear whether one ought to say that numerosity is innate. In some senses of ‘innate’ it is, and in other senses it is not. The conclusion for me is that we need a much more carefully elaborated theoretical framework within which to formulate the real questions that lie behind formulations such as, ‘Does the subject X have the notion Y ?’ or ‘Is Y a property of the initial state this subject ?’ or ‘Is Y innate ?’
Papert concludes from his argument by example that the key problem of learning theory should be to reduce the sense of miracle induced by the power of the mind both to think and to grow. Euler’s theorem can do this for a student of percpetron-learning. The more precise articulation of the processes Piaget named “assimilation” and “accommodation” is the best hope for achieving this end in the case of human learning. By providing crisp examples of how functionally labile cognitive structures can be seen as changing through their local interactions, we move in such a direction.