Lenat & Brown

LENAT and BROWN: — On Why AM Appeared to Work

The limitations of most “expert systems” derive from their computerized imitation of human behavior being restricted to superficial performances rather than focussing on important cognitive processes. A notable exception to this observation was the attempt by Lenat to construct a program which might learn by discovery [note 1]. His retrospective analysis of that program’s success and limitations demonstrates how serious work in Artificial Intelligence quickly rises to address issues that are profound for people as well.

How AM Discovered the Natural Numbers
In essence, AM was an automatic programming system whose primitive actions produced modifications to pieces of LISP code,

FOOT LISP is the premier programming language of the Artificial Intlelligence community. LOGO is a child-accessible dialect of LISP. The three distinguishing characteristics of LISP are its processing of symbolic expressions, its use of recursion in that processing, and its commitment to lists as well as character strings as fundamental data elements.

predicates which represented the characteristic functions of various math concepts…. For instance, AM had a frame that represented the concept LIST-EQUAL, a predicate that checked any two Lisp list structures to see whether or not they were equal…. During one AM task, it sought for examples of LIST-EQUAL in action, and a heuristic accomodated by picking random pairs of examples of LIST, plugging them in for x and y and running the algorithm. Needless to say, VERY few of those executions returned T (about 2%, as there were about 50 examples of LIST at the time). Another heuristic noted that this was extremely low (though nonzero), so it might be worth defining new predicates by slightly generalizing LIST-EQUAL; that is, copy its algorithm and slightly weakening it so that it returns T more often. When that task was chosen from the agenda, another heuristic said that one way to generalize a definition with two conjoined recursive calls was simply to eliminate one of them entirely, or to replace the “and” by an “or”. AM then defined three new predicates…. The first of these (L-E-1) has had the recursion in the CAR direction removed. All it checks for now is that, when elements are stripped off each list, the two lists become null at exactly the same time. That is, (L-E-1) is now the predicate we might call SAME-LENGTH…. AM quickly derived from L-E-1 a function we would call LENGTH and a set of of canonical lists of each possible length ((NIL), (T), (T T), (T T T), (T T T T), etc;) i.e. a set isomorphic to the natural numbers. By restricting list operations to these canonical lists, AM derived the common arithmetic functions (in this case, addition), and soon began exploring elementary number theory. So these simple mutations sometimes led to dramatic discoveries.

Why It Worked
This simple minded scheme worked almost embarassingly well. Why was that? Originally, we attributed it to the power of heuristic search (in defining specific goals such as “generalize LIST-EQUAL”) and to the density of worthwhile math concepts. Recently we have come to see that it is, in part, the density of worthwhile math concepts AS REPRESENTED IN LISP that is the crucial factor.

It was only because of the intimate relationship between Lisp and Mathematics that the mutation operators…turned out to yield a high “hit rate” of viable, useful new math concepts when applied to previously known useful math concepts — concepts represented as Lisp functions….

Why Extensions Failed
At that time, it seemed straight forward to simply add Heuretics (the study of heuristics) as one more field in which to let AM explore, observe, define, and develop. That task — learning new heuristics by discovery — turned out to be much more difficult than was realized initially, and we have just now achieved some successes at it. Along the way, it became clearer why AM had succeded in the first place and why it was so difficult to use the same paradigm to discover new heuristics….

When the basic automatic programming (mutation operators) were applied to viable, useful heuristics, they almost always produced useless (often worse than useless) new heuristics rules….

To rephrase that: a math concept C was represented in AM by its characteristic function, which in turn was represented as a piece of Lisp code….This would typically take about 4-8 lines to write down, of which only 1-3 lines were the “meat” of the function. Syntactic mutation of such tiny Lisp programs led to meaningful, related Lisp programs, which in turn were often the characteristic function for some related MATH concept. But taking a two page program (as many of the AM heuristics were coded)and making a small syntactic mutation is doomed to almost always giving garbage as the result. It is akin to causing a point mutation in an organism’s DNA: in the case of a very simple microorganism, there is a reasonable chance of producing a viable, altered mutant. In the case of a higher animal, however, such point mutations are almost always universally deleterious.

We can never directly mutate the meaning of a concept, we can only mutate the structural form of that concept as embedded in some representation. Thus there is never any guarantee that we aren’t just mutating some “implementation detail” that is a consequence of the representation, rather than some genuine part of the concept’s intensionality….

In terms of the syntax of a given language, it is straightforward to define a collection of mutators that produce minimal generalizations of a given Lisp function by systematic modifications to its implementation structure….Structural generalizations produced in this way can be guaranteed to generalize the extension of function, and that necessarily produces a generalization of its intension, or meaning. Therein lies the lure… We now understand that lure conceals a dangerous barb: minimal generalizations defined over a function’s structural encoding need not bear much relationship to minimal intensional generalizations, especially if these functions are computational objects as opposed to mathematical objects.

Lenat and Brown conclude their discussion with an emphasis on the dichotomy between the worlds of functioning structures and those of meaning.

FOOT This might incline one to take very seriously the difference between the elevation of control and the correlation of perspectives; such would be too simple a reaction(see Chapter 2 for a detailed explanation and exemplification of these phrases).

They sound the alarm that one must seriously attend to the interrelation of form and content; this is an old theme. More importantly, they have established that the local organization of knowledge in programs is a significant factor precisely because it controls the productive modifiablity of that knowledge, that is, its usefulness as a foundation from which learning may take place.

Print Friendly, PDF & Email