LC0bO9 Role of Modeling in Case Studies
Appendix C: The Role of Modeling in Case Study Analyses
With respect to case study analysis, cognitive modeling plays a key role as follows. A first use of cognitive models is to provide fully explicit descriptions of cognitive structures ascribable to the subject on the basis of evidence in the corpus. It is possible to treat such an ascription as a particular theory in this specific sense: if the subject has the knowledge ascribed to her, then there should exist a range of behaviors which might be possible based on that particular state of knowledge. If cognitive modeling techniques are used to represent that particular knowledge and a small domain of possible interaction with the environment, the model is able generate a complete specification of the possible behavior given that particular knowledge and the interactions of the domain. (see Lawler and Selfridge, 1985). The outcome of such a simulation is an explicative model of the knowledge and its interaction potential. The central role of explicative models for case study investigation is that as particular theories they can help articulate what other sorts of behavior might possibly occur and how it might be related to the ascribed knowledge state. The researcher can then return to the empirical corpus with an enriched appreciation of what sorts of behavior might support or contradict the ascribed model and can then search for such examples. Such explicative models can also serve as a foundation for performance models and for learning models.
It is generally recognized, in disciplines of both natural and artificial intelligence, that having available several overlapping representations of situations is productive in problem solving. More rarely is recognized the centrality of multiple representations in learning and their specific roles. Simple machine learning procedures, when based upon profound domain analyses (or even clever tricks), can sometimes go surprisingly far in producing self-improving systems, but they do not approach in potential for significant learning the richness of multiple schemas competing and cooperating to solve a single problem with different representation schemes.
When one assumes that multiple schemas in the mind compete and/or cooperate in solving particular problems, a new level of complication is introduced. Modeling must then move beyond explicative models to performance models. Such models require a specification of the processes of interaction between simultaneously active schemas. Learning models require further specifications of how particular schemas in a given performance model through intra-action (interaction among themselves) and through interaction with the external environment change some parts of themselves.
Multi-schema modeling is the exploration and exploitation of intra-active learning, by which is meant the development of new knowledge from the interaction of disparate bodies of knowledge within a single individual (that cluster of knowledge, seen as a whole individual, is interacting with an external environment in emergent behavior). As computer-based implementations, the objective of multi-schema modeling is to exploit frontier performances; these are patterns of behavior circumstantially successful but beyond the reach of replicable behavior. These frontier performances, or surprising successes, are used as triggers to initiate reflexive construction of new procedures and structures, reconstruction of existing procedures and structures, and re-balancing of influence among schemas in an ensemble of functionally independent but genetically [in the Piagetian sense] linked active schemas. Simultaneous multiple representation of the problem being solved is assumed to be the normal circumstance of problem solving. The centrality of this aspect of multi-schema modeling to learning is detailed in the next paragraph. Historically, multi-schema modeling began as a reaction against the classic compositional learning paradigm, where new learned procedures were composed of other procedures upon achieving a preset goal through trying all combinations of already existing primitive and composed procedures.
Some issues of mind are so general that their consideration seems grandiose. To the extent however, that the general framework guides specific developments, it is appropriate to sketch, albeit somewhat vaguely, the directions in which such modeling might move to address general issues. The primary, larger scale psychological questions implicated in multi-schema modeling are these:
1. how could/can significant generalization take place?
2. if development is not strictly progressive but involves (possibly recurrent) reorganizations, how can this be imagined to occur?
3. recapitulative reconstruction: how can we imagine the spiral of development to emerge from interactions of an ensemble of particular schemas? what could be the role of multiple modes of perception and representation in the cycle of cognitive development? These questions are addressed in the following sections.
The general stance of this position is that some kinds of generalization can proceed through a mechanism based on the interaction of schemas constructed in compatible but different ways on the basis of experiences essentially different in respect of the mode of perception serving as the original channel of the experience. (This is not a claim that all generalization is of this nature.) One key notion (argued through a worked example in Lawler, 1987a) is the productivity for major cognitive development of partially-compatible systems of representation, based upon disparate, experience-based models whose variety derives from their generation through the primarily diverse modes of human perception and action.
Generalization from Specific Incidents
Within such a non-uniform process model of mind, generalization through problem solving goes forward following a process such as that outlined below:
1. multiple, active schemas work simultaneously on a problem; one is dominant (call it schema D)
2. when the plan of the dominant schema is abandoned, yet success occurs, they are consequences:
a. the success indicates that something worth learning has occurred.
b. by definition, the surprising success was not able to be anticipated by the dominant schema D.
c. making a distinction between this new circumstance and others requires explanation from some alternative source. What other source is possible ? Information from one of the subdominant schemas, S1, S2, ..Sn.. Let Si provide an acceptable explanation.
d. When such an explanation succeeds in discriminating this new circumstances from others, there are consequences:
+ memory of the circumstance of surprising win; this amounts to a generalization by extension of schema D’s application to new effective area.
+ increase in salience/utility of schema Si; this amounts to a generalization by extensions of its applicability for its use as an explanation.
+ most important, inter-linking of terms of descriptions themselves between schema D and schema Si creates rudiments of a new sub-articulate level of structure which — when richly described — can emerge as a new “level of structure.” This possibly has bearing on the issue of how encapsulation of a process might work might work and what an encapsulated process might be: a first order abstraction of equivalencies between at least two schemas, reified as an object fitting that description.
The kind of model growing out of Lawler’s learning analyses involves the following dimensions (based on chapter 5 of Lawler 1985):
The basic situation: genetically cognate schemas embodying knowledge about task domains compete among themselves to provide solutions for any problems presented to them. Over time, the more dependable and efficient among them come to dominate behavior more frequently; practice effects increase their relative dominance. One consequence is that later acquired, more nearly-perfect schemas can become functionally dominant. Such increasing robustness implies that subsequently later constructed schemas may even provide guidance in interpreting and reinterpreting problems which are, in the nature of things, more specifically the business of some other context specific schema. That is, a temporally post-cedent schema can become the “logical parent” of a some domain specific, historically applicable, but less efficient schema. This scenario for local cognitive reorganization has wider implications for how we might think about the large scale structure of mind.
To the process creating a generalized form of the relation just described, wherein some specific single schema comes to serve as a later acquired logical ancestor to a cluster of related, task-specific schemas, we give the name cluster nucleation and we consider it an important event in cognitive development, since it affects cognitive organization on a larger scale than that of clusters of schemas.
– such a nuclear schema is an embodiment of what we recognize as a general and powerful idea
– postulate an image of cognitive organization based on a geographic analogy (as in Minsky, 1975):
* in any modern country, their are regional capitals. these capitals serve as hubs for distribution and collection in their regions. cluster nuclei serve such a function in this sketch of cognitive organization; they become the capitals of the mind.
* with modern transportation, the communication between such hubs is much more efficient than that between cities on the periphery of the hubs.
* the model of mind, then, is one with a global structure based on the interrelation of nuclear schemas, with a local structure derived from the functional dominance of those nuclear schemas in respect of problems in a cluster of epistemologically related domains.
At any point of time, the organization of a specific mind reflects not only the particular experiences of the individual, but the extent to which reorganization has occurred as a consequence of the particular problems that have been solved and the ways they have been solved.
The Spiral of Development
We suspect that the processes of recapitulative construction are very important to understanding long term cognitive development, but it is very hard to say anything about the spiral of development that is more than vague speculation. This fact is one reason why we need to do empirical studies which will help us address the issue. Nonetheless, we see in the gradual, long term development of a “sub-articulate level of structure” through experience-based interlinking of schemas, the ground for a possible explanation of why there should appear to be active learning in a given area, then a fallow period, then a later reconstruction of “the same” notions at a different level of sophistication. If the same constructive processes work first on schemas derived fairly directly from experiences and later on “reified” objects created from links of correspondence between more concretely-based schemas, the very time taken to establish those original inter-schema links would explain the fallow period. The differences in sophistication might then also be seen as derived from focusing first on schemas which are descriptions of concrete things and later on other structures which are description of relations between the original schemas. Much work needs to be done before anything more satisfactory can be said.
Robert W. Lawler, Spring 1988
1. What Multi-Schema Modeling and Learning Are Not
In addition to the machine learning and cognitive modeling work reported in the proceedings of the Cognitive Science Society, the European Conference on Artificial Intelligence, and the American Association for Artificial Intelligence, a series Machine Learning Workshops, begun in 1980 and continued biennially since 1983 have brought to the public an extensive literature that provides a reservoir of techniques for application to modeling of particular knowledge that is our objective (Michalski 1983b; Michalski et al., 1983; Mitchell et al., 1986,.) Multi-schema modeling, because it has different objectives, differs in implementation from other major kinds of modeling currently being explored in the following ways:
– from connectionist models: the learning of these systems is embodied in a network whose interconnections and, consequently whose outputs, are partially determined by “weights” which specify the extent to which one component of the system can influence the performance of another component. Parallel Distributed Processing (Rumelhardt and McClelland, 1987) is the premier work of this genre, much of the popularity of the paradigm having been inspired by the hope that the back-propagation algorithm will permit such systems to escape the pitfalls of learning by hill-climbing. Minsky disputes this claim (personnel communication; also see Minsky and Papert, 1987). In multi-schema modeling, learning is embodied in the creation of new structures, symbolically represented in programs as functions and data-structures.
– from boltzmann machine models: in this category of connectionist models, escape from the hill-climbing trap, and thus a kind of more creative problem solving, is achieved by permitting random noise to intrude into the network and reset subsets of weights within the system so that over a long period of time, the system may converge to a global optimum state (Falhman, Hinton, and Sejanowski, 1983). With multi-schema modeling, on the other hand, creativity within the system is based on the interplay of interactivity with the environment and intra-activity, that is specific kinds of interactions of the multiple schemas among themselves.
– from clustering models: Multi-schema modeling does not employ computational clustering methods, such as those employed by Michalsky’s (1983) star method.
– from version space models: it is not clear how, technically, to apply the methods developed by Mitchell (1983) for heuristic search in a space of symbolic description to the kinds of domains and problems which are and will be the focus of multi-schema modeling. However, in terms of objectives and Mitchell’s view of learning, his work is quite congenial and will be explored and exploited where possible.
– from debugging models: in learning through debugging (brought to prominence in Sussman, 1972), the trigger for learning was purpose driven repair of faulty procedures, the failure of the function to achieve its anticipated goal being the trigger to learning; the difference between the goal state and the achieved state was ascribed as a fault or bug of the active procedure. The inversion of some feature of the goal-difference-generated bug description provided the guidance for constructing a more effective procedure which would not fail on a future attempt. It has become common to describe such systems as learning through expectation failure. That name is an umbrella broad enough to cover multi-schema modeling as well as Sussman’s learning model. But in contrast with that work, and Schank’s recent proposals as well (during the AI and Education panel at AAAI meeting in Philadelphia, 1986), in multi-schema modeling the trigger for learning is success, not failure. The guidance for the construction of a new function is not inversion of a bug-description but rather the concrete series of actions which serendipitously achieved a recently abandoned goal. This emphasis on learning through recall or reconstruction of the actions which led to accidental success derives directly from Lawler’s analyses of the empirical material in his two major case study corpora (Lawler, 1981, 1987a.) Although to some extent multi-schema modeling differs from other approaches because of its objectives and the researcher’s intellectual stance, the issues are sufficiently hard that ideas from other styles of modeling should be followed whenever they could make a contribution to such an effort.