CASE: A Case Analysis Support Environment


Abstract

The synergy between hypertext tools for organizing large, heterogeneous
databases and functioning models as explanations of processes may permit us to
address a class of problems remaining largely bypassed and undervalued in the
study of human learning. If these “power tools for the mind” permit us better
to manage and model complexity, they may bring within our grasp a series of
problems long considered beyond the reach of well articulated understanding.
One such cluster of problems centers on how cognitive development of the
individual relates to particular interactions. Case study has long been an
effective tool in unravelling the more intricate patterns of human behavior and
development. Its focus on individual behavior leads to the detailed
observations necessary to understand an individual’s performance. As
contrasted with methods which attempt to “hold all else equal,” case study
traces in detail the path of change, and so it is the method of choice in
studying learning. Technology has enhanced the dependability of case study
materials: videotape permits capture of enough of a context to permit later
interpretation in detail. This blessing is a burden in disguise, for there is
no balanced development of techniques of analysis. Computers can facilitate
the analysis of case material to redress that imbalance. One may hope their
use will introduce a new period in the study of intelligence, one where
cognitive scientists will have at last the tools to study the development of
knowledge in its full particularity.

An Example of a Case Study Corpus

Since significant learning appears from processes which are extended in
time, its understanding depends upon a multitude of interactions between what
is in the individual’s mind and the accidents of everyday experience. This
stance has led me to study and record the cognitive development of one of my
daughters from the time she was 18 weeks old through the sixth year of her
life. The targeted theme of this study is the interrelationship, if any,
between the development of language skills and knowledge and spatial knowledge.
Every week we have videotaped experiments and our play together; we
supplemented those mechanical records with extensive naturalistic observation.
The total number of tapes comprising the corpus is 240 (each containing,
typically, three experimental sessions). For the first three years, the
experiments divide into sets with two different foci. The first is a
continuing series about Peggy’s developing object knowledge; this material
relates to literature of the Piagetian paradigm and is intended as a
calibrating spine of the study. The second set of experiments is more a
miscellany, each one drawing its inspiration from what my wife or I could
notice as most potentially fecund in the child’s behavior. Some incidents of
the naturalistic observations are striking in themselves, such as the child’s
climbing up to a tea table — when she had not yet walked — and pushing it
across the floor, walking behind it. Other observations were driven by
quasi-regular reflection, and they tend to focus around my theoretical
concerns, such as the interplay of language production and other dimensions of
development.

Using Hypertext to Cope with an Extensive Corpus

The information captured in so rich a medium as videotape is beyond all
hope of transcribing completely in any serial symbolic form, such as text based
protocols. Any theory which initially selects the material to be transcribed
must be a preliminary, imperfect theory — but its selection criteria will
screen out possibly critical information. We can begin, however, with partial
transcriptions and use the file updating capability of computer based storage
to extend the transcribed corpus at need. Call this strategy variable depth
transcription. The researcher records what he imagines as relevant, with such
pointers to source material as to make its deepening at need a matter of
course. As his analysis leads to improved theory, that theory will suggest the
need for deeper analysis of parts of the corpus and their more extended
transcription. The extended database will then suggest enhancements of the
theory. A positive feedback loop is established. Hypertext facilities now
existing and under development permit such an approach. They need to be
applied to two problems: recording important details and their interconnections
in on line databases; and developing functioning models of cognitive structures
and their changes, based on the empirical material of the corpus. These are
the objectives of the CASE project.

Figure 1: CASE, a Case Analysis Support Environment
CASE: a Case Analysis Support Environment
Data Files:

    Index: A bi-directional index of text files contents; this is a
    list of themes and purposes, on the one hand, and a list of activities and
    results on the other.
    Text: By episodes with scenes: for scenes, a summary description; for
    episodes, a summary description with variable depth transcription of action and
    dialog.
    Examples: Related behaviors: for each discriminated behavior, the file
    will contain an architypical example with its “link-set” — pointers to all
    other located relevant occurrences (in the literature as well as in the
    specific study; the purpose is to provide a minimally abstract of the
    exemplified behaviors.
    Models: Speculative, ascribed functional schemata/: these are to be the
    minimal models necessary to function over the set of related examples; the file
    will indicate model sequences and correspondences with other models in the
    file.
    Theories: Theories of model development: the possibilities this file
    presents are: describing the minimal changes necessary to cover examples of
    significant development; the possibility of holding in different states of
    development alternate theories relating to the corpus; further, to the extent
    that the models and theories can be made functional, it will be possible to
    engage in regression testing of theory changes by applying them over the set of
    models and examples.

Progress to date with the CASE project has been extensive but limited in kind.
The original objective of this research was to explore the use of hypertext
systems as a tool for advancing the analysis and modelling of detailed case
studies. The conceptual focus was two-fold, on developing a database and
psychological analyses, and on exploring the utility of hypertext for tasks
involving the administration of complex bodies of information and even the
development of and interconnection of functional models with them. Such remain
the long term objectives of the research. In practice, to date the effort has
focussed on establishing the overall structure into which the case material
will be fit over time. Significant segments of the corpus of naturalistic
observations have been entered into the online database. We have followed this
procedure:

  1. information in the case corpus is brought on-line as ascii text files.
  2. those files and imported to the Notecards environment where they are broken
    up into small text records (stored on individual cards)

  3. an administrative structure is imposed on those records by storing them in
    hierarchically related systems of fileboxes.

  4. thematic structures are imposed on those records by relating records in the
    notecards to one another with links whose type varies across a spectrum of
    issues.

  5. the conclusion is a network of database of records which the analyist can
    navigate and modify as his questions and knowledge change.

A First Implementation

The current Dandylion database, developed at the Army Research Institute
for the Behavioral and social Sciences (ARI) in the first six months of the
project, occupies more than 3400 pages of the Xerox hard disk storage (this is
approximately 1,700,000 characters). Figure 2 shows a top level view of that
database. The central structure of the database derives from three indices, or
main categories of data. Videotapes represents a catalog of the videotape
corpus. Vignettes is a catalog of notes and short stories based on
naturalistic observation. Citations is a reference list of the books that I
have read or might read that I think should be relevant to analysis of the
corpus.

Bushy Trees

The vignettes catalog in Figure 2 is a list of themes in the vignettes
of naturalistic observations in the database. This text entered the database
as an ascii file. The vignette database itself is a filebox of notecards, each
of which contains text manipulable by a WYSIWYG editor. Each vignette card is
created with text selected from the vignette catalog, cut, and pasted into a
notecard. As needed, text of individual vignettes has been transcribed from
the manuscript to a notecard and inserted in the vignettes filebox. The
structure of the file is shallow and broad (720 notecards). The file is
logically sequential, ordered by serial date from the day of the subject’s
birth. The sequence is explicit in the notecard labels through not in the
physical organization of the database. For example, VN054 contains notes of
motor development observed in the 54th day of the subjects life. Themes and
issues that relate one vignette to another are represented by typed links
threading the vignettes along a string of logical interconnection. The primary
link types represent categories of infant development (motor, perceptual,
cognitive, social) and study focal themes (language, physical objects,
methodology). Since any protocol may contain information relevant to several
themes, the threads interweave through the collection of protocols in a
complex but comprehensible fashion.

Each sub-filebox in the videotape database of Figure 2 represents a single
physical tape. Each videotape is divided into scenes (another subfilebox)
named with a label of the form Tnnn.keyword, where nnn is the serial date in
the subject’s life and the keyword names either the other persons in the scene
(a parent, sibling, or pet is typical) or the experimental materials used by
the subject. Each scene is, as appropriate and as needed, further analyzed
into thematically defined episodes which contain in turn sequences of actions,
speech, and commentary by experimenters. For example, videotape “VT127.P018”
in Figure 2 names a physical tape made on the 127th day of Peggy’s life (in her
eighteenth week).

logical structure of CASE files



Overview of the Notecards CASE Files Structure


The subboxes “T127.Gretchen” and T127.Objects”
specify scenes in which the subject was with her mother first and then with a
specific collection of experimental objects. This database is also shallow and
bushy, containing about 750 fileboxes representing scenes (each subdivided into
episodes as well). The reason for this labelling is so that a simple lisp
function can sort vignette and videptape card titles within a specific date
range and order the material to support the correlation and interlacing of
events noted in the naturalistic observation and recorded in the videotapes.

Progress and Limiations

The initial efforts in the first six months of this project were
directed primarily towards familiarization with the system, database design,
and the beginnings of database construction. Database construction took place
in the Smart Technologies Group of the Army Research Institute for the
Behavioral and Social Sciences, Washington, DC. The protocol material for the
database was keyed online at a remote location as ascii files (now still
available in this form), then mailed by arpanet to the laboratory at which they
were integrated into the database by cutting and pasting text strings into
notecards. At this point, I have available a structure with which I can begin
the analysis and model building the corpus demands for its serious scientific
exploration. Given that the database I’m constructing is very large and
detailed, it should be no surprise that progress is slow, especially now that
the effort has turned toward analysis of videotaped experiments. A beginning
has been made in the analysis of videotape materials, but only at the top level
of observation. The current phase may best be described as corpus
adminstration. It is becoming clear that the effort will go forward in three
waves which, although they will overlap, will follow this natural sequence.
Corpus administration, corpus exploration, theory construction. The primary
feedback loop ultimately will range between theory construction and corpus
exploration, but before that can begin there must be a critical mass of
material under review and at least partially online. Achieving that critical
mass is the heart of the current effort. Since the psychological analysis
needed to construct decent models of material in the Peggy corpus will take
years, it should be no surprise that I have a need to work with other models
now. During several years, I worked with Oliver Selfridge to develop simple
models of interactive learning and implemented them in zetalisp on a Symbolics
3600. For a variety of reasons, I decided to convert those models to run in
Object Logo on a Macintosh computer. I am now attempting to connect those
models to their corpus in a project parallel to the CASE project effort.

The Psychology of the Particular

Many social scientists stand in awe of general theories. They typcially
seek an abstract correspondence which will generally permit predictions that
will cover many of the specific events that interest them. For me, the primary
value of a general theory is more down to earth, more like what an engineer
needs; it is the aid a theory offers in understanding and solving particular
problems, such as what enabled a specific person to learn some particular
knowledge in a given context. Why are case studies focussed on a single person
worth paying attention to ? I believe these methods and objectives will help
us approach a new way of doing psychology.

Kurt Lewin argued (1935) that psychology is now an Aristotelian science
and will become a modern or Galilean science only when researchers
shift their focus from finding cross classificatory correspondences to
developing explicit explanations for series of events in concrete cases. In
short, human psychology will become a science only when it begins solving
problems in concrete cases, as one does in reading computer memory dumps or
exploring machine learning. Lewin’s specific proposals failed to engender such
a transformation (see chapter 2 in Langer, 1967), yet there remains the sense
that his attempt was profoundly right — to move studies of mind from seeking
correspondences to solving important problems in very specific and concrete
cases.

The New Opportunity

If we can construct what Lewin refers to as “the pure case” (a corpus
with a sufficiency of information to explain adequately all questions on which
it might bear) and extend the modelling successes of function-oriented
psychology, this should impact both theory formation and how one teaches
psychology. The CASE project is one experiment in this spirit. We are trying
to:

    – capture a detailed body of information
    – convert that corpus to an on-line database via variable depth transcription
    – link related events and model development within the corpus
    – offer that linked database and access to the corpus materials to the scrutiny
    and further development by colleagues in order to enhance:

    * development of alternative theories

    * application of our own theories to other cases/corpora.

This method will also enhance the acceptability of the case study method by
discriminating between the idiographic focus of the content of case studies and
idiosyncratic interpretations of such studies. Such facilities will provide a
kind of experimental workbench for students where they may undertake, as it
were, a kind of apprenticeship in case study analysis under the tutelage of the
case database developer.

Some may want to argue that such efforts are not scientific in the sense of
permitting replicable experiments in other circumstances, but the effort is
scientific in Peirce’s broader sense — an attempt to approach some imperfectly
understood but well defined reality through seeking the convergence of opinion
based on serious and extended inquiry. That is enough for me.

There is no magic in either cognitive modelling or the use of on-line tools for
managing data, but their synergy will permit us to address and solve some
long-standing, important problems in cognitive psychology. It is the problems
which give the tools their importance. It is the new tools which give us some
hope of coping with the problems by sharing our information, analyses, and
ideas.

REFERENCES

K. Lewin. Aristotelian and and Galilean Modes of Thought in Contemporary
Psychology. In A Dynamic Theory of Personality: Selected Papers of Kurt
Lewin, McGraw Hill, 1935.

S. Langer. Idols of the Laboratory. Chapter 2 in Mind: An Essay on Human
Feeling., (Vol.1). John Hopkins Press, 1967.

C. S. Peirce. The Fixation of Belief. In Chance, Love, and Logic (M. R.
Cohen, ed.). Harcourt, Brace, and Co. 1923. Lessons from the History of
Science. In Essays in the Philosophy of Science (Vincent Tomas, Ed.). Liberal
Arts Press, 1957.

ACKNOWLEDGEMENTS

This work was undertaken through a senior research residency grant from
the National Research Council at the Army Research Institute for the Behavioral
and Social Sciences. My colleague, Joseph Psotka, director of the Smart
Technologies Project at ARI ,was extraordinarily supportive and helpful. His
suggestions have made this work much better than it could ever have been
otherwise. Purdue has subsequently provided me a place and time to carry on this line of work.

Publication notes:

  • Written in 1981. Unpublished.
  • Published as an Information submissions to the First National Hypertext Conference,
    Chapel Hill, 1987.
  • re-Published in this extended form in Hypertext: The State of the Art, McAleese
    and Green, Intellect, 1990.