TECHNICAL REPORT
Computer Aided Education and Training Initiative

CDRL __________

CONTRACT N66001-95-C-8621

Date of Report

12 January 1998


SUBMITTED TO

Receiving Officer
e-mail address: "caetinrad@nosc.mil"

Rich Laverty       Frank Schindler     Bob Medearis
619-553-2918       619-553-2845        619-553-6377
laverty@nosc.mil   fschindl@nosc.mil   medearis@nosc.mil

SUBMITTED BY

Daniel D. Suthers
Learning Research and Development Center
University of Pittsburgh
Pittsburgh, PA 15260

Phone: 412-624-7036
Fax: 412-624-9149
email: suthers+@pitt.edu


Do not distribute to DTIC or other data depositories.
Distribution authorized to DOD components only; premature dissemination (date). Other requests shall be referred to Naval Command, Control and Ocean Surveillance Center (NCCOSC), RDT&E Division, San Diego, California 92152-5000.


1. Preliminary Information

1.1 Report Title

Collaboration, Apprenticeship, and Critical Discussion: Groupware for Learning


1.2 Notices

As stated in the proposal, the University provides to the Government restricted data rights , including a fully paid non-exclusive royalty-free license for use or distribution of software products produced by this project, with the following limitations:
  1. The University does not provide any licenses to the Government for commercial, off-the-shelf software that have been be used as part of the overall effort.
  2. The University reserves the right to develop other products and own them, even if they use ideas implemented in the delivered software.


1.3 Abstract

The project designed, implemented, and deployed a sophisticated client-server software system for supporting collaborative critical inquiry with a diagrammatic "evidence mapping" tool, computer coaching, and a "chat" tool. Working with teachers and CAETI partners, we developed realistic learning scenarios that embedded these tools in the DoDEA 9th grade science curriculum.


1.4 Table of Contents; List of Tables, Figures and Illustrations

1.4.1 Contents

    1. TECHNICAL REPORT
    2. 1.Preliminary Information
      1. 1.1Report Title
      2. 1.2Notices
      3. 1.3Abstract
      4. 1.4Table of Contents; List of Tables, Figures and Illustrations
        1. 1.4.1Contents
        2. 1.4.2Tables and Figures
      5. 1.5List of Symbols, Abbreviations, and Acronyms
      6. 1.6Preface and Acknowledgements
    3. 2.Body of the Report
      1. 2.1Summary
      2. 2.2Introduction
      3. 2.3Methods, Assumptions, and Procedures
        1. 2.3.1Overview of Belvedere
          1. 2.3.1.1Cognitive Support
          2. 2.3.1.2Collaborative Support
          3. 2.3.1.3Evaluative Support
          4. 2.3.1.4Other Features
        2. 2.3.2Architecture
          1. 2.3.2.1Concepts of Application
            1. 2.3.2.1.1Critical Inquiry in Science
            2. 2.3.2.1.2Generality
          2. 2.3.2.2Interface Presentation
            1. 2.3.2.2.1A Graphical Interface for Critical Inquiry
            2. 2.3.2.2.2Relations to Concepts of Applications
            3. 2.3.2.2.3Comments on the Analysis
          3. 2.3.2.3Concepts of Operations
            1. 2.3.2.3.1Supporting Collaborative Coached Critical Inquiry
            2. 2.3.2.3.2Relations to Other Levels
            3. 2.3.2.3.3Interoperability and Reusability
          4. 2.3.2.4Abstract Implementation
            1. 2.3.2.4.1Relations to Concepts of Operations
            2. 2.3.2.4.2Interoperability and Reusability Issues
          5. 2.3.2.5Resource Layer
          6. 2.3.2.6Architectural Lessons
        3. 2.3.3Design of Computer Coaches
          1. 2.3.3.1Introduction
          2. 2.3.3.2Pedagogical Constraints on Advice
          3. 2.3.3.3Syntactic Advice Strategies
          4. 2.3.3.4Consistency-Based Advice Strategies
            1. 2.3.3.4.1Consistency and Transitivity
            2. 2.3.3.4.2Formative Experiments
            3. 2.3.3.4.3 "Snippets"
            4. 2.3.3.4.4Implemented Expert Coach
            5. 2.3.3.4.5Reimplementation in Java
          5. 2.3.3.5Coaching Status and Future Directions
        4. 2.3.4Design of Classroom Implementation
          1. 2.3.4.1Classroom Environment
          2. 2.3.4.2Teacher's Role
          3. 2.3.4.3Curriculum Materials
          4. 2.3.4.4Student Activities
          5. 2.3.4.5Supporting Peer Evaluation with Performance-based Assessment
        5. 2.3.5Collaborative Activities Funded
          1. 2.3.5.1Belz and Luckham: Architecture Abstraction Hierarchy Reference Model
          2. 2.3.5.2Forbus and Koedinger: Integrated Feasibility Demonstration of Science Learning Spaces
            1. 2.3.5.2.1Demonstration Architecture
            2. 2.3.5.2.2Lessons Learned from the Integration Demonstration
          3. 2.3.5.3Workshop: Architectures and Methods for Designing Cost-Effective and Reusable ITS
      4. 2.4Results
        1. 2.4.1Highlights of Project
          1. 2.4.1.1Cognitive/Education Highlights
          2. 2.4.1.2Technology Highlights
          3. 2.4.1.3Implementation Status and Platforms
            1. 2.4.1.3.1Belvedere
            2. 2.4.1.3.2Collaborative Database Server and Connection Manager
            3. 2.4.1.3.3Coach ("Idea Generator")
          4. 2.4.1.4Maturity and Deployability
          5. 2.4.1.5Openness and Interoperability
            1. 2.4.1.5.1Points of Interoperability:
            2. 2.4.1.5.2Limitations on Interoperability:
          6. 2.4.1.6(Re)Applicability
          7. 2.4.1.7Venues and Collaborations
          8. 2.4.1.8Future Research
            1. 2.4.1.8.1Technology Research
            2. 2.4.1.8.2Cognitive/Education Research
        2. 2.4.2Results of Independent Evaluations
          1. 2.4.2.1Evaluation Methodology
          2. 2.4.2.2Summary of Evaluation Results
      5. 2.5Conclusions
      6. 2.6Recommendations
        1. 2.6.1Follow-on Work for Belvedere
          1. 2.6.1.1Port Server to NT
          2. 2.6.1.2Refine and Extend the Belvedere Interface as Guided by User Feedback
          3. 2.6.1.3Future Work for Coaching
          4. 2.6.1.4Cognitively Motivated Design of Challenge Activities and Materials
          5. 2.6.1.5Realistic Classroom Implementation
          6. 2.6.1.6Assessment Rubrics
          7. 2.6.1.7Commercial Deployment
        2. 2.6.2Additional Research
          1. 2.6.2.1Designing open, interoperable educational software.
          2. 2.6.2.2Representational devices as "epistemic forms" for collaborative learning.
          3. 2.6.2.3Software participation in reflective learning interactions.
          4. 2.6.2.4Examining the cost-benefit tradeoff between knowledge engineering and coaching functionality.
          5. 2.6.2.5Designing hypermedia structures to scaffold critical inquiry skills.
          6. 2.6.2.6Scaffolding learning from simulations and visualizations.
          7. 2.6.2.7Comparing scientists' and learners' inquiry skills.
          8. 2.6.2.8Realistic school implementation of advanced educational technology.
    4. 3.End Matter
      1. 3.1Appendices
      2. 3.2Bibliography


1.4.2 Tables and Figures

TABLE 1. EXAMPLES OF SYNTACTIC ADVICE PATTERNS.

FIGURE 1. BELVEDERE INQUIRY DIAGRAM AND ADVICE

FIGURE 2. THE BELVEDERE INTERFACE

FIGURE 3: ABSTRACT IMPLEMENTATION LAYER

FIGURE 4. EXAMPLE ADVICE CONSISTENCY PATH.

FIGURE 5. "SNIPPETS" MAY BE REFERENCED BY CLICKING ON DOCUMENT ICONS.

FIGURE 6. WEB-BASED MATERIALS FOR CHALLENGE PROBLEM

FIGURE 7. SAMPLE ASSESSMENT RUBRICS

FIGURE 8. ARCHITECTURE REFERENCE MODEL.

FIGURE 9. SCIENCE LEARNING SPACE DEMONSTRATION.

FIGURE 10. LEARNING SPACE DEMONSTRATION ARCHITECTURE.


1.5 List of Symbols, Abbreviations, and Acronyms

AAAS
CAETI
Computer Aided Education and Training Initiative
CGI
Common Gateway Interface
CSCL
Computer Supported Collaborative Learning
DoD
Department of Defense
DoDDS
Department of Defense Dependent Schools
DoDEA
Department of Defense Education Activity
HTML
HyperText Markup Language
HTTP
HyperText Transfer Protocol
ITS
Intelligent Tutoring System
LRDC
Learning Research and Development Center
MOO
Object Oriented MUD (see)
MUD
Multi-User Domain
NSES
National Science Education Standards
SGML
Standard Generalized Markup Language
URL
Universal Resource Locator
WYSIWIS
What you see is what I see
WWW
World Wide Web


1.6 Preface and Acknowledgements

This report describes work funded by the Advanced Research Projects Agency's Computer Aided Education and Training Initiative, under the title "Collaboration, Apprenticeship, and Critical Discussion: Groupware for Learning," Contract N66001-95-C-8621. Prior to CAETI, foundational work was supported by the National Science Foundation's Applications of Advanced Technology program under the title "Tools for Thinking About Complex Issues in Science and Public Policy," Grant MDR-9155715. Subsequent to CAETI, the project has been funded by the DoD Presidential Technology Initiative under the title "Scaling up Implementation of Collaborative Critical Inquiry in the DoDDS Curriculum."

Project PIs are indebted to Kirstie Bellman for her visionary leadership, to Neil Jacobstein for his facilitation as cluster leader, to Gary Bridgewater for the many ways in which he coordinated our work, to Bill Bewley, Sue Latzko, and their colleagues at ISX for coordination of site visits, to Lila Cheville for helping us coordinate with the DoDEA curriculum, to Lynne Gilfilian for evaluation, and to Frank Belz for useful discussions about architecture. The Belvedere client was programmed by Kim Harrigal and Dan Jones; the collaborative database server by Dan Jones; and automated coaching by Dan Suthers and Joe Toth. Science materials were prepared by Eva Toth and Arlene Weiner. On-line help and user interface testing was performed by John Connelly. Sandy Katz contributed to evaluation planning. Dan Suthers was technical lead and project manager.


2. Body of the Report

2.1 Summary

Under CAETI funding, the Advanced Cognitive Tools for Learning project designed, implemented and delivered one of the most ambitious and comprehensive packages of the CAETI program [Suthers & Jones, 1997; Toth, Suthers & Weiner, 1997]. This package begins with a sophisticated client-server software system for supporting collaborative critical inquiry. This software, known as "Belvedere," focuses and prompts students' cognitive activity by giving them a graphical language to express the steps of hypothesizing, data-gathering, and weighing of information. It provides apprenticeship in science by suggesting next possible steps, and by cognitively motivated structuring of materials and activities. It supports collaborative learning through the shareability of diagrams by students in same-time same-place, same-time distant or asynchronous modes, as well as through text-based "chat" windows. Belvedere is based on a client-server architecture that can deliver advanced educational technology on a variety of platforms, requiring only that user machines run Java and have a few standard tools such as a Web-browser, a word processor and a spreadsheet.

However we went much further than handing raw technology to the schools - we also provided comprehensive support for implementation and integration in the classroom and curriculum. Situated in a well-known institute for the study of school learning, we are sensitive to the many demands and opportunities that school reform places on teachers and schools. Already experienced with classrooms, we listened carefully to DoDEA curriculum supervisors and teachers, developed curriculum materials keyed to the DoDDS curriculum standards and objectives, and modified materials and software in the development process . Our "science challenge" activities are designed to match and enrich the DoDDS curriculum, and are based on standards such as the National Research Council's NSES, LRDC's New Standards, and the AAAS Benchmarks for Science Literacy. We also provided students and teachers with assessment rubrics which serve to scaffold student activities and guide peer review, as well as help the teacher assess nontraditional learning objectives.


2.2 Introduction

Our fundamental belief is that learning is produced by extended engagement of students in complex cognitive activities, involving peers, experts, teachers, or intelligent systems as partners, which offer means for (1) generating new ideas, (2) reflecting on ideas and recent cognitive activity, (3) accessing useful information, (4) motivating participation , (5) scaffolding for performances somewhat beyond their competence, and (6) facilitation of a long-term agenda for learning. We are at a watershed at which a number of approaches that previously were beyond real world use are now feasible, but each approach seems to lack some of the six basic affordances just listed. Below we consider four major forms of learning systems, to motivate our own technical approach.

An important form is the intelligent tutoring system (ITS). Quite a number have been built, at least in prototype form. Most have emphasized student modeling and providing feedback in the form of statements about knowledge deemed to be missing from the student. Several labs, notably John Anderson's, have pursued a scheme called model tracing, in which an intelligent critic intervenes whenever the student deviates from an "expert" approach in solving a set problem [Anderson & Pelletier, 1991; Anderson, et al. 1995] A different approach is associated with the work in Roger Schank's lab, in which cases are indexed according to situational properties of impasses or breakdowns that occurred within them. When a case becomes relevant because it matches a current realized or potential breakdown, the system offers to tell the student a "story" that might be helpful. Finally, our coached apprenticeship scheme provides all six affordances, but, at the outset of this project, only for technical jobs and only in purely computer-human interactions, rather than collaborative interactions of students with each other and with teachers. Consequently, we set out to adapt our approach to school curriculum. Consistent with research that shows that students learn better when they actively pursue understanding rather than passively receiving knowledge [Chi et al 1989, Resnick & Chi 1988, Webb 1989] the classroom teacher is now being urged to become a "guide on the side" rather than "the sage on the stage." New roles have also been recommended for tutoring systems that parallel the teacher's new role in "decentered" classrooms [Chan & Baskin, 1988; Clancey, 1992; Roschelle, 1994]. Hence work was addressed towards tutoring systems that augment the learning processes of students engaged in collaborative critical inquiry [O'Neill & Gomez, 1994; Slavin 1990].

A second form is the exploratory environment , such as a simulation of friction-free movement of objects. Exploratory environments provide many opportunities that might stimulate learning-producing activity among student peers and with teachers, but such systems characteristically lack any kind of advisor to explain, coach, and scaffold learning. They also tend to lack any structure for facilitating an explicit learning agenda. Still, we borrow from the exploratory microworlds form the notion of having kits that students can use to configure environments from which they might learn something.

A third form is systems to support authoring and, more generally, communications . Students routinely use word processing in schools now, for example, and modest beginnings of groupware for learning can be found. In the most innovative experiments, students author multimedia "publications," an approach we will be promoting as well. A variety of low-bandwidth interaction forums, MUDs, for example, are also being used in experimental learning research efforts and by some innovative teachers. However, the tools for developing substantial interactions and for publishing substantial bodies of work are just beginning to be developed to support learning activities. We planned to focus on a piece of this problem, tools to support (scaffold) the expansion from simple, narrative communications to communications of arguments, especially comparisons of alternative theories and debates over alternative public policies. In such dialectical activities, communications are more complex, there is more need for diagrammatic tools and other intelligent support for students, and there is great need to quickly recast subsets of discussion in different forms. For this reason, we saw the development of authoring, advising, and interacting capabilities for diagrammatic representations of arguments to be of central importance to stimulating a new level of high-order thinking and interaction over networks.

A fourth form is technology to support collaborative learning. There are at least three motivations driving the growing interest in this approach. The first is a practical one: having more than one student use instructional software is a cost saver. Also, there are seldom enough computer workstations for every student. The second (and perhaps primary) motivation is a response to the abundant research showing that collaborative learning correlates with a wide range of positive outcomes. Within non-computer-based, classroom and laboratory settings, collaborative learning has been shown to correlate with greater learning, increased productivity, more time on task, transfer of knowledge to related tasks, higher motivation, and heightened sense of competence [Johnson & Johnson, 1989; Rysavy & Sales, 1991; Sharan, 1980; Slavin, 1990]. Similarly, Webb's [1987] review of the research on collaborative use of instructional software suggests that it is at least as effective as individual use, and is sometimes superior [e.g., Johnson et al., 1985; Justen et al. 1990]. However, there is also widely recognized room for improvement. Collaborative learning does not work for all learners, and the results of instructional outcome studies are mixed [Klein & Pridemore, in press, Webb, 1987]. Fruitful student interactions are simply not a given. We can not expect learning gains to happen just because students are sitting together, at a desk or at a computer workstation. In Brown and Palincsar's [1989] words: "Social interactions do not always create new learning; peer interactions vary enormously; only some teaching environments actually create ideal learning experiences." (p. 397) Thus, the third motivation to develop CSCL systems is to improve the effectiveness of collaborative learning as an instructional format: i.e., to support peer interactions so as to increase learning gains.

These cognitive and social barriers to fruitful collaborative interactions raise an important challenge for teachers and developers of CSCL environments: to identify the features of collaborative learning situations that potentiate learning and to build learning environments which contain these features. At the outset of this work, we had been addressing this objective in two projects by designing and implementing tools which support question-asking, explanation, and critical discussion - i.e., the "knowledge articulation" activities which underlie social learning - particularly, for collaboration across networked machines. In the Belvedere project, our goal is to support students learning to engage in critical discussion of competing scientific theories. This work has resulted in a graphical argumentation environment in which students articulate and compare alternate theories and their associated arguments, and change them in response to new evidence or criticism [Suthers et al. , 1995; Suthers & Weiner, 1995]. The Sherlock troubleshooting environment for a complex electronics device contains tools to support peer critique of student solution traces. [Lesgold et al., 1992; Katz et al., 1993; Katz & Lesgold, 1994]. Students can step through a peer's solution, and select from a menu of "troubleshooting standards" those standards that the student did not follow, at a particular step of his solution. The software also has the capability to attach a menu of questions to domain objects, such as components in circuit diagrams.

Several system developers have been building instructional software which is specially tailored for collaborative use - i.e., CSCL systems [McManus & Aiken 1993; Newman 1991; Scardamalia et al. 1992; Whitelock et al. 1991]. However, none of these systems, to our knowledge, were being built upon a foundation of empirical research on collaborative cognition in the target domain of instruction. We believe that it is critical to ground the development of CSCL systems in research which addresses issues such as the following: What types of knowledge do students seek from their peers while collaborating on problem-solving activities? Are there any patterns in the types of knowledge that students can/can not readily explain to their peers? To what extent do students ask the right questions at the right time; i.e., do they know what they need to know in order to overcome an impasse during problem solving? What is the nature of human tutors' support for students, during collaborative problem solving and individual or collaborative critique of peer solutions? For example, what is the content and structure of human tutors' explanations for particular types of questions? Is human coaching primarily directive or question-driven? To address these questions, we had been working with small groups of students engaged in analysis of scientific debates in the Belvedere environment. We had also begun to carry out observational studies of students working together on Sherlock problems, and critiquing traces of peer solutions, with system coaching suppressed but a human tutor available to ask questions to when they are stuck (i.e., unable to help each other). This work prepared us to inform the development of a computer model of support for peer collaboration during particular types of activities (e.g., peer critiquing, collaborative problem solving).

A major motivation was the concern that knowledge-based educational software, such as intelligent tutoring systems, have historically been large, self-contained programs with specialized platform requirements. We saw that to make these technologies viable, we must be able add component functionality incrementally, and enable systems to interoperate with commercial software and internet resources [Brusilovsky, et al. 1996; Ritter & Koedinger, 1995; Roschelle & Kaput, 1995]. We knew that to reduce the cost of materials prepared by developers, and to enable greater collaboration between users, representations of educational materials should be shareable between diverse applications across the internet. Interoperability and reuse considerations suggested a "lowest common denominator" approach, yet we did not want to limit support for more advanced functionality such as domain-specific coaching. We also saw a great need to leverage productive software engineering technology, standard off-the-shelf commodities, and standard information packaging protocols in order to make educational technology more affordable. We had refined our software engineering capabilities toward this end, and planned to leverage SGML and especially its World-Wide Web expression in our work.

Thus, we started in a good position to develop and evaluate intelligent network-based technology for supporting critical discussion, problem solving, and use of on-line information resources, within both declarative work environments, such as texts and arguments, and procedural resources, such as simulations and problem-solving task environments.


2.3 Methods, Assumptions, and Procedures

In the following sections we begin by giving an overview of the software we developed, from the point of view of the interface, overall behavior, and design motivations for these. Then we describe the underlying architecture, focus more specifically on coaching functionality, and conclude with a discussion of the substantial supporting materials that we developed in addition to the software.

2.3.1 Overview of Belvedere

The "Belvedere" software is a complete redesign and reimplementation of a system of the same name, previously reported in [Suthers, et al. 1995; Suthers & Weiner, 1995]. Belvedere's core functionality is a shared workspace for constructing "inquiry diagrams," which relate data and hypotheses by evidential relations (consistency and inconsistency). The implemented system also includes groupware and associated tools that support students engaged in critical inquiry processes, such as investigating a scientific problem:

Figure 1. Belvedere Inquiry Diagram and Advice

The diagramming window is shown in Figure 1, with a student-generated "inquiry diagram" and a window (in the lower right corner) displaying advice from a coach. The default "palette" (the horizontal row of icons near the top of Figure 1) makes salient the most crucial distinctions we want students to acquire in order to conduct scientific inquiry. Left to right, the icons are "data" for empirical statements, "hypothesis" for theoretical statements, "unspecified" shape statements about which students disagree or are uncertain; then links representing "against" and "for" evidential relations, and a link for conjunction. Students use the palette by clicking on an icon, typing some text (in the case of statements) and optionally setting other attributes, and then clicking in the diagram to place the statement or create the link. The remaining icons at the right end of the palette provide sources of counsel and knowledge: they are a light bulb representing "ideas" from the coach, an "in-box" that can receive information from Web pages, and (optional and not shown in the figure) icons that start other applications such as a Web browser. A "Guide" menu (upper right of Figure 1) provides students with suggestions on how to use the software through five "phases of inquiry" (explore, hypothesize, investigate, evaluate, and report).

We use a diagrammatic interface for cognitive, collaborative, and evaluative reasons. First, the cognitive: concrete representations of abstractions turn conceptual tasks into perceptual tasks. Thus the diagrams help students "see" and internalize these abstractions and keep track of them while working on complex issues. Second, the collaborative: diagrams support collaboration by providing a shared context and reference point. Third, the evaluative: student-constructed diagrams provide the teacher and the computer with a basis for assessing students' understanding of inquiry in general and of a topic area in particular. These three reasons are discussed further below.

2.3.1.1 Cognitive Support

Diagrams help students "see" and internalize abstractions and keep track of them while working on complex issues. The inquiry diagram serves both as a record of what the students have done, and an agenda of further work (especially with the help of coaching, discussed below). The representations help guide students' thinking and activity. We have found that the choice of representational primitives has a strong effect on the content of students' collaboration, since the first action one takes when expressing an idea is to choose a category from the primitives. The earlier version of Belvedere [Suthers et al. 1995] provided a large set of choices. However, in formative evaluations with dyads, the students' discussions of the choices interfered with continuation of the inquiry process. We therefore reduced the palette to the essential types, to help focus their discussion on the most essential distinctions. One of the menus provides the option of adding other primitives.

2.3.1.2 Collaborative Support

Diagrams support collaboration by providing a shared context and reference point. These advantages manifest in different ways depending on whether the students are co-present or collaborating over the network. When they are co-present, diagrams support collaboration by helping students keep track of and refer to ideas under discussion, whether using a single display, or two displays near each other. In these situations students often use gestures on the display to indicate prior statements and relationships. In some group configurations we have seen students work independently, then use gesturing on the display to re-coordinate their collaboration when one student finds relevant information. This can occur when information is brought to the group from off-line sources, such as hands-on experiments. Distally, students can work in parallel on the same workspace, as long as they are not editing the same object at the same time. On networked computers, all changes are propagated to others working with the same diagram in a "what you see is what I see" manner. In addition to the diagram, a "chat" facility and a remote pointing mechanism support unstructured natural-language conversations, needed to coordinate the more structured inquiry diagramming when collaborating at a distance.

2.3.1.3 Evaluative Support

Student-constructed diagrams provide the teacher and the computer with a basis for assessing students' understanding of scientific inquiry, as well as of subject matter knowledge. This assessment can support computer coaching of the inquiry process. As described in section 2.3.3, we have constructed two types of coaches. One provides general advice on the structure of the inquiry diagram from the standpoint of scientific argumentation. It helps the students understand principles of inquiry such as: hypotheses are meant to explain data, and are not accepted merely by being stated; multiple lines of evidence converging on a hypothesis is better than one consistent datum; one should seek disconfirming evidence as well as confirming evidence (addressing the confirmation bias, as shown in Figure 1); discriminating evidence is needed when two hypotheses have identical support; circular arguments are problematic; etc. The other coach performs various comparisons between the students' diagrams and an inquiry diagram provided by a subject matter expert. This coach can provide students with feedback concerning correctness, or confront students with new information (found in the expert's diagram) that challenges students in some way.

2.3.1.4 Other Features

Other features of Belvedere, briefly noted, include the following. Students can set different "belief levels" for the statements and relations, and display these as line thickness with a "filter." References to external objects can be sent from other applications to an "in-box" (right hand icon of Figure 1) for optional placement in the diagram at the students' convenience. We and our students regularly use this in-box mechanism to send references to Web pages containing relevant information. Once placed in an inquiry diagram, Belvedere provides a hyperlink back to the referenced Web page. Thus Belvedere can be used as a structured "hotlist" to organize Internet resources.

2.3.2 Architecture

This section describes the architecture underlying the Belvedere system. Our notion of "architecture" is multifaceted, encompassing all aspects of the design of the software. In this section, we use four levels of description for software systems proposed by CAETI colleagues Frank Belz and David Luckham [Luckham et al., 1997]: Interface Presentation, Concepts of Operations, Abstract Implementation, and Resource. In analyzing our own work we have found it useful to begin with a fifth level of description, Concepts of Application, that is independent of the software. This is necessary for design and evaluation with respect to intended objectives. Further discussions with Tom Wheeler of Army CECOM further clarified our concepts of these abstraction layers. Along with Belz, Luckham, and Wheeler, we claim that clarity about level of description helps avoid misunderstandings due to talking at different levels, and enables one to choose to use an existing architecture at one level while rejecting or changing it at another level. (See section 2.3.5.1 for further discussion of this abstraction hierarchy and the collaborations that led to it.)

Each of the following sections begins with a general characterization of the corresponding level of description, followed by an informal description of our application or architecture at that level, and a summary of mappings to other levels of description. At each level we discuss reusability and interoperability concerns, and the advantages and disadvantages of our design.

2.3.2.1 Concepts of Application

At the level of concepts of application, one begins by describing the application domain largely in its own terms (as practitioners view it), and the educational objectives or other task objectives. Then, through cognitive task analysis or other methodology, one identifies barriers to these objectives, and chooses those which the software might be expected to help overcome.
2.3.2.1.1 Critical Inquiry in Science
The Belvedere application domain is learning critical inquiry skills, particularly in science. Since the focus of this paper is on viable architectures rather than this specific application domain, we describe the application only enough to provide background for subsequent discussion. Basic actions of learning critical inquiry in science include
  1. Familiarizing oneself with a field of study
  2. Identifying a problem of interest
  3. Proposing hypotheses (or solutions)
  4. Identifying and seeking evidence that bears on those hypotheses (or solutions)
  5. Drawing conclusions based on the evidence found
  6. Summarizing and reporting the inquiry to others
  7. Evaluating the status of the inquiry, with repeat at any of the steps above
  8. Discussing and coordinating the doing of 1-8 with others.
  1. Obtaining solicited and unsolicited guidance on how to conduct critical inquiry
We identified the following possible barriers to learning critical inquiry in science (Suthers et al. 1995; Suthers & Weiner 1995):
  1. Lack of motivation.
  2. Limited knowledge of scientific domains.
  3. Inability to recognize abstract relationships implicit in scientific theories and arguments about them.
  4. Difficulty keeping track of a complex debate.
  1. Lack of scientific argumentation criteria, and associated biases, e.g., confirmation bias.
We return to elements of both of the above lists in subsequent sections.
2.3.2.1.2 Generality
At the Concepts of Application level, "reusability" is a psychological concern rather than an engineering concern: we must ask how well the task analysis applies to other domains, and hence whether the pedagogical strategies and forms of scaffolding that are embodied in other levels of the system will transfer well. The generality of our particular analysis is not within the scope of this paper.

2.3.2.2 Interface Presentation

At the interface presentation level, one designs the perceptual/motor experience of the user. Here we describe the functionality available to user in terms of representations of application objects and actions on these objects.
2.3.2.2.1 A Graphical Interface for Critical Inquiry
The Belvedere "inquiry diagram" interface (Figure 2) can be thought of as networked groupware for constructing representations of evidential relations between statements. It uses shapes for different types of statements and links for different kinds of relationships between these statements. Multiple clients can view the same inquiry diagram, with "what you see is what I see" (WYSIWIS) updating. An auxillary "chat" window (upper left of Figure 2) supports unstructured natural language communication. Additionally, a software-based "coach" (lower right of Figure 1) provides assistance to students as they engage in their various inquiry activities [Toth et al., 1997]. To avoid interrupting students' thought processes, the coach is minimally intrusive, usually remaining quiet unless students ask for advice, and flashing its light bulb only when it has critical advice to offer. It coaches critical inquiry by asking questions students may not have thought of, based on criteria of inquiry and argumentation in science.

Figure 2. The Belvedere Interface

Belvedere is designed to be used in conjunction with materials presented in a Web browser. The materials are segmented into units at a granularity which a subject matter expert chooses for his or her own inquiry diagrams. "Reference This" buttons in the Web pages enable students to send "references" to these segments into the Belvedere "in-box" (upper right of Figure 2) from where they may be dragged into the inquiry diagram as needed. The small icons in the upper left of each shape indicate that hyperlinks can be followed back to the original document.

2.3.2.2.2 Relations to Concepts of Applications
A well designed interface should support Concepts of Application through a clear mapping of domain objects and actions to interface objects and actions. Furthermore, the interface should address the barriers identified at the superordinate level of analysis, for example by providing visual organizers.

Summarizing from Suthers et al. [1995] and Suthers & Weiner [1995], here is how the interface is designed to address barriers to learning critical inquiry:

  1. Lack of motivation: Belvedere is designed to support collaborative problem solving, providing peer motivation and engaging activities [O'Neill & Gomez 1994; Scardamalia & Bereiter 1991; Slavin 1990]. Support for collaboration includes networked WYSIWYS, the chat facility, and the diagram itself, which helps students switch between working independently and working together without losing track of what they are doing.
  2. Limited knowledge of scientific domains: This is addressed in part through on-line materials, and in part through "expert coaches" which can coach based on the knowledge of a particular domain.
  3. Inability to recognize abstract relationships and arguments: Belvedere's diagrammatic representations reify these relationships and make weaknesses and points where further contributions can be made salient [Smolensky et al 1987; Streitz et al 1989].
  4. Difficulty keeping track of a complex debate: This is partially addressed by the concrete visual representation, which help students keep track of main points and pending issues.
  1. Lack of scientific argumentation criteria, and associated biases: This is addressed by Belvedere's coach.
Following are some examples of the mapping of Concepts of Application actions to the Interface level:
  1. Familiarizing oneself with a field of study: Browsing the Web materials.
  2. Identifying a problem of interest: Starting a new inquiry diagram, labeled by a problem statement.
  3. Proposing hypotheses: Either selecting the "hypothesis" icon and typing in a statement of a hypothesis, or using a "reference this" button to bring a reference to an existing hypothesis into the diagram.
  4. Identifying and seeking evidence that bears on those hypotheses: The coach helps users identify when evidence is needed. The Web materials themselves along with hands-on activities suggested in those materials provide some sources of evidence. Evidence is recorded as for A3, except the "Data" icon is used.
  5. Drawing conclusions: Belvedere provides a facility for changing and viewing the relative "strength" of the different statements. The coach provides some guidance, but further support is needed here.
  6. Summarizing and reporting the inquiry to others: Currently support is inadequate. Users can print their inquiry diagrams, or convert them into HTML tables that summarize the evidence for and against each major hypothesis.
  7. Evaluating the status of the inquiry, with repeat at any of the steps above: The coach provides some local guidance. Also we provide an outline of phases of activity and a "Guide" menu to help students through these phases.
  8. Discussing and coordinating the doing of 1-8 with others: If not in co-located, users can interact via the Chat window.
  1. Obtaining solicited and unsolicited guidance: The coach provides both.
This analysis has been simplified for this paper: our full analysis specifies the complete interface actions required to carry out each action in the concepts of application.
2.3.2.2.3 Comments on the Analysis
An analysis of this kind has helped us identify some limitations of the Belvedere interface. The mapping is not always clear, and it lacks scaffolding of the overall process. We have begun to address these concerns. An advantage of our approach is that the interface can easily be modified without affecting the other levels of the architecture. As we shall see in the next section, the Interface level of analysis can also be bypassed in favor of a direct mapping of Concepts of Application to Concepts of Operations.

2.3.2.3 Concepts of Operations

At this level one describes how the software models the application domain, in terms of classes of objects and the operations that can be performed on them. The specification can take the form of an object-oriented model, or a collection of abstract data types (ADTs).
2.3.2.3.1 Supporting Collaborative Coached Critical Inquiry
To illustrate, below are some objects and operations supported by our system. The numbers in brackets indicate which Concepts of Application actions are being supported.

Inquiry Diagrams. Inquiry diagrams consist of a problem statement, and a collection of statements and relationships between them. The operations abstract communications between the Belvedere interface and a persistent object store. Some of these are New Inquiry Diagram (A2), Open Inquiry Diagram (A2), Add Statement (A3, A4), Add Relationship (A3, A4), Update Statement (A5, A7), and Delete Statement or Relationship (A5) (we retain a complete history of all objects that existed).

Information Search. Accomplished by Get Page (A1) and Send Reference (A3, A4), invoked via the Web browser.

Discussion with Others. A8 is accomplished by Send Message.

Advice Services. Objects include requests, replies, and interruptions; all in support of A9. The client can Request Advice; and the coach can Send Advice, which consists of the advice text and a list of the objects that the advice text refers to. The coach can also send an Interruption, which is a request to perform an interface action that notifies the user that advice is available.

Some important Concepts of Application activities are not supported by this model. These include performing data analysis and visualizations (A4, A5), asking the coach specific questions (A9), and abstracting summaries of the inquiry (A6). Extensions are being planned to address these concerns.

2.3.2.3.2 Relations to Other Levels
Concepts of Operations supports the User Interface level by providing primitives for creation of, access to, and state changes in objects. Concepts of Operations abstracts from Concepts of Application because the objects or ADTS could be reapplied to other application domains that have similar modeling requirements: a given application is an instance in the class of task domains covered. Hence, Concepts of Operations is the level at which we describe generic task domains . A shell is a collection of software that applies to a given generic task domain [Murray 1996].[1] For example, our generic task domain is collaborative critical inquiry with coaching, and our software can be thought of as a shell for such applications.
2.3.2.3.3 Interoperability and Reusability
At the level of Concepts of Operations, interoperability and reusability is aided by shared ontologies. Ontologies are formalized structures (such as hierarchies) that define abstract concepts and the relations between them.[2] The concepts abstract critical features of the particular objects of an application domain. Shared ontologies help people communicate the contents and capabilities of their systems, strategies, etc., for example helping us determine whether the modeling services of a particular piece of software will adequately support our needs in a new application, or whether we can reuse a pedagogical strategy. Shared ontologies also enable us to compose knowledge-based software components because they enable one component to "understand" the contents of data or messages it receives from another component. This is an area we have only begun to explore in our own work, but see Ikeda et al [1995].[3]

2.3.2.4 Abstract Implementation

At this level one describes the architectural elements and communication between these elements, including software modules such as interpreters, databases, event managers, etc., and data and control flow between them. Figure 3 details our abstract implementation level architecture at the granularity of modules that require network communications. All actions initiated by the user are accomplished via CGI and the response from the CGI call. The decision to use CGI was based upon the availability of the HTTP server (already needed for materials delivery); the ease of interfacing a Java application with the server via the openURL method; and ease of modification and maintenance. Messages for WYSIWIS, coaching, and chat come in asynchronously via a small listener server in the client. The listener runs as a separate thread in the client. The Connection Manager is written in Java. The interfaces are simple and robust: the communication architecture has performed extremely well during our laboratory "stress" testing. Other advantages include portability and low cost (most components are free). A major exception is the Coach, which was implemented in Lisp and Loom for ease of development. The Coach actually consists of several submodules: an argument pattern coach, an expert model coach, and an arbitrator that prioritizes advice from the coaches for presentation based on factors such as discourse history and type of advice [Toth et al, 1997]. Our architecture enables this use of "heavyweight" environments for advanced functionality, because client platforms need only run Netscape and Java applications. However, we have recently reimplemented the Coach in Java to enable lower cost and portable server delivery.


  1. Browsing (Get Page)
    1. Client request (HTTP)
    2. Server reply (HTML with embedded Java)
    3. Access logging. (When implemented, Tracker will notify Coaches.)
  2. Referencing On-line Materials (Send Reference)
    1. Java applet sends reference to server (data embedded in CGI GET)
    2. Reference sent via socket in application specific protocol
  3. Application Requests and Updates (New Inquiry Diagram, Open Inquiry Diagram, Add Statement, Add Relationship, Update Statement, Delete Statement)
    1. Request or update sent to Session Server (data embedded in CGI GET)
    2. Server queries or updates Database (SQL requests)
    3. Database replies with results or return code
    4. Reply sent to client (response to CGI GET. User was able to continue working before reply received.)
  4. Updates Propagated to Other Clients (WYSIWIS for events generated in #3)
    1. Update sent to Session Server (subset of 3a)
    2. Connection Manager informed of update (TCP socket; application specific protocol)
    3. Connection Manager formats message and informs all clients that are using the same workspace (TCP socket; application specific protocol)
  5. Coaching (Request Advice Send Advice)
    1. Update or advice request sent to Session Server (data embedded in CGI GET)
    2. Update or advice request sent to Coach dispatcher (TCP socket; application specific protocol)
    3. Coach queries Database if needed to determine state (SQL, read-only)
    4. Database replies
    5. Coach sends client advice, if requested. Coach sends interrupt when update activated high priority advice (TCP socket; application specific protocol)
  6. Chat Facility (Send Message)
    1. User's comment sent to Session Server (data embedded in CGI GET)
    2. Session Server sends comment to Connection Manager (TCP socket; application specific protocol)
    3. Connection Manager forwards to users in same workgroup (TCP socket; application specific protocol)
Figure 3: Abstract Implementation Layer
2.3.2.4.1 Relations to Concepts of Operations
Concepts of Operations abstracts functionality from structure in the Abstract Implementation, by indicating which subsets of Abstract Implementation layer are involved in a given functionality (as shown in the lists of Figure 3). Concepts of Operations provides the semantics of communications, and Abstract Implementation provides the syntax and protocol.
2.3.2.4.2 Interoperability and Reusability Issues
Communication is the key to interoperability and reusability at this level. Specifically, the use of standard protocols where they exist facilitates the interchange, addition, or reuse of components. Our current communication protocols and representations are summarized in Figure 3. Some of the advantages have already been discussed, including simplicity, robustness, portability, and low cost. Also, it is easy to add or change clients using CGI scripts. This was not true of the coach described above; however the recently finished Java-based Coach utilizes the same communication protocols (and Java networking code) as other clients. These changes facilitate the easy addition of new coach modules and the distribution of coach functionality across platforms: one can take a client, remove the GUI, and plug in a coach. Furthermore, the architecture permits interaction with other architectures and components. For example, in our MOO-based CAETI Integration Feasibility Demonstration, presented at the June 1997 meeting, a simulation by Ken Forbus sent simulation results as Data objects into the Belvedere in-box, and a tutor by Ken Koedinger commented on how these data objects are linked in to the inquiry diagram (see section 2.3.5.2).

The above design is limited in several ways. Some of the protocols are application specific. This is probably unavoidable; although some reuse may be facilitated by shared ontologies at the Concepts of Operations level. Under new PTI funding, we have begin another cycle of redesign to enable delivery using other databases and other server class machines. Prototype versions of RMI and CORBA server interfaces have been implemented and are currently undergoing testing and evaluation. Our new design will also greatly simplify the addition of new types of clients. (We plan to add clients that manipulate influence diagrams, causal loop diagrams, and concept maps.) Under the new design the protocols are data-driven, so that only minor modifications to the Session Manager (and no other existing components) are required to add a new type of client. Each client would load a data type table into the Database.

We are attempting to generalize the abstract implementation architecture to be configurable for any learning application that requires networked collaboration, coaching, and multimedia. Adaptive multimedia [Brusilovsky et al, 1996] could be included with scripts that automatically generate HTML pages from the database to meet user's needs. We have designed and implemented a prototype of this adaptive hypermedia extension but have not incorporated into our released system. Student modeling facilities would be improved by informing the Coach of which materials students have examined via the Tracker.

2.3.2.5 Resource Layer

At this level one describes the system in terms of the resources used and their performance characteristics, including performance of both hardware and implemented software, as well as constraints on where that software resides. In Belz and Luckham's work this level of description is used primarily for performance modeling,[4] which is not a concern in this paper. For present purposes the most significant resource constraints on the implemented architecture are as follows:
Client platforms: Any platform supporting Java applications and Netscape 2.0 or better. We have tested on Mac OS, Solaris, Windows'95 and Windows NT. Installations in our lab and in DoDDS schools were on PowerMacintosh 8100 series and various Pentium platforms. Applications shown in the shaded box in Figure 3 must be running at the same IP location.
Server platforms: A Unix server was required. The redesigned version will deliver on Windows NT and other server class machines. The server was installed on a Sparcstation 20 MP in our lab and on Netras in each of the 4 DoDDS schools. The server components shown in shaded boxes must reside at the same location as others in the box.
Network requirements: With the possible exception of images embedded in HTML materials, all messages are small, so communication load is low. A 28.8 connection was adequate. Installations were all 10BaseT.

2.3.2.6 Architectural Lessons

The advent of the Web has brought us widespread connectivity, shared protocols, and software languages that can migrate between platforms. These have enabled the development of client-server systems for delivery of interesting functionality as well as materials, on a variety of platforms. Such systems provide the knowledge-based educational technology community with more viable options for getting systems delivered in the "real world." During development we can choose to use sophisticated tools for knowledge-based systems, and to the extent that connectivity is available, deliver intelligent functionality without needing to scale down the intelligence. Furthermore, this new technology can help us address some of the pragmatic problems that have plagued those who are developing knowledge-based applications for education. We have begun to resolve some of the basic interoperability issues that will make it easier to reuse components of ITS and other knowledge-based systems. This reuse will enable researchers to allocate more effort to research rather than development of the infrastructure needed to test their ideas, as well as reducing cost of delivery. The real issues -- the hard ones -- are now shifting to a more conceptual level of analysis. We need to address the issue of how we can share content , including media, pedagogical strategies, and intelligent services such as user modeling. Section 2.3.5.2 pursues this issue further in the context of an Integrated Feasibility Demonstration in collaboration with other CAETI researchers.

2.3.3 Design of Computer Coaches

In this section we describe a prototype advisor for students using Belvedere. The advisor has two strategic components, syntactic and consistency-based. Syntactic strategies are based on structural and categorical patterns in argument representations constructed by the students, and suggest ways in which students can continue their inquiry. Consistency-based strategies check student-made links between pairs of statements against the pairwise relations specified between corresponding units in a knowledge base constructed by a teacher or expert, and identify information that may challenge or corroborate relationships proposed by the students.

2.3.3.1 Introduction

We have now prototyped an automated advisor that gives advice on demand concerning ways in which an argument in this environment can be extended or revised. Rather than supplying oracular advice whenever the student missteps, the advisor is on-demand, avoiding inappropriate intrusion into student discussion that may be taking place external to the computer environment. Advice is phrased as suggestions and questions because we cannot presume that an automated advisor has sufficient information to be imperative, and we want students to think about the advice, not just execute it.

In this section we discuss two methods of advice generation that we have implemented. Syntactic advice strategies make suggestions based solely on the syntactic structure of students' inquiry diagrams. Consistency-based advice strategies use a simple knowledge base of consistency relations between information units to identify information that may challenge or corroborate relationships postulated by the students. Before describing these advice giving methods, we first briefly describe the design constraints under which we operated.

2.3.3.2 Pedagogical Constraints on Advice

Our design of the advisors to be discussed were guided in part by the following constraints.

Maintain the student-initiated character of Belvedere's environment. Belvedere encourages reflection by allowing students to see their argumentation as an object. They can point to different parts of it and focus on areas that need attention. They can engage in a process of construction and revision, reciprocally explaining and confronting each other. An advisor should not intervene prematurely in their thinking process. It should be discreet, offering advice on request. Students should feel free to discard an advisor's suggestions when they believe them to be irrelevant or inappropriate.

Address certain parts of the task that are critical to the desired cognitive skill. Research on "confirmation bias" and hypothesis driven search suggests that students are likely to be concerned with the process of constructing an argument for a favored theory they are supporting, sometimes overlooking or discounting discrepant data [Klayman & Ha 1987; Chinn & Brewer 1993]. Also, they may not consider alternate explanations of the data they are using. An advisor should address these problems. For example, it should offer information that the student may not have sought, including information that is discrepant with the student's theory.

Be applicable to problems constructed by outside experts and teachers. The advisor should be able to give useful advice based on a simple knowledge base that an expert or a teacher might construct. So far Belvedere has been used to construct arguments in domains as different as theory of evolution, contrasting theories of mountain formation, cause of the Cretaceous extinctions, whether HIV causes AIDS, and theories in social psychology. It is not feasible to develop for each a representation of the knowledge needed to deal with the argumentation students potentially could engage in. We are instead interested in a general approach, applicable to all the cases, in which the knowledge base can be constructed by a teacher.

2.3.3.3 Syntactic Advice Strategies

The first approach we implemented gives advice in response to situations that can be defined on a purely syntactic basis, using only the structural and categorical features of the students' argument graphs. (The students' text is not interpreted.) Types of advice are defined in terms of patterns to be matched to the diagram, and textual advice to be given if there is a match. Example advice patterns are given in Table 1.

The advice applicable to a given inquiry diagram is often more than a student can be expected to absorb and respond to at one time. When more than one instance of advice is applicable, a preference-based quicksort algorithm is used, following a mechanism used by Suthers [1993] for selecting between alternate explanations. Advice instances are sorted in priority order, and the highest priority advice is given. Objects that bind to variables in the patterns are highlighted in yellow when the advice is given, so the user can easily identify what the advice is about. If further advice is requested before the diagram changes, subsequent advice instances on the sorted list are used without reanalysis. We are investigating preferences that take into account factors such as prior advice that has been given, how that advice has been responded to, how recently the object of advice was constructed, and various categorical attributes of the applicable advice.

We believe that the most important kind of advice is that which stimulates and scaffolds constructive activity on the part of the students. To give this kind of "open world" advice, our first step was to identify partial argument patterns in the inquiry diagram the students had constructed so far and indicate how the student could complete these patterns. For example, the advisor might find theoretical claims that have no empirical support and suggest that support be sought (hypothesis-lacks-empirical-evidence), or it might find competing theories that are both supported by the same empirical evidence, and ask if there is a discriminating piece of evidence (discriminating-evidence-needed in Table 1).


 (def-advice 'HYPOTHESIS-LACKS-EMPIRICAL-EVIDENCE
  :query '(retrieve (?h) (and (hypothesis ?h) (No-Evidencep ?h)))
  :advice ("Can you find data that are for or against this hypothesis?
A scientific hypothesis is put forward to explain observed data. Data
that a hypothesis explains or predicts count *for* it. Data that are
inconsistent with the hypothesis count *against* it.")

  :subsequent-advice ("Can you find some data for or against this
hypothesis?")
  :advice-types '(incompleteness))

(def-advice 'DISCRIMINATING-EVIDENCE-NEEDED
  :query '(remove-duplicate-binding-sets 
           (retrieve (?h1 ?h2)
		(and (hypothesis ?h1) (hypothesis ?h2)
		     (:not (:same-as ?h1 ?h2))
		     (Exists-Consistent-DataP ?h1)
		     (Exists-Consistent-DataP ?h2)
		     (fail (Consistent-HypoP ?h1 ?h2))
		     (Identical-EvidenceP ?h1 ?h2))))
  :advice ("These hypotheses are supported by the same data. When this
happens, scientists look for more data as a \"tie breaker\" -- especially
data that is *against* one hypothesis. Can you produce some data that
would \"rule out\" one of the hypotheses?")
  :subsequent-advice ("Can you produce some data that might support just
one of the hypotheses?")
  :advice-types '(incompleteness evaluative))

(def-advice 'CONFIRMATION-BIAS
  :query '(retrieve (?h) (and 
		           (hypothesis ?h)
                           (Exists-Multiple-Consistent-DataP ?h)
                           (Multiply-LinkedP ?h)
                           (fail (Exists-Inconsistent-DataP ?h))))
  :advice
  ("You've done a nice job of finding data that is consistent with this
hypothesis. However, in science we must consider whether there is any
evidence *against* our hypothesis as well as evidence for it. Otherwise
we risk fooling ourselves into believing a false hypothesis. Is there
any evidence against this hypothesis?")
  :subsequent-advice ("Don't forget to look for evidence against this
hypothesis!")
  :advice-types '(cognitive-bias))

(def-advice 'ALTERNATE-HYPOTHESIS
  :query '(retrieve (?h) (hypothesis ?h))
  :lisp-test '(lambda (client query-result) 
           (declare (ignore client)) 
           (= (length query-result) 1))
  :advice ("Scientists consider many hypotheses to get the best
explanation of the data they are interested in. If they don't compare
their favorite idea to other ideas, somebody else will! Is there another
hypothesis that you could consider?")
  :subsequent-advice ("Is there another hypothesis that you could
consider?")
  :advice-types '(cognitive-bias incompleteness))

(def-advice 'SWALLOW-DOES-NOT-A-SUMMER-MAKE
  :query '(retrieve (?d ?h)
             (and (data ?d) (hypothesis ?h)
                  (Consistent-HypoP ?d ?h)
                  (fail (Exists-Multiple-Consistent-DataP ?h))
                  (fail (Exists-Inconsistent-DataP ?h))))
  :advice-template 
  ("Strong hypotheses and theories usually have a lot of data to support
them. However, this hypothesis has only one consistent data item. It
looks rather weak. Can you find more data for this hypothesis? Can you
find data that is against it?")
  :subsequent-advice
  ("This hypothesis has only one consistent data item. Could you find
more data for (or against) this hypothesis?")
  :advice-types '(evaluative incompleteness))
Table 1. Examples of Syntactic Advice Patterns.

The syntactic advisor also responds to illegal and incoherent constructions. "Illegal" constructions are those that use elements of the diagrammatic language in a manner inconsistent with their intended semantics. For example, a "support" link should not be used between data. "Incoherent" constructions are those in which the elements are each used legally, but in combinations that are semantically problematic. Examples include a loop of "support" links, or a datum that both supports and undermines the same claim. Some other advice given includes:

  1. Suggests that a theory or hypothesis be formulated when none is present in the inquiry diagram.
  2. Asking whether the students can find empirical evidence relevant to a hypothesis that has no relations to empirical observations, or show that it predicts or explains an observed phenomenon.
  3. Asking whether the students can say how an unconnected statement relates to the rest of the diagram.
  4. Asks whether there is another theory that provides an alternate explanation for the empirical data when only one theory or hypothesis is involved in the inquiry diagram;.
  5. Asks whether data can be found to discriminate between two theories that have identical support.
  6. Asking whether a "data" box that supports one hypothesis could possibly support another. ("Data" is a kind of "Empirical Observation." This is an instance of support-competitor.)
  7. The "one swallow does not a summer make" rule suggests that a good hypothesis explains more than one datum, and asks whether the student can find more data.
The argument coach provides advice about relationships among statements, but has nothing to say about the contents of these statements. Thus, it provides feedback about the "grammar" of scientific discourse, but knows nothing about the meaning of the diagrams. Our system presently encodes twenty-three such structure-based rules, giving advice on a wide variety of situations involving statements about hypotheses and data, and relationships about these statements using "for" and "against" links.

2.3.3.4 Consistency-Based Advice Strategies

Ideally, we would like to have an advisor that understands the students' text as well as the domain under discussion, and provides fully knowledge-based advice. This is not currently possible due to the difficulty of constructing domain knowledge bases and of understanding students' texts. Instead, we have adopted the strategy of investigating how much useful advice we can get out of a minimal semantic annotation before we move on to more complex approaches. In this manner we hope to better understand the cost/benefit tradeoff between knowledge engineering and added functionality.

The consistency-based advisor is our first step in this direction. It is intended to offer specific information that the student may not discover on her own. It makes two assumptions: students construct their inquiry diagrams from existing units of text, and these units are annotated with relationships recording whether they are consistent or inconsistent with each other, based on expert judgment. The advisor searches the latter "consistency graph" to find paths between units that students have used in their inquiry diagrams, and selects other units found along those paths which are brought to the students' attention. Our claim is that this enables us to point out information that is relevant at a given point in the inquiry process without needing to pay the cost of a more complete semantic model of that information.

2.3.3.4.1 Consistency and Transitivity
The consistency-based advisor is based on a comparison of the student's inquiry diagram with information derived from a teacher's or expert's inquiry diagram. For the purposes of advice-giving, Belvedere's various relations between argument components are classified simply as relations of inconsistency or consistency and are assumed to be symmetrical. Thus a "consistency" link means that the information in the connected nodes is at least compatible, and preferably one can be offered as evidence for the other. The link "and" is a concept-forming link: it defines an implicit node that is consistent with the conjuncts, and inconsistent with nodes that are inconsistent with either conjunct. More precisely, the consistency relation of and-links is based on the following rules:
  1. A&B is consistent with each of A and B individually.
  2. If C is inconsistent with A or inconsistent with B then C is inconsistent with A&B.
The converse of rule 2 does not hold: A&B can be inconsistent with C while both A and B are individually consistent with C.

When the teacher or domain expert defines the task and the information needed, she can draw an inquiry diagram in Belvedere. The diagram is easily transformed into a set of consistency relations between pairs of texts, following the rules described above. These relations become the "expert model" or knowledge base for the consistency-based advisor.

During a student session, students can drag texts authored by the expert into their diagrams, and express argumentation relationships between them. The links in the student diagram are interpreted as consistency relationships in the manner described above. The advisor can then compare consistency relations defined by the students with consistency relations defined by the expert, and look for inconsistencies and other possible advice. The comparison is based on a graph-search algorithm, which has been implemented and tested as reported below. It searches in the expert's consistency graph for paths between nodes that have been related in the students' graph, constrained by the following rules:

  1. Only the shortest path between two nodes is considered.
  2. A "positive path" crosses only consistency links.
  3. A "negative path" ends with an inconsistency link.
  4. A path can cross only one inconsistency link. An inconsistency link ends a path.
  5. Conflicts between positive and negative paths are resolved in favor of the negative path.
The advisor can then select an item on the path found in the expert's graph that is not present in the students' graph, and present this to the student for consideration. If the path is of a different polarity than the students' link, the information presented could possibly contradict the relationship claimed by the students; if the polarity is the same the information would presumably support the relationship claimed by the students. The five rules presented above define a non-monotonic logic similar to Thomason's skeptical reasoning [Thomason 1992]. Rule 1 is used to control the search and control the length of meaningless paths. The "consistency relation" is weaker than logical implication. It can easily be the case that inconsistent statements are at the ends of a long chain of consistency links. Although limiting the search to the shortest path does not solve this problem, it greatly reduces the effect of long and meaningless paths.

Rules 2 to 4 are used to maintain the consistency of the path. Once a negative link is crossed, it is plausible that the two nodes at the end of the path are inconsistent. We cannot extend the path any farther, because whatever conclusion is drawn from there is quite arbitrary. For example, suppose that A is inconsistent with B, and B is inconsistent with C. We can't conclude that A is inconsistent with C: they could be consistent components of an argument against B. On the other hand, we can't assume that A is consistent with C: they could be arguing against B based on incompatible assumptions. Thus, Rule 4 forces the search to stop when a conflict is reached. Rule 5 is introduced to address the confirmation bias.

2.3.3.4.2 Formative Experiments
We conducted two preliminary experiments with the consistency-based advisor. In the first experiment we were interested in testing consistency relations that we expected to be difficult or that required some inferential power. We used a subset of the "iguana problem" knowledge base used in some of the studies with the students comprised 19 nodes, 14 consistent and inconsistent relations, and 2 and-links. The three authors made judgments of consistency between pairs of statements corresponding to the nodes. Then we compared our judgments with the advisor's judgments. In all the relations about which all three authors agreed, the advisor made the same judgment. The only disagreements were on relations about which the authors disagreed. These cases were all characterized by the lack of a connecting path between the tested nodes. Either the search process was blocked by an inconsistency link, or a critical link was missing in an intermediate step of the process.



Figure 4. Example advice consistency path.

In the second experiment, we were concerned with the advice that would be given in a real interaction with students. We constructed a consistency graph of 90 statements and 73 relations from the materials used in one of the sessions with students, and performed consistency analysis on each link from two student sessions (see diagram in Figure 4). The performance was similar to the previous experiment. We always agreed with the system's judgment, and the intermediate steps were sequences of coherent proofs. On most of the links the advisor agreed with the students (these were among our best students). In one case only, the advisor gave a different judgment: see the support link in Figure 4. The path the advisor constructed starts at the "and" node, crosses the upper right and lower right nodes (not displayed in the students' graph), and ends at the lower left node. The advisor recognizes that this path (shaded) crosses an inconsistency link, so conflicts with the students' support link. If the students would ask the advisor for a critique of their argument, the advisor would highlight the link and it would display the node on the lower right (the only information on the path that they have not seen), confronting them with the conditions for land animals' migration which they overlooked.
2.3.3.4.3 "Snippets"
Students using Belvedere access information from a prepared domain-relevant database of HTML-based documents browsed with the Netscape application. As they do, they can add new statements to the diagram and show their relation to other statements by using "for", "against", and "and" links. Students have the choice of providing the contents of each statement either by typing in text or by selecting pre-coded HTML-based "snippets." A "snippet" is a unit of text that contains a statement or set of statements that can be used by students conducting a scientific inquiry. Typically, a snippet is a few sentences in length (Figure 5). To encourage deep processing, neither the contents of the snippet nor a predefined label is provided. Students must type in a short summary and identify the snippet as a "hypothesis", "data" , or "principle" , before it is sent to an "In-Box" in the inquiry diagram.



Figure 5. "Snippets" may be referenced by clicking on document icons.

We have presently encoded extensive domains of documents on two topics into HTML-based materials databases: What caused mass extinctions? and What advice should be given to a person with a family history of a genetic disorder? The Mass Extinctions database, for example, contains about one hundred snippets. A student or teacher would select a pre-configured "snippet" from an HTML document while browsing the materials database using the Netscape browsing tool. Once such a snippet has been selected, a Java-based "applet" is activated that notifies the Belvedere client tool that the snippet has been selected. A visual icon known as the "In-Box" in the Client is highlighted, indicating that the material from this snippet can be relocated into the inquiry diagram. A list of such selections is maintained in this In-Box. Students also have the option to discard snippets present on the In-Box list. A dialogue box is again provided in which the student can check the snippet summary and modify its type, if necessary. When the student retrieves the snippet from the In-Box, the student-written label (summary) appears in the appropriate hypothesis, data, or principle shape as defined by the student.

2.3.3.4.4 Implemented Expert Coach
The delivered version of Belvedere integrates the LISP-based argument and expert coaches which run on a Sun Microsystems-based server, with Java-based client software running on Intel and Macintosh platforms. These machines are connected via a TCP/IP-based communications link (Figure 3). The relationship between the server-based coach and clients is one-to-many. The expert coach requires two knowledge bases, the student inquiry diagram and a corresponding expert inquiry diagram. Both kinds of diagrams are maintained in a Postgres relational database. At load time, these diagrams are read from the Postgres server into a LISP-based LOOM knowledge base and instantiated as LOOM objects. LOOM is a knowledge representation system developed and maintained at the Information Sciences Institute [Bates 1987]. A set of expert diagrams is maintained by the expert (e.g., a teacher) and provides a canonical depiction of the teacher's mental model of the domain. The student and expert diagrams can consist of both snippets and "non-snippets"; i.e., text contents that are not predefined and so are not known to the expert coach.

The student cannot see the expert diagram during a session. The expert diagram is thought of as a "read-only" entity and is configured by the teacher before an inquiry session begins. The student diagram is dynamic; each time a change occurs in a student diagram, the change is noted by the expert coach and the LOOM knowledge base is updated with the new information.

As students construct an inquiry diagram, they may include snippets from the materials database. The expert coach is utilized only when a student assigns a relationship between two snippets with a "for", "against", or "and" link. The expert coach can only provide advice about what it knows. Thus, if non-snippets are introduced into the student diagram the expert coach will be unaware of their existence, and it will have no advice about non-snippets created by the students. In that case, the argument coach still subjects the non-snippet objects to analysis and can respond. Meta-rules manage the advice from two different coaches before the advice is passed on to the student. As advice is generated from each coach, it is maintained on a list which is then subject to a recursive multi-keyed "preference" sort. For instance, expert advice is provided first, multiple instances of the same advice are reduced to single instances, and argument advice is sorted according to type, ranging from "getting started" (e.g., the inquiry diagram is empty) to "advanced" (e.g., a swallow does not a summer make). In this fashion, we envision future arbitration schemes to manage several such knowledge sources.

The expert coach has been implemented with a best-first heuristic search to determine the optimal path from the start node to the goal node in the expert diagram using the cost function

f(n) = g(n) + h(n)
where g is the distance of the path from the current node n in the graph back to the start node, and h is a heuristic estimate of the distance from the current node n to the goal [Cohen 1989]. The heuristic is articulated as follows: If the student has indicated a "for" link, all paths in the expert diagram which contain "against" links will be given shorter distances than paths with "for" links. Likewise, if a student has indicated an "against" link, all paths in the expert diagram which contain "for" links will be given shorter distances than paths with "against" links. In this fashion, as the best-first queue is sorted in non-decreasing order, the shorter paths will be sorted according to distance to the beginning of the queue and be favored at each new iteration of the search. The following rules also constrain the search of the expert coach:
  1. The start and goal statements in the student diagram must be snippets and must also exist in the expert diagram.
  2. The shortest path between the start and goal is always considered.
  3. If the student has connected two snippets with a "for" link, and there is a path from the start to the goal in the expert inquiry diagram with an "against" link in it, provide feedback about the contents of the two statements connected by the "against" link. The "against" link in the path closest to the start node is always preferred.
  4. If the student has connected two snippets with an "against" link, and there is a path from the start to the goal in the expert inquiry diagram that consists entirely of "for" links, provide feedback about the contents of each statement in that path.
  5. If either of the statements found in a path in the expert diagram is the start or the goal do not provide feedback about it.
  6. If a student has indicated a "for" link between the start and goal and the expert diagram has a direct (i.e., no intervening links) "against" link between the start and goal , ask the student to consider the ramifications of an "against" link.
  7. If a student has indicated an "against" link between the start and goal and the expert diagram has a direct (i.e., no intervening links) "for" link between the start and goal , ask the student to consider the ramifications of a "for" link.
2.3.3.4.5 Reimplementation in Java
We subsequently ported the LOOM LISP-based coaching code to the Java programming language. The bulk of the student software was already written in Java and we wanted to achieve the platform-independence that Java affords with as many components of the system as we can. The LOOM run-time image consumes tens of megabytes of memory and is platform-dependent. We selected the "Java Expert System Shell" (JESS), which is a rule-based inference engine utilizing the Rete pattern-matching algorithm [Friedman-Hill 1997]. Jess is written entirely in Java. Twenty-one of the twenty-three LOOM-based rules for the argument coach have been re-implemented in JESS.

2.3.3.5 Coaching Status and Future Directions

The syntax-based advisor can make suggestions to stimulate students' thinking with no knowledge engineering required on the part of the teacher or domain expert. However, the advice is very general and does not adequately address the confirmation bias. The consistency-based advisor can provide students with assistance in identifying relevant information which students may have not considered (perhaps due to the confirmation bias), and which may challenge their thinking. This advice cannot be provided by the syntax-based advisor, because the advice depends on knowledge of certain semantic relationships between the textual units involved. The level of "understanding" of the texts on the part of the system required is extremely minimal: this is an advantage, as it reduces the knowledge engineering demands on educators preparing materials for students. Clearly, a minimal semantic approach have limitations. For example it cannot infer the goals of the student, in particular which theory she is trying to build or support. The advisor cannot help the student in the construction of an argumentation or find a counter argument that attacks her theory or engage the student in a scientific discussion. However, by investigating the utility of advice obtained from these minimal semantic annotations we hope to gain interesting insights that will help us to move toward more complex approaches, also we can better understand the cost-benefit tradeoff between knowledge engineering and added functionality.

Although we have selected an appropriate level of representation, the snippet, to allow the student to access domain-relevant material, we are considering the pedagogical value of both a finer and a coarser grain size. A finer grain would reduce ambiguity and increase the accuracy of feedback. On the other hand, a coarser grain, i.e., at the level of a normal paragraph, or of a typical Web document, would enable quicker authoring of the Web-based materials described earlier. Currently the expert's specification of the relations is a major bottleneck for complex domains. The model of coaching with a larger grain size would be an "FYI" coach, which would function like a research librarian forwarding new information to those likely to be interested in it. It would still be possible to specify "for" and "against" relations in a general sense, just as a paper can give evidence for or against a particular view. However, coarse-grained representation has obvious limitations. For example, it is important for students to learn that one can often extract evidence for a view from a context that is generally unfavorable. Indeed, scientific papers are obliged to take note of divergent views and limitations. We are also considering exposing the student to sub-graphs of the expert diagram. We are exploring models of learning and cognitive/perceptual mapping for the novice and expert, regarding the information realized in the diagrams the web-based materials [e.g., Petre 1993].

2.3.4 Design of Classroom Implementation

Technology has the potential to transform education, not just by providing students with an opportunity to learn the tools of the modern workplace, nor simply by automating aspects of the educational process. Its greater potential lies in the ability to change the organization of classes, from teacher-centered didactic instruction to student-centered collaborative inquiry [Cummins, 1988; O'Neill & Gomez, 1994; Scardamalia & Bereiter, 1991]. Properly designed technology supports and facilitates collaborative approaches to learning that are recommended by numerous researchers [Johnson & Johnson, 1989; Kuhn, 1993; Slavin, 1990; Webb, 1989]. However, this potential is not an attribute of technology in itself. Computer supported collaborative learning (CSCL) technology will have an impact only if it is designed along with methodologies and materials that provide support for teachers who are learning to implement nontraditional activities in their classrooms, and address concerns such as integration with the curriculum and effective utilization of inadequate computer resources. In this section we describe the support we provided.

Consistent with these views, pilot studies with Belvedere [Suthers & Weiner; 1995] indicated that there was a need to structure the roles and activities of students working with Belvedere (see also [Waugh & Levin, 1988]). With DoDEA teacher colleagues, we developed a classroom implementation methodology focused on collaborative problem solving by small groups of students. The methodology calls for changes in the classroom environment, teacher's role, curriculum materials, student activities, and assessment methodology. Students work in teams to investigate real-world "challenge problems," designed to match and enrich the DoDDS curriculum with attention to National Science Education Standards [National Academy of Sciences, 1996]. The teams plan their investigation, perform hands-on experiments, analyze their results, and report their conclusions to others. Our classroom activity plans provide teachers with specific guidance on how to manage these activities with different levels of computer resources. Teachers and students are provided with assessment instruments designed as an integral part of the curriculum. Assessment rubrics are given to the students at the beginning of their project as criteria to guide their activities. They guide peer review, as well as helping the teacher assess nontraditional learning objectives. In this section we describe and comment on this methodology as it was carried out in our most exemplary case, in a Wurzburgh general science class.

2.3.4.1 Classroom Environment

The traditional teacher centered environment was changed to one that is more suitable for group work. Five computer stations and five tables for hands-on investigations were set up around the classroom. The computer stations became the center for collaborative exploration of Web-based curriculum materials, use of computer simulations and data analysis tools, and use of the Belvedere environment for recording results and their significance. The tables became centers for experiments with hands-on manipulatives and for paper-based work, including peer review. In less technology-rich environments, students can share work across time periods by successively working on and storing diagrams.

2.3.4.2 Teacher's Role

The teacher shifted toward the role of facilitator of student inquiry, moving among workstations, guiding student work and offering individual help. Teachers' transition into this new role was supported by involving them in the development of student activity plans for their classes during our STS2 teacher training workshop. Teacher involvement provides a sense of ownership, helping to motivate the change in how they facilitate learning, and customizes the plans for different classroom contexts. We provided additional support in a form of cognitive apprenticeship [Collins, et al. 1989], by conducting several classes with Belvedere activities ourselves. The teacher assumed increasing responsibility over time, both within each class and across classes. Where developer modeling is not available, electronic discussions and peer mentoring may help teachers support each other in new practices.

2.3.4.3 Curriculum Materials

Students learn to conduct critical inquiry by being posed with real world problems. Towards this end, we developed Web-based curriculum modules,[5] treating controversial issues such as genetic testing, or scientific problems under active investigation such as mass extinctions. The modules take into account the National Science Education Standards [National Academy of Sciences, 1996], local curricular standards, and teacher suggestions. The modules present students with authentic problems in which good solutions require consideration of multiple viewpoints and the use of evidence collected from various sources of information.



Figure 6. Web-based Materials for Challenge Problem


As shown in Figure 6, two menus are provided with the web based materials. A domain independent menu (left side) guides students through five phases of inquiry, providing suggestions on how to conduct scientific inquiry and how to use the Belvedere software in this process. Another menu (bottom) provides domain specific links organized in a manner relevant to the phases of inquiry. For example, students are provided with a link to a glossary of terms; access to simplified versions of articles on scientists' hypotheses, methodology, and field reports; and a link to experiments involving both hands-on manipulatives and computer simulations. The Web-pages contain "reference" icons resembling text pages (two are seen in Figure 6, one preceding each paragraph of text), which enable students to send text found on these pages into the inquiry diagram's "in-box."

2.3.4.4 Student Activities

In our exemplary case, the activities began with ourselves or the teacher modeling the use of inquiry diagrams to the whole class, using a simple everyday example such as reasoning about why a friend's coat is wet. Then groups of 4-6 students were formed, each working with a computer. After exploring background information on the science problem and choosing hypotheses to investigate, each group was divided. One pair or triad of students conducted hands-on experiments, recorded their results, and discussed findings. The other pair or triad of students continued to investigate the computer based articles and simulations. The full group then reassembled in front of their computer to share the results of their work, and record the results and interpretation of their experiences in their inquiry diagrams (e.g., Figure 1). Finally, the student team prepared a written report to be presented to other teams. In a one-computer classroom, computer access can be interleaved with hands-on activities.


What you learn  How you learn it   How you tell how well you learned           
to do                                                                          
                                   Poor    The inquiry diagram contains one     
                                           appropriate hypothesis and no        
                                           related data.                        
To formulate    Create Belvedere   Fair    The inquiry diagram shows one        
and revise      inquiry diagrams           appropriate hypothesis and one       
scientific      that record                data supporting it.                  
explanations,   different          Good    The inquiry diagram shows one        
and to use      hypotheses about           hypothesis with the use of           
evidence to     a problem,                 evidence for it as well as against   
                                           it.                                  
develop a       different data     Good    The inquiry diagram shows several    
logical         that can help              hypotheses each connected to         
argument.       you decide                 multiple pieces of data.             
                between                                                         
                the hypotheses,    Great   The inquiry diagram shows multiple   
                and the                    hypotheses with the use of           
                relationships              evidence for as well as against      
                between the data           each of these hypotheses.            
                and hypotheses.                                                 
                                   Great   The inquiry diagram indicates        
                                           additional information the student   
                                           would look for to support or to      
                                           refute explanations.                 
                                   Poor    The inquiry diagram only contains    
                                           information that is drawn from       
                                           personal experience or speculation.  
To develop a    Find out what      Good    The inquiry diagram contains         
model that      specialists in             references to information from       
integrates      different                  only one discipline, for example     
concepts from   disciplines                Geology, Physics, Chemistry, or      
                think of the               Biology.                             
                problem.                                                        
multiple        Look for           Good    The information in the inquiry       
domains with    information from           diagrams come from one kind of       
different kinds different                  resource, for example only from      
of data.        resources, such            experiments, field observations,     
                as on-line and             or articles.                         
                library            Great   The inquiry diagram contains         
                articles,                  references to information from       
                experiments you            multiple disciplines such as         
                do, and field              Geology, Physics, Chemistry,         
                observations.              Biology.                             
                                   Great   The information in the inquiry       
                                           diagrams are drawn from multiple     
                                           resources, such as experiments,      
                                           field observations, and articles.    

Figure 7. Sample Assessment Rubrics

2.3.4.5 Supporting Peer Evaluation with Performance-based Assessment

The value of peer coaching in an unfamiliar practice can be limited by students' lack of knowledge of the criteria for excellent performance. Additionally, traditional assessment, considered to be the final step of instruction, does not measure inquiry skills effectively. We address both of these problems with performance-based assessment "rubrics" that we developed to guide self- and peer-assessment of critical inquiry, as well as to facilitate teacher assessment of student work. The rubrics are provided to students at the beginning of their research. They indicate expectations, show successful methods for progressing with inquiry, and give examples of excellent and poor performances, thus guiding peer assessment during collaboration. A sample is shown in Figure 3. The rubrics take into account NSES standards for content objectives and outcome skills to be measured, and use the methodology outlined in the New Standards: Performance Standards project [National Center of Education and the Economy, 1995] for evaluating student-generated artifacts and performances.

2.3.5 Collaborative Activities Funded

The CAETI program assembled the very best of our nation's researchers in technology for innovative approaches to education, affording a prime opportunity for collaborations. In this section we report briefly on three such collaborations that took place during and were supported by CAETI.

2.3.5.1 Belz and Luckham: Architecture Abstraction Hierarchy Reference Model

Under CAETI funding, participants David Luckham and Frank Belz developed and promoted an abstraction hierarchy for describing and modeling CAETI architectures, in particular a MOO architecture [Luckham et al. 1997]. Dan Suthers (the author of this report) recognized the value of this abstraction hierarchy in clarifying our own work on architectures, and brought it to the attention of the P1484 Working Groups on Standards for Learning Technology.[6] This led to a collaboration between Frank Belz, Dan Suthers, and Tom Wheeler of Army CECOM (one of the founders of P1484) towards a refined abstraction hierarchy. Our collaboration resulted in addition of the Application Model layer, and refinement of the conception of Luckham et al. 's "User-Interface" and "Concepts of Operations" into the "Interaction Model" and "Conceptual Model." All other aspects of the following are to be credited to David Luckham and Frank Belz, not the present project.
                 Application Model                   
               Abstract System Model                 
    Interaction Model      Conceptual Model           
           Abstract Implementation Model             
                  Resource Model                     

Figure 8. Architecture Reference Model. The Architecture Reference Model is hierarchy of "architectural abstraction levels" that appear to be useful for developing and organizing architectures of advanced educational software applications, including computer-aided instruction (CAI), intelligent learning environments (ILE), and intelligent tutoring systems (ITS). It is a descriptive and modeling tool, but is not itself an architecture. Different kinds of concepts and components are used for defining architectures at each level of the hierarchy. Each level represents a different way of thinking about an architecture. A given system would have a complete description at each level. The levels are illustrated in Figure 8 and summarized below.
Application Model:
This level describes the application domain and identifying concepts used to define the domain and to organize work in the domain. This level is described largely in its own terms (as practitioners view it), and includes theories and analysis of the domain, educational objectives (or other task objectives), task analysis, and identification of the objectives and tasks which the software is expected to support.
Abstract System Model:
This level describes the user's (and designer's) interactive and conceptual model of the system; the "high level" architecture of the system [Brooks], defining the overall effect of the system on the user, its function and its character. This level is divided into two distinct aspects to distinguish two essentially different sets of issues; for humans, things and models of them have not only an immediate, perceivable, manipulable part but also an indirect, understandable part.
Interaction Model:
A view describing the perceptual characteristics of the system, portraying the system and its functionality to the user, along with the corresponding facilities of the system to interact with the user. This level addresses all perceptual concepts pertinent to the system and defines the perceivable entities chosen to make up the interface and interactions available to the user with these entities. It is expected that this level will reify or otherwise reflect the pertinent perceptual and interactive aspects and components of the Application Model.
Conceptual Model:
A view describing the inherent functionality and character[7] of the system; the way that the user conceives of the system. It also delineates the means by which the software provides for modeling the application and application domain, i.e. the software strategy. It does this, for example, in terms of the style or type of system, along with the user "visible" classes of objects and the operations that can be performed on them. The Conceptual Model parallels the desired subset of a given Application Model, defining abstractly the automation strategy (theory) chosen to realize the system. It is the way the user conceives of the system and how it operates.
Abstract Implementation Model:
This level describes the structural framework of the implementation, delineating architectural elements and communication between these elements, including software modules such as interpreters, databases, event managers, etc., and data and control flow between them. This level describes the high level implementation architecture[8] of the system. The Interaction Model and Conceptual Model entities and interactions are allocated (mapped) to events and representations in portions of the Abstract Implementation Model during the modularization part of the software design process.
Resource Model:
This level describes the resources required to actually implement the system according to the Abstract Implementation Model. This level enables explicit description of implementation and performance issues (and models) used in designing the performance (and other non functional characteristics) of the system. This level addresses implementation issues that affect design trade-offs, and therefore can profitably be addressed prior to implementing the system. Issues considered here include physical architecture, location of services, algorithm complexity, resource constraints, bandwidth, latency, etc.
We relied on this work in our architecture description of Section 2.3.2.

2.3.5.2 Forbus and Koedinger: Integrated Feasibility Demonstration of Science Learning Spaces

In collaboration with CAETI contractors Ken Forbus and Ken Koedinger, and assisted by Danny Bobrow, Mark Shirley, and Bob Balzer, we participated in the third Integrated Feasiblilty Demonstration we called the "Science Learning Space Demonstration." We demonstrated the feasibility of composing three different existing, independently developed components into a "Science Learning Environment" with integration at the semantic level. Two of the components were complete intelligent learning environments in their own right: Active Illustrations [Forbus, 1997] enables learners to experiment with simulations (in the demonstration, global climate), and to receive explanations concerning the causal influences behind the results. Belvedere [Suthers & Jones, 1997; Suthers et al., 1997] provides learners with an "evidence mapping" facility for recording relationships between statements labeled as "hypotheses" and "data". A Scientific Argumentation Coach [Paolucci et al., 1996] guides students to seek empirical support, consider alternate hypotheses, and avoid confirmation biases, among other things. The third component was an instance of a model-tracing Tutor Agent [Ritter & Koedinger, 1997], that contains a cognitive model of general experimentation and argumentation process skills. This model was used by the Tutor Agent to provide assistance and feedback to the learner through both a Message window and a Skillometer window showing performance on a number of subskills. Using a MOO [Bobrow and Shirley] as a communication infrastructure, we demonstrated a scenario in which a student poses a hypothesis in the Belvedere evidence-mapping environment, uses the simulation to test that hypothesis in the Active Illustration environment and sends the results back to Belvedere for integration in the evidence map. Throughout this activity the Belvedere Argument Coach and the Experimentation Tutor Agent monitored student performance and provided assistance. A screen from our demonstration is shown in Figure 9


Figure 9. Science Learning Space Demonstration.

2.3.5.2.1 Demonstration Architecture
The architecture is shown in Figure 10. Forbus had already made use of the MOO for communication between the Active Illustration simulation engine and a simulation user interface (bottom right of Figure 10). This use of an open communication "bus" - the MOO - made it easy to add a Tutor Agent to monitor student performance as the basis for providing context-sensitive assistance. Koedinger employed the plug-in tutor agent architecture described in Ritter & Koedinger [1997], which recommends the use of a simple translator component (small box upper right of Figure 10) to manage the communication between the tools and tutor agents. The translator watched for messages between the Simulation Interface and the Active Illustration server, extracted messages indicating relevant student actions, and translated these student actions into a form suitable for the Tutor Agent's rule-based model tracing engine.



Figure 10. Learning Space Demonstration Architecture.

Subsequently, we added the Belvedere system and Argumentation Coach (left side of Figure 10). Belvedere's communication architecture (abstracted in Figure 10 as the "BORBI" but described more fully in section 2.3.2.4) is itself capable of supporting component-based composition of functionality. However, we decided to use the MOO due to its prior use in the first demonstration. Integration of the Belvedere subsystem into the MOO required the addition of one translator component (the other small box in the figure): no modification to Belvedere itself was required. The translator watched the MOO for Hypothesis and Simulation Run objects sent by the Simulation Interface. When seen, these were converted to Belvedere Hypothesis and Data objects and placed in the user's "in-box" for consideration.

2.3.5.2.2 Lessons Learned from the Integration Demonstration
From this experience we learned a number of lessons. First, the open client-server architectures of Active Illustrations and Belvedere greatly facilitated composition of the learning space. Second, semantic interoperability is a significant issue. Much of our communications during development were in effect a process of negotiating an informal shared ontology. The process may have been more efficient and involved fewer misunderstandings if a standard ontology or even reference vocabulary were available and known to all. However, we realized that ontologies could not address the fundamental mismatch between systems with different representational requirements. For example, the Active Illustrations simulation and Simulation Interface communicated in terms of individual parameter settings that define a simulation run, while the Belvedere evidence mapping facility needed to treat each entire simulation runs as a single empirical unit. Third, translators are part of the solution: they enable components to use their own representational systems and still communicate. Fourth, there are semantic "coupling" issues that translators cannot solve. This requires more explanation. We initially considered placing the burden of solving this problem on the Belvedere-MOO translator, to avoid the need to modify any of the components or their interfaces. The translator would aggregate individual parameter setting and simulation run events into "data" objects that record the results of the run. These data objects would then appear automatically in Belvedere's in-box. However, focusing on the needs of the learner, we elected to follow a different approach, for three major reasons. (1) Not all simulation runs will be informative enough to use. We wanted to avoid cluttering the in-box with many not so useful objects. (2) We wanted to encourage the learner to reflect on which runs were worth recording, by requiring that the learner make the decision of which to record. (3) The learner needs to make the connection between her experiences in the simulation environment and the representational objects that she manipulates in Belvedere. Hence the aggregated objects representing simulation runs should be created and given visual identities recognizable to the learner while still in the simulation environment. The Simulation Interface already enabled the user to provide textual labels for simulation runs. We modified the Simulation Interface to provide a facility for broadcasting labeled simulation run summary objects to the MOO (and hence to the Belvedere in-box), thereby enabling the learner to select relevant results without leaving the simulation context. We also added a similar facility for hypotheses created in the Simulation Interface.

This experience illustrates some tradeoffs and limitations of a purely "plug and play" approach to component based systems, while also showing that there is hope given further research. We showed not only how to reduce the effort required to "hook up" diverse components, but also the value of sharing semantics between applications. Information objects created with empirical and theoretical identities in one application (Active Illustrations) retained that identity in how they were treated in another application (Belvedere and its Argumentation Coach). Furthermore, a third Tutor Agent treated these objects as having the same semantics in both situations. Consistent treatment of the learner's representations by different software agents reinforces the semantics that we want learners to reflect upon and manipulate. Perhaps most significantly, the contextual semantics of these objects accumulate as they are used: an object, viewed in one context (e.g., evidence maps), "stands for" the learning interactions centered on it in another context (e.g., the simulation) Critical to this accumulation of contextual semantics is persistence of identity. Special attention was required to ensure that the learner "sees" the thing that shows up in Belvedere as the same object she constructed in Active Illustrations.

2.3.5.3 Workshop: Architectures and Methods for Designing Cost-Effective and Reusable ITS

While supported by CAETI funding, the author of this report organized (with Brant Cheikes, Neil Jacobstein, and Tom Murray) a workshop at the 3rd International Conference on Intelligent Tutoring Systems (ITS'96), entitled "Architectures and Methods for Designing Cost-Effective and Reusable ITSs". This highly successful workshop attracted over 40 international participants, several of whom were CAETI-funded, and led to subsequent fruitful collaborations. Five working groups were formed: Exploring Industry-Standard Architectures; Communication Architectures for ITS Components, Shared Vocabularies for Representing Pedagogical Knowledge, ITS Shells and Generic Task Domains, and Using the World Wide Web to support ITS. Further information on this workshop can be obtained from http://advlearn.lrdc.pitt.edu/.


2.4 Results

2.4.1 Highlights of Project

2.4.1.1 Cognitive/Education Highlights

2.4.1.2 Technology Highlights

2.4.1.3 Implementation Status and Platforms

At the conclusion of the CAETI program, the software was in a late-beta stage: robust enough for deployment in schools overseas, but not ready to be treated as a user-maintainable product.
2.4.1.3.1 Belvedere
Inquiry Diagram tool and associated Chat facility implemented in Java. Runs on Windows 95/NT, Solaris 2.4, and Mac OS.
2.4.1.3.2 Collaborative Database Server and Connection Manager
Implemented using the Postgres DBMS, Java, CGI, and the Netscape web server. Runs on Solaris 2.4.
2.4.1.3.3 Coach ("Idea Generator")
Implemented in Harlequin Lispworks. Runs on Solaris 2.4. Reimplementation in Java began at end of project.

2.4.1.4 Maturity and Deployability

2.4.1.5 Openness and Interoperability

2.4.1.5.1 Points of Interoperability:
2.4.1.5.2 Limitations on Interoperability:

2.4.1.6 (Re)Applicability

Current inquiry diagram and coach configuration is applicable to engaging students in the argument construction and evaluation phase of critical inquiry in a variety of science, social science, and other subject matter areas.

Extended configuration to be available early 1997 , including inquiry diagrams, concept maps, causal loop diagrams, influence diagrams, and plan diagrams, will extend applicability to other phases of critical inquiry as well as other applications beyond education, including risk assessment, planning, and job skills analysis.

2.4.1.7 Venues and Collaborations

2.4.1.8 Future Research

2.4.1.8.1 Technology Research
2.4.1.8.2 Cognitive/Education Research

2.4.2 Results of Independent Evaluations

Belvedere was used in the first semester of 1997 by 5 teacher participants in 4 Department of Defense Dependents' Schools (DoDDS) in Germany and Italy. The classes included 9th grade Science, and 9-12th grade Physics, Chemistry, and Science and Technology. During this time, evaluation of the Belvedere classroom implementation was conducted by a third party evaluator, Dr. Lynne Gilfillan, who was under contract with the CAETI program to perform this evaluation in the DoDDS testbed.

2.4.2.1 Evaluation Methodology

Dr. Gilfillan's used classroom observation forms focused on CAETI program objectives and the use of CAETI infrastructure. She also videotaped selected classroom sessions. We provided her with additional observation forms to record the activities of teachers and students, and their use of components of our software and methodology. The location of schools prevented extended observations on our part, but analysis of these forms, along with analysis of student generated artifacts (such as inquiry diagrams, Excel graphs, and student reports) for learning gains, is ongoing [18].

2.4.2.2 Summary of Evaluation Results

The independent evaluator's report discusses effects of the Belvedere approach on the general nature of student activity, on teacher roles and on the classroom environment.

Observations of student activity show that students were engaged and on task during the collaborative problems solving situations presented to them by the Belvedere comprehensive approach. Teachers indicated that the approach enhanced students ability to engage in collaborative tasks.

"Classroom observations of teachers and students using Belvedere show that it is being used to support cooperative problem solving, with students working in groups of 2 to 4 students. Students appeared to be engaged and on task. Teachers report that it is easy to use, and they find that it enhances students ability to engage in cooperative work, and to address scientific hypothesis testing in an organized and analytical way."[9]
Students also found the activity structure easy to follow and helpful in integrating work with the use of various software tools and information resources such as the world wide web.
"Students report that working with Belvedere makes it easier for them to organize and review the arguments for and against a specific scientific hypothesis. They also report that they find it easy to integrate work in Belvedere with work in other applications like Word and Excel and Web Browsers. Students using Belvedere generated artifacts that demonstrated integration of the knowledge representation maps generated using Belvedere with text and graphic information taken from a variety of resources, including the Internet, and with numerical data generated as a result of classroom activities."
Teachers reported that the staff development activities provided were adequate for classroom implementation of the Belvedere approach.
"Data collected on the efficacy of staff development for teachers using Belvedere indicated that they were very satisfied with the training provided, and believed that they were well prepared to integrate use of the Belvedere software into their classrooms. The staff development provided for Belvedere compared very favorably with that provided by other application developers in the CAETI program.
The independent evaluator also reported a striking difference in classroom organization before and after the introduction of the Belvedere approach. The classroom changed from a traditional format, with students doing work at their desks in rows, to a group-centered organization, in which students were gathered around computers or hands-on activities "like campfires" and engaged in active discussion.


2.5 Conclusions

We have discussed how Belvedere project integrated advanced technology with a classroom implementation methodology to support collaborative inquiry in the classroom. Networked groupware for collaborative inquiry and intelligent coaching aids were co-designed and delivered along with cognitively principled curriculum materials, activities designed to encourage collaborative inquiry, classroom implementation plans developed collaboratively with teachers, and instruments for assessment of nontraditional learning objectives that also scaffold peer coaching. A number of architectural and interoperability issues were explored.

Many of our conclusions are to be found throughout this report, especially in Section 2.3.2.6 (Architectural Lessons), 2.3.3.5 (Coaching Status and Future Directions), and 2.3.5.2.2 (Lessons Learned from the Integration Demonstration). Other conclusions are best reported in the form of recommendations for future work.


2.6 Recommendations

We present first recommendations for further work on the Belvedere software and supporting materials towards a deployable system, and then recommendations for future research of a more general nature.

2.6.1 Follow-on Work for Belvedere

The following work is be recommended to further the development of Belvedere as a deployable application.

2.6.1.1 Port Server to NT

Although our Belvedere inquiry diagram software can run "stand-alone" without a server, a server is required to support (a) collaboration over the network, (b) a persistent database, and (c) a local Web-site containing customized materials and student and teacher pages. Hence we strongly recommend continuation of the client/server model. The currently delivered system uses a Netra (Unix) server. Unix was chosen for development because of its strength as a large-scale server platform, and because of the availability of free tools. However, recognizing DoDEA's need to deliver on platforms for which there is ongoing administrative support, we began to redesign and reimplement our server to enable delivery on other server-class machines such as Windows NT. This work should be continued:

2.6.1.2 Refine and Extend the Belvedere Interface as Guided by User Feedback

The Belvedere inquiry diagram software was evaluated in five DoDDS classrooms. As a result of these evaluations and our own experience, the following improvements to the interface would be recommended:

2.6.1.3 Future Work for Coaching

We recommend extending and enhancing the interaction between the argument and expert coaches. Presently, we only evaluate the status of "for" and "against" relations between adjoining nodes in the search of a path from the start node to a goal node. We would also like to include higher-order structures involving more than one relation and more than two statements. This would not be unlike the pattern-matching strategies that are presently employed in the argument coach, but would compare student and expert diagrams to find principled differences between the two. At the time of this writing, we have prototyped a few basic patterns against which the expert coach compares, for example, a basic data-for-hypothesis structure in the student diagram, with the same structure (using the same snippets, etc.) in the expert diagram, and then notifies the student that other data or hypotheses exist that either support or refute the hypothesis or data in the basic structure.

2.6.1.4 Cognitively Motivated Design of Challenge Activities and Materials

Long-term viability requires that we transition the development of curricular materials to other providers, such as publishers. Development of additional curricular topic areas serves both short-term and long-term goals. These topics will support the needs of current and future teacher collaborators, provide a demonstration of the generality of collaborative critical inquiry in other disciplines as well as science, and most importantly provide us with a broader base of case studies from which to abstract a reusable methodology for developing cognitively and socially motivated curricular materials. This abstracted methodology would then be the basis for technology transfer to publishers. We recommend work towards the following:

2.6.1.5 Realistic Classroom Implementation

One of the greatest challenges for realizing the potential of technology and of active inquiry-based learning is implementation in the real world class. Such a class has many students, few computers, limited time, and in some cases, a teacher who is not experienced in the use of technology or in alternative approaches to organizing class activity [Schofield 1996]. Teachers need ways of organizing the class that other teachers have found to be successful. They also need methods for assessing the new learning objectives that inquiry-based learning activities address. Towards this end, we recommend refinement of our classroom implementation plans and assessment rubrics. Of particular value would be abstract "template" implementation plans that can be used for new challenge activities in different subject areas. We recommend a family of such templates, each distinguished by being targeted towards different subjects (science, language arts, mathematics) and their particular constellations of knowledge and cognitive skills.

2.6.1.6 Assessment Rubrics

Many of the competencies required by the modern workplace are not measured effectively with traditional standardized tests. New assessment tools are needed to measure students understanding of the interrelationships within complex systems, their skills in conducting inquiry, and their ability to collaborate and communicate effectively. Traditionally educational assessment is considered as the "evil but necessary" final step of instruction. Our vision of assessment is that it should support and scaffold performance objectives which students and teachers set forth for themselves. Towards these ends, we recommend continued development of assessment rubrics designed to capture skills of critical inquiry and collaborative problem solving in general, and topic-specific aspects of the challenging curriculum activities in particular.

2.6.1.7 Commercial Deployment

It is vitally important to design a viable model for long term delivery, maintenance, and support of advanced educational technology such as the Belvedere family of tools. We see two competing delivery models today.

The more familiar of the two approaches is technology transfer to a commercial entity which would market and support a product derived from research software such as Belvedere. One barrier towards such technology transfer is the lack of regular contact between researchers and relevant commercial entities. Research labs should be facilitated in finding appropriate commercialization options.

An alternate, non-proprietary approach to long term support of educational software is being developed. This approach is known variously as the Educational Object Economy (NSF) and the Object Economy Model (Apple Computer). Reusable, platform independent software objects are shared and maintained in object repositories. The Repository is based on an innovative licensing scheme which provides software for free, provided that improved versions are made available in the repository under the same license, or that a royalty-paying license be negotiated (royalties fund the repository). We realize that this model is too experimental at this point to be viewed as a primary delivery mechanism. However, considerations such as (1) the large number of specialized topics addressed in education, (2) the need for students to examine the objects of their study under multiple representations using a variety of tools, and (3) the propensity for teachers and schools to adopt materials to their own way of doing things suggest that the traditional commercial economic model will not support the diversity of functionality needed for education. Hence an EOE model should be supported.

2.6.2 Additional Research

Summarizing what has been said throughout this document, the following research directions are recommended.

2.6.2.1 Designing open, interoperable educational software.

Knowledge-based educational systems have historically been large, self-contained programs with specialized platform requirements. To make these technologies viable, we must be able add component functionality incrementally, and enable systems to interoperate with commercial software and Internet resources. To reduce the cost of materials prepared by developers, and to enable greater collaboration between users, representations of educational materials should be shareable between diverse applications across the Internet. This suggests a "lowest common denominator" approach, yet we do not want to limit support for more advanced functionality such as domain-specific coaching. Several lines of work are suggested.

One involves the design of communication architectures for composing systems out of separately developed components. As suggested in this document, key issues lie in semantic interoperability. We recommend a testbed in which learning applications are composed of existing software that fills specific pedagogical needs. The research issues would be to investigate how to get adequate semantic coupling to support the pedagogical needs while minimizing changes needed to the components. Research and development could examine the roles of various solutions, including ontologies, translators, and alternate communication infrastructures. The work should be concerned with both software-level coupling (e.g., agents interpret objects from other components in an appropriate manner) and human-level coupling (e.g., objects retain their identity in the eye of the user as they move between components).

Another line of work involves the development of semantic annotations that can be embedded in more conventional materials, yet support advanced functionality. Much interest has been generated recently in the area of "metadata" for digital learning objects. Such efforts provide part of the groundwork for our vision, but are limited in three ways. First, the granularity of such efforts tends to be coarser than would be required for a semantic model of learner-constructed representational artifacts -- the dominant model is annotation of entire objects or documents. Second, metadata efforts do not go far enough to provide machine interpretable content semantics. Shared ontologies could eventually fill this need. (Metadata efforts are not to be faulted: there is an immediate need for other aspects of metadata that cannot be delayed while the research community develops ontologies.) The third limitation of metadata is also a limitation of any purely formal semantics for learner-constructed representations: it does not ensure that semantics will accrue for the user of the materials. Future work may be required in providing objects with persistence of appearance as well as of behavior before learners will perceive that which moves between applications as a single thing accumulating contextual semantics.

2.6.2.2 Representational devices as "epistemic forms" for collaborative learning.

The design of any software interface entails a large number of design decisions, more than any one research lab can back up with empirical investigations. However some design features are so critical to the intended application that they should receive thorough study. Such is the case with representational devices used in software for socially-mediated learning (also known as "computer supported collaborative learning"). We view these representational devices as "epistemic forms:" tools that guide and coordinate knowledge-building interactions. We have repeatedly observed that learners who are provided with a set of representational primitives for the construction of knowledge artifacts discuss the appropriate choice of primitive for a given constructive act. Thus, by manipulating the design of the primitives, it is possible to manipulate the discriminations that learners reflect on. Once learners have constructed representations, their learning interactions are further guided by the objects and relationships (expressed or potential) that these representations make salient. For example, some interfaces for evaluating theoretical claims with respect to empirical observations represent evidential relations between instances of these two categories only implicitly, such as through containment of one inside the other; while others (such as Belvedere) represent the relations explicitly, such as with arcs in graphs. We claim that this difference whether the relationships are represented as first class objects will have a significant effect on learners' discussions about these relationships.

These kinds of design considerations are critical, yet are insufficiently studied. We recommend a series of studies that vary features of representational systems, such as whether epistemological distinctions (theoretical claims versus empirical observations) must be attached to statements, and whether evidential relations between statements are represented as first class objects. Dependent variables include coding of discussion for certain discourse features shown by other literature to be correlated with positive learning outcomes, as well as direct measures of individual learning outcomes in both subject matter knowledge and inquiry skills.

2.6.2.3 Software participation in reflective learning interactions.

Collaborative learning can yield positive results such as increased motivation, greater learning, and transfer of knowledge to related tasks. However, collaboration alone does not guarantee learning gains. For example, learners cannot model expert knowledge and performance for each other. The design of effective representational tools, although helpful, does not solve this problem. Although we may design tools that make certain aspects of a problem more explicit, learners may yet fail to notice these aspects or know how to act on them in an appropriate way. A major advantage software environments for the construction of representations (over paper, for example) is that this medium is interactive and computational. Software can be designed to selectively enter into the reflective dialogue, helping learners recognize the critical features displayed in the representations and respond with constructive activity. However, this requires that the design of representational devices and their formal semantics be coordinated with learners' understandings of those representations. For these reasons we view research on software for automated advice giving, coaching, etc. as an integral part of research on the design of representational tools for reflective learning interactions.

A related line of work might first begin with observations of peer coaching to answer questions such as: What support for learning do peer groups offer their members, and what support is lacking from peer groups that must be addressed by some mixture of human and automated mentoring? How can we recognize opportunities for coaching effective collaboration?

2.6.2.4 Examining the cost-benefit tradeoff between knowledge engineering and coaching functionality.

Knowledge-based techniques for advising or coaching typically require representations of the knowledge of a domain that are used to annotate the materials manipulated or created by students. However, "knowledge engineering" requires considerable work on the part of developers. Also, interactions intended to ascertain the meaning of users' materials may distract them from the learning task. Hence it is natural to ask what benefit is gained from automated coaching or advising and how the benefits compare to these costs. We have taken an incremental approach, investigating the utility of advice obtained from minimal semantic annotations before proceeding to more complex functionality. We recommend continuation of this work, using open architectures such as that reported herein to enable these coaches to be added or removed independently of each other for experimentation purposes.

2.6.2.5 Designing hypermedia structures to scaffold critical inquiry skills.

In our preparation of Web-based "field reports," "experiments," "conference papers," etc. we were faced with apparently conflicting requirements in designing the hypertext links by which these materials are indexed. On the one hand, research consistently shows the utility of reifying the cognitive structures of experts, so that learners can be guided by and more easily acquire these structures. On the other hand, students need to be faced with choices similar to those in the real world in order to engage in exploratory behavior and practice newly acquired cognitive skills. Is there a conflict between reifying expert structures and presenting students with real-world choices, perhaps requiring alternate links structures for different phases of the learning process? We recommend investigations into whether it is necessary to generate alternate link structures in real time to meet conflicting needs.

2.6.2.6 Scaffolding learning from simulations and visualizations.

This recommended line of work investigates how people learn and fail to learn from simulations and visualizations. These data analysis and communication methodologies are widely used by scientists, but perhaps special skills are needed to leverage their power in learning applications. We recommend a line of work to (1) identify how people fail to acquire information that is made available by simulations and visualizations, (2) identify the missing prerequisite knowledge or skills responsible for this failure, (3) design interface aids and coaching that assist with these prerequisites, and (4) evaluate whether this assistance improves the effectiveness of simulations and visualizations as a means towards other learning ends.

2.6.2.7 Comparing scientists' and learners' inquiry skills.

Our own work would have benefited from studies of whether scientists' expertise includes domain independent inquiry skills that could be taught to school children. We recommend studies that separate domain-specific from domain-independent aspects of expertise by comparing scientists and novices working in domains unfamiliar to both. A hypertext information-gathering environment and Belvedere could be used to record subjects' use of the information they deem relevant. One might record and analyze information seeking and use in terms of dimensions such as systematicity of search, and whether and when subjects seek disconfirming as well as confirming evidence.

2.6.2.8 Realistic school implementation of advanced educational technology.

Our experience implementing Belvedere in four Department of Defense Dependent Schools has highlighted several problems concerning the scale-up of prototype advanced technology efforts to schools. Although some problems require political and economic solutions, others may be amenable to research in advanced technology and professional development supporting its use.

Designing teachers' mental models of our systems. While conducting teacher development workshops, we found ourselves engaged in cognitively demanding, rapid translation of our rich mental model of the software into a model that would be useful to teachers. We now believe that we need to design a mental model of the system oriented towards its classroom implementation, and to do so as part of the software design process rather than after the fact. This need fits nicely with user-oriented layers of a hierarchical architecture reference model that that developed in collaboration with external colleagues Frank Belz and Tom Wheeler.

Large scale, distributed evaluation. In order to conduct our research in the context of large scale school implementations, we need ways to understand what is going on during possibly concurrent use in a number of geographically distributed classrooms. Methodologies for analyzing large sets of interaction data will be needed to augment what can be observed in person. Our networked technology provides an opportunity for distributed data collection and developing new evaluation methodologies.


3. End Matter

3.1 Appendices

Software and HTML materials are appended in the form of a CD-ROM (which also includes this report). The electronic version of this report will include hyperlinks to the following:


3.2 Bibliography

Anderson, J. R., Corbett, A. T., Koedinger, K. R., & Pelletier, R. (1995).
Cognitive tutors: Lessons learned. The Journal of the Learning Sciences, 4 (2) 167-207.
Anderson, J. R. & Pelletier, R. (1991).
A development system for model-tracing tutors. In Proceedings of the International Conference of the Learning Sciences , 1-8. Evanston, IL.
Bates, R. & MacGregor, R. (1987).
The loom knowledge representation language. ADA 183415 RS-87-188, Information Sciences Institute, University of Southern California, Marina del Ray, CA, 1987.
Brown, A.L., & Palincsar, A.S. (1989).
Guided, cooperative learning and individual knowledge acquisition. In L. Resnick (Ed.), Knowing, Learning, and Instruction: Essays in Honor of Robert Glaser. Hillsdale, NJ: Lawrence Erlbaum Associates.
Brusilovsky, P., Schwarz, E, & Weber, G. (1996).
ELM-ART: An intelligent tutoring system on world wide web. ITS'96, Third International Conference on Intelligent Tutoring Systems Monteal, June 1996, pp. 261-269.
Chan, T.W., & Baskin, A.B.(1988).
Studying with the prince: The computer as a learning companion. In Proceedings of the International Conference on Intelligent Tutoring Systems, Montreal, pages 194-200, 1988.
Chi, M.T.H., Bassok, M., Lewis, M., Reimann, P., & Glaser, R. (1989).
Self-explanations: How students study and use examples in learning to solve problems. Cognitive Science, 13 :145-182, 1989.
Chinn, C. & Brewer, W. (1993).
Factors that influence how people respond to anomalous data. In Proc. 15th Annual Conf. of the Cognitive Science Society, pages 318--323, Hillsdale, NJ: Lawrence Erlbaum, 1993.
Clancey, W.J. (1992).
Guidon-manage revised: A socio-technical systems approach. In Intelligent Tutoring Systems, pages 21-36, 1992.
Cohen, P.R. & Feigenbaum, E.A. (1989).
The Handbook of Artificial Intelligence, Volume 1. Addison-Wesley, New York, 1989.
Collins, A., Brown, J. S., & Newman, S. (1989).
Cognitive Apprenticeship: Teaching the craft of reading, writing, and mathematics. Tech. Report No. 403, Palo Alto, CA: Institute for Research on Learning, 1989.
Collins, A. & Ferguson, W. (1993).
Epistemic Forms and Epistemic Games: Structures and Strategies to Guide Inquiry. Educational Psychologist, 28 (1), 25-42.
Cummins, J. (1988).
From the inner city to the global village: The microcomputer as a catalyst for collaborative interchange. Language, Culture and Curriculum, 1 (1), 1-13, 1988.
Forbus, K. (1997).
Using qualitative physics to create articulate educational software. IEEE Expert , 12 (3), May/June 1997.
Friedman-Hill, E. (1997).
Jess, the Java Expert System Shell. Sandia National Laboratories, Livermore, CA, 1997. http://herzberg.ca.sandia.gov/.
Ikeda, M., Hoppe, U., & Mizoguchi, R. (1995).
Ontological Issues of CSCL Systems Design. AI-Ed 95, the 7th World Conference on Artificial Intelligence in Education., August 16-19, 1995, Washington DC, pp.242-249.
Johnson, D. & Johnson, R. (1989).
Cooperation and Competition: Theory and Research. Interaction Book Company, 1989.
Johnson, R.T., Johnson, D.W., & Stanne, M.B. (1985).
Effects of cooperative, competitive, and individualistic goal structures on computer- assisted instruction. Journal of Educational Psychology , 77 (6), 668- 677.
Justen, III, J.E., Waldrop, P.B., & Adams, II, (1990, July).
Effects of paired versus individual user computer- assisted instruction and type of feedback on student achievement. Educational Technology, 30 (7) , 51- 53.
Katz, S., Lesgold, A., Eggan, G., & Gordin, M. (1993).
Modeling the student in Sherlock II. Journal of Artificial Intelligence in Education (Special issue on student modeling, G. McCalla & J. Greer, eds.), 3 , 495-518.
Katz, S., & Lesgold, A. (1994).
Implementing post-problem reflection within Coached Practice Environments. In P. Brusilovsky, S. Dikareva, J. Greer, and V. Petrushin (Eds.), Proceedings of the East-West International Conference on Computer Technologies in Education (pp. 125-30), Crimea, Ukraine.
Klayman, J. & Ha, Y.-W. (1987).
Confirmation, disconfirmation, and information in hypothesis testing. Psychological Review, 94 :211--228, 1987.
Klein, J.D., & Pridemore, D.R. (in press).
Effects of cooperative learning and need for affiliation on performance, time on task, and satisfaction. Educational Technology, Research & Development, 40 (4).
Kuhn, D. (1993).
Science as argument: Implications for teaching and learning scientific thinking skills. Science Education, 77 :319-337, 1993.
Lesgold, A. M., Lajoie, S. P., Bunzo, M., & Eggan, G. (1992).
SHERLOCK: A coached practice environment for an electronics troubleshooting job. In J. Larkin & R. Chabay (Eds.), Computer assisted instruction and intelligent tutoring systems: Shared issues and complementary approaches (pp. 201-238). Hillsdale, NJ: Lawrence Erlbaum Associates.
Luckham, D.C., Vera, J., and Belz, F. (1997).
Towards an abstraction hierarchy for CAETI architectures, and possible applications. Stanford University Computer Systems Laboratory Technical Report No. CSL-TR-97-727.
McManus, M.M., & Aiken, R.M. (1993).
The group leader paradigm in an intelligent collaborative learning system. In P. Brna, S. Ohlsson, and H. Pain (Eds.), Proceedings of the World Conference on Artificial Intelligence in Education (pp. 243-256). Edinburgh, Scotland, August, 1993.
Murray, T. (1996).
Having it all, maybe: Design tradeoffs in ITS authoring tools. ITS'96, Third International Conference on Intelligent Tutoring Systems , Monteal, June 1996, pp. 93-101.
National Academy of Sciences (1996).
National Science Education Standards. National Academy Press, 1996.
NCEE (1995).
New Standards: Performance Standards. National Center of Education and the Economy, 1995.
Newman, D. (1991).
Computer support for school work. In L. Birnbaum (Ed.), Proceedings of the 1991 Conference on the Learning Science (pp. 344-350). Charlottesville, VA: Association for the Advancement of Computing in Education.
O'Neill, D. K., & Gomez, L. M. (1994).
The collaboratory notebook: A distributed knowledge-building environment for project-enhanced learning. In Proceedings of Ed-Media '94, Vancouver, BC.
Paolucci, M., Suthers, D., & Weiner, A. (1996).
Automated advice-giving strategies for scientific inquiry. Intelligent Tutoring Systems, 3rd International Conference, Montreal, June 12-14, 1996.
Resnick, L. & Chi, M.T.H. (1988).
Cognitive psychology and science learning. In M.Druger, editor, Science for the Fun of it: A Guide to Informal Science Education, pages 24-31. National Science Teachers Association, 1988.
Ritter, S. & Koedinger, K. R. (1995).
Towards lightweight tutoring agents. In Proceedings of the World Conference on Artificial Intelligence in Education, AACE: Charlottesville, VA.
Ritter, S. & Koedinger, K. R. (1997).
An architecture for plug-in tutoring agents. In Journal of Artificial Intelligence in Education , 7 (3/4), 315-347. Charlottesville, VA: Association for the Advancement of Computing in Education.
Roschelle., J. (1994).
Designing for cognitive communication: Epistemic fidelity or mediating collaborative inquiry? The Arachnet Electronic Journal on Virtual Culture, 2 (2), 1994.
Roschelle, J. & Kaput, J. (1995).
Educational software architecture and systemic impact: The promise of component software. Presented at AERA Annual Meeting, San Francisco, April 19, 1995.
Rysavy, D.M. & Sales, G.C. (1991).
Cooperative learning in computer-based instruction. Educational Technology Research & Development , 39 (2), 70-79.
Scardamalia, M., & Bereiter, C. (1991).
Higher levels of agency for children in knowledge building: A challenge for the design of new knowledge media.The Journal of the Learning Sciences, 1 (1), 37--68.
Scardamalia, M., Bereiter, C., Brett, C., Burtis, P.J., Calhoun, C., & Smith Lea, N. (1992).
Educational applications of a networked communal database. Interactive Learning Environments , 2 (1), 45-71.
Scofield, J. W. (1996).
Computers and Classroom Culture. Cambridge: Cambridge University Press. 1996.
Sharan, S. (1980).
Cooperative learning in small groups: Recent methods and effects on achievement, attitudes, and ethnic relations. Review of Educational Research , 50 , 241-272
Slavin, R. E. (1990).
Cooperative learning: Theory, research, and practice . Englewood Cliffs, NJ: Prentice-Hall.
Smolensky, P., Fox, B., King, R., & Lewis, C. (1987).
Computer-aided reasoned discourse, or, how to argue with a computer. In R. Guindon (Ed.), Cognitive science and its applications for human-computer interaction (pp. 109-162). Hillsdale, NJ: Erlbaum.
Streitz, N. A., Hannemann, J., & Thuring, M. (1989).
From ideas and arguments to hyperdocuments: Traveling through activity spaces. In Hypertext '89 Proceedings, Pittsburgh, PA (pp. 343--364). New York: ACM.
Suthers, D. (1993).
Preferences for model selection in explanation. In Proc. 13th International Joint Conference on Artificial Intelligence (IJCAI-93), pages 1208--1213, Chambery, France, August 1993.
Suthers, D. & Jones, D. (1997).
An architecture for intelligent collaborative educational systems. AI-Ed 97, the 8th World Conference on Artificial Intelligence in Education, Kobe Japan, August 20-22, 1997.
Suthers, D., Toth, E., and Weiner, A. (1997).
An Integrated Approach to Implementing Collaborative Inquiry in the Classroom. Computer Supported Collaborative Learning (CSCL'97), Toronto, December, 1997.
Suthers, D., Weiner, A., Connelly, A. and Paolucci, M. (1995).
Belvedere: Engaging students in critical discussion of science and public policy issues.AI-Ed 95, the 7th World Conference on Artificial Intelligence in Education., August 16-19, 1995, Washington DC
Suthers, D. and Weiner, A. (1995).
Groupware for developing critical discussion skills. CSCL '95, Computer Supported Cooperative Learning, Bloomington, Indiana, October 17-20, 1995.
Thomason, R.H. (1992).
Netl and subsequent path-based inheritance theories. Computers Math. Applic., 23 (2-5):179--204, 1992.
Toth, J., Suthers, D., and Weiner, A. (1997).
Providing expert advice in the domain of collaborative scientific inquiry. To appear in AI&ED97.
Waugh, M.L. & Levin, J. (1988).
Telescience activities: Educational uses of electronic networks. The Journal of Computers in Mathematics and Science Teaching , 8 (2): 29-33, 1988.
Webb, N. (1989).
Peer interaction and learning in small groups. International Journal of Education Research, 13 :21-40, 1989.
Whitelock, D., Taylor, J., O'Shea, T., & Scanion, E. (1991).
How students construct a shared understanding of collisions in Newtonian mechanics. In L. Birnbaum (Ed.), Proceedings of the 1991 Conference on the Learning Sciences (pp. 430-441). Charlottesville, VA: Association for the Advancement of Computing in Education.


Return to top

Contents

    1. TECHNICAL REPORT
    2. 1.Preliminary Information
      1. 1.1Report Title
      2. 1.2Notices
      3. 1.3Abstract
      4. 1.4Table of Contents; List of Tables, Figures and Illustrations
        1. 1.4.1Contents
        2. 1.4.2Tables and Figures
      5. 1.5List of Symbols, Abbreviations, and Acronyms
      6. 1.6Preface and Acknowledgements
    3. 2.Body of the Report
      1. 2.1Summary
      2. 2.2Introduction
      3. 2.3Methods, Assumptions, and Procedures
        1. 2.3.1Overview of Belvedere
          1. 2.3.1.1Cognitive Support
          2. 2.3.1.2Collaborative Support
          3. 2.3.1.3Evaluative Support
          4. 2.3.1.4Other Features
        2. 2.3.2Architecture
          1. 2.3.2.1Concepts of Application
            1. 2.3.2.1.1Critical Inquiry in Science
            2. 2.3.2.1.2Generality
          2. 2.3.2.2Interface Presentation
            1. 2.3.2.2.1A Graphical Interface for Critical Inquiry
            2. 2.3.2.2.2Relations to Concepts of Applications
            3. 2.3.2.2.3Comments on the Analysis
          3. 2.3.2.3Concepts of Operations
            1. 2.3.2.3.1Supporting Collaborative Coached Critical Inquiry
            2. 2.3.2.3.2Relations to Other Levels
            3. 2.3.2.3.3Interoperability and Reusability
          4. 2.3.2.4Abstract Implementation
            1. 2.3.2.4.1Relations to Concepts of Operations
            2. 2.3.2.4.2Interoperability and Reusability Issues
          5. 2.3.2.5Resource Layer
          6. 2.3.2.6Architectural Lessons
        3. 2.3.3Design of Computer Coaches
          1. 2.3.3.1Introduction
          2. 2.3.3.2Pedagogical Constraints on Advice
          3. 2.3.3.3Syntactic Advice Strategies
          4. 2.3.3.4Consistency-Based Advice Strategies
            1. 2.3.3.4.1Consistency and Transitivity
            2. 2.3.3.4.2Formative Experiments
            3. 2.3.3.4.3 "Snippets"
            4. 2.3.3.4.4Implemented Expert Coach
            5. 2.3.3.4.5Reimplementation in Java
          5. 2.3.3.5Coaching Status and Future Directions
        4. 2.3.4Design of Classroom Implementation
          1. 2.3.4.1Classroom Environment
          2. 2.3.4.2Teacher's Role
          3. 2.3.4.3Curriculum Materials
          4. 2.3.4.4Student Activities
          5. 2.3.4.5Supporting Peer Evaluation with Performance-based Assessment
        5. 2.3.5Collaborative Activities Funded
          1. 2.3.5.1Belz and Luckham: Architecture Abstraction Hierarchy Reference Model
          2. 2.3.5.2Forbus and Koedinger: Integrated Feasibility Demonstration of Science Learning Spaces
            1. 2.3.5.2.1Demonstration Architecture
            2. 2.3.5.2.2Lessons Learned from the Integration Demonstration
          3. 2.3.5.3Workshop: Architectures and Methods for Designing Cost-Effective and Reusable ITS
      4. 2.4Results
        1. 2.4.1Highlights of Project
          1. 2.4.1.1Cognitive/Education Highlights
          2. 2.4.1.2Technology Highlights
          3. 2.4.1.3Implementation Status and Platforms
            1. 2.4.1.3.1Belvedere
            2. 2.4.1.3.2Collaborative Database Server and Connection Manager
            3. 2.4.1.3.3Coach ("Idea Generator")
          4. 2.4.1.4Maturity and Deployability
          5. 2.4.1.5Openness and Interoperability
            1. 2.4.1.5.1Points of Interoperability:
            2. 2.4.1.5.2Limitations on Interoperability:
          6. 2.4.1.6(Re)Applicability
          7. 2.4.1.7Venues and Collaborations
          8. 2.4.1.8Future Research
            1. 2.4.1.8.1Technology Research
            2. 2.4.1.8.2Cognitive/Education Research
        2. 2.4.2Results of Independent Evaluations
          1. 2.4.2.1Evaluation Methodology
          2. 2.4.2.2Summary of Evaluation Results
      5. 2.5Conclusions
      6. 2.6Recommendations
        1. 2.6.1Follow-on Work for Belvedere
          1. 2.6.1.1Port Server to NT
          2. 2.6.1.2Refine and Extend the Belvedere Interface as Guided by User Feedback
          3. 2.6.1.3Future Work for Coaching
          4. 2.6.1.4Cognitively Motivated Design of Challenge Activities and Materials
          5. 2.6.1.5Realistic Classroom Implementation
          6. 2.6.1.6Assessment Rubrics
          7. 2.6.1.7Commercial Deployment
        2. 2.6.2Additional Research
          1. 2.6.2.1Designing open, interoperable educational software.
          2. 2.6.2.2Representational devices as "epistemic forms" for collaborative learning.
          3. 2.6.2.3Software participation in reflective learning interactions.
          4. 2.6.2.4Examining the cost-benefit tradeoff between knowledge engineering and coaching functionality.
          5. 2.6.2.5Designing hypermedia structures to scaffold critical inquiry skills.
          6. 2.6.2.6Scaffolding learning from simulations and visualizations.
          7. 2.6.2.7Comparing scientists' and learners' inquiry skills.
          8. 2.6.2.8Realistic school implementation of advanced educational technology.
    4. 3.End Matter
      1. 3.1Appendices
      2. 3.2Bibliography