AUTOMATIC LEARNING IN EXPERT SYSTEMS

Principal Investigators: Kurt Fedra and Lothar Winkelbauer TABLE OF CONTENTS PROJECT SUMMARY PROJECT DESCRIPTION GENERAL PROBLEM STATEMENT BACKGROUND AND STATE OF THE ART PROGRAM OF RESEARCH 1. Objectives and Approach 2. Project Participants and Related Research at I1ASA 2.1. JlASA's Advanced Computer Applications Projects (ACA) 2.2. IIASA's Advantage 2.3. Approach and Activities of the ACA Project 2.4. Major Studies of the ACA Project 2.5. ACA Results and Implementation 3. Strategy for the Proposed Research REFERENCES BIBLIOGRAPHY ATTACHMENTS 1. Schedule of Activities 2. Curriculum Vitae: Kurt Fedra 3. Curriculum Vitae: Lothar Winkelbauer 4. Project Publications Tab project it one of 5, which, Uktn together, form the bub for IIASA't rtqnttt to NSF for program rapport in FY 1088. Fund* forthcoming from NSF would contribute in part toward forthtrinc tht work dttcribtd htrt.

PROJECT SUMMARY Expert systems are emerging as a new generation of software that holds great promise for practical applications. A major bottleneck in their development and widespread use is knowledge acquisition, i.e., the transfer of domain-specific knowledge from human experts to the machine. Supporting knowledge acquisition by the machine itself, automatic learning, could: continuously improve performance through knowledge gained during the system's use and automatically incorporated into its knowledge bases; speed up the task of developing domain-specific expert systems, by providing a major productivity tool for knowledge engineers and system developers; and thus help to make expert system technology a valuable complement to the standard set of tools of Operations Research and Applied Systems Analysis. Within the framework of HASA's Advanced Computer Applications Project (ACA), a number of operational information- and decision-support systems for domains such as risk analysis, resource management, or regional development planning will provide test applications. Methodological research will concentrate on developing a hybrid approach to machine learning, combining concepts such as rote learning, learning by examples, learning by analogy, and learning by being told into a set of a modular tools to enhance existing software systems.

PROJECT DESCRIPTION GENERAL PROBLEM STATEMENT Expert systems are emerging as a new generation of software that holds great promise for shaping new information technology. New ways of tackling extremely complex and "soft" problems that defy classical formalization through the use of, e.g., heuristic approaches, as well as the extension of the group of potential users through new dimensions of user friendliness may be regarded as a major part of the potential benefits of expert systems. The major bottleneck in their development and widespread use is knowledge acquisition, i.e., the transfer of domain-specific knowledge from the human expert to the machine. Supporting knowledge acquisition by the machine itself, automatic learning, could: continuously improve performance through knowledge gained during the system's use and automatically incorporated into its knowledge bases; speed up the task of developing domain-specific expert systems, by providing a major productivity tool for knowledge engineers and systems developers. Application and problem-oriented rather than methodology-oriented systems are most often hybrid tystem*, where elements of Artificial Intelligence (AI) technology are combined with more classical techniques of information processing and approaches of operations research and systems analysis. Here traditional numerical data processing is supplemented by symbolic elements, rules, and heuristics in the various forms of knowledge representation (for a selection of relevant books and articles see the bibliography). There are numerous applications where the addition of a quite small amount of "knowledge* in the above sense, e.g., to an existing simulation model, may considerably extend its power and usefulness and at the same time make it much easier to use. Expert systems are not necessarily purely knowledge driven, relying on huge knowledge bases of thousands of rules. Applications containing only small knowledge bases of at best a few dozen to a hundred rules can dramatically extend the scope of standard computer applications in terms of application domains as well as in terms of an enlarged non-technical user communil lity. Expert systems are not supposed to substitute human expertise by a program. They should be advisory programs that bring expertise to the user, who can then use his own expertise and experience to recognise patterns and symptoms, recall history, and exercise judgement. Thus, for most expert systems, the users themselves are a rich source of knowledge and information.

BACKGROUND AND STATE OF THE ART Artificial Intelligence aims at developing computer systems of high performance (in the sense of complexity rather than speed), flexibility and "intelligent* behavior. Fields of research include understanding natural language, robotics, computer vision, expert systems, and machine learning. Machine learning was, and is, one of the central research areas of AI; a detailed discussion of the state of the art of learning in AI, with extensive references on further material and prototypical implementation examples, can be found in Cohen and Feigenbaum (1982) Tke Handbook of Artificial Intelligence, Volume 3, Chapter XIV: Learning and Inductive Inference [l]. * Most examples of machine learning, however, are of a more experimental, and at best demonstration prototype nature, where the domains of learning tend to be simple games or performance tasks of comparable complexity. Few operational expert systems include learning capabilities proper (e.g., Buchanan and Mitchell [2], Harmon and King [3]; see also Weigkricht and Winkelbauer [4]). Starting from H. Simon's [5] definition of learning as any process by which a system improves its performance^ levels or concepts of machine learning are usually grouped into: Rote learning, where the information input only needs to be categorized and stored for later retrieval (e.g., Samuel [6, 7]); Learning by example, where the information provided is at a low level of abstraction and requires inductive inference for generalization to a useful level (e.g., Tsypkin [8 Vere [9,10], Buchanan et al. [11, Hayes-Roth and McDermott [12], Mitchell [13, 14 Dietterich and Michalski [15, 16 ); "» Learning by being told, in which information provided is of a highly abstract and symbolic nature and requires deductive strategies to operationalize it (e.g., Davis [17,18], Hayes-Roth et al. [19, 20], Mostow [21]); Learning by analogies, in which the information provided applies to a related task or concept, requiring transformation strategies (e.g., Lenat [22, 23]). The above classification and set of references is certainly incomplete. A further selective Bibliography, concentrating on approaches to machine learning, is given after the References. In a large and composite practical application, however, where the performance tasks arc varied and fairly complex, none of the above concepts alone will suffice and only combined strategies hold promise to deliver improved performance and "intelligent* machine behavior that should be the hallmark of all AI applications. PROGRAM OF RESEARCH 1. Objective* and Approach Within the framework of expert systems, automatic learning aims at quantitative and qualitative performance improvement over more rigid, traditional approaches. The knowledge and information basis of an expert system can be improved and extended not only in an explicit knowledge acquisition mode, but also during the use of the system itself in the consultancy mode, using internal sources (e.g., optimization routines) as well -2-

as the user himself (who defines strategic objectives, constraints, values, and judgement, or performs a complex set of tasks in an exemplary manner) as sources of information. The hypothesis implicit in our proposal, namely that information provided by the (expert) user or appropriate software modules (e.g., optimization routines) during routine operation of tke system can be incorporated automatically on the oasis of a generalized knowledge representation format, and will thus lead to a considerable performance improvement of a learning system, can only be tested empirically, i.e., by constructing a prototype and subjecting it to peer and domain-expert review. Consequently, an important prerequisite for the proposed research on automatic learning concepts of practical applicability is their development within a realistic, problem-oriented framework, that is, an operational, domain-specific software system. Taking advantage of ongoing information and decision support systems development projects that address problems of hazardous substances management, industrial risk analysis (Fedra [24, 25], Fedra and Otway [26], Fedra et al. [27]), and environmental impact assessment and natural resources management, and that can provide practical test applications, the methodological research on machine learning will concentrate on learning by examples in simulation- and optimization-based hybrid systems, automatic selection of the appropriate representation (e.g., frames comprising rules, symbolic values, numeric values, other frames) in an interactive knowledge acquisition process, and plausibility and consistency checking of new information in incremental learning situations. Our central paradigm for developing hybrid strategies for machine learning is learning by examples. A learning system should acquire knowledge about various problem domains (e.g., hazardous substances) directly from model- or user-generated examples and the currently held information. Based on concepts of pattern matching, similarities, and "Gestalt," the examples will be transformed to problem-specific forms of representation, e.g., frames including rules, from which generalized knowledge about the problem area will be derived in the form of higher order representation (e.g., meta rules). The generalisation process can be repeated recursively for the next higher level of representation, as increasingly general views of the problem domain are constructed from an increasing information base. The knowledge the system holds about the problem areas will automatically be extended quantitatively and qualitatively both in an explicit data entry and knowledge acquisition mode, and during the system's functioning and use, e.g., by incorporating solutions from optimization programs. Explicit data entry and knowledge acquisition will be based on an intelligent, conversational questionnaire for specific data and knowledge with dynamically updated rules for plausibility and consistency checking. A rich set of model and knowledge bases, that is also constantly enlarged through use of the system, holds an infinite number of combinatorial possibilities for 'new* internally generated or deduced knowledge and ways of organizing this knowledge. Automatic learning can also help to get this continuous stream of information filtered and organized, transforming a large set of examples (acquired through rote learning mechanisms) into more efficient forms of representation covering this set of examples. -3-