Knowledge Acquisition

Similar documents
Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

Knowledge-Based - Systems

Unit 7 Data analysis and design

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

WHY SOLVE PROBLEMS? INTERVIEWING COLLEGE FACULTY ABOUT THE LEARNING AND TEACHING OF PROBLEM SOLVING

Rover Races Grades: 3-5 Prep Time: ~45 Minutes Lesson Time: ~105 minutes

Generating Test Cases From Use Cases

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Introduction to CRC Cards

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Scoring Guide for Candidates For retake candidates who began the Certification process in and earlier.

ECE-492 SENIOR ADVANCED DESIGN PROJECT

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Modeling user preferences and norms in context-aware systems

Ontologies vs. classification systems

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

MYCIN. The MYCIN Task

Class Numbers: & Personal Financial Management. Sections: RVCC & RVDC. Summer 2008 FIN Fully Online

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Data Modeling and Databases II Entity-Relationship (ER) Model. Gustavo Alonso, Ce Zhang Systems Group Department of Computer Science ETH Zürich

Cognitive Thinking Style Sample Report

GACE Computer Science Assessment Test at a Glance

Maths Games Resource Kit - Sample Teaching Problem Solving

Software Maintenance

Millersville University Degree Works Training User Guide

Different Requirements Gathering Techniques and Issues. Javaria Mushtaq

EXAMPLES OF SPEAKING PERFORMANCES AT CEF LEVELS A2 TO C2. (Taken from Cambridge ESOL s Main Suite exams)

Section 3.4. Logframe Module. This module will help you understand and use the logical framework in project design and proposal writing.

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Lecture 1: Basic Concepts of Machine Learning

PROCESS USE CASES: USE CASES IDENTIFICATION

Tun your everyday simulation activity into research

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

M55205-Mastering Microsoft Project 2016

LEGO MINDSTORMS Education EV3 Coding Activities

Student Handbook 2016 University of Health Sciences, Lahore

Modeling full form lexica for Arabic

Think A F R I C A when assessing speaking. C.E.F.R. Oral Assessment Criteria. Think A F R I C A - 1 -

SYSTEM ENTITY STRUCTUURE ONTOLOGICAL DATA FUSION PROCESS INTEGRAGTED WITH C2 SYSTEMS

Towards a Collaboration Framework for Selection of ICT Tools

Evaluating the Effectiveness of Mindmapping in Generating Domain Ontologies using OntoREM: The MASCOT Case Study

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

Laporan Penelitian Unggulan Prodi

What is PDE? Research Report. Paul Nichols

Major Milestones, Team Activities, and Individual Deliverables

OCR LEVEL 3 CAMBRIDGE TECHNICAL

Unit 3. Design Activity. Overview. Purpose. Profile

Using focal point learning to improve human machine tacit coordination

A Coding System for Dynamic Topic Analysis: A Computer-Mediated Discourse Analysis Technique

Getting Started with Deliberate Practice

Foundations of Knowledge Representation in Cyc

Tutoring First-Year Writing Students at UNM

PEDAGOGICAL LEARNING WALKS: MAKING THE THEORY; PRACTICE

DOES RETELLING TECHNIQUE IMPROVE SPEAKING FLUENCY?

UNDERSTANDING DECISION-MAKING IN RUGBY By. Dave Hadfield Sport Psychologist & Coaching Consultant Wellington and Hurricanes Rugby.

Ontological spine, localization and multilingual access

DSTO WTOIBUT10N STATEMENT A

Probability estimates in a scenario tree

BASIC EDUCATION IN GHANA IN THE POST-REFORM PERIOD

Multiple Measures Assessment Project - FAQs

Eliciting Language in the Classroom. Presented by: Dionne Ramey, SBCUSD SLP Amanda Drake, SBCUSD Special Ed. Program Specialist

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

Implementing a tool to Support KAOS-Beta Process Model Using EPF

A process by any other name

Author: Justyna Kowalczys Stowarzyszenie Angielski w Medycynie (PL) Feb 2015

Rule Learning With Negation: Issues Regarding Effectiveness

CEFR Overall Illustrative English Proficiency Scales

Constraining X-Bar: Theta Theory

The Enterprise Knowledge Portal: The Concept

DegreeWorks Advisor Reference Guide

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Case study Norway case 1

Billett, S. (1994). Situating learning in the workplace: Having another look at Apprenticeships. Industrial and Commercial Training, 26(11) 9-16.

One of the aims of the Ark of Inquiry is to support

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Match or Mismatch? How congruent are the beliefs of teacher candidates, cooperating teachers, and university-based teacher educators?

with The Grouchy Ladybug

Evidence for Reliability, Validity and Learning Effectiveness

CS Machine Learning

Simulation in Maritime Education and Training

Python Machine Learning

Unpacking a Standard: Making Dinner with Student Differences in Mind

Day 1 Note Catcher. Use this page to capture anything you d like to remember. May Public Consulting Group. All rights reserved.

NATIONAL SURVEY OF STUDENT ENGAGEMENT

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Creating Your Term Schedule

Stakeholder Engagement and Communication Plan (SECP)

Evaluation of Learning Management System software. Part II of LMS Evaluation

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

HARPER ADAMS UNIVERSITY Programme Specification

Uncertainty concepts, types, sources

Abstract. Janaka Jayalath Director / Information Systems, Tertiary and Vocational Education Commission, Sri Lanka.

PowerCampus Self-Service Student Guide. Release 8.4

SAMPLE. PJM410: Assessing and Managing Risk. Course Description and Outcomes. Participation & Attendance. Credit Hours: 3

Evolution of Symbolisation in Chimpanzees and Neural Nets

Transcription:

Knowledge Acquisition COMP62342 Sean Bechhofer University of Manchester sean.bechhofer@manchester.ac.uk Knowledge Acquisition (KA) Operational definition Given a source of (declarative) knowledge a sink KA is the transfer of declarative statements from source to sink we can generalise this to other sources, e.g., sensors We distinguish between KA and K refinement i.e., modification of the statements in our sink But this distinction is merely conceptual Actual processes are messy Range of automation Fully manual (what we re going to do!) (Fully) automated Possibly plus refinement e.g., machine learning, text extraction 2 1

From Knowing to Representation Source A person, typically called the domain expert (DE, or expert ) domain, subject matter, universe of discourse, area,... Key features They know a lot about the domain (coverage) They are highly reliable about the domain (accuracy) They know how to articulate domain knowledge Though not always in the way we want! They have good metaknowledge Immediate Sink A document encoded in natural language or semi-nl Ultimate Sink A document encoded in a formal/actionable KR language I.e., an OWL Ontology! This KA is often called Knowledge Elicitation 3 Knowing to Representation Margaret Grace Rever is the mother of Robert David Bright Source Immediate Sink Robert_David_Bright_1965! hasmother! Margaret_Grace_Rever_1934! Ultimate Sink 4 2

5 Eliciting Knowledge Proposal 1: Ask the expert nicely to write it all down Problems: 1. They know too much 2. Much of what they know is tacit Perhaps can give it on demand, but not spontaneously I.e., it s there buthard to access They can t describe it (well) 3. They know too little E.g., application goals Target representation constraints E.g., the language Their knowledge is incomplete Though they maybe able to acquire or generate it 4. Expense Busy and valuable people They get bored 6 3

The Knowledge Engineer (KE) Key Role Expertise in KA E.g., elicitation Knows the target formalism Knows knowledge (and software) development Tools, methodologies, requirements management, etc. Does not necessarily know the domain! Though the KE may also be a DE Most DEs are not KEs Though they may be convertible May be able to become (enough of an) expert E.g., if autodidact or good learner with access to classes Investment in the representation itself 7 Elicitation Technique Requirements Minimise DE s time Assume DE scarcity Capture essential knowledge Including metaknowledge! Minimise DE s KE training and effort Assume loads of tacit knowledge Thus techniques must be able to capture it Support multiple sources Multiple experts (get consensus?) Experts might point to other sources (e.g., standard text) KEs must understand enough So, the techniques have to allow for KE domain learning KRs reasonably accessible to non-experts Always assume DE not invested I.e., that you care more about the KR, much more 8 4

Note on generalizability Many KA techniques are very specific Specific to source (e.g., learning from relational databases) Specific to targets (e.g., learning a schema) Elicitation techniques are generally flexible Arbitrary sources and sinks In both domain and form NL intermediaries help Parameterisable is perhaps more accurate 9 Elicitation Techniques Two major families Pre-representation Post-(initial)representation Pre-representation Starting point! Experts interact with a KE Focused on protocols A record of behavior Protocol-generation Protocol-analysis Post-representation (modelling) Experts interact with a (proto)representation (& KE) Testing and generating 10 5

Pre-representation Techniques Protocol-generation Often involves video or other recording Interviews Structured or unstructured (e.g., brainstorming) Observational Reporting Self or shadowing Any non-interview observation Protocol-analysis Typically done with transcripts or notes But direct video is fine Convert protocols into protorepresentations So, some modelling already! We can treat many things as protocols E.g., Wikipedia articles, textbooks, papers, etc. 11 Modelling Techniques (Often characterized by aspects of the target (OWL in our case)) Being picky Pedantic refinement Sorting techniques are used for capturing the way people compare and order concepts, and can lead to the revelation of knowledge about classes, properties and priorities Hierarchy-generation techniques such as laddering are used to build taxonomies or other hierarchical structures such as goal trees and decision networks. Matrix-based techniques involve the construction of grids indicating such things as problems encountered against possible solutions. Limited-information and constrained-processing tasks are techniques that either limit the time and/or information available to the expert when performing tasks. For instance, the twenty-questions technique provides an efficient way of accessing the key information in a domain in a prioritised order. 12 6

Other Modelling Techniques Scenario descriptions Diagrams Problem solving Teaching Role Play Joint Observation Etc. 13 Example: An Animals Taxonomy Task: generate a controlled vocab for an index of a children s book Domain: Animals including (think of these as CQ) Where they live What they eat Carnivores, herbivores and omnivores How dangerous they are How big they are A bit of basic anatomy» legs, wings, fins? skin, feathers, fur?... (read the book!) Representation aspects Hierarchical list with priorities 14 7

Protocol Analysis From interviews/behaviour to analysable items Text! Text is good! From a text, find key terms harmonise them capitalisation, pluralization (or not), orthography, etc. Keep track of Significance Core or peripheral terms Illustrative? Defining? Situation Sentences or sections Output: List of Terms 15 Animal taxonomy Term Generation! screenshot_03 16 8

Sort of Knowledge Declarative Knowledge about Terms (or Concepts) Aka Conceptual Knowledge Initial steps Identify the domain and requirements Collect the terms Gather together the terms that describe the objects in the domain. Analyse relevant sources Documents Manuals Web resources Interviews with Expert We ve done that! Now some modelling Two techniques today! Card sorting 3 card trick 17 Card Sorting! Card Sorting identifies similarities A relatively informal procedure Works best in small groups Write down each concept/idea on a card 1. Organise them into piles 2. Identify what the pile represents New concepts! New card! 3. Link the piles together 4. Record the rationale and links 5. Reflect Repeat! Each time, note down the results of the sorting Brainstorm different initial piles 18 9

Sorted Animal Cards 19 Try 2 Rounds Initial ideas How we use them Ecology Anatomy... 20 10

Generative For elicitation, more is (generally) better Within limits Brainstormy Is critical knowledge tacit? We can t easily know in advance Winnowing is crucial Sometimes we elicit things which should be discarded And trigger the discarding of other things! Better to know what we don t care to know! 21 Knowledge Acquisition (KA) Operational definition Given a source of (propositional) knowledge a sink KA is the transfer of propositions from source to sink Elicitation (for terminological knowledge) Initial Capture: Source: People, experts, domain experts (DE) Sink: Protocol (record of behavior) Term Extraction: Source: Text (e.g., transcript, textbook, Wikipedia article) Sink: List of terms (perhaps on cards) Initial Regimentation: Source: List of terms (on cards!) Sink: Proto-representation Hierarchy of categorized, harmonised terms (with notes!) 22 11

Triadic Elicitation: The 3 card trick Select 3 cards at random Identify which 2 cards are the most similar? Write down why (a similarity) As a new term! Write down why not like 3rd (a difference) Another new term! Helps to determine the characteristics of our classes Prompts us into identifying differences & similarities There will always be two that are closer together Although which two cards that is may differ From person to person From perspective to perspective From round to round 25 Example 1. David Bright (1934) 2. Margaret Grace Reever (1934) 3. Robert David Bright (1965) 26 12

20 Questions Like the game! The KE picks an object/concept in the domain The DE tries to guess it and asks a series of yes/no questions Is it an animal? Is it a vegetable? Is it a mineral? KE notes the questions and their order Can help determine key concepts, properties, etc. Animals, vegetables, and minerals! Can help structure the domain Anim al Livin g Thin g Is it a living thing?, an animal?, a plant? Note that the technique is not the game! Goals are different! We re very interested in the questions, not the answers per se Plan t 27 Key Goal: Laddering Terms vary in generality Tree vs. Plant Dog vs. Rover Each sort may be implicit! Goal: Flesh out the generality hierarchy Get more specific (if too general) Get more general (if mostly specific) How? 1. Take a group and ask what they have in common During sorting or 3-card or directly 2. Then investigate relations of new term Siblings, missing children, and (eventually) parents (back to 1) 28 13

So! The Task Capture Look at the Menu Extract List of terms; put them on cards! Organise Hierarchy Encode OWL in Protégé 33 Coursework Take the KE done in class Feel free to refine it further Encode it using Protege 4 Each category term becomes a class Capture your hierarchy using subsumption/subclassing Submit your RDF/XML file Full description on Blackboard! 34 14