Relational Knowledge Discovery

Similar documents
Advanced Grammar in Use

Developing Grammar in Context

International Examinations. IGCSE English as a Second Language Teacher s book. Second edition Peter Lucantoni and Lydia Kellas

Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi

Lecture 1: Basic Concepts of Machine Learning

Guide to Teaching Computer Science

Python Machine Learning

THE PROMOTION OF SOCIAL AWARENESS

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

AQUA: An Ontology-Driven Question Answering System

(Sub)Gradient Descent

Principles of Public Speaking

Lecture Notes on Mathematical Olympiad Courses

Learning From the Past with Experiment Databases

Constructive Induction-based Learning Agents: An Architecture and Preliminary Experiments

Rule Learning With Negation: Issues Regarding Effectiveness

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Lecture 1: Machine Learning Basics

MMOG Subscription Business Models: Table of Contents

Unit 7 Data analysis and design

EDEXCEL FUNCTIONAL SKILLS PILOT TEACHER S NOTES. Maths Level 2. Chapter 4. Working with measures

Athens: City And Empire Students Book (Cambridge School Classics Project) By Cambridge School Classics Project

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Cambridge NATIONALS. Creative imedia Level 1/2. UNIT R081 - Pre-Production Skills DELIVERY GUIDE

EDEXCEL FUNCTIONAL SKILLS PILOT. Maths Level 2. Chapter 7. Working with probability

Mathematics subject curriculum

CSL465/603 - Machine Learning

Probabilistic Latent Semantic Analysis

Practical Research Planning and Design Paul D. Leedy Jeanne Ellis Ormrod Tenth Edition

Welcome to. ECML/PKDD 2004 Community meeting

Knowledge-Based - Systems

Using dialogue context to improve parsing performance in dialogue systems

University Library Collection Development and Management Policy

Hardhatting in a Geo-World

Chapter 2 Rule Learning in a Nutshell

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Learning and Transferring Relational Instance-Based Policies

A Version Space Approach to Learning Context-free Grammars

BOOK INFORMATION SHEET. For all industries including Versions 4 to x 196 x 20 mm 300 x 209 x 20 mm 0.7 kg 1.1kg

Agent-Based Software Engineering

CELTA. Syllabus and Assessment Guidelines. Third Edition. University of Cambridge ESOL Examinations 1 Hills Road Cambridge CB1 2EU United Kingdom

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

ITSC 2321 Integrated Software Applications II COURSE SYLLABUS

UNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL

Some Principles of Automated Natural Language Information Extraction

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

GACE Computer Science Assessment Test at a Glance

Note: Principal version Modification Amendment Modification Amendment Modification Complete version from 1 October 2014

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

INPE São José dos Campos

Reinforcement Learning by Comparing Immediate Reward

Version Space. Term 2012/2013 LSI - FIB. Javier Béjar cbea (LSI - FIB) Version Space Term 2012/ / 18

Effect of Cognitive Apprenticeship Instructional Method on Auto-Mechanics Students

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

A Case Study: News Classification Based on Term Frequency

Ontological spine, localization and multilingual access

Cooperative evolutive concept learning: an empirical study

GCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)

Evolution of Symbolisation in Chimpanzees and Neural Nets

CS 101 Computer Science I Fall Instructor Muller. Syllabus

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Spoken English, TESOL and Applied Linguistics

Section 3.4. Logframe Module. This module will help you understand and use the logical framework in project design and proposal writing.

21st CENTURY SKILLS IN 21-MINUTE LESSONS. Using Technology, Information, and Media

Open Sharing, Global Benefits The OpenCourseWare Consortium

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

Focused on Understanding and Fluency

Computerized Adaptive Psychological Testing A Personalisation Perspective

Time series prediction

Medical Complexity: A Pragmatic Theory

BEING ENTREPRENEURIAL. Being. Unit 1 - Pitching ideas to others Unit 2 - Identifying viable opportunities Unit 3 - Evaluating viable opportunities

Truth Inference in Crowdsourcing: Is the Problem Solved?

Information System Design and Development (Advanced Higher) Unit. level 7 (12 SCQF credit points)

Entry form Practical or Theory exams

Australian Journal of Basic and Applied Sciences

OPAC and User Perception in Law University Libraries in the Karnataka: A Study

Rule Learning with Negation: Issues Regarding Effectiveness

Space Travel: Lesson 2: Researching your Destination

American Literature: Major Authors Epistemology: Religion, Nature, and Democracy English 2304 Mr. Jeffrey Bilbro MWF

Planning a research project

Diagnostic Test. Middle School Mathematics

Introduction to Causal Inference. Problem Set 1. Required Problems

Predicting Students Performance with SimStudent: Learning Cognitive Skills from Observation

Abstractions and the Brain

Customized Question Handling in Data Removal Using CPHC

DICE - Final Report. Project Information Project Acronym DICE Project Title

Tuesday 13 May 2014 Afternoon

Problem Solving for Success Handbook. Solve the Problem Sustain the Solution Celebrate Success

Mining Association Rules in Student s Assessment Data

Save the date: 23rd Nuclear Inter Jura Congress in Abu Dhabi: 5-9 November, 2018

Strategies for Solving Fraction Tasks and Their Link to Algebraic Thinking

World University Rankings. Where s India?

Malaysia & Singapore [DK TRAVEL GD MALAYSIA & SINGAP] [Paperback] By DK Publishing"(Manufactured by)

Paper 2. Mathematics test. Calculator allowed. First name. Last name. School KEY STAGE TIER

City University of Hong Kong Course Syllabus. offered by Department of Architecture and Civil Engineering with effect from Semester A 2017/18

Eduroam Support Clinics What are they?

CS Machine Learning

Building a Sovereignty Curriculum

Transcription:

Relational Knowledge Discovery What is knowledge and how is it represented? This book focuses on the idea of formalising knowledge as relations, interpreting knowledge represented in databases or logic programs as relational data, and discovering new knowledge by identifying hidden and defining new relations. After a brief introduction to representational issues, the author develops a relational language for abstract machine learning problems. He then uses this language to discuss traditional methods such as clustering and decision tree induction, before moving onto two previously underestimated topics that are again coming to the fore: rough set data analysis and inductive logic programming. Its clear and precise presentation is ideal for undergraduate computer science students. The book will also interest those who study artificial intelligence or machine learning at the graduate level. Exercises are provided and each concept is introduced using the same example domain, making it easier to compare the individual properties of different approaches. M. E. MÜLLER is a Professor of Computer Science at the University of Applied Sciences, Bonn-Rhein-Sieg.

Relational Knowledge Discovery University of Applied Sciences, Bonn-Rhein-Sieg

CAMBRIDGE UNIVERSITY PRESS Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo, Delhi, Mexico City Cambridge University Press The Edinburgh Building, Cambridge CB2 8RU, UK Published in the United States of America by Cambridge University Press, New York Information on this title: /9780521190213 c M. E. Müller 2012 This publication is in copyright. Subject to statutory exception and to the provisions of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published 2012 Printed in the United Kingdom at the University Press, Cambridge A catalogue record for this publication is available from the British Library Library of Congress Cataloguing in Publication data Müller, M. E., (Martin E.), 1970 Relational knowledge discovery / M.E. Müller. p. cm. ISBN 978-0-521-19021-3 (hardback) 1. Computational learning theory. 2. Machine learning. 3. Relational databases. I. Title. Q325.7.M85 2012 006.3 1 dc23 2011049968 ISBN 978-0-521-19021-3 Hardback ISBN 978-0-521-12204-7 Paperback Cambridge University Press has no responsibility for the persistence or accuracy of URLs for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Contents About this book Page 1 1 Introduction 4 1.1 Motivation 5 1.2 Related disciplines 8 2 Relational knowledge 17 2.1 Objects and their attributes 18 2.2 Knowledge structures 32 3 From data to hypotheses 38 3.1 Representation 38 3.2 Changing the representation 46 3.3 Samples 53 3.4 Evaluation of hypotheses 57 3.5 Learning 67 3.6 Bias 68 3.7 Overfitting 73 3.8 Summary 74 4 Clustering 76 4.1 Concepts as sets of objects 76 4.2 k-nearest neighbours 78 4.3 k-means clustering 81 4.4 Incremental concept formation 85 4.5 Relational clustering 90 5 Information gain 92 5.1 Entropy 93 5.2 Information and information gain 98 5.3 Induction of decision trees 102 5.4 Gain again 109 5.5 Pruning 111 5.6 Conclusion 119 v

vi Contents 6 Rough set theory 121 6.1 Knowledge and discernability 121 6.2 Rough knowledge 127 6.3 Rough knowledge structures 137 6.4 Relative knowledge 141 6.5 Knowledge discovery 149 6.6 Conclusion 156 7 Inductive logic learning 159 7.1 From information systems to logic programs 160 7.2 Horn logic 167 7.3 Heuristic rule induction 180 7.4 Inducing Horn theories from data 189 7.5 Summary 221 8 Learning and ensemble learning 224 8.1 Learnability 224 8.2 Decomposing the learning problem 234 8.3 Improving by focusing on errors 239 8.4 A relational view on ensemble learning 244 8.5 Summary 249 9 The logic of knowledge 251 9.1 Knowledge representation 251 9.2 Learning 253 9.3 Summary 256 Notation 258 References 261 Index 267