Explanation and Simulation in Cognitive Science

Similar documents
Evolution of Symbolisation in Chimpanzees and Neural Nets

Syntactic systematicity in sentence processing with a recurrent self-organizing network

Artificial Neural Networks

Artificial Neural Networks written examination

An Empirical and Computational Test of Linguistic Relativity

1 NETWORKS VERSUS SYMBOL SYSTEMS: TWO APPROACHES TO MODELING COGNITION

Learning Methods for Fuzzy Systems

Rule-based Expert Systems

Neuro-Symbolic Approaches for Knowledge Representation in Expert Systems

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Connectionism, Artificial Life, and Dynamical Systems: New approaches to old questions

Seminar - Organic Computing

Abstractions and the Brain

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

The Good Judgment Project: A large scale test of different methods of combining expert predictions

How People Learn Physics

Knowledge-Based - Systems

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

SARDNET: A Self-Organizing Feature Map for Sequences

A Neural Network GUI Tested on Text-To-Phoneme Mapping

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Knowledge Transfer in Deep Convolutional Neural Nets

Accelerated Learning Course Outline

On-Line Data Analytics

Natural Language Processing. George Konidaris

***** Article in press in Neural Networks ***** BOTTOM-UP LEARNING OF EXPLICIT KNOWLEDGE USING A BAYESIAN ALGORITHM AND A NEW HEBBIAN LEARNING RULE

Accelerated Learning Online. Course Outline

Exploration. CS : Deep Reinforcement Learning Sergey Levine

INPE São José dos Campos

phone hidden time phone

Lecture 1: Machine Learning Basics

Concept Acquisition Without Representation William Dylan Sabo

Lecture 10: Reinforcement Learning

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Lecture 1: Basic Concepts of Machine Learning

Visual CP Representation of Knowledge

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

arxiv: v1 [cs.cv] 10 May 2017

POLA: a student modeling framework for Probabilistic On-Line Assessment of problem solving performance

Learning to Schedule Straight-Line Code

Neural Representation and Neural Computation. Philosophical Perspectives, Vol. 4, Action Theory and Philosophy of Mind (1990),

Evolutive Neural Net Fuzzy Filtering: Basic Description

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Software Maintenance

An OO Framework for building Intelligence and Learning properties in Software Agents

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

An Evaluation of the Interactive-Activation Model Using Masked Partial-Word Priming. Jason R. Perry. University of Western Ontario. Stephen J.

Lecture 2: Quantifiers and Approximation

Automating the E-learning Personalization

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

UC Merced Proceedings of the Annual Meeting of the Cognitive Science Society

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

CS 598 Natural Language Processing

Python Machine Learning

Networks in Cognitive Science

An empirical study of learning speed in backpropagation

Device Independence and Extensibility in Gesture Recognition

MYCIN. The MYCIN Task

A Pipelined Approach for Iterative Software Process Model

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Using computational modeling in language acquisition research

Word learning as Bayesian inference

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Reinforcement Learning by Comparing Immediate Reward

Probability and Statistics Curriculum Pacing Guide

*** * * * COUNCIL * * CONSEIL OFEUROPE * * * DE L'EUROPE. Proceedings of the 9th Symposium on Legal Data Processing in Europe

Running Head: STUDENT CENTRIC INTEGRATED TECHNOLOGY

Intervening to alleviate word-finding difficulties in children: case series data and a computational modelling foundation

The Evolution of Random Phenomena

Learning goal-oriented strategies in problem solving

A Reinforcement Learning Variant for Control Scheduling

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

CS Machine Learning

Degeneracy results in canalisation of language structure: A computational model of word learning

Second Exam: Natural Language Parsing with Neural Networks

Debriefing in Simulation Train-the-Trainer. Darren P. Lacroix Educational Services Laerdal Medical America s

A Metacognitive Approach to Support Heuristic Solution of Mathematical Problems

A Hybrid Model of Reasoning by Analogy*

The Strong Minimalist Thesis and Bounded Optimality

Circuit Simulators: A Revolutionary E-Learning Platform

Using focal point learning to improve human machine tacit coordination

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

A cognitive perspective on pair programming

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Radius STEM Readiness TM

arxiv: v1 [math.at] 10 Jan 2016

2. Related Work. KEY WORDS concept, spreading activation, neural network, semantic memory, semantic network, memory model. 1.

KLI: Infer KCs from repeated assessment events. Do you know what you know? Ken Koedinger HCI & Psychology CMU Director of LearnLab

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

TD(λ) and Q-Learning Based Ludo Players

A Computer Vision Integration Model for a Multi-modal Cognitive System

Transcription:

Explanation and Simulation in Cognitive Science Simulation and computational modeling Symbolic models Connectionist models Comparing symbolism and connectionism Hybrid architectures Cognitive architectures

Simulation and Computational Modeling With detailed and explicit cognitive theories, we can implement the theory as a computational model And then execute the model to: Simulate cognitive capacity Derive predictions from the theory The predictions can then be compared to empirical data

Questions What kinds of theories are amenable to simulation? What techniques work for simulation? Is simulating the mind different from simulating the weather?

The Mind & the Weather The mind may just be a complex dynamic system, but it isn t amenable to generic simulation techniques: The relation between theory and implementation is indirect: theories tend to be rather abstract The relation between simulation results and empirical data is indirect: simulations tend to be incomplete The need to simulate helps make theories more concrete But improvement of the simulation must be theory-drive, not just an attempt to capture the data

Symbolic Models High-level functions (e.g., problem solving, reasoning, language) appear to involve explicit symbol manipulation Example: Chess and shopping seem to involve representation of aspects of the world and systematic manipulation of those representations

Central Assumptions Mental representations exist Representations are structured Representations are semantically interpretable

What s in a representation? Representation must consist of symbols Symbols must have parts Parts must have independent meanings Those meanings must contribute to the meanings of the symbols which contain the e.g., 34 contains 3 and 4, parts which have independent meanings the meaning of 34 is a function of the meaning of 3 in the tens position and 4 in the units position

In favor of structured mental representations Productivity It is through structuring that thought is productive (finite number of elements, infinite number of possible combinations) Systematicity If you think John loves Mary, you can think Mary loves John Compositionality The meaning of John loves Mary is a function of its parts, and their modes of combination Rationality If you know A and B is true, then you can infer A is true Fodor & Pylyshyn (1988)

What do you do with them? Suppose we accept that there are symbolic representations How can they be manipulated? by a computing machine Any such approach has three components A representational system A processing strategy A set of predefined machine operations

Automata Theory Identifies a family of increasingly powerfu computing machines Finite state automata Push down automata Turning machines

Automata, in brief (Figure 2.2 in Green et al., Chapter 2) This FSA takes as input a sequence of on and off messages, and accepts any sequence ending with an on A PDA adds a stack: an infinite-capacity, limited access memory, so that what a machine does depends on input, current state, plus the memory

A Turing machine changes this memory to allow any location to be accessed at any time. An the State transition function specifies read/write instructions, as well as which state to move to next. Any effective procedure can be implemented on an appropriately programmed Turing machine And Universal Turing machines can emulate any Turing machine, via a description on the tape of the machine and its inputs Hence, philosophical disputes: Is the brain Turing powerful? Does machine design matter or not?

More practical architectures Von Neumann machines: Strictly less powerful than Turing machines (finite memory) Distinguished area of memory for stored programs Makes them conceptually easier to use than TMs Special memory location points to next-instruction on each processing cycle: fetch instruction, move pointer to next instruction, execute current instruction

Production Systems Introduced by Newell & Simon (1972) Cyclic processor with two main memory structures Long term memory with rules (~productions) Working memory with symbolic representation of current system state Example: IF goal (sweeten(x) AND available (sugar) THEN action (add(sugar, X)) and retract (goal(sweeten(x)))

Recognize phase (pattern matching) Find all rules in LTM that match elements in WM Act phase (conflict resolution) Choose one matching rule, execute, update WM and (possibly) perform action Complex sequences of behavior can thus result Power of pattern matcher can be varied, allowing different use of WM Power of conflict resolution will influence behavior given multiple matches Most specific This works well for proble -solving. Would it work for pole-balancing?

Connectionist Models The basic assumption There are many processors connected together, and operating simultaneously Processors: units, nodes, artificial neurons

A connectionist network is A set of nodes, connected in some fashion Nodes have varying activation levels Nodes interact via the flow of activation along the connections Connections are usually directed (one-way flow), and weighted (strength and nature of interaction; positive weight = excitatory; negative = inhibitory) A node s activation will be computed from the weighted sum of its inputs

Local vs. Distributed Representation Parallel Distributed Processing is a (the?) major branch of connectionis In principle, a connectionist node could have an interpretable meaning E.g., active when red input, or grandmother, or whatever However, an individual PDP node will not have such an interpretable meaning Activation over whole set of nodes corresponds to red Individual node participates in many such representations

PDP PDP systems lack systematicity and compositionality Three main types of networks: Associative Feed-forward Recurrent

Associative To recognize and reconstruct patterns Present activation pattern to subset of units Let network settle in stable activation pattern (reconstruction of previously learned state)

Feedforwar Not for reconstruction, but for mapping from one domain to another Nodes are organized into layers Activation spreads through layers in sequence A given layer can be thought of as an activation vector Simplest case: Input layer (stimulus) Output layer (response) Two layer networks are very restricted in power. Intermediate (hidden) layers gain most of the additional computational power needed.

Recurrent Feedforward nets compute mappings given current input only. Recurrent networks allow mapping to take into account previous input. Jordan (1986) and Elman (1990) introduced networks with: Feedback links from output or hidden layers to context units, and Feedforward links from the context units to the hidden units Jordan network output depends on current input and previous output Elman network output depends on current input and whole of previous input history

Key Points about PDP It s not just that a net can recognize a pattern or perform a mapping It s the fact that it can learn to do so, on the basis of limited data And the way that networks respond to damage is crucial

Learning Present network with series of training patterns Adjust the weights on connections so that the patterns are encoded in the weights Most training algorithms perform small adjustments to the weights per trial, but require many presentations of the training set to reach a reasonable degree of performance There are many different learning algorithms

Learning (contd.) Associative nets support Hebbian learning rule: Adjust weight of connection by amount proportional to the correlation in activity of corresponding nodes So if both active, increase weight; if both inactive, increase weight; if they differ, decrease weight Important because this is biologically plausible and very effective

Learning (contd.) Feedforward and recurrent nets often exploit the backpropagation of error rule Actual output compared to expected output Difference computed and propagated back to input, layer by layer, requiring weight adjustments Note: unlike Hebb, this is supervised learning

Psychological Relevance Given a network of fixed size, if there are two few units to encode the training set, then interference occurs This is suboptimal, but is better than nothing, since at least approximate answers are provided And this is the flipside of generalization, which provides output for unseen input E.g., weep! wept; bid! bid

Damage Either remove a proportion of connections Or introduce random noise into activati propagati And behavior can simulate that of people with various forms of neurological damage Graceful degradation : impairment, but residual function

Example of Damage Hinton & Shallice (1991), Plaut & Shallice (1993) on deep dyslexia: Visual error ( cat read as cot ) Semantic error ( cat read as dog ) Networks constructed for orthography-tophonology mapping, lesioned in various ways, producing behavior similar to human subjects

Symbolic Networks Though distributed representations have proved very important, some researchers prefer localist approaches Semantic networks: Frequently used in AI-based approaches, and in cognitive approaches which focus on conceptual knowledge One node per concept; typed links between concepts Inference: link-following

Production systems with spreading activati Anderson s work (ACT, ACT*, ACT-R) Symbolic networks with continuous activation values ACT-R never removes working memory elements; activation instead decays over time Productions chosen on basis of (co-) activation

Interactive Activation Networks Essentially, localist connectionist networks Featuring self-excitatory and lateral inhibitory links, which ensure that there s always a winner in a competition (e.g., McClelland & Rumelhart s model of letter perception) Appropriate combinations of levels, with feedback loops in them, allow modeling of complex datadriven and expectation-driven bahavior

Comparing Symbolism & Connectionism As is so often the case in science, the two approaches were initially presented as exclusive alternatives

Connectionist: Interference Generalization Graceful degradation Symbolists complain: Connectionists don t capture structured information Network computation is opaque Networks are merely implementation-level

Symbolic Productive Systematic Compositional Connectionists complain Symbolists don t relate assumed structures to brain They relate them to von Neumann machines

Connectionists can claim: Complex rule-oriented behavior *emerges* from interaction of subsymbolic behavior So symbolic models describe, but do not explai

Symbolists can claim: Though PDP models can learn implicit rules, the learning mechanisms are usually no neurally plausible after all Performance is highly dependent on exact choice of architecture

Hybrid Architectures But really, the truth is that different tasks demand different technologies Hybrid approaches explicitly assume: Neither connectionist nor symbolic approach is flawed Their techniques are compatible

Two main hybrid options: Physically hybrid models: Contain subsystems of both types Issues: interfacing, modularity (e.g., use Interactive Activation Network to integrate results) Non-physically hybrid models Subsystems of only one type, but described two ways Issue: levels of description (e.g., connectionist production systems)

Cognitive Architectures Most modeling is aimed at specific processes or tasks But it has been argued that: Most real tasks involve many cognitive processes Most cognitive processes are used in many tasks Hence, we need unified theories of cognition

Examples ACT-R (Anderson) Soar (Newell) Both based on production system technology Task-specific knowledge coded into the productions Single processing mechanism, single learning mechanis

Like computer architectures, cognitive architectures tend to make some tasks easy, at the price of making other hard Unlike computer architectures, cognitive architectures must include learning mechanisms But note that the unified approaches sacrifice genuine task-appropriateness and perhaps also biological plausibility

A Cognitive Architecture is: A fixed arrangement of particular functional components A processing strategy