A Neural Network Model For Concept Formation

Similar documents
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Artificial Neural Networks

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Evolution of Symbolisation in Chimpanzees and Neural Nets

Artificial Neural Networks written examination

INPE São José dos Campos

Degeneracy results in canalisation of language structure: A computational model of word learning

An Empirical and Computational Test of Linguistic Relativity

Axiom 2013 Team Description Paper

SARDNET: A Self-Organizing Feature Map for Sequences

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Knowledge-Based - Systems

Accelerated Learning Course Outline

Using focal point learning to improve human machine tacit coordination

Evolutive Neural Net Fuzzy Filtering: Basic Description

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Seminar - Organic Computing

Accelerated Learning Online. Course Outline

Python Machine Learning

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

On the Combined Behavior of Autonomous Resource Management Agents

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

arxiv: v1 [math.at] 10 Jan 2016

Using computational modeling in language acquisition research

Softprop: Softmax Neural Network Backpropagation Learning

Probabilistic Latent Semantic Analysis

Introduction to Simulation

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

Neural pattern formation via a competitive Hebbian mechanism

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Lecture 1: Machine Learning Basics

Knowledge Transfer in Deep Convolutional Neural Nets

arxiv: v1 [cs.lg] 3 May 2013

FOR TEACHERS ONLY. The University of the State of New York REGENTS HIGH SCHOOL EXAMINATION PHYSICAL SETTING/PHYSICS

Assignment 1: Predicting Amazon Review Ratings

Corpus Linguistics (L615)

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Speaker Identification by Comparison of Smart Methods. Abstract

Reinforcement Learning by Comparing Immediate Reward

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

A Reinforcement Learning Variant for Control Scheduling

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Calibration of Confidence Measures in Speech Recognition

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Statewide Framework Document for:

An Online Handwriting Recognition System For Turkish

Mandarin Lexical Tone Recognition: The Gating Paradigm

Syntactic systematicity in sentence processing with a recurrent self-organizing network

Word learning as Bayesian inference

Generative models and adversarial training

Software Maintenance

Test Effort Estimation Using Neural Network

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Maths Games Resource Kit - Sample Teaching Problem Solving

CS Machine Learning

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Strategy Study on Primary School English Game Teaching

Modeling function word errors in DNN-HMM based LVCSR systems

Cooperative Game Theoretic Models for Decision-Making in Contexts of Library Cooperation 1

Making welding simulators effective

Abstractions and the Brain

University of Victoria School of Exercise Science, Physical and Health Education EPHE 245 MOTOR LEARNING. Calendar Description Units: 1.

An empirical study of learning speed in backpropagation

Australian Journal of Basic and Applied Sciences

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

***** Article in press in Neural Networks ***** BOTTOM-UP LEARNING OF EXPLICIT KNOWLEDGE USING A BAYESIAN ALGORITHM AND A NEW HEBBIAN LEARNING RULE

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

Physics 270: Experimental Physics

Introduction to Causal Inference. Problem Set 1. Required Problems

Learning Methods for Fuzzy Systems

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

A SURVEY OF FUZZY COGNITIVE MAP LEARNING METHODS

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Mining Topic-level Opinion Influence in Microblog

A Bootstrapping Model of Frequency and Context Effects in Word Learning

Effect of Treadmill Training Protocols on Locomotion Recovery in Spinalized Rats

Getting Started with Deliberate Practice

Word Segmentation of Off-line Handwritten Documents

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Modeling function word errors in DNN-HMM based LVCSR systems

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

Computerized Adaptive Psychological Testing A Personalisation Perspective

Learning to Schedule Straight-Line Code

This scope and sequence assumes 160 days for instruction, divided among 15 units.

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

The Strong Minimalist Thesis and Bounded Optimality

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Understanding and Supporting Dyslexia Godstone Village School. January 2017

Performance Analysis of Optimized Content Extraction for Cyrillic Mongolian Learning Text Materials in the Database

Evolving Spiking Networks with Variable Resistive Memories

Transcription:

A Neural Network Model For Concept Formation Jiawei Chen, Yan Liu, Qinghua Chen, Jiaxin Cui Department of Systems Science School of Management Beijing Normal University Beijing 100875, P.R.China. chenjiawei@bnu.edu.cn Fukang Fang State Key Laboratory of Cognitive Neuroscience and Learning Beijing Normal University Beijing 100875, P.R.China. fkfang@bnu.edu.cn Abstract Acquisition of abstract concept is the key step in human intelligence development, but the neural mechanism of concept formation is not clear yet. Researches on complexity and self organization theory indicate that concept is a result of emergence of neural system and it should be represented by an attractor. Associative learning and hypothesis elimination are considered as the mechanisms of concept formation, and we think that Hebbian learning rule can be used to describe the two operations. In this paper, a neural network is constructed based on Hopfield model, and the weights are updated according to Hebbian rule. The forming processes of natural concept, number concept and addition concept are simulated by using our model. Facts from real neuroanatomy provide some evidences for our model. 1 Introduction It is an important and difficult problem that how concept is abstracted from concrete instances with some common features. In the last 100 years, issues about concept formation and concept development are discussed by psychologist using the prevalent means of behavioral experiments [10], but the neural mechanism of these processes are not clear. Recently, neural network models [1, 2] are used to discuss the problems about concept formation from the angel of word learning. This method first needs to explain that how a concept is represented in the neural system, i.e. how a cognition state, such as concept or memory, is represented by a physical system which consists of neurons and connections between them. Hopfield successfully discussed the problem using the thoughts on emergence, and his model has been applicated frequently in the field of cognition. Hopfield network [7] indicated that an attractor which dominates substantial region around it in the phase space denotes a nominally assigned memory. The same as memory, concept is also cognition state and should be represented by attractor of the physical system. Obviously, the collective behaviors are more appropriate than individual unit to express cognition states because of more robustness and stability. Some attractor networks [6, 4] have been created to study questions about language learning. Concrete instance can be represented by many features, and maybe language is the best way to describe these features. Word meanings carve up the world in complex ways, such that an entity, action, property, or relation can typically be labeled by multiple words [13]. Language can be seen as a simple and complete projection of the real world. The early stages of word learning are often used to study the issues of concept formation in many literatures [13, 12]. Although these works mainly focus on the acquisition of concrete concepts, abstract concepts should can be discussed by using language as research object. Some Chinese characters are pictograph, and the grapheme can express more signification than alphabet language systems. So from the angle of modeling, Chinese characters would be more suitable than other languages for exploring the neural mechanisms of concept formation. In this article, the process of how features are extracted from samples is simulated and the neural mechanisms would be discussed. A model based on Hopfield network is constructed, and the connection weights are updated based on a variant of Hebb learning rule. Samples with some common features, certainly each of them has some special features, are used to train the model. The weights states and test results indicate that the common features of the samples can be extracted by the model and represented by an attractor of the system. In the first section of this paper, a model based on Hopfield network is constructed, including architecture, weight update algorithm, samples, etc. The second section, three groups of samples are used to train the model and the simulated results are illuminated. At last, the neural mechanism of concept formation is summed up and explained.

2 Model Let us consider a fully connected recurrent neural network, which is a variant of Hopfield model. The details of the network, such as the Architecture, weight adjustment, samples and training, etc. are described below. 2.1 Architecture Of The Network Our neural network just has one layer and is composed of N neurons. Each neuron i has two states V i = 0 or V i = 1, which denote not firing or firing at maximum rate respectively. The instantaneous state of the system is specified by listing the N values of V i, so it is represented by a binary word of N bits. The network is full connected, ie. all neurons connected with each others. The strength of the connection from neuron j to neuron i is defined as w ij. We suppose that 0 w ij 1, i, j and w ii = 0, i. How the system deals with information is decided by the current weights state. Because there is only one layer in our model, each neuron plays the roles of receiving input vector form environment and denoting the output result. Input vector X i should be represented by a binary word of N bits so as to be matched with the neuron state. For example, the ith input vector can be written as X i = [x i,1, x i,2,..., x i,n ], x i,j = 0 or 1. The output of our network can be represented by the states of all the N neurons. 2.2 Weight Update Algorithm All the neuron states should be determined before the weights are updated. Two cases will be considered for calculating the neuron state. On the one hand, when a sample is input to the network the neurons states are changed to the same as the input vector, i.e. V i = x c,i, here x c,i denotes the ith component of the current input vector; On the other hand, when no external instance were provided, the neuron state changes with time according to the following algorithm. N V i = hardlim( w ij V j θ i ) (1) j=1 where θ i denotes the threshold of the ith neuron. In a certain system, we assume that all the thresholds are equal to a constant θ. In the formal theory of neural networks the weight w ij is considered as a parameter that can be adjusted so as to optimize the performance of a network for a given task. In our model, we assume that the weight will be updated according to Hebbian learning rule [5], i.e. the network learns by strengthening connection weights between neurons activated at the same time. It can be written as following: η w ij d, if V i = 1, V j = 1 η w ij d, if V i = 1, V j = 0 w ij = η w ij d, if V i = 0, V j = 1 d, if V i = 0, V j = 0 here, 0 < η < 1 is a small constant called learning rate. The parameter d is a small positive constant that describes the rate by which w ij decays back to zero in the absence of stimulation. Of course, equation (2) is just one of the possible forms to specify rules for the growth and decay of the weights, and there are some difference with the other forms of Hebb rule [3]. From the formula (2) we can see that synaptic efficacy w ij would grow without limit if the same potentiating stimulus is applied over and over again. A saturation of the weights should be consider. On the other hand, the synaptic efficacy w ij should be non-negative. These two restrictions can be achieved by setting: 1, if w ij (t + 1) > 1 w ij (t + 1) = w ij (t + 1), if 0 w ij (t + 1) 1 (3) 0, if w ij (t + 1) < 0 2.3 Samples For Training In our model, the features of sample are represented by the dot matrix of Chinese character. Each element of the matrix denotes one feature. The value of each element is 1 or 0, which indicates that the sample has or hasn t the feature corresponding to the element. Each neuron in the network has only two states, any input vector should be represented by a binary word so as to be matched with the neuron state. Combination of a few Chinese characters is chosen as sample of our model, such as,, etc. Each sample is presented using m 16 16 dot matrix, where m denotes the number of Chinese character. The matrix element in the character s stroke is set to 1, otherwise should be set to 0. At last the input vector can be obtained by converting the dot matrix into a vector, example of an instance is shown in Figure 1. It is need to point out that different training set should be used for different purpose of experiment. Each sample of a training set has the same number of characters, but samples belong to different sets perhaps has different number of characters. The number of neurons in our network is decided by the number of characters in each sample. For examples, a network using as a sample would consist of 2 16 16 = 512 neurons. (2)

is the essential feature that all horses have altogether and the color is the special feature which each sample has it s own solely. The model simulates the process in which the abstract concept horse is extracted from several samples of horse with different color by drawing out the common features and eliminating the unique characteristic. The concept should correspond to an attractor of our model. Figure 1. The representation of an instance (A)The instance includes 2 characters; (B)The instance represented by dot matrix; (C)The input vector obtained by putting the dot matrix into one column. 2.4 Training And Testing Training is a procedure of weights updated iteratively according to the external input. In a certain experiment, the training set for the network consisted of k samples with some identical properties. During each epoch, an instance randomly selected from the training set (k samples) is shown to the network. The neuron states are changed to be the same as the input vector, and the weights are adopted according with the formulae 2 and 3. The network should be trained again and again until the weight matrix is changed over a small range at last. After learning a training set in which the instances have some identical features, does the network know these features? We addressed this question by presenting the network with some input patterns and examining the output patterns of network. If the network has learned these features, then the network will evolve to a stable state witch denotes the concept when a sample with all or most of the identical features is shown to the network. In the testing procedure, all of the weights will be fixed and a test sample would be shown to the network. The output of the network can be calculated according to formulae 1. 3 Simulation Results Firstly, how the concept horse is extracted from samples is simulated. For a horse sample, we consider the following two features: the shape and the color. The shape Figure 2. The training set includes six instances. Six samples shown in figure 2 are used in the model, i.e. k = 6. The neuron number N = 512 can be determined by calculating the dots of any sample. Before training, we initialize the weights randomly from 0.2 to 1, i.e. { 0.2 < Wij (0) = W ji (0) < 1, i, j and i j; (4) W ii (0) = 0, i. the wights are shown in Figure 3(A). The other parameters are set to be η = 0.25, d = 0.05. The network is trained 150 times with samples selected from the training set randomly and the weights are shown in Figure 3(B). Figure 3. The weight matrix evolve from a random initial state to a stable final state. (A)The initial state; (B)The final state. By comparing the trained weights with the initial, we can obtain the following results: 1. The weight matrix changes from random distribution to a stable state with the

training process, and no any obvious change will happen once the weights reach the stable state. 2. From figure 3, we can see that the number of connections between neurons is massively reduced, but the average connection strength changes from w ij = 0.600 to w ij = 0.999 in the process of training. The more connections indicate the more plasticity of the network, and the strong and stable connections perhaps denote certain cognition patterns. 3. Because the similarity between individual character of the samples, some elements in the top-left quarter of the weight matrix are not 0. All elements except the bottom-right quarter will change to 0 by increasing the number of samples in the training set. The stable state of the weights is a attractor of the system, and the fix point correspond to a cognition pattern of concept horse extracted from samples. In theoretical aspect, the analysis of the collective behaviors of the neurons may refer to hopfield s work. We can also exam the attractor and cognition state of the system directly using three test samples which displayed in Figure 4(A), (B), (C). Here, we set the parameter θ = 30. The network weights are fixed and the samples are input to the network respectively. The output of the network can be calculated by formulae 1 and the test results are also shown in Figure 4. In the phase space, the three samples is dominated by a attractor which is the nominally assigned concept horse and they will eventually settle into the attractor state. On the other hand, sample which is not in the domain of the attractor will not evolving to the stable state. An example is shown in Figure 4(D). Figure 4. The attractor of the network is tested though three positive examples and a negative instance. (A) A sample arbitrarily selected from the training set; (B) An incomplete sample includes a majority of, but not all, features of the concept horse; (C) A sample includes all features of horse, and individual features are given arbitrarily; (D) A sample includes only little feature of the concept horse, although its individual features are used in the training process. Our model simulates the forming process of natural concept by using the horse as an example. In fact, a class of concept which formed by extracting the common features from concrete instances can be simulated using our model, such as the concept of natural number and addition. The simulation result is shown in Figure 5. Figure 5. Two other examples of concept formation. (A) The number concept 3 is extracted from the six samples; (B)The addition concept 2+3=5 is extracted from the six samples. Certainly, the concept of number consists of many connotations include concrete concept, abstract concept, ordering concept and number structure [9]. Our model only simulates the emerge process from concrete concept to abstract concept. 4 Discussion As we mentioned above, the issues of concept formation have been discussed from the angel of word learning, and two broad classes of proposals for how word learning works have been dominant in the literature: hypothesis elimination and associative learning. we consider that the union of the two operations may be the mechanism of concept formation. On the one hand, The features that all samples have are called essential features, and the connections between neurons which represent the essential features will be strengthened during the training process. Associative learning works. On the other hand, the connections between individual features each other and between individual features and essential features become weaker and weaker gradually. Hypothesis elimination works. These two operations can be precisely described by Hebbian learning rule. So Hebbian learning rule should be the neural mechanism of concept formation in certain condition. The reliability of our model can be explained by the fact from neuroanatomy. Experiments indicate that the number of connections between neurons is massively reduced in the adult compared to the infant. In the cat, for example, there is a huge decrease in the number of callosal axons during neonatal life, and a 90% reduction in the number of synapses and branches of the axonal arbors of the remaining fibres [8, 11]. The fact is similar with out simulation results. However, the process of concept formation is very complicated and the essential of the process is emergence.

Hebbian learning rule perhaps can curve up the forming mechanism of some simple concept. For complex scientific concept and social concept, more kinds of factors and more complex mechanisms should be considered. Acknowledgement This work is supported by NSFC under the grant No.60534080, No.60374010 and No.70471080. References [1] E. Colunga, L. B. Smith, From the Lexicon to Expectations About Kinds: A Role for Associative Learning, Psychological Review 2 (2005) 347-382. [2] M. Gasser, L. B. Smith, Learning nouns and adjectives: A connectionist approach. Language and Cognitive Processes 13 (1998) 269-306. [3] W. Gerstner, W. M. Kistler, Spiking Neuron Models: Single Neurons, Populations, Plasticity, Cambridge University Press, Cambridge, 2002. [4] M. W. Harm, M. S. Seidenberg, Phonology, Reading Acquisition, and Dyslexia: Insights from Connectionist Models, Psychological Review 106 (1999) 491-528. [5] D. O. Hebb, The Organization of Behavior, Wiley, New York, 1949. [6] G. E. Hinton, T. Shallice, Lesioning an attractor network: Investigations of acquired dyslexia. Psychological Review 98 (1991) 74-95. [7] J. J. Hopfield, Neural Networks and Physical Systems with Emergent Collective Computational Abilities, Proc. NatL Acad. Sci. USA 79 (1982) 2554-2558. [8] G. M. Innocenti, Exuberant development of connections, and its possible permissive role in cortical evolution, Trends Neurosci. 18 (1995) 397-402. [9] C. Lin, The study on the development of the number concept and operational ability in schoolchildren, Acta Psychologica Sinica 3 (1981) 289-298. [10] E. Machery, 100 years of psychology of concepts: the theoretical notion of concept and its operationalization, Studies in History and Philosophy of Biological and Biomedical Sciences 38 (2007) 63 84. [11] B. Payne, H. Pearson, P. Cornwell, Deveopment of visual and auditory cortical connections in cat. Cerebral Cortex 7 (1988) 309-389. [12] T. Regier, The Emergence of Words: Attentional Learning in Form and Meaning, Cognitive Science 29 (2005) 819 865. [13] F. Xu, J. B. Tenenbaum, Word Learning as Bayesian Inference, Psychological Review 2 (2007) 245-272.