NEURAL NETWORKS TO SIMULATE HUMAN LEARNING: A SHIFT TOWARDS MODULAR ARCHITECTURES

NEURAL NETWORKS TO SIMULATE HUMAN LEARNING: A SHIFT TOWARDS MODULAR ARCHITECTURES Syed Sibte Raza Abidi School of Computer Sciences Universiti Sains Malaysia 11800 Penang, Malaysia. Email: sraza@cs.usm.my ABSTRACT Neural networks have a natural propensity for learning - learning from being instructed or from experience. We believe that neural networks provide substantial opportunities for simulating various human learning activities in that these networks emphasise an inherent adaptive learning ability, either supervised (based on instructions) or unsupervised (based on observations). However, to use neural networks for simulating aspects of human cognition and development it remains of interest to examine the parallels, if any, between human learning and neural network learning mechanisms. For that matter, we suggest a possible interpretation of traditional psychological notions of human learning in a neural network terminology. We argue that in order to simulate aspects of human learning, it is important to use a modular neural network architecture that integrates a variety of neural networks in some principled fashion. We propose a framework for developing modular neural network architectures, and present ACCLAIM, an exemplar modular neural network architecture for simulating the development of early child language. 1. INTRODUCTION Learning is an extremely studied subject, particularly in psychology and more recently in Artificial Intelligence and Neural s (or Connectionist s). Neural networks emphasise an inherent adaptive learning ability, either supervised (based on instructions) or unsupervised (based on observations). Both the neural network community and psychologists widely suggest that due to their learning capabilities, neural networks provide substantial opportunities for simulating various human learning activities. However, it remains of interest to examine the parallels, if any, between human learning and neural network learning mechanisms. The influence of psychology on neural network learning has been always very direct: Hebbian learning, that is, to reinforce the connection weights between simultaneously active units, was inspired by early Pavlovian learning models. More recently, some neural network researchers and philosophers of science, including McClelland (1988), Bechtal & Abrahamsen (1991) and Seidenberg (1993) have suggested a neural network based interpretation of psychological notions of learning, in particular notions of human learning proposed by the eminent Swiss psychologist Jean Piaget. In this paper, we firstly provide an interpretation of psychological notions of human learning in a neural network terminology. This exercise leads towards establishing the aptness of neural networks for simulating aspects of human learning and cognition. Next, we argue that to perform realistic simulations of human learning activities it is important to use modular neural network architectures, so as to incorporate the variety of constraints that need to be addressed by a realistic simulation. In this regard, we propose a framework for developing modular neural network architectures. Finally, we demonstrate how neural networks can be used to simulate, or more accurately trained, to mimic the development of language amongst children during infancy. We present an exemplar modular neural network architecture -- ACCLAIM, which not only simulates aspects of the development of child language both at the oneword and two-word stage, but also produces child-like one-word utterances and two-word sentences.

2. AN INTERPRETATION OF HUMAN LEARNING IN TERMS OF NEURAL NETWORK TERMINOLOGY According to eminent developmental psychologist Jean Piaget, learning is the acquisition of knowledge due to some particular information provided by the environment. Learning is inconceivable without a theoretically prior interior structure of equilibrium which provides the capacity to learn and the structuring of the learning process; in a wide sense, it includes both (Furth, 1969: 294). Furthermore, the so-called cognitive development is made possible through an interaction between what Piaget s calls assimilation - a process by which perceptual stimuli are absorbed and interpreted, and accommodation - a co-occurring process whereby the internal structures are adjusted to facilitate the assimilation of the new perceptual stimuli. It may be noted that Piaget s definition of human learning includes references to environment, prior interior structure, capacity to learn and a learning process. Piagetian notions of learning, synthesising biological growth and environmental influence, have a computational interpretation, albeit rather a simplistic one. We argue that a computational interpretation of psychological notions of learning need to incorporate data structures that can learn. By learning we mean that the data structures should have the tendency to modify or expand to incorporate new information acquired by way of continuous interaction with the environment. Traditional AI structures may suffice to represent knowledge, but they lack the ability to learn in a developmental, time-varying manner. On the contrary, we find neural networks having a natural propensity to learn - either from experience or from being told, and that their learning mechanisms have some empathy with Piaget s notions of learning. Our attempt to reinterpret Piaget s notion of learning in neural network terminology takes into account James McClelland s (1989) seminal paper, in which he provides an introductory exposition of connectionism s 1 influence on modelling aspects of cognition. McClelland s essay includes the description of the learning principle governing cognitive development: adjust the parameters of the mind in proportion to the extent to which their adjustments can produce a reduction in the discrepancy between the expected and observed events (1989:20). It is of interest to note that, from McClelland s learning principle, which claims to capture the residue of Piaget s notions of human learning, emerges a neural network based interpretation of Piaget s notions of learning, as noted in table 1. Piagetian Constructs Analogous Neural Notions Parameters of the Connections among units. Both entities are amenable to alteration mind due to experience. Expected event Desired pattern of activation over the network's output units. Observed event Actual pattern of activation produced over the network's output units. Adjustment of the Connectionist learning processes that involve adjustment of parameters connections. Discrepancy 'Error minimisation' process during connectionist learning, reducing reduction error between expected and observed pattern of activation. Table 1: Correspondence between Piagetian constructs and analogous neural network notions. We believe that, an advantage of suggesting a neural network based interpretation of Piagetian notions of learning is that these notions can now be implemented into neural networks and can be observed by simulating a variety of cognition oriented scenarios, for instance the simulation of concept development, language development, language production and so on. 3. A SHIFT TOWARDS MODULAR NEURAL NETWORK ARCHITECTURES 1 We would use the wordconnectionism or Connectionist as a synonym of neural networks

Over the years neural network technology has certainly matured, in a theoretical sense new neural network architectures and learning algorithms have been formulated, also the philosophical implications of neural networks are seemingly now more well-grounded. We believe that now when the efficacy of neural networks is widely accepted, the scope of cognitive modelling using neural networks need to be expanded. Previously, many neural network researchers have attempted to simulate aspects of human learning, in particular linguistic behaviour, using a single neural network and learning mechanism -- the multilayered backpropagation network, a controlled feedback loop, implementing a supervised learning algorithm. The success of such simulations was determined in terms of the ability of the neural network to associate a set of input patterns with a corresponding set of output patterns. Although, the strategy employed by early researchers is seemingly valid for simulating low-level cognitive activities, however if one needs to simulate a high level cognitive activity, which involves an interplay of a variety of cognitive aspects, the single neural network approach would certainly prove inadequate. Developmental psychologists have consistently argued that human development, which may include the development of language, sensori-motor control, visual recognition, and object permanence, etc. is achieved through different learning mechanisms, for instance error correction, classical conditioning, self-organisation, pattern classification and so on. Therefore, to perform a realistic simulation of some aspect of human cognitive development, in our case language development, one at least needs a unified framework that (a) incorporates a variety of learning mechanisms; (b) manipulates a variety of inputs - perceptual, verbal, functional, etc.; (c) includes both localist and distributed representation schemes; and (d) satisfies multiple simultaneous constraints. This brings into relief the need for a modular neural network architecture: an architecture that integrates both supervised and unsupervised learning algorithms in a unified framework, thus exploiting the collective strengths of a variety of neural networks to provide a more realistic simulation. In a modular neural network architecture the individual neural networks retain their structural and functional distinctness and can be viewed as independent 'modules' of a model. Development of modular neural network architectures, in simple terms, requires the mixing and matching of the relative strengths of a variety of neural networks. We propose a framework for developing modular neural network architectures that distinguishes candidate neural networks on the basis of their intrinsic characteristics such as learning mechanisms, input/output representation schemes, environmental considerations and so on (Abidi, 1994 & 1996). Our framework mainly emphasises (i) psychological and neurobiological distinctions between various neural networks when selecting neural networks to simulate certain tasks; (ii) architectural specifications - determining the number of layers, number of units in a layer, activation update functions and learning parameters; (iii) a plausible connectivity scheme by which various neural networks can efficiently communicate with each other; and (iv) variety of training strategies, including (a) one neural network learning its training data independently; (b) two or more neural networks learning their specified training data simultaneously; and (c) a co-operative training strategy where one or more neural networks transform the training data to a representation scheme that is interpretable by the principal neural network being trained. We present below a seven phase strategy for developing modular neural network architectures: I. identify the sub-tasks constituting a complex cognitive activity. Use an individual neural network to simulate a sub-task. Such a neural network can be regarded as an independent module of the modular architecture. II. design appropriate neural networks that can simulate the sub-tasks. The design metrics are the number of layers, number of units in each layer, connectivity pattern of the layers and the activation function of the units. III. develop a knowledge representation scheme that can be shared by other neural networks, i.e., the knowledge stored in one network is accessible to other networks in the modular architecture. IV. establish a communication mechanism among the neural networks so that information is accessible throughout the modular architecture. V. train each neural network with its respective stimuli either separately or if needed in conjunction with other related networks.

VI. represent explicitly the knowledge learnt by each neural network, such that it is understandable and has some significance to an external observer. VII. formulate a processing scheme that may synchronise the overall operation of the modular architecture. The processing scheme should retain concurrency, enhance the processing strengths of various networks and at the same time avoid unnecessary cross-talk (influence) between the neural networks. 4. ACCLAIM - A MODULAR NEURAL NETWORK ARCHITECTURE Language development is an exemplar high level human cognitive development; a complex activity that seems improbable to simulate using just a single neural network. Rather, a realistic simulation of child language development would require a variety of multi-layered neural networks: for instance, one to learn to process lexical input and output, yet another to learn phonology and more networks to learn concepts, semantic relations and word-order. Based on earlier proposals advocating the modularity approach for simulating high-level cognitive activities, we present a modular neural network model - ACCLAIM (A Connectionist Child LAnguage Development & Imitation Model), which simulates child language development within the age group 9-24 months. ACCLAIM systematically synthesises both supervised and unsupervised learning neural networks (including Kohonen Maps, Backpropagation networks, Hebbian Connections and the Spreading Activation mechanism), based on our psycholinguistic model of child language development. ACCLAIM (see figure 1b) has been used to simulate the development and categorisation of concepts amongst children together with the lexicalisation of these concepts: the 'concept memory' and 'word lexicon' have been simulated using two independent Kohonen Maps that are linked together through a Hebbian Connection based naming connection network. Backpropagation networks have been used to implement a 'conceptual relation network (for one-word sentence production) and to implement a 'wordorder network (for two-word sentence production). Children's evolving semantic performance has been simulated by a semantic relation network using an Hebbian Connection. The training data used for our simulation is based on 'realistic' child language data acquired from various child language studies. Perceptual Stimuli Concepts Perceptual Stimuli Unitary Concept Module Concept Memory Naming Connection Module Concept Memory Semantic Relation Module Concept Memory concepts Naming Connections concept categories Conceptual Relations Word Lexicon Semantic Relations One-word utterance Words Learnt Semantic Relations PERCEPTUAL INPUT (Semantic Features) Concept Memory Kohonen Map Conceptual Relations Backpropagation Naming Connection Hebbian Connections Semantic Relations Hebbian Connections Word-Order Testing LINGUISTIC INPUT (Adult two-word collocations) Word Lexicon Kohonen Map Two-word collocation Word Lexicon Word Order Module words Word-order Learnt Word-order ONE-WORD UTTERANCES Backpropagation TWO-WORD SENTENCES (a) (b) Figure 1: (a) Four neural network modules each comprising more than one neural network and simulating some aspect of child language development; (b) The modular architecture of ACCLAIM - an integration of various neural networks each simulating an aspect of child language development

Each of ACCLAIM s constituent neural networks can be envisaged as an individual entity, embodying a different kind of knowledge. These neural networks can then be configured based on our psycholinguistic model to realise a variety of neural network modules, where each module simulates an aspect of child language development. For instance, the naming connection module which simulates concept naming, comprises three neural networks - the concept memory, word lexicon and naming connection network. It should be noted that within a neural network module the constituent neural networks retain their identity and interact with each other. In ACCLAIM, four different neural network modules (shown in Figure 1a) relevant to child language development are implemented by integrating various neural networks. One of the advantages of the modular design of ACCLAIM is that knowledge learnt by an individual neural networks is utilised by more than one module, for instance the concepts stored in the concept memory are used by three modules - the one-word module, naming connection module and the semantic relation module. At a deeper level, each module again can be envisaged as an independent model, capable of simulating a psycholinguistic process on its own. For instance, a simulation of the child s development of semantic relations can be performed by just employing the semantic relation module. The modular approach of ACCLAIM makes it possible to work with one module at a time; enabling the simulation of the process with a variety of data without taking into account other modules. More attractively, at a later stage the results of the simulation obtained from one module can be used in simulations involving other modules. Finally, the efficacy of our simulation of child language development carried out using ACCLAIM is demonstrated by the fact that ACCLAIM is able to produce both one-word and two-word sentences in a certain situation, which are similar with the kind of sentences produced by a child in the same real-life situation. Furthermore, ACCLAIM is able to handle novel, noisy and incomplete input situations by generalising to produce adequate and meaningful responses. 5. CONCLUSIONS We have suggested a neural network interpretation of psychological notions of human learning which would assist researchers investigating the role of neural networks in simulating human learning and cognition. The move from single towards modular neural network architectures may form the basis for more elaborate and realistic simulations of human cognitive activities that involve an active interplay between a variety of processes. Furthermore, by way of ACCLAIM we have demonstrated how neural networks can be used to simulate high-level cognitive activities, in particular child language development The architecture of ACCLAIM and the resultant processing capabilities achieved, should be an indicator as to how functionally and structurally divergent neural networks when synthesised together in a meaningful manner, i.e. based on a psycholinguistic model, can simulate a high-level cognitive activity. REFERENCES Abidi, S.S.R. (1996) Neural s and Child Language Development: Towards a Conglomerate Neural Simulation Architecture. To present at International Conference on Neural Information Processing 96, Hong Kong. Abidi, S.S.R. & Ahmad, K. (1996) Child Language Development: A Connectionist Simulation of the Evolving Concept Memory. M. Aldridge (Ed.) Child language. Clevedon: Multilingual Matters Ltd. Abidi, S.S.R. & Ahmad, K. (1994) 'Connectionism as a Model for Child Language Development'. In Artificial Intelligence & Cognitive Science, Seventh Annual Irish Conference, Dublin. Bechtel, W. & Abrahamsen, A. (1991) Connectionism and the Mind. Oxford: Basil Blackwell. Furth, H. G. (1969) Piaget and Knowledge. Chicago: The University of Chicago Press.

McClelland, J. (1989) PDP: Implications for Cognition and Development. R. Morris (Ed.) Parallel Distributed Processing: Implications for Psychology and Neurobiology. Oxford: Clarendon Press. Seidenberg, M. (1993) Connectionist Models and Cognitive Theory. Psychological Science, Vol. 4, pp. 228-235.