Mark Girolami Self-Organising Neural Networks Independent Component Analysis and Blind Source Separation Springer
Mark Girolami, BSc (Hons), BA, MSc, PhD, CEng, MIMechE, MIEE Department of Computing and Information Systems, University of Paisley, High Street, Paisley, PA12BE, UK Series Editor J.G. Taylor, BA, BSc, MA, PhD, FlnstP Centre for Neural Networks, Department of Mathematics, King's College, Strand, London WC2R 2LS, UK ISBN-13: 978-1-85233-066-8 British Library Cataloguing in Publication Data Girolami, Mark Self-organising neural networks: independent component analysis and blind source separation. - (Perspectives in neural computing) I.Neural networks (Computer science) 2.Self-organizing systems I.Titie 006.3'2 ISBN-13: 978-1-85233-066-8 Library of Congress Cataloging-in-Publication Data Girolami, Mark, 1963- Self-organising neural networks: independent component analysis and Blind source separation I Mark Girolami. p. cm. -- (Perspectives in neural computing) ISBN-13: 978-1-85233-066-8 e-isbn-13: 978-1-4471-0825-2 DOl: 10.1007/978-1-4471-0825-2 1. Neural networks (Computer science) I. Title. II. Series QA76.87.S47 1999 99-29068 006.3'2--dc21 CIP Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of repro graphic reproduction in accordance with the terms of licences issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. Springer-Verlag London Limited 1999 The use of registered names, trademarks etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation. express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Typesetting: Camera ready by author 34/3830-543210 pruited on acid-free paper SPIN 10689791
Perspectives in Neural Computing Springer London Berlin Heidelberg New York Barcelona Hong Kong Milan Paris Santa Clara Singapore Tokyo
Also in this series: Adrian Shepherd Second -Order Methods for Neural Networks 3-540-76100-4 Dimitris C. Dracopoulos Evolutionary Learning Algorithms for Neural Adaptive Control 3-540-76161-6 John A. Bullinaria, David W. Glasspool and George Houghton (Eds) 4th Neural Computation and Psychology Workshop, London, 9-11 April 1997: Connectionist Representations 3-540-76208-6 Maria Marinaro and Roberto Tagliaferri (Eds) Neural Nets - WIRN VIETRI-97 3-540-76157-8 Gustavo Deco and Dragan Obradovic An Information-Theoretic Approach to Neural Computing 0-387-94666-7 Thomas Lindblad and Jason M. Kinser Image Processing using Pulse-Coupled Neural Networks 3-540-76264-7 L. Niklasson, M. Boden and T. Ziemke (Eds) ICANN98 3-540-76263-9 Maria Marinaro and Roberto Tagliaferri (Eds) Neural Nets - WIRN VIETRI-98 1-85233-051-1 Dietmar Heinke, Glyn W. Humphreys and Andrew Olson (Eds) Connectionist Models in Cognitive Neuroscience The 5th Neural Computation and Psychology Workshop, Birmingham, 8-10 September 1998 1-85233-052-X Amanda J.C. Sharkey (Ed.) Combining Artificial Neural Nets 1-85233-004-X Dirk Husmeier Neural Networks for Conditional Probability Estimation 1-85233-095-3 AchiJleas Zapranis and Apostolos-Paul ReCenes Principles ocneural Model Identification, Selection and Adequacy 1-85233-139-9
Contents Foreword... 1. Introduction... 1 1.1 Self-Organisation and Blind Signal Processing... 1 1.2 Outline of Book Chapters... 3 2. Background to Blind Source Separation... 5 2.1 Problem Formulation... 5 2.2 Entropy and Information... 7 2.2.1 Entropy... 7 2.2.2 Kullback-Leibler Entropy and Mutual Information... 10 2.2.3 Invertible Probability Density Transformations... 15 2.3 A Contrast Function for ICA... 18 2.4 Cumulant Expansions of Probability Densities and Higher Order Statistics... 20 2.4.1 Moment Generating and Cumulant Generating Functions... 20 2.4.2 Properties of Moments and Cumulants... 27 2.5 Gradient Based Function Optimisation... 30 2.5.1 The Natural Gradient and Covariant Algorithms... 31 3. Fourth Order Cumulant Based Blind Source Separation... 35 3.1 Early Algorithms and Techniques... 35 3.2 The Method of Contrast Minimisation... 39 3.3 Adaptive Source Separation Methods... 42 3.4 Conclusions... 44 4. Self-Organising Neural Networks... 47 4.1 Linear Self-Organising Neural Networks... 47 4.1.1 Linear Hebbian Learning... 47 4.1.2 Principal Component Analysis... 50 4.1.3 Linear Anti-Hebbian Learning... 52 4.2 Non-Linear Self-Organising Neural Networks... 56 4.2.1 Non-Linear Anti-Hebbian Learning: The Herrault-Jutten Network... 56 ix
VI Self-Organising Neural Networks 4.2.2 Information Theoretic Algorithms... 62 4.2.3 Non-Linear Hebbian Learning Algorithms... 70 4.2.3.1 Signal Representation Error Minimisation... 71 4.2.3.2 Non-Linear Criterion Maximisation... 73 4.3 Conclusions... 75 5. The Non-Linear PCA Algorithm and Blind Source Separation... 77 5.1 Introduction... 77 5.2 Non-Linear PCA Algorithm and Source Separation... 77 5.3 Non-Linear PCA Algorithm Cost Function... 79 5.4 Non-Linear PCA Algorithm Activation Function... 86 5.4.1 Asymptotic Stability Requirements... 87 5.4.2 Stability Properties of the Compound Activation Function... 92 5.4.3 Stability of Solution with Sub-Gaussian Sources... 96 5.4.4 Simulation: Separation of Mixtures of Sub-Gaussian Sources... 98 5.4.5 Stability of Solution with Super-Gaussian Sources... 104 5.4.6 Simulation: Separation of Mixtures of Super-Gaussian Sources.. 108 5.4.7 Separation of Mixtures of Both Sub- and Super-Gaussian Sources... 114 5.5 Conclusions... 116 6. Non-Linear Feature Extraction and Blind Source Separation... 119 6.1 Introduction... 119 6.2 Structure Identification in Multivariate Data... 119 6.3 Neural Network Implementation of Exploratory Projection Pursuit... 121 6.4 Neural Exploratory Projection Pursuit and Blind Source Separation.. 123 6.5 Kurtosis Extrema... 124 6.6 Finding Interesting and Independent Directions... 127 6.7 Finding Multiple Interesting and Independent Directions Using Symmetric Feedback and Adaptive Whitening... 132 6.7.1 Adaptive Spatial Whitening... 133 6.7.2 Simulations... 136 6.7.3 An Extended EPP Network with Non-Linear Output Connections... 141 6.8 Finding Multiple Interesting and Independent Directions Using Hierarchic Feedback and Adaptive Whitening... 150 6.9 Simulations... 151 6.10 Adaptive BSS Using a Deflationary EPP Network... 152 6.11 Conclusions... 159 7. Information Theoretic Non-Linear Feature Extraction And Blind Source Separation... 165 7.1 Introduction... 165
Contents vii 7.2 Information Theoretic Indices for EPP... 165 7.3 Maximum Negentropy Learning... 167 7.3.1 Single Neuron Maximum Negentropy Learning... 167 7.3.2 Multiple Output Neuron Maximum Negentropy Learning... 171 7.3.3 Maximum Negentropy Learning and Infomax Equivalence... 176 7.3.4 The Natural Gradient and Covariant Learning... 178 7.4 General Maximum Negentropy Learning... 181 7.5 Stability Analysis of Generalised Algorithm... 191 7.6 Simulation Results... 192 7.7 Conclusions... 200 8. Temporal Anti-Hebbian Learning... 201 8.1 Introduction... 201 8.2 Blind Source Separation of Convolutive Mixtures... 201 8.3 Temporal Linear Anti-Hebbian Model... 205 8.4 Comparative Simulation... 210 8.5 Review of Existing Work on Adaptive Separation of Convolutive Mixtures... 213 8.6 Maximum Likelihood Estimation and Source Separation... 220 8.7 Temporal Anti-Hebbian Learning Based on Maximum Likelihood Estimation... 223 8.8 Comparative Simulations Using Varying PDF Models... 229 8.9 Conclusions... 237 9. Applications... 239 9.1 Introduction... 239 9.2 Industrial Applications... 239 9.2.1 Rotating Machine Vibration Analysis... 240 9.2.2 A Multi-Tag Frequency Identification System... 241 9.3 Biomedical Applications... 242 9.3.1 Detection of Sleep Spindles in EEG... 242 9.4 ICA: A Data Mining Tool... 243 9.5 Experimental Results... 248 9.5.1 The Oil Pipeline Data... 249 9.5.2 The Swiss Banknote Data... 250 9.6 Conclusions... 254 References... 255 Index... 269
Foreword The conception of fresh ideas and the development of new techniques for Blind Source Separation and Independent Component Analysis have been rapid in recent years. It is also encouraging, from the perspective of the many scientists involved in this fascinating area of research, to witness the growing list of successful applications of these methods to a diverse range of practical everyday problems. This growth has been due, in part, to the number of promising young and enthusiastic researchers who have committed their efforts to expanding the current body of knowledge within this field of research. The author of this book is among one of their number. I trust that the present book by Dr. Mark Girolami will provide a rapid and effective means of communicating some of these new ideas to a wide international audience and that in turn this will expand further the growth of knowledge. In my opinion this book makes an important contribution to the theory of Independent Component Analysis and Blind Source Separation. This opens a range of exciting methods, techniques and algorithms for applied researchers and practitioner engineers, especially from the perspective of artificial neural networks and information theory. It has been interesting to see how rapidly the scientific literature in this area has grown. This present book comes at a good time, because it provides a well reasoned introduction to the basic ideas for those who are curious about the theoretical derivation of unsupervised learning algorithms for blind source separation. It also provides a self-contained analysis of algorithms with an emphasis on recent research results that include the well-balanced research works of the author. Due to the many promising applications the subject of independent component analysis will continue to be a fruitful area of research. Dr. Andrzej Cichocki Head of Laboratory for Open Information Systems, Brain Science Institute, Riken, Japan and Warsaw University of Technology, Poland E-mail: cia@brain.riken.go.jp April 1999