The Government of the Russian Federation

Size: px
Start display at page:

Download "The Government of the Russian Federation"

Transcription

1 The Government of the Russian Federation The Federal State Autonomous Institution of Higher Education "National Research University - Higher School of Economics" Faculty of Business Informatics Department of Innovation and Business in Information Technology Master s Program Big Data Systems Author: Dr. Sci., Prof. Andrey Dmitriev, a.dmitriev@hse.ru Moscow, 2014 This document may not be reproduced or redistributed by other Departments of the University without permission of the Authors.

2 Field of Application and Regulations The course "Applied Machine Learning" syllabus lays down minimum requirements for student s knowledge and skills; it also provides description of both contents and forms of training and assessment in use. The course isoffered to students of the Master s Program "Big Data Systems" (area code ) in the Faculty of Business Informatics of the National Research University "Higher School of Economics". The course is a part of the curriculum pool of elective courses (1 st year, M.2.B.1. Optional courses, M.2 Courses required by the M s program of the academic year s curriculum), and it is a twomodule course (3 rd module and 4 th module). The duration of the course amounts to 48 class periods (both lecture and practices) divided into 24 lecture hours and 24 practice hours. Besides, 96 academic hours are set aside to students for self-studying activity. The syllabus is prepared for teachers responsible for the course (or closely related disciplines), teaching assistants, students enrolled on the course "Applied Machine Learning" as well as experts and statutory bodies carrying out assigned or regular accreditations in accordance with educational standards of the National Research University Higher School of Economics, curriculum ("Business Informatics", area code ), Big Data Systems specialization, 1 st year, academic year. 1 Course Objectives The main objective of the Course is to present, examine and discuss with students fundamentals and principles of machine learning. This course is focused on understanding the role of machine learning for big data analysis. Generally, the objective of the course can be thought as a combination of the following constituents: familiarity with peculiarities of supervised learning, parametric and multivariate methods, dimensionality reduction, clustering, nonparametric methods, decision trees, linear discrimination, kernel machines, Bayesian estimation as applied areas related to big data analysis, understanding of the main notions of machine learning theory, the framework of machine learning as the most significant areas of big data analysis, understanding of the role of machine learning in big data analysis, obtaining skills in utilizing machine learning in big data analysis. 2 Students' Competencies to be Developed by the Course While mastering the course material, the student will know main notions of the supervised learning, parametric and multivariate methods, dimensionality reduction, clustering, nonparametric methods, decision trees, linear discrimination, kernel machines, Bayesian estimation, acquire skills of big data analysis, gain experience in big data analysis with use main notions of the supervised learning, parametric and multivariate methods, dimensionality reduction, clustering, nonparametric methods, decision trees, linear discrimination, kernel machines, Bayesian estimation. In short, the course contributes to the development of the following professional competencies:

3 Ccompetencies Ability to offer concepts, models, invent and test methods and tools for professional work Ability to apply the methods of system analysis and modeling to assess, design and strategy development of enterprise architecture Ability to develop and implement economic and mathematical models to justify the project solu-tions in the field of information and computer technology Ability to organize self and collective research work in the enterprise and manage it FSES/ HSE code Descriptors main mastering features (indicators of result achievement) Training forms and methods contributing to the formation and development of competence SC-2 Demonstrates Lecture, practice, home tasks PC-13 Owns and uses Lecture, practice, home tasks PC-14 Owns and uses Lecture, practice, home tasks PC-16 Demonstrates Lecture, practice, home tasks 3 The Course within the Program s Framework The course "Applied Machine Learning" syllabus lays down minimum requirements for student s knowledge and skills; it also provides description of both contents and forms of training and assessment in use. The course is offered to students of the Master s Program "Big Data Systems" (area code ) in the Faculty of Business Informatics of the National Research University "Higher School of Economics". The course is a part of the curriculum pool of required courses (1 st year, M.2.B.1. Optional courses, M.2 Courses required by the M s program of the academic year s curriculum), and it is a twomodule course (3 rd module and 4 th module). The duration of the course amounts to 48 class periods (both lecture and practices) divided into 24 lecture hours and 24 practice hours. Besides, 96 academic hours are set aside to students for self-studying activity. Academic control forms include 2 home tasks are done by students individually, herewith each student has to prepare electronic (PDF format solely) report; all reports have to be submitted in LMS; all reports are checked and graded by the instructor on ten-point scale by the end of the 3 rd module and the 4 th module, pass-final examination, which implies written test and computer-based problem solving. The Course is to be based on the acquisition of the following courses: Calculus Linear Algebra Probability Theory and Mathematical Statistics Data Analysis Economic and Mathematical Modeling Discrete Mathematics The Course requires the following students' competencies and knowledge: main definitions, theorems and properties from Calculus, Linear Algebra, Probability Theory and Mathematical Statistics, Data Analysis, Economic and Mathematical Modeling and Discrete Mathematics, ability to communicate both orally and in written form in English language,

4 ability to search for, process and analyze information from a variety of sources. Main provisions of the course should be used to further the study of the following courses: Risk analysis based on big data Predictive Modeling Marketing analytics based on big data 4 Thematic Course Contents Title of the topic / lecture Hours (total number) Lectures Class hours Seminars Practice Independent work 3 rd Module 1 Supervised Learning Bayesian Decision Theory Parametric and Multivariate Methods Dimensionality Reduction Clustering rd Module TOTAL th Module 6 Nonparametric Methods Decision Trees Linear Discrimination Multilayer Perceptrons Kernel Machines Bayesian Estimation Design and Analysis of Machine Learning Experiments 5 Forms and Types of Testing Type of control Current (week) Resultant Pass-fail exam th Module TOTAL TOTAL Form of control 1 year Department Parameters Home task 1 week 29 Innovation problems solving, written and Business report (paper) Home task 2 week 40 in In- problems solving, written formation report (paper) week 41 Technology written test (paper) and computer-based problem solving Evaluation Criteria Current and resultant grades are made up of the following components: 2 tasks are done by students individually, herewith each student has to prepare electronic (PDF format solely) report. All reports have to be submitted in LMS. All reports are checked and graded by the instructor on tenpoint scale by the end of the 1 st module. All home tasks (HT) are assessed on the ten-point scale summary. pass-final examination implies written test (WT) and computer-based problem solving (CS). Finally, the total course grade on ten-point scale is obtained as

5 O(Total) = 0,6 * O(HT) + 0,1 * O(WT) + 0,3 * O(CS). A grade of4 or higher means successful completion of the course ("pass"), while grade of 3 or lower mean sun successful result ("fail"). Conversion of the concluding rounded grade O(Total) to five-point scale grade. 6 Detailed Course Contents Lecture 1. Supervised Learning Examples of Machine Learning Applications. Learning Associations: Classification, Regression, Unsupervised Learning, Reinforcement Learning. Learning a Class from Examples. Vapnik-Chervonenkis (VC). Dimension. Probably Approximately Correct (PAC) Learning. Noise. Learning Multiple Classes. Regression. Model Selection and Generalization. Dimensions of a Supervised Machine Learning Algorithm. Practice 1. Probably Approximately Correct (PAC) Learning. Noise. Learning Multiple Classes. Regression. Model Selection and Generalization. Dimensions of a Supervised Machine Learning Algorithm. 1. Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cambridge, Angluin, D Queries and Concept Learning. Machine Learning 2: Blumer, A., A. Ehrenfeucht, D. Haussler, and M. K. Warmuth Learnability and the Vapnik-Chervonenkis Dimension. Journal of the ACM 36: Dietterich, T. G Machine Learning. In Nature Encyclopedia of Cognitive Science. London: Macmillan. 4. Hirsh, H Incremental Version Space Merging: A General Framework for Concept Learning. Boston: Kluwer. Lecture 2. Bayesian Decision Theory Introduction. Classification. Losses and Risks. Discriminant Functions. Utility Theory. Association Rules. Practice 2. Classification. Losses and Risks. Discriminant Functions. Utility Theory. Association Rules. 1. Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cabridge, Agrawal, R., H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, ed. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Cambridge, MA: MIT Press. 2. Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. Li, J On Optimal Rule Discovery. IEEE Transactions on Knowledge and Data Discovery 18: Newman, J. R., ed The World of Mathematics. Redmond, WA: Tempus. Omiecinski, E. R Alternative Interest Measures for Mining Associations in Databases. IEEE Transactions on Knowledge and Data Discovery 15: Russell, S., and P. Norvig Artificial Intelligence: A Modern Approach. New York: Prentice Hall. Shafer, G., and J. Pearl, eds Readings in Uncertain Reasoning. SanMateo,

6 CA: Morgan Kaufmann. 5. Zhang, C., and S. Zhang Association Rule Mining: Models and Algorithms. New York: Springer. Lecture 3. Parametric and Multivariate Methods Maximum Likelihood Estimation: Bernoulli Density, Multinomial Density, Gaussian (Normal) Density. Evaluating an Estimator: Bias and Variance. The Bayes Estimator. Parametric Classification. Regression. Tuning Model Complexity: Bias/Variance Dilemma. Model Selection Procedures. Multivariate Data. Parameter Estimation. Estimation of Missing Values. Multivariate Normal Distribution. Multivariate Classification. Tuning Complexity. Discrete Features. Multivariate Regression. Practice 3. Maximum Likelihood Estimation. Multivariate Classification. Tuning Complexity. Discrete Features. Multivariate Regression. 1. Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cambridge, Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. 2. Friedman, J. H Regularized Discriminant Analysis. Journal of American Statistical Association 84: Harville, D. A Matrix Algebra from a Statistician s Perspective. New York: Springer. 4. Manning, C. D., and H. Schutze Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press. 5. McLachlan, G. J Discriminant Analysis and Statistical Pattern Recognition. New York: Wiley. 6. Rencher, A. C Methods of Multivariate Analysis. New York: Wiley. 7. Strang, G Linear Algebra and its Applications, 3rd ed. New York: Harcourt Brace Jovanovich. Lecture 4. Dimensionality Reduction Subset Selection. Principal Components Analysis. Factor Analysis. Multidimensional Scaling. Linear Discriminant Analysis. Isomap. Locally Linear Embedding. Practice 4. Principal Components Analysis. Factor Analysis. Multidimensional Scaling. Linear Discriminant Analysis. 1. Balasubramanian, M., E. L. Schwartz, J. B. Tenenbaum, V. de Silva, and J. C. Langford The Isomap Algorithm and Topological Stability. Science 295: Chatfield, C., and A. J. Collins Introduction to Multivariate Analysis. London: Chapman and Hall. 3. Cox, T. F., and M. A. A. Cox Multidimensional Scaling. London: Chapman and Hall. 4. Devijer, P. A., and J. Kittler Pattern Recognition: A Statistical Approach. New York: Prentice-Hall. 5. Flury, B Common Principal Components and Related Multivariate Models. New York: Wiley.

7 6. Fukunaga, K., and P. M. Narendra A Branch and Bound Algorithm for Feature Subset Selection. IEEE Transactions on Computers C-26: Lecture 5. Clustering Mixture Densities. k-means Clustering. Expectation-Maximization Algorithm. Mixtures of Latent Variable Models. Supervised Learning after Clustering. Hierarchical Clustering. Choosing the Number of Clusters. Practice 5. Mixture Densities. k-means Clustering. Expectation-Maximization Algorithm. Mixtures of Latent Variable Models. 1. Alpaydın, E Soft Vector Quantization and the EM Algorithm. Neural Networks 11: Barrow, H. B Unsupervised Learning. Neural Computation 1: Bezdek, J. C., and N. R. Pal Two Soft Relatives of Learning Vector Quantization. Neural Networks 8: Bishop, C. M Latent Variable Models. In Learning in Graphical Models, ed. M. I. Jordan, Cambridge, MA: MIT Press. 5. Dempster, A. P., N. M. Laird, and D. B. Rubin Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of Royal Statistical Society B 39: Gersho, A., and R. M. Gray Vector Quantization and Signal Compression. Boston: Kluwer. 7. Ghahramani, Z., and G. E. Hinton The EM Algorithm for Mixtures of Factor Analyzers. Technical Report CRG TR-96-1, Department of Computer Science, University of Toronto Lecture 6. Nonparametric Methods Nonparametric Density Estimation. Histogram Estimator: Kernel Estimator, k-nearest Neighbor Estimator. Generalization to Multivariate Data. Nonparametric Classification. Condensed Nearest Neighbor. Nonparametric Regression: Smoothing Models, Running Mean Smoother, Kernel Smoother, Running Line Smoother. How to Choose the Smoothing Parameter. Practice 6. Nonparametric Density Estimation. Nonparametric Regression. 1. Aha, D. W., ed Special Issue on Lazy Learning, Artificial Intelligence Review 11(1 5): Aha, D. W., D. Kibler, and M. K. Albert Instance-Based Learning Algorithm. Machine Learning 6: Atkeson, C. G., A. W. Moore, and S. Schaal Locally Weighted Learning. Artificial Intelligence Review 11: Cover, T. M., and P. E. Hart Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory 13: Dasarathy, B. V Nearest Neighbor Norms: NN Pattern Classification Techniques. Los Alamitos, CA: IEEE Computer Society Press.

8 6. Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. Geman, S., E. Bienenstock, and R. Doursat Neural Networks and the Bias/Variance Dilemma. Neural Computation 4: Lecture 7. Linear Discrimination Generalizing the Linear Model. Geometry of the Linear Discriminant: Two Classes, Multiple Classes. Pairwise Separation. Parametric Discrimination Revisited. Gradient Descent. Logistic Discrimination: Two Classes, Multiple Classes. Discrimination by Regression. Practice 7. Generalizing the Linear Model. Geometry of the Linear Discriminant. Logistic Discrimination. 1. Aizerman, M. A., E. M. Braverman, and L. I. Rozonoer Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning. Automation and Remote Control 25: Anderson, J. A Logistic Discrimination. In Handbook of Statistics, Vol. 2, Classification, Pattern Recognition and Reduction of Dimensionality, ed. P. R. Krishnaiah and L. N. Kanal, Amsterdam: North Holland. 3. Bridle, J. S Probabilistic Interpretation of Feedforward Classification Network Outputs with Relationships to Statistical Pattern Recognition. In Neurocomputing: Algorithms, Architectures and Applications, ed. F. Fogelman-Soulie and J. Herault, Berlin: Springer. 4. Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. McCullagh, P., and J. A. Nelder Generalized Linear Models. London: Chapman and Hall. Lecture 8. Multilayer Perceptrons Introduction: Understanding the Brain, Neural Networks as a Paradigm for Parallel Processing. The Perceptron. Training a Perceptron. Learning Boolean Functions. Multilayer Perceptrons. MLP as a Universal Approximator. Backpropagation Algorithm: Nonlinear Regression, Two-Class Discrimination, Multiclass Discrimination, Multiple Hidden Layers. Training Procedures: Improving Convergence, Overtraining, Structuring the Network, Hints. Tuning the Network Size. Bayesian View of Learning. Dimensionality Reduction. Learning Time. Time Delay Neural Networks. Recurrent Networks. Practice 8. Backpropagation Algorithm. Training Procedures. 1. Abu-Mostafa, Y Hints. Neural Computation 7: Aran, O., O. T. Yıldız, and E. Alpaydın An Incremental Framework Based on Cross- Validation for Estimating the Architecture of a Multilayer Perceptron. International Journal of Pattern Recognition and Artificial Intelligence 23: Ash, T Dynamic Node Creation in Backpropagation Networks. Connection Science 1: Battiti, R First- and Second-Order Methods for Learning: Between Steepest Descent and Newton s Method. Neural Computation 4:

9 5. Bishop, C. M Neural Networks for Pattern Recognition. Oxford: Oxford University Press. Bourlard, H., and Y. Kamp Auto-Association by Multilayer Perceptrons and Singular Value Decomposition. Biological Cybernetics 59: Lecture 9. Multilayer Perceptrons Introduction: Understanding the Brain, Neural Networks as a Paradigm for Parallel Processing. The Perceptron. Training a Perceptron. Learning Boolean Functions. Multilayer Perceptrons. MLP as a Universal Approximator. Backpropagation Algorithm: Nonlinear Regression, Two-Class Discrimination, Multiclass Discrimination, Multiple Hidden Layers. Training Procedures: Improving Convergence, Overtraining, Structuring the Network, Hints. Tuning the Network Size. Bayesian View of Learning. Dimensionality Reduction. Learning Time. Time Delay Neural Networks. Recurrent Networks. Practice 9. Backpropagation Algorithm. Training Procedures. 1. Abu-Mostafa, Y Hints. Neural Computation 7: Aran, O., O. T. Yıldız, and E. Alpaydın An Incremental Framework Based on Cross- Validation for Estimating the Architecture of a Multilayer Perceptron. International Journal of Pattern Recognition and Artificial Intelligence 23: Ash, T Dynamic Node Creation in Backpropagation Networks. Connection Science 1: Battiti, R First- and Second-Order Methods for Learning: Between Steepest Descent and Newton s Method. Neural Computation 4: Bishop, C. M Neural Networks for Pattern Recognition. Oxford: Oxford University Press. Bourlard, H., and Y. Kamp Auto-Association by Multilayer Perceptrons and Singular Value Decomposition. Biological Cybernetics 59: Lecture 10. Kernel Machines Optimal Separating Hyperplane. The Nonseparable Case: Soft Margin Hyperplane. ν-svm. Kernel Trick. Vectorial Kernels. Defining Kernels. Multiple Kernel Learning. Multiclass Kernel Machines. Kernel Machines for Regression. One-Class Kernel Machines. Kernel Dimensionality Reduction. Practice 10. The Nonseparable Case: Soft Margin Hyperplane. ν-svm. Multiclass Kernel Machines. Kernel Machines for Regression. 1. Allwein, E. L., R. E. Schapire, and Y. Singer Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. Journal of Machine Learning Research 1: Burges, C. J. C A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2: Chang, C.-C., and C.-J. Lin LIBSVM: A Library for Support Vector Machines Cherkassky, V., and F. Mulier Learning from Data: Concepts, Theory, and Methods. New York: Wiley.

10 5. Cortes, C., and V. Vapnik Support Vector Networks. Machine Learning 20: Dietterich, T. G., and G. Bakiri Solving Multiclass Learning Problems via Error- Correcting Output Codes. Journal of Artificial Intelligence Research 2: Gonen, M., and E. Alpaydın Localized Multiple Kernel Learning. In 25th International Conference on Machine Learning, ed. A. McCallum and S. Roweis, Madison, WI: Omnipress. Lecture 11. Bayesian Estimation Estimating the Parameter of a Distribution: Discrete Variables, Continuous Variables. Bayesian Estimation of the Parameters of a Function: Regression, The Use of Basis/Kernel Functions, Bayesian Classification. Gaussian Processes. Practice 11. Estimating the Parameter of a Distribution. Bayesian Estimation of the Parameters of a Function. Gaussian Processes. 1. Bishop, C. M Pattern Recognition and Machine Learning. New York: Springer. 2. Figueiredo, M. A. T Adaptive Sparseness for Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 25: Gelman, A Objections to Bayesian statistics. Bayesian Statistics 3: MacKay, D. J. C Introduction to Gaussian Processes. In Neural Networks and Machine Learning, ed. C. M. Bishop, Berlin: Springer. 5. MacKay, D. J. C Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press. 6. Rasmussen, C. E., and C. K. I. Williams Gaussian Processes for Machine Learning. Cambridge, MA: MIT Press. 7. Tibshirani, R Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society B 58: Lecture 12. Design and Analysis of Machine Learning Experiments Factors, Response, and Strategy of Experimentation. Response Surface Design. Randomization, Replication, and Blocking. Guidelines for Machine Learning Experiments. Cross-Validation and Resampling Methods: K-Fold Cross-Validation, 5 2 Cross-Validation, Bootstrapping. Measuring Classifier Performance. Interval Estimation. Hypothesis Testing. Assessing a Classification Algorithm s Performance: Binomial Test, Approximate Normal Test, t Test. Comparing Two Classification Algorithms: McNemar s Test, K-Fold Cross-Validated Paired t Test, 5 2 cv Paired t Test, 5 2 cv Paired F Test. Comparing Multiple Algorithms: Analysis of Variance. Comparison over Multiple Datasets: Comparing Two Algorithms, Multiple Algorithms. Practice 12. Cross-Validation and Resampling Methods. Assessing a Classification Algorithm s Performance. 1. Alpaydın, E Combined 5 2 cv F Test for Comparing Supervised Classification Learning Algorithms. Neural Computation 11:

11 2. Bouckaert, R. R Choosing between Two Learning Algorithms based on Calibrated Tests. In Twentieth International Conference on Machine Learning, ed. T. Fawcett and N. Mishra, Menlo Park, CA: AAAI Press. 3. Demsar, J Statistical Comparison of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7: Dietterich, T. G Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10: Fawcett, T An Introduction to ROC Analysis. Pattern Recognition Letters 27: Montgomery, D. C Design and Analysis of Experiments. 6th ed., New York: Wiley. 7. Ross, S. M Introduction to Probability and Statistics for Engineers and Scientists. New York: Wiley. 7 Educational Technology During classes various types of active methods are used: analysis of practical problems, group work, computer simulations in computational software program Mathematica 10.0, distance learning with use LMS. 8 Methods and Materials for Current Control and Attestation 8.1 Example of Problems for Home Tasks Problem 1. Imagine you have two possibilities: You can fax a document, that is, send the image, or you can use an optical character reader (OCR) and send the text file. Discuss the advantage and disadvantages of the two approaches in a comparative manner. When would one be preferable over the other? Problem 2. Somebody tosses a fair coin and if the result is heads, you get nothing; otherwise you get $5. How much would you pay to play this game? What if the win is $500 instead of $5? Problem 3. Show that as we move an item from the consequent to the antecedent, confidence can never increase: confidence(abc D) confidence(ab CD). Problem 4. Write the code that generates a normal sample with given μ and σ, and the code that calculates m and s from the sample. Do the same using the Bayes estimator assuming a prior distribution for μ. Problem 5. In Isomap, instead of using Euclidean distance, we can also use Mahalanobis distance between neighboring points. What are the advantages and disadvantages of this approach, if any? Problem 6. In image compression, k-means can be used as follows: The image is divided into nonoverlapping c c windows and these c2-dimensional vectors make up the sample. For a given k, which is generally a power of two, we do k-means clustering. The reference vectors and the indices for each window is sent over the communication line. At the receiving end, the image is then reconstructed by reading from the table of reference vectors using the indices. Write the computer program that does this for different values of k and c. For each case, calculate the reconstruction error and the compression rate. Problem 7. In the running smoother, we can fit a constant, a line, or a higher-degree polynomial at a test point. How can we choose between them? Problem 8. What is the implication of the use of a single η for all xj in gradient descent?

12 Problem 9. Consider a MLP architecture with one hidden layer where there are also direct weights from the inputs directly to the output units. Explain when such a structure would be helpful and how it can be trained. Problem 10. Incremental learning of the structure of a MLP can be viewed as a state space search. What are the operators? What is the goodness function? What type of search strategies are appropriate? Define these in such a way that dynamic node creation and cascade-correlation are special instantiations. 8.2 Questions for Pass-Final Examination Theoretical Questions 1. Examples of Machine Learning Applications. 2. Learning Associations: Classification, Regression, Unsupervised Learning, Reinforcement Learning. Learning a Class from Examples. Vapnik-Chervonenkis (VC). Dimension. Probably Approximately Correct (PAC) Learning. Noise. 3. Learning Multiple Classes. Regression. Model Selection and Generalization. Dimensions of a Supervised Machine Learning Algorithm. 4. Introduction. Classification. Losses and Risks. Discriminant Functions. Utility Theory. Association Rules. 5. Maximum Likelihood Estimation: Bernoulli Density, Multinomial Density, Gaussian (Normal) Density. 6. Evaluating an Estimator: Bias and Variance. The Bayes Estimator. Parametric Classification. Regression. 7. Tuning Model Complexity: Bias/Variance Dilemma. 8. Model Selection Procedures. 9. Multivariate Data. Parameter Estimation. Estimation of Missing Values. Multivariate Normal Distribution. Multivariate Classification. Tuning Complexity. Discrete Features. Multivariate Regression. 10. Subset Selection. Principal Components Analysis. 11. Factor Analysis. 12. Multidimensional Scaling. 13. Linear Discriminant Analysis. Isomap. Locally Linear Embedding. 14. Mixture Densities. k-means Clustering. Expectation-Maximization Algorithm. 15. Mixtures of Latent Variable Models. 16. Supervised Learning after Clustering. Hierarchical Clustering. Choosing the Number of Clusters. 17. Nonparametric Density Estimation. Histogram Estimator: Kernel Estimator, k-nearest Neighbor Estimator. 18. Generalization to Multivariate Data. Nonparametric Classification. Condensed Nearest Neighbor. 19. Nonparametric Regression: Smoothing Models, Running Mean Smoother, Kernel Smoother, Running Line Smoother. How to Choose the Smoothing Parameter. 20. Generalizing the Linear Model. Geometry of the Linear Discriminant: Two Classes, Multiple Classes. 21. Pairwise Separation. Parametric Discrimination Revisited. Gradient Descent. Logistic Discrimination: Two Classes, Multiple Classes. Discrimination by Regression.

13 22. Understanding the Brain, Neural Networks as a Paradigm for Parallel Processing. 23. The Perceptron. Training a Perceptron. Learning Boolean Functions. Multilayer Perceptrons. MLP as a Universal Approximator. 24. Backpropagation Algorithm: Nonlinear Regression, Two-Class Discrimination, Multiclass Discrimination, Multiple Hidden Layers. 25. Training Procedures: Improving Convergence, Overtraining, Structuring the Network, Hints. 26. Tuning the Network Size. Bayesian View of Learning. Dimensionality Reduction. 27. Learning Time. Time Delay Neural Networks. Recurrent Networks. 28. Optimal Separating Hyperplane. 29. The Nonseparable Case: Soft Margin Hyperplane, ν-svm. 30. Kernel Trick. Vectorial Kernels. Defining Kernels. 31. Multiple Kernel Learning. Multiclass Kernel Machines. 32. Kernel Machines for Regression. One-Class Kernel Machines. 33. Kernel Dimensionality Reduction. 34. Estimating the Parameter of a Distribution: Discrete Variables, Continuous Variables. 35. Bayesian Estimation of the Parameters of a Function: Regression, The Use of Basis/Kernel Functions, Bayesian Classification. Gaussian Processes. 36. Factors, Response, and Strategy of Experimentation. Response Surface Design. Randomization, Replication, and Blocking. 37. Guidelines for Machine Learning Experiments. 38. Cross-Validation and Resampling Methods: K-Fold Cross-Validation, 5 2 Cross-Validation, Bootstrapping. 39. Measuring Classifier Performance. Interval Estimation. Hypothesis Testing. Assessing a Classification Algorithm s Performance: Binomial Test, Approximate Normal Test, t Test. 40. Comparing Two Classification Algorithms: McNemar s Test, K-Fold Cross-Validated Paired t Test, 5 2 cv Paired t Test, 5 2 cv Paired F Test. 41. Comparing Multiple Algorithms: Analysis of Variance. Comparison over Multiple Datasets: Comparing Two Algorithms, Multiple Algorithms. Examples of Problems Problem 1. In a two-class problem, let us say we have the loss matrix where λ11 = λ22 = 0, λ21 = 1 and λ12 = α. Determine the threshold of decision as a function of α. Problem 2. The K-fold cross-validated t test only tests for the equality of error rates. If the test rejects, we do not know which classification algorithm has the lower error rate. How can we test whether the first classification algorithm does not have higher error rate than the second one? Hint: We have to test H0 : μ 0 vs. H1 : μ > 0. Problem 3. If we have two variants of algorithm A and three variants of algorithm B, how can we compare the overall accuracies of A and B taking all their variants into account? 9 Teaching Methods and Information Provision 9.1 Core Textbook Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cambridge, 2010.

14 9.2 Required Reading Han, J., and M. Kamber Data Mining: Concepts and Techniques, 2nd ed. San Francisco: Morgan Kaufmann. Leahey, T. H., and R. J. Harris Learning and Cognition, 4th ed. New York: Prentice Hall. Witten, I. H., and E. Frank Data Mining: Practical Machine Learning Toolsand Techniques, 2nd ed. San Francisco: Morgan Kaufmann. 9.3 Supplementary Reading Dietterich, T. G Machine Learning. In Nature Encyclopedia of Cognitive Science. London: Macmillan. Hirsh, H Incremental Version Space Merging: A General Framework for Concept Learning. Boston: Kluwer. Kearns, M. J., and U. V. Vazirani An Introduction to Computational Learning Theory. Cambridge, MA: MIT Press. Mitchell, T Machine Learning. New York: McGraw-Hill. Valiant, L A Theory of the Learnable. Communications of the ACM 27: Vapnik, V. N The Nature of Statistical Learning Theory. New York: Springer. Winston, P. H Learning Structural Descriptions from Examples. In The Psychology of Computer Vision, ed. P. H. Winston, New York: McGraw-Hill. 9.4 Handbooks Handbook of Statistics, Vol. 2, Classification, Pattern Recognition and Reduction of Dimensionality, ed. P. R. Krishnaiah and L. N. Kanal, 1982, Amsterdam: North Holland. 9.5 Software Mathematica v Distance Learning MIT Open Course (Machine Learning) HSE Learning Management System 10 Technical Provision Computer, projector (for lectures or practice), computer class

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures

Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining (Portland, OR, August 1996). Predictive Data Mining with Finite Mixtures Petri Kontkanen Petri Myllymaki

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Lecture 1: Basic Concepts of Machine Learning

Lecture 1: Basic Concepts of Machine Learning Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010

More information

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

Mining Association Rules in Student s Assessment Data

Mining Association Rules in Student s Assessment Data www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

A Comparison of Standard and Interval Association Rules

A Comparison of Standard and Interval Association Rules A Comparison of Standard and Association Rules Choh Man Teng cmteng@ai.uwf.edu Institute for Human and Machine Cognition University of West Florida 4 South Alcaniz Street, Pensacola FL 325, USA Abstract

More information

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.

More information

Rule Learning with Negation: Issues Regarding Effectiveness

Rule Learning with Negation: Issues Regarding Effectiveness Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX

More information

Softprop: Softmax Neural Network Backpropagation Learning

Softprop: Softmax Neural Network Backpropagation Learning Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science

More information

Semi-Supervised Face Detection

Semi-Supervised Face Detection Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

Time series prediction

Time series prediction Chapter 13 Time series prediction Amaury Lendasse, Timo Honkela, Federico Pouzols, Antti Sorjamaa, Yoan Miche, Qi Yu, Eric Severin, Mark van Heeswijk, Erkki Oja, Francesco Corona, Elia Liitiäinen, Zhanxing

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Evolutive Neural Net Fuzzy Filtering: Basic Description

Evolutive Neural Net Fuzzy Filtering: Basic Description Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:

More information

Axiom 2013 Team Description Paper

Axiom 2013 Team Description Paper Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

Issues in the Mining of Heart Failure Datasets

Issues in the Mining of Heart Failure Datasets International Journal of Automation and Computing 11(2), April 2014, 162-179 DOI: 10.1007/s11633-014-0778-5 Issues in the Mining of Heart Failure Datasets Nongnuch Poolsawad 1 Lisa Moore 1 Chandrasekhar

More information

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

The Boosting Approach to Machine Learning An Overview

The Boosting Approach to Machine Learning An Overview Nonlinear Estimation and Classification, Springer, 2003. The Boosting Approach to Machine Learning An Overview Robert E. Schapire AT&T Labs Research Shannon Laboratory 180 Park Avenue, Room A203 Florham

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

SARDNET: A Self-Organizing Feature Map for Sequences

SARDNET: A Self-Organizing Feature Map for Sequences SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu

More information

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE Mingon Kang, PhD Computer Science, Kennesaw State University Self Introduction Mingon Kang, PhD Homepage: http://ksuweb.kennesaw.edu/~mkang9

More information

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick

More information

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer

More information

Evolution of Symbolisation in Chimpanzees and Neural Nets

Evolution of Symbolisation in Chimpanzees and Neural Nets Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication

More information

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA Alta de Waal, Jacobus Venter and Etienne Barnard Abstract Most actionable evidence is identified during the analysis phase of digital forensic investigations.

More information

Test Effort Estimation Using Neural Network

Test Effort Estimation Using Neural Network J. Software Engineering & Applications, 2010, 3: 331-340 doi:10.4236/jsea.2010.34038 Published Online April 2010 (http://www.scirp.org/journal/jsea) 331 Chintala Abhishek*, Veginati Pavan Kumar, Harish

More information

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence

More information

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Ajith Abraham School of Business Systems, Monash University, Clayton, Victoria 3800, Australia. Email: ajith.abraham@ieee.org

More information

arxiv: v1 [cs.lg] 15 Jun 2015

arxiv: v1 [cs.lg] 15 Jun 2015 Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and

More information

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,

More information

Knowledge Transfer in Deep Convolutional Neural Nets

Knowledge Transfer in Deep Convolutional Neural Nets Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract

More information

INPE São José dos Campos

INPE São José dos Campos INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA

More information

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de

More information

Mining Student Evolution Using Associative Classification and Clustering

Mining Student Evolution Using Associative Classification and Clustering Mining Student Evolution Using Associative Classification and Clustering 19 Mining Student Evolution Using Associative Classification and Clustering Kifaya S. Qaddoum, Faculty of Information, Technology

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014 UNSW Australia Business School School of Risk and Actuarial Studies ACTL5103 Stochastic Modelling For Actuaries Course Outline Semester 2, 2014 Part A: Course-Specific Information Please consult Part B

More information

STA 225: Introductory Statistics (CT)

STA 225: Introductory Statistics (CT) Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic

More information

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California

More information

A study of speaker adaptation for DNN-based speech synthesis

A study of speaker adaptation for DNN-based speech synthesis A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,

More information

A survey of multi-view machine learning

A survey of multi-view machine learning Noname manuscript No. (will be inserted by the editor) A survey of multi-view machine learning Shiliang Sun Received: date / Accepted: date Abstract Multi-view learning or learning with multiple distinct

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com

More information

Agent-Based Software Engineering

Agent-Based Software Engineering Agent-Based Software Engineering Learning Guide Information for Students 1. Description Grade Module Máster Universitario en Ingeniería de Software - European Master on Software Engineering Advanced Software

More information

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies

More information

Generative models and adversarial training

Generative models and adversarial training Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?

More information

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS L. Descalço 1, Paula Carvalho 1, J.P. Cruz 1, Paula Oliveira 1, Dina Seabra 2 1 Departamento de Matemática, Universidade de Aveiro (PORTUGAL)

More information

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach #BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,

More information

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,

More information

Integrating E-learning Environments with Computational Intelligence Assessment Agents

Integrating E-learning Environments with Computational Intelligence Assessment Agents Integrating E-learning Environments with Computational Intelligence Assessment Agents Christos E. Alexakos, Konstantinos C. Giotopoulos, Eleni J. Thermogianni, Grigorios N. Beligiannis and Spiridon D.

More information

Data Fusion Through Statistical Matching

Data Fusion Through Statistical Matching A research and education initiative at the MIT Sloan School of Management Data Fusion Through Statistical Matching Paper 185 Peter Van Der Puttan Joost N. Kok Amar Gupta January 2002 For more information,

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass

BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,

More information

An OO Framework for building Intelligence and Learning properties in Software Agents

An OO Framework for building Intelligence and Learning properties in Software Agents An OO Framework for building Intelligence and Learning properties in Software Agents José A. R. P. Sardinha, Ruy L. Milidiú, Carlos J. P. Lucena, Patrick Paranhos Abstract Software agents are defined as

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

Probability and Statistics Curriculum Pacing Guide

Probability and Statistics Curriculum Pacing Guide Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods

More information

Welcome to. ECML/PKDD 2004 Community meeting

Welcome to. ECML/PKDD 2004 Community meeting Welcome to ECML/PKDD 2004 Community meeting A brief report from the program chairs Jean-Francois Boulicaut, INSA-Lyon, France Floriana Esposito, University of Bari, Italy Fosca Giannotti, ISTI-CNR, Pisa,

More information

Discriminative Learning of Beam-Search Heuristics for Planning

Discriminative Learning of Beam-Search Heuristics for Planning Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University

More information

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering

More information

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application International Journal of Medical Science and Clinical Inventions 4(3): 2768-2773, 2017 DOI:10.18535/ijmsci/ v4i3.8 ICV 2015: 52.82 e-issn: 2348-991X, p-issn: 2454-9576 2017, IJMSCI Research Article Comparison

More information

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion

More information

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February

More information

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

arxiv: v1 [cs.lg] 3 May 2013

arxiv: v1 [cs.lg] 3 May 2013 Feature Selection Based on Term Frequency and T-Test for Text Categorization Deqing Wang dqwang@nlsde.buaa.edu.cn Hui Zhang hzhang@nlsde.buaa.edu.cn Rui Liu, Weifeng Lv {liurui,lwf}@nlsde.buaa.edu.cn arxiv:1305.0638v1

More information

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access The courses availability depends on the minimum number of registered students (5). If the course couldn t start, students can still complete it in the form of project work and regular consultations with

More information

arxiv: v1 [cs.cv] 10 May 2017

arxiv: v1 [cs.cv] 10 May 2017 Inferring and Executing Programs for Visual Reasoning Justin Johnson 1 Bharath Hariharan 2 Laurens van der Maaten 2 Judy Hoffman 1 Li Fei-Fei 1 C. Lawrence Zitnick 2 Ross Girshick 2 1 Stanford University

More information

A Reinforcement Learning Variant for Control Scheduling

A Reinforcement Learning Variant for Control Scheduling A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement

More information

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Exploration. CS : Deep Reinforcement Learning Sergey Levine Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?

More information

Detailed course syllabus

Detailed course syllabus Detailed course syllabus 1. Linear regression model. Ordinary least squares method. This introductory class covers basic definitions of econometrics, econometric model, and economic data. Classification

More information

Comparison of network inference packages and methods for multiple networks inference

Comparison of network inference packages and methods for multiple networks inference Comparison of network inference packages and methods for multiple networks inference Nathalie Villa-Vialaneix http://www.nathalievilla.org nathalie.villa@univ-paris1.fr 1ères Rencontres R - BoRdeaux, 3

More information

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming. Computer Science 1 COMPUTER SCIENCE Office: Department of Computer Science, ECS, Suite 379 Mail Code: 2155 E Wesley Avenue, Denver, CO 80208 Phone: 303-871-2458 Email: info@cs.du.edu Web Site: Computer

More information

Henry Tirri* Petri Myllymgki

Henry Tirri* Petri Myllymgki From: AAAI Technical Report SS-93-04. Compilation copyright 1993, AAAI (www.aaai.org). All rights reserved. Bayesian Case-Based Reasoning with Neural Networks Petri Myllymgki Henry Tirri* email: University

More information

arxiv: v2 [cs.cv] 30 Mar 2017

arxiv: v2 [cs.cv] 30 Mar 2017 Domain Adaptation for Visual Applications: A Comprehensive Survey Gabriela Csurka arxiv:1702.05374v2 [cs.cv] 30 Mar 2017 Abstract The aim of this paper 1 is to give an overview of domain adaptation and

More information

Learning Distributed Linguistic Classes

Learning Distributed Linguistic Classes In: Proceedings of CoNLL-2000 and LLL-2000, pages -60, Lisbon, Portugal, 2000. Learning Distributed Linguistic Classes Stephan Raaijmakers Netherlands Organisation for Applied Scientific Research (TNO)

More information

Switchboard Language Model Improvement with Conversational Data from Gigaword

Switchboard Language Model Improvement with Conversational Data from Gigaword Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword

More information

Universidade do Minho Escola de Engenharia

Universidade do Minho Escola de Engenharia Universidade do Minho Escola de Engenharia Universidade do Minho Escola de Engenharia Dissertação de Mestrado Knowledge Discovery is the nontrivial extraction of implicit, previously unknown, and potentially

More information

Artificial Neural Networks

Artificial Neural Networks Artificial Neural Networks Andres Chavez Math 382/L T/Th 2:00-3:40 April 13, 2010 Chavez2 Abstract The main interest of this paper is Artificial Neural Networks (ANNs). A brief history of the development

More information

Massachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139

Massachusetts Institute of Technology Tel: Massachusetts Avenue  Room 32-D558 MA 02139 Hariharan Narayanan Massachusetts Institute of Technology Tel: 773.428.3115 LIDS har@mit.edu 77 Massachusetts Avenue http://www.mit.edu/~har Room 32-D558 MA 02139 EMPLOYMENT Massachusetts Institute of

More information

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE Master of Science (M.S.) Major in Computer Science 1 MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE Major Program The programs in computer science are designed to prepare students for doctoral research,

More information