The Government of the Russian Federation
|
|
- Albert Cain
- 6 years ago
- Views:
Transcription
1 The Government of the Russian Federation The Federal State Autonomous Institution of Higher Education "National Research University - Higher School of Economics" Faculty of Business Informatics Department of Innovation and Business in Information Technology Master s Program Big Data Systems Author: Dr. Sci., Prof. Andrey Dmitriev, a.dmitriev@hse.ru Moscow, 2014 This document may not be reproduced or redistributed by other Departments of the University without permission of the Authors.
2 Field of Application and Regulations The course "Applied Machine Learning" syllabus lays down minimum requirements for student s knowledge and skills; it also provides description of both contents and forms of training and assessment in use. The course isoffered to students of the Master s Program "Big Data Systems" (area code ) in the Faculty of Business Informatics of the National Research University "Higher School of Economics". The course is a part of the curriculum pool of elective courses (1 st year, M.2.B.1. Optional courses, M.2 Courses required by the M s program of the academic year s curriculum), and it is a twomodule course (3 rd module and 4 th module). The duration of the course amounts to 48 class periods (both lecture and practices) divided into 24 lecture hours and 24 practice hours. Besides, 96 academic hours are set aside to students for self-studying activity. The syllabus is prepared for teachers responsible for the course (or closely related disciplines), teaching assistants, students enrolled on the course "Applied Machine Learning" as well as experts and statutory bodies carrying out assigned or regular accreditations in accordance with educational standards of the National Research University Higher School of Economics, curriculum ("Business Informatics", area code ), Big Data Systems specialization, 1 st year, academic year. 1 Course Objectives The main objective of the Course is to present, examine and discuss with students fundamentals and principles of machine learning. This course is focused on understanding the role of machine learning for big data analysis. Generally, the objective of the course can be thought as a combination of the following constituents: familiarity with peculiarities of supervised learning, parametric and multivariate methods, dimensionality reduction, clustering, nonparametric methods, decision trees, linear discrimination, kernel machines, Bayesian estimation as applied areas related to big data analysis, understanding of the main notions of machine learning theory, the framework of machine learning as the most significant areas of big data analysis, understanding of the role of machine learning in big data analysis, obtaining skills in utilizing machine learning in big data analysis. 2 Students' Competencies to be Developed by the Course While mastering the course material, the student will know main notions of the supervised learning, parametric and multivariate methods, dimensionality reduction, clustering, nonparametric methods, decision trees, linear discrimination, kernel machines, Bayesian estimation, acquire skills of big data analysis, gain experience in big data analysis with use main notions of the supervised learning, parametric and multivariate methods, dimensionality reduction, clustering, nonparametric methods, decision trees, linear discrimination, kernel machines, Bayesian estimation. In short, the course contributes to the development of the following professional competencies:
3 Ccompetencies Ability to offer concepts, models, invent and test methods and tools for professional work Ability to apply the methods of system analysis and modeling to assess, design and strategy development of enterprise architecture Ability to develop and implement economic and mathematical models to justify the project solu-tions in the field of information and computer technology Ability to organize self and collective research work in the enterprise and manage it FSES/ HSE code Descriptors main mastering features (indicators of result achievement) Training forms and methods contributing to the formation and development of competence SC-2 Demonstrates Lecture, practice, home tasks PC-13 Owns and uses Lecture, practice, home tasks PC-14 Owns and uses Lecture, practice, home tasks PC-16 Demonstrates Lecture, practice, home tasks 3 The Course within the Program s Framework The course "Applied Machine Learning" syllabus lays down minimum requirements for student s knowledge and skills; it also provides description of both contents and forms of training and assessment in use. The course is offered to students of the Master s Program "Big Data Systems" (area code ) in the Faculty of Business Informatics of the National Research University "Higher School of Economics". The course is a part of the curriculum pool of required courses (1 st year, M.2.B.1. Optional courses, M.2 Courses required by the M s program of the academic year s curriculum), and it is a twomodule course (3 rd module and 4 th module). The duration of the course amounts to 48 class periods (both lecture and practices) divided into 24 lecture hours and 24 practice hours. Besides, 96 academic hours are set aside to students for self-studying activity. Academic control forms include 2 home tasks are done by students individually, herewith each student has to prepare electronic (PDF format solely) report; all reports have to be submitted in LMS; all reports are checked and graded by the instructor on ten-point scale by the end of the 3 rd module and the 4 th module, pass-final examination, which implies written test and computer-based problem solving. The Course is to be based on the acquisition of the following courses: Calculus Linear Algebra Probability Theory and Mathematical Statistics Data Analysis Economic and Mathematical Modeling Discrete Mathematics The Course requires the following students' competencies and knowledge: main definitions, theorems and properties from Calculus, Linear Algebra, Probability Theory and Mathematical Statistics, Data Analysis, Economic and Mathematical Modeling and Discrete Mathematics, ability to communicate both orally and in written form in English language,
4 ability to search for, process and analyze information from a variety of sources. Main provisions of the course should be used to further the study of the following courses: Risk analysis based on big data Predictive Modeling Marketing analytics based on big data 4 Thematic Course Contents Title of the topic / lecture Hours (total number) Lectures Class hours Seminars Practice Independent work 3 rd Module 1 Supervised Learning Bayesian Decision Theory Parametric and Multivariate Methods Dimensionality Reduction Clustering rd Module TOTAL th Module 6 Nonparametric Methods Decision Trees Linear Discrimination Multilayer Perceptrons Kernel Machines Bayesian Estimation Design and Analysis of Machine Learning Experiments 5 Forms and Types of Testing Type of control Current (week) Resultant Pass-fail exam th Module TOTAL TOTAL Form of control 1 year Department Parameters Home task 1 week 29 Innovation problems solving, written and Business report (paper) Home task 2 week 40 in In- problems solving, written formation report (paper) week 41 Technology written test (paper) and computer-based problem solving Evaluation Criteria Current and resultant grades are made up of the following components: 2 tasks are done by students individually, herewith each student has to prepare electronic (PDF format solely) report. All reports have to be submitted in LMS. All reports are checked and graded by the instructor on tenpoint scale by the end of the 1 st module. All home tasks (HT) are assessed on the ten-point scale summary. pass-final examination implies written test (WT) and computer-based problem solving (CS). Finally, the total course grade on ten-point scale is obtained as
5 O(Total) = 0,6 * O(HT) + 0,1 * O(WT) + 0,3 * O(CS). A grade of4 or higher means successful completion of the course ("pass"), while grade of 3 or lower mean sun successful result ("fail"). Conversion of the concluding rounded grade O(Total) to five-point scale grade. 6 Detailed Course Contents Lecture 1. Supervised Learning Examples of Machine Learning Applications. Learning Associations: Classification, Regression, Unsupervised Learning, Reinforcement Learning. Learning a Class from Examples. Vapnik-Chervonenkis (VC). Dimension. Probably Approximately Correct (PAC) Learning. Noise. Learning Multiple Classes. Regression. Model Selection and Generalization. Dimensions of a Supervised Machine Learning Algorithm. Practice 1. Probably Approximately Correct (PAC) Learning. Noise. Learning Multiple Classes. Regression. Model Selection and Generalization. Dimensions of a Supervised Machine Learning Algorithm. 1. Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cambridge, Angluin, D Queries and Concept Learning. Machine Learning 2: Blumer, A., A. Ehrenfeucht, D. Haussler, and M. K. Warmuth Learnability and the Vapnik-Chervonenkis Dimension. Journal of the ACM 36: Dietterich, T. G Machine Learning. In Nature Encyclopedia of Cognitive Science. London: Macmillan. 4. Hirsh, H Incremental Version Space Merging: A General Framework for Concept Learning. Boston: Kluwer. Lecture 2. Bayesian Decision Theory Introduction. Classification. Losses and Risks. Discriminant Functions. Utility Theory. Association Rules. Practice 2. Classification. Losses and Risks. Discriminant Functions. Utility Theory. Association Rules. 1. Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cabridge, Agrawal, R., H. Mannila, R. Srikant, H. Toivonen, and A. Verkamo Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, ed. U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, Cambridge, MA: MIT Press. 2. Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. Li, J On Optimal Rule Discovery. IEEE Transactions on Knowledge and Data Discovery 18: Newman, J. R., ed The World of Mathematics. Redmond, WA: Tempus. Omiecinski, E. R Alternative Interest Measures for Mining Associations in Databases. IEEE Transactions on Knowledge and Data Discovery 15: Russell, S., and P. Norvig Artificial Intelligence: A Modern Approach. New York: Prentice Hall. Shafer, G., and J. Pearl, eds Readings in Uncertain Reasoning. SanMateo,
6 CA: Morgan Kaufmann. 5. Zhang, C., and S. Zhang Association Rule Mining: Models and Algorithms. New York: Springer. Lecture 3. Parametric and Multivariate Methods Maximum Likelihood Estimation: Bernoulli Density, Multinomial Density, Gaussian (Normal) Density. Evaluating an Estimator: Bias and Variance. The Bayes Estimator. Parametric Classification. Regression. Tuning Model Complexity: Bias/Variance Dilemma. Model Selection Procedures. Multivariate Data. Parameter Estimation. Estimation of Missing Values. Multivariate Normal Distribution. Multivariate Classification. Tuning Complexity. Discrete Features. Multivariate Regression. Practice 3. Maximum Likelihood Estimation. Multivariate Classification. Tuning Complexity. Discrete Features. Multivariate Regression. 1. Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cambridge, Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. 2. Friedman, J. H Regularized Discriminant Analysis. Journal of American Statistical Association 84: Harville, D. A Matrix Algebra from a Statistician s Perspective. New York: Springer. 4. Manning, C. D., and H. Schutze Foundations of Statistical Natural Language Processing. Cambridge, MA: MIT Press. 5. McLachlan, G. J Discriminant Analysis and Statistical Pattern Recognition. New York: Wiley. 6. Rencher, A. C Methods of Multivariate Analysis. New York: Wiley. 7. Strang, G Linear Algebra and its Applications, 3rd ed. New York: Harcourt Brace Jovanovich. Lecture 4. Dimensionality Reduction Subset Selection. Principal Components Analysis. Factor Analysis. Multidimensional Scaling. Linear Discriminant Analysis. Isomap. Locally Linear Embedding. Practice 4. Principal Components Analysis. Factor Analysis. Multidimensional Scaling. Linear Discriminant Analysis. 1. Balasubramanian, M., E. L. Schwartz, J. B. Tenenbaum, V. de Silva, and J. C. Langford The Isomap Algorithm and Topological Stability. Science 295: Chatfield, C., and A. J. Collins Introduction to Multivariate Analysis. London: Chapman and Hall. 3. Cox, T. F., and M. A. A. Cox Multidimensional Scaling. London: Chapman and Hall. 4. Devijer, P. A., and J. Kittler Pattern Recognition: A Statistical Approach. New York: Prentice-Hall. 5. Flury, B Common Principal Components and Related Multivariate Models. New York: Wiley.
7 6. Fukunaga, K., and P. M. Narendra A Branch and Bound Algorithm for Feature Subset Selection. IEEE Transactions on Computers C-26: Lecture 5. Clustering Mixture Densities. k-means Clustering. Expectation-Maximization Algorithm. Mixtures of Latent Variable Models. Supervised Learning after Clustering. Hierarchical Clustering. Choosing the Number of Clusters. Practice 5. Mixture Densities. k-means Clustering. Expectation-Maximization Algorithm. Mixtures of Latent Variable Models. 1. Alpaydın, E Soft Vector Quantization and the EM Algorithm. Neural Networks 11: Barrow, H. B Unsupervised Learning. Neural Computation 1: Bezdek, J. C., and N. R. Pal Two Soft Relatives of Learning Vector Quantization. Neural Networks 8: Bishop, C. M Latent Variable Models. In Learning in Graphical Models, ed. M. I. Jordan, Cambridge, MA: MIT Press. 5. Dempster, A. P., N. M. Laird, and D. B. Rubin Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of Royal Statistical Society B 39: Gersho, A., and R. M. Gray Vector Quantization and Signal Compression. Boston: Kluwer. 7. Ghahramani, Z., and G. E. Hinton The EM Algorithm for Mixtures of Factor Analyzers. Technical Report CRG TR-96-1, Department of Computer Science, University of Toronto Lecture 6. Nonparametric Methods Nonparametric Density Estimation. Histogram Estimator: Kernel Estimator, k-nearest Neighbor Estimator. Generalization to Multivariate Data. Nonparametric Classification. Condensed Nearest Neighbor. Nonparametric Regression: Smoothing Models, Running Mean Smoother, Kernel Smoother, Running Line Smoother. How to Choose the Smoothing Parameter. Practice 6. Nonparametric Density Estimation. Nonparametric Regression. 1. Aha, D. W., ed Special Issue on Lazy Learning, Artificial Intelligence Review 11(1 5): Aha, D. W., D. Kibler, and M. K. Albert Instance-Based Learning Algorithm. Machine Learning 6: Atkeson, C. G., A. W. Moore, and S. Schaal Locally Weighted Learning. Artificial Intelligence Review 11: Cover, T. M., and P. E. Hart Nearest Neighbor Pattern Classification. IEEE Transactions on Information Theory 13: Dasarathy, B. V Nearest Neighbor Norms: NN Pattern Classification Techniques. Los Alamitos, CA: IEEE Computer Society Press.
8 6. Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. Geman, S., E. Bienenstock, and R. Doursat Neural Networks and the Bias/Variance Dilemma. Neural Computation 4: Lecture 7. Linear Discrimination Generalizing the Linear Model. Geometry of the Linear Discriminant: Two Classes, Multiple Classes. Pairwise Separation. Parametric Discrimination Revisited. Gradient Descent. Logistic Discrimination: Two Classes, Multiple Classes. Discrimination by Regression. Practice 7. Generalizing the Linear Model. Geometry of the Linear Discriminant. Logistic Discrimination. 1. Aizerman, M. A., E. M. Braverman, and L. I. Rozonoer Theoretical Foundations of the Potential Function Method in Pattern Recognition Learning. Automation and Remote Control 25: Anderson, J. A Logistic Discrimination. In Handbook of Statistics, Vol. 2, Classification, Pattern Recognition and Reduction of Dimensionality, ed. P. R. Krishnaiah and L. N. Kanal, Amsterdam: North Holland. 3. Bridle, J. S Probabilistic Interpretation of Feedforward Classification Network Outputs with Relationships to Statistical Pattern Recognition. In Neurocomputing: Algorithms, Architectures and Applications, ed. F. Fogelman-Soulie and J. Herault, Berlin: Springer. 4. Duda, R. O., P. E. Hart, and D. G. Stork Pattern Classification, 2nd ed. New York: Wiley. McCullagh, P., and J. A. Nelder Generalized Linear Models. London: Chapman and Hall. Lecture 8. Multilayer Perceptrons Introduction: Understanding the Brain, Neural Networks as a Paradigm for Parallel Processing. The Perceptron. Training a Perceptron. Learning Boolean Functions. Multilayer Perceptrons. MLP as a Universal Approximator. Backpropagation Algorithm: Nonlinear Regression, Two-Class Discrimination, Multiclass Discrimination, Multiple Hidden Layers. Training Procedures: Improving Convergence, Overtraining, Structuring the Network, Hints. Tuning the Network Size. Bayesian View of Learning. Dimensionality Reduction. Learning Time. Time Delay Neural Networks. Recurrent Networks. Practice 8. Backpropagation Algorithm. Training Procedures. 1. Abu-Mostafa, Y Hints. Neural Computation 7: Aran, O., O. T. Yıldız, and E. Alpaydın An Incremental Framework Based on Cross- Validation for Estimating the Architecture of a Multilayer Perceptron. International Journal of Pattern Recognition and Artificial Intelligence 23: Ash, T Dynamic Node Creation in Backpropagation Networks. Connection Science 1: Battiti, R First- and Second-Order Methods for Learning: Between Steepest Descent and Newton s Method. Neural Computation 4:
9 5. Bishop, C. M Neural Networks for Pattern Recognition. Oxford: Oxford University Press. Bourlard, H., and Y. Kamp Auto-Association by Multilayer Perceptrons and Singular Value Decomposition. Biological Cybernetics 59: Lecture 9. Multilayer Perceptrons Introduction: Understanding the Brain, Neural Networks as a Paradigm for Parallel Processing. The Perceptron. Training a Perceptron. Learning Boolean Functions. Multilayer Perceptrons. MLP as a Universal Approximator. Backpropagation Algorithm: Nonlinear Regression, Two-Class Discrimination, Multiclass Discrimination, Multiple Hidden Layers. Training Procedures: Improving Convergence, Overtraining, Structuring the Network, Hints. Tuning the Network Size. Bayesian View of Learning. Dimensionality Reduction. Learning Time. Time Delay Neural Networks. Recurrent Networks. Practice 9. Backpropagation Algorithm. Training Procedures. 1. Abu-Mostafa, Y Hints. Neural Computation 7: Aran, O., O. T. Yıldız, and E. Alpaydın An Incremental Framework Based on Cross- Validation for Estimating the Architecture of a Multilayer Perceptron. International Journal of Pattern Recognition and Artificial Intelligence 23: Ash, T Dynamic Node Creation in Backpropagation Networks. Connection Science 1: Battiti, R First- and Second-Order Methods for Learning: Between Steepest Descent and Newton s Method. Neural Computation 4: Bishop, C. M Neural Networks for Pattern Recognition. Oxford: Oxford University Press. Bourlard, H., and Y. Kamp Auto-Association by Multilayer Perceptrons and Singular Value Decomposition. Biological Cybernetics 59: Lecture 10. Kernel Machines Optimal Separating Hyperplane. The Nonseparable Case: Soft Margin Hyperplane. ν-svm. Kernel Trick. Vectorial Kernels. Defining Kernels. Multiple Kernel Learning. Multiclass Kernel Machines. Kernel Machines for Regression. One-Class Kernel Machines. Kernel Dimensionality Reduction. Practice 10. The Nonseparable Case: Soft Margin Hyperplane. ν-svm. Multiclass Kernel Machines. Kernel Machines for Regression. 1. Allwein, E. L., R. E. Schapire, and Y. Singer Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers. Journal of Machine Learning Research 1: Burges, C. J. C A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2: Chang, C.-C., and C.-J. Lin LIBSVM: A Library for Support Vector Machines Cherkassky, V., and F. Mulier Learning from Data: Concepts, Theory, and Methods. New York: Wiley.
10 5. Cortes, C., and V. Vapnik Support Vector Networks. Machine Learning 20: Dietterich, T. G., and G. Bakiri Solving Multiclass Learning Problems via Error- Correcting Output Codes. Journal of Artificial Intelligence Research 2: Gonen, M., and E. Alpaydın Localized Multiple Kernel Learning. In 25th International Conference on Machine Learning, ed. A. McCallum and S. Roweis, Madison, WI: Omnipress. Lecture 11. Bayesian Estimation Estimating the Parameter of a Distribution: Discrete Variables, Continuous Variables. Bayesian Estimation of the Parameters of a Function: Regression, The Use of Basis/Kernel Functions, Bayesian Classification. Gaussian Processes. Practice 11. Estimating the Parameter of a Distribution. Bayesian Estimation of the Parameters of a Function. Gaussian Processes. 1. Bishop, C. M Pattern Recognition and Machine Learning. New York: Springer. 2. Figueiredo, M. A. T Adaptive Sparseness for Supervised Learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 25: Gelman, A Objections to Bayesian statistics. Bayesian Statistics 3: MacKay, D. J. C Introduction to Gaussian Processes. In Neural Networks and Machine Learning, ed. C. M. Bishop, Berlin: Springer. 5. MacKay, D. J. C Information Theory, Inference, and Learning Algorithms. Cambridge, UK: Cambridge University Press. 6. Rasmussen, C. E., and C. K. I. Williams Gaussian Processes for Machine Learning. Cambridge, MA: MIT Press. 7. Tibshirani, R Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society B 58: Lecture 12. Design and Analysis of Machine Learning Experiments Factors, Response, and Strategy of Experimentation. Response Surface Design. Randomization, Replication, and Blocking. Guidelines for Machine Learning Experiments. Cross-Validation and Resampling Methods: K-Fold Cross-Validation, 5 2 Cross-Validation, Bootstrapping. Measuring Classifier Performance. Interval Estimation. Hypothesis Testing. Assessing a Classification Algorithm s Performance: Binomial Test, Approximate Normal Test, t Test. Comparing Two Classification Algorithms: McNemar s Test, K-Fold Cross-Validated Paired t Test, 5 2 cv Paired t Test, 5 2 cv Paired F Test. Comparing Multiple Algorithms: Analysis of Variance. Comparison over Multiple Datasets: Comparing Two Algorithms, Multiple Algorithms. Practice 12. Cross-Validation and Resampling Methods. Assessing a Classification Algorithm s Performance. 1. Alpaydın, E Combined 5 2 cv F Test for Comparing Supervised Classification Learning Algorithms. Neural Computation 11:
11 2. Bouckaert, R. R Choosing between Two Learning Algorithms based on Calibrated Tests. In Twentieth International Conference on Machine Learning, ed. T. Fawcett and N. Mishra, Menlo Park, CA: AAAI Press. 3. Demsar, J Statistical Comparison of Classifiers over Multiple Data Sets. Journal of Machine Learning Research 7: Dietterich, T. G Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms. Neural Computation 10: Fawcett, T An Introduction to ROC Analysis. Pattern Recognition Letters 27: Montgomery, D. C Design and Analysis of Experiments. 6th ed., New York: Wiley. 7. Ross, S. M Introduction to Probability and Statistics for Engineers and Scientists. New York: Wiley. 7 Educational Technology During classes various types of active methods are used: analysis of practical problems, group work, computer simulations in computational software program Mathematica 10.0, distance learning with use LMS. 8 Methods and Materials for Current Control and Attestation 8.1 Example of Problems for Home Tasks Problem 1. Imagine you have two possibilities: You can fax a document, that is, send the image, or you can use an optical character reader (OCR) and send the text file. Discuss the advantage and disadvantages of the two approaches in a comparative manner. When would one be preferable over the other? Problem 2. Somebody tosses a fair coin and if the result is heads, you get nothing; otherwise you get $5. How much would you pay to play this game? What if the win is $500 instead of $5? Problem 3. Show that as we move an item from the consequent to the antecedent, confidence can never increase: confidence(abc D) confidence(ab CD). Problem 4. Write the code that generates a normal sample with given μ and σ, and the code that calculates m and s from the sample. Do the same using the Bayes estimator assuming a prior distribution for μ. Problem 5. In Isomap, instead of using Euclidean distance, we can also use Mahalanobis distance between neighboring points. What are the advantages and disadvantages of this approach, if any? Problem 6. In image compression, k-means can be used as follows: The image is divided into nonoverlapping c c windows and these c2-dimensional vectors make up the sample. For a given k, which is generally a power of two, we do k-means clustering. The reference vectors and the indices for each window is sent over the communication line. At the receiving end, the image is then reconstructed by reading from the table of reference vectors using the indices. Write the computer program that does this for different values of k and c. For each case, calculate the reconstruction error and the compression rate. Problem 7. In the running smoother, we can fit a constant, a line, or a higher-degree polynomial at a test point. How can we choose between them? Problem 8. What is the implication of the use of a single η for all xj in gradient descent?
12 Problem 9. Consider a MLP architecture with one hidden layer where there are also direct weights from the inputs directly to the output units. Explain when such a structure would be helpful and how it can be trained. Problem 10. Incremental learning of the structure of a MLP can be viewed as a state space search. What are the operators? What is the goodness function? What type of search strategies are appropriate? Define these in such a way that dynamic node creation and cascade-correlation are special instantiations. 8.2 Questions for Pass-Final Examination Theoretical Questions 1. Examples of Machine Learning Applications. 2. Learning Associations: Classification, Regression, Unsupervised Learning, Reinforcement Learning. Learning a Class from Examples. Vapnik-Chervonenkis (VC). Dimension. Probably Approximately Correct (PAC) Learning. Noise. 3. Learning Multiple Classes. Regression. Model Selection and Generalization. Dimensions of a Supervised Machine Learning Algorithm. 4. Introduction. Classification. Losses and Risks. Discriminant Functions. Utility Theory. Association Rules. 5. Maximum Likelihood Estimation: Bernoulli Density, Multinomial Density, Gaussian (Normal) Density. 6. Evaluating an Estimator: Bias and Variance. The Bayes Estimator. Parametric Classification. Regression. 7. Tuning Model Complexity: Bias/Variance Dilemma. 8. Model Selection Procedures. 9. Multivariate Data. Parameter Estimation. Estimation of Missing Values. Multivariate Normal Distribution. Multivariate Classification. Tuning Complexity. Discrete Features. Multivariate Regression. 10. Subset Selection. Principal Components Analysis. 11. Factor Analysis. 12. Multidimensional Scaling. 13. Linear Discriminant Analysis. Isomap. Locally Linear Embedding. 14. Mixture Densities. k-means Clustering. Expectation-Maximization Algorithm. 15. Mixtures of Latent Variable Models. 16. Supervised Learning after Clustering. Hierarchical Clustering. Choosing the Number of Clusters. 17. Nonparametric Density Estimation. Histogram Estimator: Kernel Estimator, k-nearest Neighbor Estimator. 18. Generalization to Multivariate Data. Nonparametric Classification. Condensed Nearest Neighbor. 19. Nonparametric Regression: Smoothing Models, Running Mean Smoother, Kernel Smoother, Running Line Smoother. How to Choose the Smoothing Parameter. 20. Generalizing the Linear Model. Geometry of the Linear Discriminant: Two Classes, Multiple Classes. 21. Pairwise Separation. Parametric Discrimination Revisited. Gradient Descent. Logistic Discrimination: Two Classes, Multiple Classes. Discrimination by Regression.
13 22. Understanding the Brain, Neural Networks as a Paradigm for Parallel Processing. 23. The Perceptron. Training a Perceptron. Learning Boolean Functions. Multilayer Perceptrons. MLP as a Universal Approximator. 24. Backpropagation Algorithm: Nonlinear Regression, Two-Class Discrimination, Multiclass Discrimination, Multiple Hidden Layers. 25. Training Procedures: Improving Convergence, Overtraining, Structuring the Network, Hints. 26. Tuning the Network Size. Bayesian View of Learning. Dimensionality Reduction. 27. Learning Time. Time Delay Neural Networks. Recurrent Networks. 28. Optimal Separating Hyperplane. 29. The Nonseparable Case: Soft Margin Hyperplane, ν-svm. 30. Kernel Trick. Vectorial Kernels. Defining Kernels. 31. Multiple Kernel Learning. Multiclass Kernel Machines. 32. Kernel Machines for Regression. One-Class Kernel Machines. 33. Kernel Dimensionality Reduction. 34. Estimating the Parameter of a Distribution: Discrete Variables, Continuous Variables. 35. Bayesian Estimation of the Parameters of a Function: Regression, The Use of Basis/Kernel Functions, Bayesian Classification. Gaussian Processes. 36. Factors, Response, and Strategy of Experimentation. Response Surface Design. Randomization, Replication, and Blocking. 37. Guidelines for Machine Learning Experiments. 38. Cross-Validation and Resampling Methods: K-Fold Cross-Validation, 5 2 Cross-Validation, Bootstrapping. 39. Measuring Classifier Performance. Interval Estimation. Hypothesis Testing. Assessing a Classification Algorithm s Performance: Binomial Test, Approximate Normal Test, t Test. 40. Comparing Two Classification Algorithms: McNemar s Test, K-Fold Cross-Validated Paired t Test, 5 2 cv Paired t Test, 5 2 cv Paired F Test. 41. Comparing Multiple Algorithms: Analysis of Variance. Comparison over Multiple Datasets: Comparing Two Algorithms, Multiple Algorithms. Examples of Problems Problem 1. In a two-class problem, let us say we have the loss matrix where λ11 = λ22 = 0, λ21 = 1 and λ12 = α. Determine the threshold of decision as a function of α. Problem 2. The K-fold cross-validated t test only tests for the equality of error rates. If the test rejects, we do not know which classification algorithm has the lower error rate. How can we test whether the first classification algorithm does not have higher error rate than the second one? Hint: We have to test H0 : μ 0 vs. H1 : μ > 0. Problem 3. If we have two variants of algorithm A and three variants of algorithm B, how can we compare the overall accuracies of A and B taking all their variants into account? 9 Teaching Methods and Information Provision 9.1 Core Textbook Alpaydin E. Introduction to Machine Learning, 2 nd Edition, MIT Press Cambridge, 2010.
14 9.2 Required Reading Han, J., and M. Kamber Data Mining: Concepts and Techniques, 2nd ed. San Francisco: Morgan Kaufmann. Leahey, T. H., and R. J. Harris Learning and Cognition, 4th ed. New York: Prentice Hall. Witten, I. H., and E. Frank Data Mining: Practical Machine Learning Toolsand Techniques, 2nd ed. San Francisco: Morgan Kaufmann. 9.3 Supplementary Reading Dietterich, T. G Machine Learning. In Nature Encyclopedia of Cognitive Science. London: Macmillan. Hirsh, H Incremental Version Space Merging: A General Framework for Concept Learning. Boston: Kluwer. Kearns, M. J., and U. V. Vazirani An Introduction to Computational Learning Theory. Cambridge, MA: MIT Press. Mitchell, T Machine Learning. New York: McGraw-Hill. Valiant, L A Theory of the Learnable. Communications of the ACM 27: Vapnik, V. N The Nature of Statistical Learning Theory. New York: Springer. Winston, P. H Learning Structural Descriptions from Examples. In The Psychology of Computer Vision, ed. P. H. Winston, New York: McGraw-Hill. 9.4 Handbooks Handbook of Statistics, Vol. 2, Classification, Pattern Recognition and Reduction of Dimensionality, ed. P. R. Krishnaiah and L. N. Kanal, 1982, Amsterdam: North Holland. 9.5 Software Mathematica v Distance Learning MIT Open Course (Machine Learning) HSE Learning Management System 10 Technical Provision Computer, projector (for lectures or practice), computer class
Python Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationPp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining. Predictive Data Mining with Finite Mixtures
Pp. 176{182 in Proceedings of The Second International Conference on Knowledge Discovery and Data Mining (Portland, OR, August 1996). Predictive Data Mining with Finite Mixtures Petri Kontkanen Petri Myllymaki
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationCourse Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE
EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationMining Association Rules in Student s Assessment Data
www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationA Comparison of Standard and Interval Association Rules
A Comparison of Standard and Association Rules Choh Man Teng cmteng@ai.uwf.edu Institute for Human and Machine Cognition University of West Florida 4 South Alcaniz Street, Pensacola FL 325, USA Abstract
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationSoftprop: Softmax Neural Network Backpropagation Learning
Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science
More informationSemi-Supervised Face Detection
Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationIntroduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationTime series prediction
Chapter 13 Time series prediction Amaury Lendasse, Timo Honkela, Federico Pouzols, Antti Sorjamaa, Yoan Miche, Qi Yu, Eric Severin, Mark van Heeswijk, Erkki Oja, Francesco Corona, Elia Liitiäinen, Zhanxing
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationAxiom 2013 Team Description Paper
Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationBusiness Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence
Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationMachine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler
Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina
More informationIssues in the Mining of Heart Failure Datasets
International Journal of Automation and Computing 11(2), April 2014, 162-179 DOI: 10.1007/s11633-014-0778-5 Issues in the Mining of Heart Failure Datasets Nongnuch Poolsawad 1 Lisa Moore 1 Chandrasekhar
More informationPREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES
PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationThe Boosting Approach to Machine Learning An Overview
Nonlinear Estimation and Classification, Springer, 2003. The Boosting Approach to Machine Learning An Overview Robert E. Schapire AT&T Labs Research Shannon Laboratory 180 Park Avenue, Room A203 Florham
More informationSemi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration
INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationCS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University
CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE Mingon Kang, PhD Computer Science, Kennesaw State University Self Introduction Mingon Kang, PhD Homepage: http://ksuweb.kennesaw.edu/~mkang9
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationIterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages
Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer
More informationEvolution of Symbolisation in Chimpanzees and Neural Nets
Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication
More informationChapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard
Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA Alta de Waal, Jacobus Venter and Etienne Barnard Abstract Most actionable evidence is identified during the analysis phase of digital forensic investigations.
More informationTest Effort Estimation Using Neural Network
J. Software Engineering & Applications, 2010, 3: 331-340 doi:10.4236/jsea.2010.34038 Published Online April 2010 (http://www.scirp.org/journal/jsea) 331 Chintala Abhishek*, Veginati Pavan Kumar, Harish
More informationThe Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms
IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence
More informationAnalysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems
Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems Ajith Abraham School of Business Systems, Monash University, Clayton, Victoria 3800, Australia. Email: ajith.abraham@ieee.org
More informationarxiv: v1 [cs.lg] 15 Jun 2015
Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and
More informationClass-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification
Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification Tomi Kinnunen and Ismo Kärkkäinen University of Joensuu, Department of Computer Science, P.O. Box 111, 80101 JOENSUU,
More informationKnowledge Transfer in Deep Convolutional Neural Nets
Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract
More informationINPE São José dos Campos
INPE-5479 PRE/1778 MONLINEAR ASPECTS OF DATA INTEGRATION FOR LAND COVER CLASSIFICATION IN A NEDRAL NETWORK ENVIRONNENT Maria Suelena S. Barros Valter Rodrigues INPE São José dos Campos 1993 SECRETARIA
More informationRule discovery in Web-based educational systems using Grammar-Based Genetic Programming
Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de
More informationMining Student Evolution Using Associative Classification and Clustering
Mining Student Evolution Using Associative Classification and Clustering 19 Mining Student Evolution Using Associative Classification and Clustering Kifaya S. Qaddoum, Faculty of Information, Technology
More informationReducing Features to Improve Bug Prediction
Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science
More informationACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014
UNSW Australia Business School School of Risk and Actuarial Studies ACTL5103 Stochastic Modelling For Actuaries Course Outline Semester 2, 2014 Part A: Course-Specific Information Please consult Part B
More informationSTA 225: Introductory Statistics (CT)
Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic
More informationPhonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project
Phonetic- and Speaker-Discriminant Features for Speaker Recognition by Lara Stoll Research Project Submitted to the Department of Electrical Engineering and Computer Sciences, University of California
More informationA study of speaker adaptation for DNN-based speech synthesis
A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,
More informationA survey of multi-view machine learning
Noname manuscript No. (will be inserted by the editor) A survey of multi-view machine learning Shiliang Sun Received: date / Accepted: date Abstract Multi-view learning or learning with multiple distinct
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationLearning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for
Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan walker@research.att.com jeannie@ai.mit.edu shri@research.att.com
More informationAgent-Based Software Engineering
Agent-Based Software Engineering Learning Guide Information for Students 1. Description Grade Module Máster Universitario en Ingeniería de Software - European Master on Software Engineering Advanced Software
More informationScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques
Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 98 (2016 ) 368 373 The 6th International Conference on Current and Future Trends of Information and Communication Technologies
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationCOMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS
COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS L. Descalço 1, Paula Carvalho 1, J.P. Cruz 1, Paula Oliveira 1, Dina Seabra 2 1 Departamento de Matemática, Universidade de Aveiro (PORTUGAL)
More informationDeep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach
#BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying
More informationAustralian Journal of Basic and Applied Sciences
AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean
More information*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN
From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,
More informationMachine Learning from Garden Path Sentences: The Application of Computational Linguistics
Machine Learning from Garden Path Sentences: The Application of Computational Linguistics http://dx.doi.org/10.3991/ijet.v9i6.4109 J.L. Du 1, P.F. Yu 1 and M.L. Li 2 1 Guangdong University of Foreign Studies,
More informationIntegrating E-learning Environments with Computational Intelligence Assessment Agents
Integrating E-learning Environments with Computational Intelligence Assessment Agents Christos E. Alexakos, Konstantinos C. Giotopoulos, Eleni J. Thermogianni, Grigorios N. Beligiannis and Spiridon D.
More informationData Fusion Through Statistical Matching
A research and education initiative at the MIT Sloan School of Management Data Fusion Through Statistical Matching Paper 185 Peter Van Der Puttan Joost N. Kok Amar Gupta January 2002 For more information,
More informationNotes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1
Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial
More informationBAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION. Han Shu, I. Lee Hetherington, and James Glass
BAUM-WELCH TRAINING FOR SEGMENT-BASED SPEECH RECOGNITION Han Shu, I. Lee Hetherington, and James Glass Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Cambridge,
More informationAn OO Framework for building Intelligence and Learning properties in Software Agents
An OO Framework for building Intelligence and Learning properties in Software Agents José A. R. P. Sardinha, Ruy L. Milidiú, Carlos J. P. Lucena, Patrick Paranhos Abstract Software agents are defined as
More informationLearning Methods in Multilingual Speech Recognition
Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex
More informationProbability and Statistics Curriculum Pacing Guide
Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods
More informationWelcome to. ECML/PKDD 2004 Community meeting
Welcome to ECML/PKDD 2004 Community meeting A brief report from the program chairs Jean-Francois Boulicaut, INSA-Lyon, France Floriana Esposito, University of Bari, Italy Fosca Giannotti, ISTI-CNR, Pisa,
More informationDiscriminative Learning of Beam-Search Heuristics for Planning
Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationComparison of EM and Two-Step Cluster Method for Mixed Data: An Application
International Journal of Medical Science and Clinical Inventions 4(3): 2768-2773, 2017 DOI:10.18535/ijmsci/ v4i3.8 ICV 2015: 52.82 e-issn: 2348-991X, p-issn: 2454-9576 2017, IJMSCI Research Article Comparison
More informationAnalysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion
More informationMaximizing Learning Through Course Alignment and Experience with Different Types of Knowledge
Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationarxiv: v1 [cs.lg] 3 May 2013
Feature Selection Based on Term Frequency and T-Test for Text Categorization Deqing Wang dqwang@nlsde.buaa.edu.cn Hui Zhang hzhang@nlsde.buaa.edu.cn Rui Liu, Weifeng Lv {liurui,lwf}@nlsde.buaa.edu.cn arxiv:1305.0638v1
More informationCourses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access
The courses availability depends on the minimum number of registered students (5). If the course couldn t start, students can still complete it in the form of project work and regular consultations with
More informationarxiv: v1 [cs.cv] 10 May 2017
Inferring and Executing Programs for Visual Reasoning Justin Johnson 1 Bharath Hariharan 2 Laurens van der Maaten 2 Judy Hoffman 1 Li Fei-Fei 1 C. Lawrence Zitnick 2 Ross Girshick 2 1 Stanford University
More informationA Reinforcement Learning Variant for Control Scheduling
A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement
More informationExploration. CS : Deep Reinforcement Learning Sergey Levine
Exploration CS 294-112: Deep Reinforcement Learning Sergey Levine Class Notes 1. Homework 4 due on Wednesday 2. Project proposal feedback sent Today s Lecture 1. What is exploration? Why is it a problem?
More informationDetailed course syllabus
Detailed course syllabus 1. Linear regression model. Ordinary least squares method. This introductory class covers basic definitions of econometrics, econometric model, and economic data. Classification
More informationComparison of network inference packages and methods for multiple networks inference
Comparison of network inference packages and methods for multiple networks inference Nathalie Villa-Vialaneix http://www.nathalievilla.org nathalie.villa@univ-paris1.fr 1ères Rencontres R - BoRdeaux, 3
More informationWe are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.
Computer Science 1 COMPUTER SCIENCE Office: Department of Computer Science, ECS, Suite 379 Mail Code: 2155 E Wesley Avenue, Denver, CO 80208 Phone: 303-871-2458 Email: info@cs.du.edu Web Site: Computer
More informationHenry Tirri* Petri Myllymgki
From: AAAI Technical Report SS-93-04. Compilation copyright 1993, AAAI (www.aaai.org). All rights reserved. Bayesian Case-Based Reasoning with Neural Networks Petri Myllymgki Henry Tirri* email: University
More informationarxiv: v2 [cs.cv] 30 Mar 2017
Domain Adaptation for Visual Applications: A Comprehensive Survey Gabriela Csurka arxiv:1702.05374v2 [cs.cv] 30 Mar 2017 Abstract The aim of this paper 1 is to give an overview of domain adaptation and
More informationLearning Distributed Linguistic Classes
In: Proceedings of CoNLL-2000 and LLL-2000, pages -60, Lisbon, Portugal, 2000. Learning Distributed Linguistic Classes Stephan Raaijmakers Netherlands Organisation for Applied Scientific Research (TNO)
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationUniversidade do Minho Escola de Engenharia
Universidade do Minho Escola de Engenharia Universidade do Minho Escola de Engenharia Dissertação de Mestrado Knowledge Discovery is the nontrivial extraction of implicit, previously unknown, and potentially
More informationArtificial Neural Networks
Artificial Neural Networks Andres Chavez Math 382/L T/Th 2:00-3:40 April 13, 2010 Chavez2 Abstract The main interest of this paper is Artificial Neural Networks (ANNs). A brief history of the development
More informationMassachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139
Hariharan Narayanan Massachusetts Institute of Technology Tel: 773.428.3115 LIDS har@mit.edu 77 Massachusetts Avenue http://www.mit.edu/~har Room 32-D558 MA 02139 EMPLOYMENT Massachusetts Institute of
More informationMASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE
Master of Science (M.S.) Major in Computer Science 1 MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE Major Program The programs in computer science are designed to prepare students for doctoral research,
More information