CS6375: Recap. Nicholas Ruozzi University of Texas at Dallas

Similar documents
Python Machine Learning

(Sub)Gradient Descent

Lecture 1: Machine Learning Basics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Probabilistic Latent Semantic Analysis

CSL465/603 - Machine Learning

Artificial Neural Networks written examination

WHEN THERE IS A mismatch between the acoustic

Assignment 1: Predicting Amazon Review Ratings

Human Emotion Recognition From Speech

Lecture 1: Basic Concepts of Machine Learning

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Learning From the Past with Experiment Databases

CS Machine Learning

Learning Methods in Multilingual Speech Recognition

A survey of multi-view machine learning

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

Australian Journal of Basic and Applied Sciences

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Speech Emotion Recognition Using Support Vector Machine

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Axiom 2013 Team Description Paper

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

CS 446: Machine Learning

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Learning Methods for Fuzzy Systems

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Universidade do Minho Escola de Engenharia

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Model Ensemble for Click Prediction in Bing Search Ads

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Rule Learning With Negation: Issues Regarding Effectiveness

A Survey on Unsupervised Machine Learning Algorithms for Automation, Classification and Maintenance

Softprop: Softmax Neural Network Backpropagation Learning

Semi-Supervised Face Detection

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

arxiv: v1 [cs.lg] 15 Jun 2015

Speaker Identification by Comparison of Smart Methods. Abstract

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Modeling function word errors in DNN-HMM based LVCSR systems

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Issues in the Mining of Heart Failure Datasets

Welcome to. ECML/PKDD 2004 Community meeting

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

arxiv: v2 [cs.cv] 30 Mar 2017

Time series prediction

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

Calibration of Confidence Measures in Speech Recognition

Speech Recognition at ICSI: Broadcast News and beyond

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Rule Learning with Negation: Issues Regarding Effectiveness

Reducing Features to Improve Bug Prediction

Applications of data mining algorithms to analysis of medical data

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Natural Language Processing: Interpretation, Reasoning and Machine Learning

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Switchboard Language Model Improvement with Conversational Data from Gigaword

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

AUTOMATED FABRIC DEFECT INSPECTION: A SURVEY OF CLASSIFIERS

Generative models and adversarial training

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Automatic Speaker Recognition: Modelling, Feature Extraction and Effects of Clinical Environment

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Test Effort Estimation Using Neural Network

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Bootstrapping Personal Gesture Shortcuts with the Wisdom of the Crowd and Handwriting Recognition

IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 17, NO. 3, MARCH

A study of speaker adaptation for DNN-based speech synthesis

Soft Computing based Learning for Cognitive Radio

Modeling function word errors in DNN-HMM based LVCSR systems

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Indian Institute of Technology, Kanpur

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Laboratorio di Intelligenza Artificiale e Robotica

Optimizing to Arbitrary NLP Metrics using Ensemble Selection

Comment-based Multi-View Clustering of Web 2.0 Items

Second Exam: Natural Language Parsing with Neural Networks

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Lecture 10: Reinforcement Learning

How to read a Paper ISMLL. Dr. Josif Grabocka, Carlotta Schatten

An OO Framework for building Intelligence and Learning properties in Software Agents

Knowledge Transfer in Deep Convolutional Neural Nets

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Truth Inference in Crowdsourcing: Is the Problem Solved?

INPE São José dos Campos

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Kamaldeep Kaur University School of Information Technology GGS Indraprastha University Delhi

Transcription:

CS6375: Recap Nicholas Ruozzi University of Texas at Dallas

Supervised Learning Regression & classification Discriminative methods k-nn Decision trees Perceptron SVMs & kernel methods Logistic regression Parameter learning Maximum likelihood estimation Expectation maximization

Bayesian Approaches MAP estimation Prior/posterior probabilities Bayesian networks Naive Bayes Hidden Markov models Structure learning via Chow-Liu Trees Latent Dirichlet Allocation (LDA)

Unsupervised Learning Clustering kk-means Spectral clustering Hierarchical clustering Expectation maximization Soft clustering Mixtures of Gaussians

Learning Theory PAC learning VC dimension Bias/variance tradeoff Chernoff bounds Sample complexity

Optimization Methods Gradient descent Stochastic gradient descent Subgradient methods Coordinate descent Lagrange multipliers and duality

Matrix Based Methods Dimensionality Reduction PCA Matrix Factorizations Collaborative Filtering Semisupervised learning

Ensemble Methods Bootstrap sampling Bagging Boosting

Other Learning Topics Active learning Reinforcement learning Learning to rank Neural networks Perceptron and sigmoid neurons Backpropagation

Questions about the course content? (Reminder: I do not have office hours this week)

For the final... You should understand the basic concepts and theory of all of the algorithms and techniques that we have discussed in the course There is no need to memorize complicated formulas, etc. For example, if I ask for the sample complexity of a scheme, I will give you the generic formula However, you should be able to derive the algorithms and updates E.g., Lagrange multipliers and SVMs, the EM algorithm, etc.

For the final... No calculators, books, notes, etc. will be permitted As before, if you need a calculator, you have done something terribly wrong The exam will be in roughly the same format Expect true/false questions, short answers, and two-three long answer questions Exam will emphasize the new material, but ALL material will be tested Take a look at the practice exams!

Final Exam Wednesday, 12/16/2015 11:00AM - 1:45PM ECSS 2.410

Related Courses at UTD Natural Language Processing (CS 6320) Statistical Methods in Artificial Intelligence and Machine Learning (CS 6347) Artificial Intelligence (CS 6364) Information Retrieval (CS 6322) Intelligent Systems Analysis (ACN 6347) Intelligent Systems Design (ACN 6349)

ML Related People Vincent Ng (NLP) Yang Liu (NLP) Vibhav Gogate (MLNs, Sampling, Graphical Models) Sanda Harabagiu (NLP & Health) Dan Moldovan (NLP) Nicholas Ruozzi (Graphical Models & Approx. Inference)

Matrix Decomposition PCA is a dimensionality reduction technique that is based on matrix factorizations Drawback: PCA returns the eigenvectors of a matrix as the most relevant vectors (many applications need subsets of the data that best describe it) Feature selection / matrix factorization using Bayesian networks Input: data points as rows of a mm nn matrix XX Output: XX~CCCC where CC is a mm kk matrix of columns selected from XX and UU is an arbitrary matrix

Airplane Health Collaboration with Southwest airlines Pilots/maintenance crews perform physical inspections of planes and are asked to translate observations into maintenance codes The observations (symptoms) and the codes (diagnoses) typically are mismatched (inspections performed quickly and too expensive to train everyone) Multiclass classification problem: given as input correctly labeled training data, learn to predict the codes for new symptoms

Parameter Tying We saw ll 2 regularization as a way to prefer simpler models Another type of simple model might be a Bayesian network in which many of the parameters (i.e., the conditional probability distributions) are the same This type of parameter tying is used in neural networks as well (though it is typically done by hand) Study the design of regularization based methods for parameter tying and improved inference/sampling methods for models with tied parameters

Graphical Models Generalization of Bayesian networks very popular in the machine learning community (take the class!) Lower bounds for continuous partition functions Theoretical guarantees on the exactness of inference in continuous graphical models Faster algorithms (via Frank-Wolfe) for learning in latent variable models

Please evaluate the course! eval.utdallas.edu