Hot Topics in Machine Learning

Similar documents
Lecture 1: Machine Learning Basics

Python Machine Learning

(Sub)Gradient Descent

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Exploration. CS : Deep Reinforcement Learning Sergey Levine

arxiv: v2 [cs.cv] 30 Mar 2017

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Generative models and adversarial training

Lecture 1: Basic Concepts of Machine Learning

CSL465/603 - Machine Learning

Introduction to Simulation

Probabilistic Latent Semantic Analysis

CS Machine Learning

Calibration of Confidence Measures in Speech Recognition

Artificial Neural Networks written examination

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

arxiv: v1 [cs.lg] 15 Jun 2015

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Attributed Social Network Embedding

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

arxiv: v1 [cs.lg] 7 Apr 2015

Lecture 10: Reinforcement Learning

Truth Inference in Crowdsourcing: Is the Problem Solved?

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

A Vector Space Approach for Aspect-Based Sentiment Analysis

Assignment 1: Predicting Amazon Review Ratings

Model Ensemble for Click Prediction in Bing Search Ads

A survey of multi-view machine learning

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Speech Emotion Recognition Using Support Vector Machine

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

A Neural Network GUI Tested on Text-To-Phoneme Mapping

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

Human Emotion Recognition From Speech

Evaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation

Knowledge Transfer in Deep Convolutional Neural Nets

WHEN THERE IS A mismatch between the acoustic

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Reinforcement Learning by Comparing Immediate Reward

Online Updating of Word Representations for Part-of-Speech Tagging

BMBF Project ROBUKOM: Robust Communication Networks

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Modeling function word errors in DNN-HMM based LVCSR systems

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

A Reinforcement Learning Variant for Control Scheduling

Evolutive Neural Net Fuzzy Filtering: Basic Description

Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors

DIRECT ADAPTATION OF HYBRID DNN/HMM MODEL FOR FAST SPEAKER ADAPTATION IN LVCSR BASED ON SPEAKER CODE

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

An Introduction to Simulation Optimization

TRANSFER LEARNING IN MIR: SHARING LEARNED LATENT REPRESENTATIONS FOR MUSIC AUDIO CLASSIFICATION AND SIMILARITY

Mathematics subject curriculum

Axiom 2013 Team Description Paper

Learning From the Past with Experiment Databases

BAYESIAN ANALYSIS OF INTERLEAVED LEARNING AND RESPONSE BIAS IN BEHAVIORAL EXPERIMENTS

Natural Language Processing: Interpretation, Reasoning and Machine Learning

INPE São José dos Campos

Learning Methods in Multilingual Speech Recognition

A Comparison of Annealing Techniques for Academic Course Scheduling

GRADUATE PROGRAM Department of Materials Science and Engineering, Drexel University Graduate Advisor: Prof. Caroline Schauer, Ph.D.

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

B.S/M.A in Mathematics

Multi-Dimensional, Multi-Level, and Multi-Timepoint Item Response Modeling.

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Comment-based Multi-View Clustering of Web 2.0 Items

Modeling function word errors in DNN-HMM based LVCSR systems

Decision Analysis. Decision-Making Problem. Decision Analysis. Part 1 Decision Analysis and Decision Tables. Decision Analysis, Part 1

A Model of Knower-Level Behavior in Number Concept Development

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

A Case Study: News Classification Based on Term Frequency

Corrective Feedback and Persistent Learning for Information Extraction

Mathematics process categories

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Second Exam: Natural Language Parsing with Neural Networks

Indian Institute of Technology, Kanpur

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

The Evolution of Random Phenomena

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Semi-Supervised Face Detection

Softprop: Softmax Neural Network Backpropagation Learning

arxiv: v1 [cs.cv] 10 May 2017

Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation

THE world surrounding us involves multiple modalities

Modeling user preferences and norms in context-aware systems

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Using focal point learning to improve human machine tacit coordination

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

12- A whirlwind tour of statistics

Transcription:

Hot Topics in Machine Learning Winter Term 2016 / 2017 Prof. Marius Kloft, Florian Wenzel October 19, 2016

Organization

Organization The seminar is organized by Prof. Marius Kloft and Florian Wenzel (PhD student). For questions regarding the seminar please contact me: Contact Florian Wenzel wenzelfl@hu-berlin.de www.florian-wenzel.de 1

Organization Course Website Can be found on my website www.florian-wenzel.de Doodle: Pick a slot please! Link to doodle poll on course website. 2

Organization each participant should choose (at least) one topic which she/he wants to present topics can be everything regarding ML (as long as it s hot) interesting paper interesting ML method or algorithm Bachelor s or Master s thesis (work in progress is totally fine) own ML project choose a topic from our list of potential topics 3

Organization doodle for open slots presentation should be around 45min + Q&A 2 weeks before presentation meet with Marius and discuss / rehearse presentation ( 10min meeting) we will meet each week (exceptions will be announced: email list) credit points for successful presentation and active participation 4

Possible Topics: Dimensionality Reduction

ISO-MAP nonlinear dimensionality reduction method estimate of the intrinsic geometry of a data manifold based on a rough estimate of each data point s neighbors Sources: http://isomap.stanford.edu/ 5

t-sne nonlinear dimensionality reduction method t-sne constructs a probability distribution over pairs of high-dimensional objects in such a way that similar objects have a high probability very popular and used in a wide range of applications Sources: http://jmlr.org/papers/volume9/vandermaaten08a/ vandermaaten08a.pdf 6

GP-LVM really cool nonlinear dimensionality reduction method based on Gaussian Processes embeds data points in a latent variable space (equipped with a prob meassure) gives simultaneously probabilities for data points on the learned manifold for belonging to the true (unknown) latent space Sources: https://www.youtube.com/watch?v=l98lw9khzfc Paper: gaussian process latent variable models for visualisation of high dimensional data 7

Other Dim Reduction Related Topics NMF (Nonnegative Matrix Factorization) LLE (Locally Linear Embedding) 8

Possible Topics: Inference

Markov Chain Monte Carlo (MCMC) aim: sample from a (intractable) posterior construct Markov chain that converges to the target distribution (as equilibrium distribution) for the seminar you can focus on the popular Metropolis-Hastings algorithm other (more advanced) MCMC algorithms: Hamiltonian Monte Carlo (HMC), SGD-based MC (next slide) Sources: Paper: An Introduction to MCMC for Machine Learning 9

Scalable Bayesian Inference most MCMC algorithms need swap through the whole dataset per sample SGD-based Sampling uses only a little fraction (so called mini batch) of the dataset for each sample based on Stochastic Gradient Descent (SGD) for seminar suitable: SGLD (Langevin Dynamics) or SGFS (improved version of SGLD) Sources: https://www.youtube.com/watch?v=qbf5ebdew7q Paper: Bayesian Learning via Stochastic Gradient Langevin Dynamics Paper: Bayesian Posterior Sampling via Stochastic Gradient Fisher Scoring 10

Variational Inference approximate the posterior with another (easier) distribution aim: Minimize the Kullback-Leibler divergence this is equivalent to maximizing the ELBO (evidence lower bound) different Assumptions lead to different algorithms, for the seminar the mean field VI algorithm is suitable scalable version: Stochastic Variational Inference Sources: Books: Bishop, Murphy Paper: Blei et al: Variational Inference: A Review for Statisticians 11

Expectation Propagation similar idea to Variational Inference, but now minimize the reverse KL divergence but leads to completely different algorithm find approximative distribution by moment matching Sources: Books: Bishop, Murphy Paper: Minka: Expectation Propagation for Approximate Bayesian Inference 12

Possible Topics: Multi Stuff

Multi Class Learning present different generalizations of binary class to multi class models compare different strategies (one-vs-rest, one-vs-one) focus on Multi Class SVM (present different formulations) extreme classification (thousands of classes) Sources: Papers by Marius Kloft Book: Bishop 13

Multi Task Learning transfer knowledge from mastering one task to the other idea: solve related problems at the same time, using a shared representation present an MTL framework (e.g. Multi Task SVM) Sources: Papers by Marius Kloft Paper: Caruana: Multitask Learning 14

Multiple Kernel Learning we have a (large) set of predefined kernels and want to combine them to one aim: find the best weights of linear combination present an MKL framework (e.g. Multi Kernel SVM) l p -norm kernel learning (Kloft) Sources: PhD thesis and papers by Marius Kloft Paper: Caruana: Multitask Learning 15

Possible Topics: Other Cool Possibilities

Other Topics CRF (Conditional Random Fields) Gradient Boosting RNNs (Recurrent Neural Networks) NLP Topics: Topic Models, Word Embeddings, Sentiment Analysis Bandits Online Learning Theory 16

Possible Topics: Your Own Ideas

Your Own Ideas please feel free to come up with your own topics explicitly welcome you can meet or contact me via mail if you have questions 17