Transfer Learning. Pei-Hao (Eddy) Su 1 and Yingzhen Li 2. January 29, Outline Motivation Historical points Definition Case studies
|
|
- Dale Gilbert
- 5 years ago
- Views:
Transcription
1 Transfer Learning Pei-Hao (Eddy) Su 1 and Yingzhen Li 2 1 Dialogue Systems Group and 2 Machine Learning Group January 29, 2015 Transfer Learning 1 / 41
2 Outline 1 Motivation 2 Historical points 3 Definition 4 Case studies Transfer Learning 2 / 41
3 Outline 1 Motivation 2 Historical points 3 Definition 4 Case studies Transfer Learning 3 / 41
4 Standard Supervised Learning Task Transfer Learning 4 / 41
5 Standard Supervised Learning Task Most ML tasks assume the training/test data are drawn from the same data space and the same distribution Transfer Learning 4 / 41
6 NLP tasks: POS, NER, Category labelling Modified from Gao et al. s presentation in KDD 08 Transfer Learning 5 / 41
7 Combine and get better result Modified from Gao et al. s presentation in KDD 08 Transfer Learning 6 / 41
8 Motivation Traditional ML tasks assume the training/test data are drawn from the same data space and the same distribution Insufficient labelled data result in poor prediction performance Lots of (un-)related existing data from various sources Start from scratch is always time-consuming Transfer knowledge from other sources may help! Transfer Learning 7 / 41
9 Motivation (Taylor et.al JMLR 09) Transfer Learning 8 / 41
10 Outline 1 Motivation 2 Historical points 3 Definition 4 Case studies Transfer Learning 9 / 41
11 Psychology and Education In 1901, Thorndike and Woodworth explored how individuals transfer similar characteristics shared by different contexts Transfer Learning 10 / 41
12 Psychology and Education In 1901, Thorndike and Woodworth explored how individuals transfer similar characteristics shared by different contexts In 1992, Perkins and Salomon published Transfer of Learning which defined different types of transfer Transfer Learning 10 / 41
13 Psychology and Education In 1901, Thorndike and Woodworth explored how individuals transfer similar characteristics shared by different contexts In 1992, Perkins and Salomon published Transfer of Learning which defined different types of transfer examples: Skill learning: C/C + + Python Language acquisition: German English Transfer Learning 10 / 41
14 Machine Learning Transfer Learning 11 / 41
15 Machine Learning Explanation-Based Neural Network Learning: A Lifelong Learning Approach [Thrun PhD 95, NIPS 96] Transfer Learning 12 / 41
16 Machine Learning Explanation-Based Neural Network Learning: A Lifelong Learning Approach [Thrun PhD 95, NIPS 96] Multitask Learning [Caruana ICML 93 & 96, PhD 97] Transfer Learning 12 / 41
17 Machine Learning Explanation-Based Neural Network Learning: A Lifelong Learning Approach [Thrun PhD 95, NIPS 96] Multitask Learning [Caruana ICML 93 & 96, PhD 97] Workshops Learning to Learn: Knowledge Consolidation and Transfer in Inductive Systems [NIPS 95] Inductive Transfer: 10 Years Later [NIPS 05] Structural Knowledge Transfer for Machine Learning [ICML 06] Transfer Learning for Complex Tasks [AAAI 08] Lifelong Learning [AAAI 11] Theoretically Grounded Transfer Learning [ICML 13] Workshop: Second Workshop on Transfer and Multi-Task Learning: Theory meets Practice [NIPS 14]... Transfer Learning 12 / 41
18 Outline 1 Motivation 2 Historical points 3 Definition 4 Case studies Transfer Learning 13 / 41
19 Definition Notations Domain D 1 Data space X 2 Marginal distribution P(X ), where X X Task T (Given D = {X, P(X )}) 1 Label space Y 2 Learn a f : X Y to approach the underlying P(Y X ), where X X and Y Y Transfer Learning 14 / 41
20 Definition Assume we have only one source S and one target T : Definition Transfer Learning (TL): Given a source domain D S and learning task T S, a target domain D T and learning task T T, transfer learning aims to help improve the learning of the target predictive function f T ( ) in D T using the knowledge in D S and T S, where D S D T (either X S X T or P S (X ) P T (X )) or T S T T (either Y S Y T or P(Y S X S ) P(Y T X T )) Transfer Learning 15 / 41
21 Example: Category labelling Transfer Learning 16 / 41
22 Example: Category labelling Transfer Learning 17 / 41
23 Example: Category labelling Transfer Learning 18 / 41
24 ML v.s. TL (Langley 06, Yang et al. 13) Transfer Learning 19 / 41
25 Outline 1 Motivation 2 Historical points 3 Definition 4 Case studies Transfer Learning 20 / 41
26 Transfer in practice The rest of the talk will give you an intuition, with examples, on: when to transfer what to transfer and how to transfer Transfer Learning 21 / 41
27 When to transfer: Domain relatedness Transfer learning is applicable when there exists relatedness Standard machine learning assume source = target Transferring knowledge from unrelated domain can be harmful - Negative transfer [Rosenstein et al NIPS-05 Workshop] (Ben-David et al.) proposed a bound of target domain error Reference Ben-David et al. Analysis of Representation for Domain Adaptation. NIPS 06 Transfer Learning 22 / 41
28 When to transfer (Ben-David et al.) In standard binary classification supervised learning task: Given X, Y = {0, 1} and samples from P(x, y), we aim to learn f : X [0, 1] which captures P(y x) Often we decompose the problem into: 1 determine a feature mapping Φ : X Z 2 learn a hypothesis h : Z {0, 1} on dataset {Φ(x), y} In transfer learning scenario: Theorem (Simplified version of Thm. 1&2) Given X = X S = X T and P S (x), P T (x) the distributions of the source and target domain. Let Φ : X Z be a fixed mapping function and H be a hypothesis space. For any hypothesis h H trained on source domain: ɛ T (h) ɛ S (h) + d H( P S, P T ) + ɛ S (h ) + ɛ T (h ) where P S, P T are induced distributions on Z wrt. P S and P T, h = arg min h H (ɛ S (h) + ɛ T (h)) is the best hypothesis by joint training. Transfer Learning 23 / 41
29 Domain adaptation Approach 1: mixture of general & specific component Can we learn hypotheses for both the general and specific components? Reference: Daume III. Frustratingly easy domain adaptation. ACL 07 Daume III et al. Co-regularization Based Semi-supervised Domain Adaptation. NIPS 10 Transfer Learning 24 / 41
30 EasyAdapt (Daume III) Binary classification problem: X S = X T R d, Y S = Y T = { 1, +1} Goal: obtain classifier f T : X T Y T in SVM context: learn a hypothesis h T R d However: too little training data available on (X T, Y T ) for robust training also P(x S ) P(x T ) and P(x S, y T ) P(x S, y T )...so directly apply a trained hypothesis h s returns bad results How to use x S, y S P(x S, y S ) to improve learning of h T? Transfer Learning 25 / 41
31 EasyAdapt (Daume III) EasyAdapt algorithm define two mappings Φ S, Φ T : R d R 3d : Φ S (x S ) = (x S, x S, 0), Φ t(x T ) = (x T, 0, x T ) training: learn a hypothesis h = (w g, w s, w t ) R 3d on transformed dataset {(Φ S (x S ), y S )} {(Φ T (x T ), y T )} test: apply h T = w g + w t on x T (also h S = w g + w s ) Transfer Learning 26 / 41
32 EA++ (Daume III et al.) Use unlabelled data to improve training: want h S and h T to agree on unlabelled data x U : h S x U = h T x U w s x U = w t x U h (0, x U, x U ) = 0 so we define mapping Φ U : R d R 3d for unlabelled data Φ U (x U ) = (0, x U, x U ) (1) and train the hypothesis h on augmented and transformed dataset {(Φ S (x S ), y S )} {(Φ T (x T ), y T )} {(Φ U (x U ), 0)} Transfer Learning 27 / 41
33 EA++ (Daume III et al.) (a) DVD BOOKS (proxy A-distance=0.7616), (b) KITCHEN APPAREL (proxy A-distance=0.0459). SOURCE/TARGETONLY(-FULL): trained on source/target (full) labelled samples ALL: trained on combined labelled samples EA/EA++: trained in augmented feature space (and unlabelled target data) Transfer Learning 28 / 41
34 Feature transfer Approach 2: shared lower-level features DNN first layer learns Gabor filters or color blobs when trained on images instances in source/target domain share the same lower-level features? Reference: Yosinski et al. How transferable are features in deep neural networks? NIPS 14. Transfer Learning 29 / 41
35 Feature transfer 1 Lee et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. ICML 09 1 adapt from Ruslan Salakhutdinov s tutorial in MLSS 14 Beijing Transfer Learning 30 / 41
36 Feature transfer (Yosinski et al.) Transfer Learning 31 / 41
37 Feature transfer (Yosinski et al.) Test 1 (similar datasets): random A/B splits of the ImageNet dataset (similar source and target domain training/testing instances) Transfer Learning 32 / 41
38 Feature transfer (Yosinski et al.) Test 1 (similar datasets): random A/B splits of the ImageNet dataset (similar source and target domain training/testing instances) Transfer Learning 32 / 41
39 Feature transfer (Yosinski et al.) Test 1 (similar datasets): random A/B splits of the ImageNet dataset (similar source and target domain training/testing instances) Transfer Learning 32 / 41
40 Feature transfer (Yosinski et al.) Test 2 (very different datasets): man-made/natural object split (dissimilar source and target domain training/testing instances) Transfer Learning 33 / 41
41 Feature transfer (Yosinski et al.) Test 2 (very different datasets): man-made/natural object split (dissimilar source and target domain training/testing instances) Transfer Learning 33 / 41
42 Joint representation Approach 3: joint feature representation data has many domain specific characteristics however might be related in high level? our brain might work like this as well Reference: Srivastava and Salakhutdinov. Multimodal Learning with Deep Boltzmann Machines. NIPS 12, JMLR 15 (2014). Transfer Learning 34 / 41
43 Joint representation (Srivastava et al.) MIR Flickr Dataset For images 1M datapoints, 25K labelled instances in 38 classes, 10K for training, 5K for validation and 10K for testing inputs are the concatenation of PHOW and MPEG-7 features For texts use word count vectors on 2K frequently used tags (very sparse) 18% training images have missing texts Transfer Learning 35 / 41
44 Joint representation (Srivastava et al.) for images: 2-layer deep Boltzmann machine (DBM) with Gaussian input units (v mi R, abbrev. W m (k) (i, j) as W (k) ij ) P(v m, h m (1), h m (2) ) exp (v mi b i ) 2 + 2σ 2 i i i,j v mi σ i W (1) ij h (1) mj + j,l h (1) mj W (2) jl h (2) ml Transfer Learning 36 / 41
45 Joint representation (Srivastava et al.) for texts: 2-layer DBM with replicated softmax model (v ti counts the occurrence of word i, abbrev. W (k) t (i, j) as W (k) P(v t, h (1) t, h (2) t ) exp v ti b i + i=1 i,j ij ) v ti W (1) ij h (1) mj + j,l h (1) tj W (2) jl h (2) tl Transfer Learning 36 / 41
46 Joint representation (Srivastava et al.) combining domain specific models to a multimodal DBM: P(v m, v t, h; θ) ( ) exp E(h m (2), h (2) t, h (3) ) E(v m, h m (1), h m (2) ) E(v t, h (1) t, h (2) t ) Transfer Learning 36 / 41
47 Joint representation (Srivastava et al.) first pre-train domain specific DBMs with CD, then co-train the joint model with PCD use mean-field variational approximation when computing hidden unit moments driven by data Transfer Learning 36 / 41
48 Joint representation (Srivastava et al.) Results: Figure: Classification with data from both image and text domain Figure: Classification with data from image domain only Transfer Learning 37 / 41
49 Joint representation (Srivastava et al.) Results: Figure: Retrieval results for multi/image domain queries Transfer Learning 37 / 41
50 Conclusions In this talk, we showed that transfer learning adapts knowledge from other sources to improve target task performance domains related to each other in different ways In the future: manage large scale data that do not lack in size but may lack in quality manage data which may continuously change over time Transfer Learning 38 / 41
51 Open Questions 2 what are the limits of existing multi-task learning methods when the number of tasks grows while each task is described by only a small bunch of samples ( big T, small n )? what is the right way to leverage over noisy data gathered from the Internet as reference for a new task? how can an automatic system process a continuous stream of information in time and progressively adapt for life-long learning? can deep learning help to learn the right representation (e.g., task similarity matrix) in kernel-based transfer and multi-task learning? How can similarities across languages help us adapt to different domains in natural language processing tasks?... 2 nips.cc/conferences/2014/program/event.php?id=4282 Transfer Learning 39 / 41
52 Thank you Transfer Learning 40 / 41
53 Reference 1 Pan and Yang. A Survey on Transfer Learning. IEEE TKDE Pan and Yang. Transfer Learning. MLSS Taylor et al. Transfer Learning for Reinforcement Learning Domains: A Survey. JMLR Langley. Transfer of Learning in Cognitive System. ICML Perkins et al. Transfer of Learning. IEE Thrun. Explanation-Based Neural Network Learning: A Lifelong Learning Approach. PhD thesis Caruana. Multitask Learning. PhD thesis Ben-David et al. Analysis of Representation for Domain Adaptation. NIPS Daume III. Frustratingly easy domain adaptation. ACL Daume III et al. Co-regularization Based Semi-supervised Domain Adaptation. NIPS Yosinski et al. How transferable are features in deep neural networks? NIPS Lee et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. ICML 2009 Pei-Hao (Eddy) 13 Srivastava Su and Yingzhen and Li Salakhutdinov. Multimodal Learning with Deep Transfer Learning 41 / 41
Lecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationarxiv: v1 [cs.lg] 15 Jun 2015
Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and
More informationCourse Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE
EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers
More informationActive Learning. Yingyu Liang Computer Sciences 760 Fall
Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,
More informationTRANSFER LEARNING IN MIR: SHARING LEARNED LATENT REPRESENTATIONS FOR MUSIC AUDIO CLASSIFICATION AND SIMILARITY
TRANSFER LEARNING IN MIR: SHARING LEARNED LATENT REPRESENTATIONS FOR MUSIC AUDIO CLASSIFICATION AND SIMILARITY Philippe Hamel, Matthew E. P. Davies, Kazuyoshi Yoshii and Masataka Goto National Institute
More informationSystem Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationarxiv: v2 [cs.cv] 30 Mar 2017
Domain Adaptation for Visual Applications: A Comprehensive Survey Gabriela Csurka arxiv:1702.05374v2 [cs.cv] 30 Mar 2017 Abstract The aim of this paper 1 is to give an overview of domain adaptation and
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationTraining a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski
Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski Problem Statement and Background Given a collection of 8th grade science questions, possible answer
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF
Read Online and Download Ebook ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Click link bellow and free register to download
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationA study of speaker adaptation for DNN-based speech synthesis
A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,
More informationIntroduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationTransfer Learning with Applications
Transfer Learning with Applications Sinno Jialin Pan 1, Qiang Yang 2,3 and Wei Fan 3 1 Institute for Infocomm Research, Singapore 2 Hong Kong University of Science and Technology 3 Huawei Noah's Ark Research
More informationA survey of multi-view machine learning
Noname manuscript No. (will be inserted by the editor) A survey of multi-view machine learning Shiliang Sun Received: date / Accepted: date Abstract Multi-view learning or learning with multiple distinct
More informationSemi-Supervised Face Detection
Semi-Supervised Face Detection Nicu Sebe, Ira Cohen 2, Thomas S. Huang 3, Theo Gevers Faculty of Science, University of Amsterdam, The Netherlands 2 HP Research Labs, USA 3 Beckman Institute, University
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationBUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING
BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING Gábor Gosztolya 1, Tamás Grósz 1, László Tóth 1, David Imseng 2 1 MTA-SZTE Research Group on Artificial
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationPREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES
PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES Po-Sen Huang, Kshitiz Kumar, Chaojun Liu, Yifan Gong, Li Deng Department of Electrical and Computer Engineering,
More informationLinking Task: Identifying authors and book titles in verbose queries
Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,
More informationA Deep Bag-of-Features Model for Music Auto-Tagging
1 A Deep Bag-of-Features Model for Music Auto-Tagging Juhan Nam, Member, IEEE, Jorge Herrera, and Kyogu Lee, Senior Member, IEEE latter is often referred to as music annotation and retrieval, or simply
More informationTop US Tech Talent for the Top China Tech Company
THE FALL 2017 US RECRUITING TOUR Top US Tech Talent for the Top China Tech Company INTERVIEWS IN 7 CITIES Tour Schedule CITY Boston, MA New York, NY Pittsburgh, PA Urbana-Champaign, IL Ann Arbor, MI Los
More informationAttributed Social Network Embedding
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, MAY 2017 1 Attributed Social Network Embedding arxiv:1705.04969v1 [cs.si] 14 May 2017 Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua Abstract Embedding
More informationDeep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach
#BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationRule Learning with Negation: Issues Regarding Effectiveness
Rule Learning with Negation: Issues Regarding Effectiveness Stephanie Chua, Frans Coenen, and Grant Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX
More informationWelcome to. ECML/PKDD 2004 Community meeting
Welcome to ECML/PKDD 2004 Community meeting A brief report from the program chairs Jean-Francois Boulicaut, INSA-Lyon, France Floriana Esposito, University of Bari, Italy Fosca Giannotti, ISTI-CNR, Pisa,
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationOnline Updating of Word Representations for Part-of-Speech Tagging
Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org
More informationExtracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models
Extracting Opinion Expressions and Their Polarities Exploration of Pipelines and Joint Models Richard Johansson and Alessandro Moschitti DISI, University of Trento Via Sommarive 14, 38123 Trento (TN),
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationDiverse Concept-Level Features for Multi-Object Classification
Diverse Concept-Level Features for Multi-Object Classification Youssef Tamaazousti 12 Hervé Le Borgne 1 Céline Hudelot 2 1 CEA, LIST, Laboratory of Vision and Content Engineering, F-91191 Gif-sur-Yvette,
More informationHIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION
HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION Atul Laxman Katole 1, Krishna Prasad Yellapragada 1, Amish Kumar Bedi 1, Sehaj Singh Kalra 1 and Mynepalli Siva Chaitanya 1 1 Samsung
More informationIterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages
Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages Nuanwan Soonthornphisaj 1 and Boonserm Kijsirikul 2 Machine Intelligence and Knowledge Discovery Laboratory Department of Computer
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationKnowledge Transfer in Deep Convolutional Neural Nets
Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract
More informationDiscriminative Learning of Beam-Search Heuristics for Planning
Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University
More informationDual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-6) Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors Sang-Woo Lee,
More informationSecond Exam: Natural Language Parsing with Neural Networks
Second Exam: Natural Language Parsing with Neural Networks James Cross May 21, 2015 Abstract With the advent of deep learning, there has been a recent resurgence of interest in the use of artificial neural
More informationarxiv: v1 [cs.cv] 10 May 2017
Inferring and Executing Programs for Visual Reasoning Justin Johnson 1 Bharath Hariharan 2 Laurens van der Maaten 2 Judy Hoffman 1 Li Fei-Fei 1 C. Lawrence Zitnick 2 Ross Girshick 2 1 Stanford University
More informationarxiv: v2 [stat.ml] 30 Apr 2016 ABSTRACT
UNSUPERVISED AND SEMI-SUPERVISED LEARNING WITH CATEGORICAL GENERATIVE ADVERSARIAL NETWORKS Jost Tobias Springenberg University of Freiburg 79110 Freiburg, Germany springj@cs.uni-freiburg.de arxiv:1511.06390v2
More informationDeep Facial Action Unit Recognition from Partially Labeled Data
Deep Facial Action Unit Recognition from Partially Labeled Data Shan Wu 1, Shangfei Wang,1, Bowen Pan 1, and Qiang Ji 2 1 University of Science and Technology of China, Hefei, Anhui, China 2 Rensselaer
More informationExposé for a Master s Thesis
Exposé for a Master s Thesis Stefan Selent January 21, 2017 Working Title: TF Relation Mining: An Active Learning Approach Introduction The amount of scientific literature is ever increasing. Especially
More informationarxiv: v1 [cs.cl] 20 Jul 2015
How to Generate a Good Word Embedding? Siwei Lai, Kang Liu, Liheng Xu, Jun Zhao National Laboratory of Pattern Recognition (NLPR) Institute of Automation, Chinese Academy of Sciences, China {swlai, kliu,
More informationModel Ensemble for Click Prediction in Bing Search Ads
Model Ensemble for Click Prediction in Bing Search Ads Xiaoliang Ling Microsoft Bing xiaoling@microsoft.com Hucheng Zhou Microsoft Research huzho@microsoft.com Weiwei Deng Microsoft Bing dedeng@microsoft.com
More informationA NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK. Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren
A NOVEL SCHEME FOR SPEAKER RECOGNITION USING A PHONETICALLY-AWARE DEEP NEURAL NETWORK Yun Lei Nicolas Scheffer Luciana Ferrer Mitchell McLaren Speech Technology and Research Laboratory, SRI International,
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationEXPLOITING DOMAIN AND TASK REGULARITIES FOR ROBUST NAMED ENTITY RECOGNITION
EXPLOITING DOMAIN AND TASK REGULARITIES FOR ROBUST NAMED ENTITY RECOGNITION Andrew O. Arnold August 2009 CMU-ML-09-109 School of Computer Science Machine Learning Department Carnegie Mellon University
More informationTHE world surrounding us involves multiple modalities
1 Multimodal Machine Learning: A Survey and Taxonomy Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency arxiv:1705.09406v2 [cs.lg] 1 Aug 2017 Abstract Our experience of the world is multimodal
More informationSEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING
SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING Sheng Li 1, Xugang Lu 2, Shinsuke Sakai 1, Masato Mimura 1 and Tatsuya Kawahara 1 1 School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501,
More informationChinese Language Parsing with Maximum-Entropy-Inspired Parser
Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art
More informationEvaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation
Multimodal Technologies and Interaction Article Evaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation Kai Xu 1, *,, Leishi Zhang 1,, Daniel Pérez 2,, Phong
More informationA Case-Based Approach To Imitation Learning in Robotic Agents
A Case-Based Approach To Imitation Learning in Robotic Agents Tesca Fitzgerald, Ashok Goel School of Interactive Computing Georgia Institute of Technology, Atlanta, GA 30332, USA {tesca.fitzgerald,goel}@cc.gatech.edu
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationarxiv:submit/ [cs.cv] 2 Aug 2017
Associative Domain Adaptation Philip Haeusser 1,2 haeusser@in.tum.de Thomas Frerix 1 Alexander Mordvintsev 2 thomas.frerix@tum.de moralex@google.com 1 Dept. of Informatics, TU Munich 2 Google, Inc. Daniel
More informationSwitchboard Language Model Improvement with Conversational Data from Gigaword
Katholieke Universiteit Leuven Faculty of Engineering Master in Artificial Intelligence (MAI) Speech and Language Technology (SLT) Switchboard Language Model Improvement with Conversational Data from Gigaword
More informationIndian Institute of Technology, Kanpur
Indian Institute of Technology, Kanpur Course Project - CS671A POS Tagging of Code Mixed Text Ayushman Sisodiya (12188) {ayushmn@iitk.ac.in} Donthu Vamsi Krishna (15111016) {vamsi@iitk.ac.in} Sandeep Kumar
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationTransfer Learning Action Models by Measuring the Similarity of Different Domains
Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn
More informationCS 446: Machine Learning
CS 446: Machine Learning Introduction to LBJava: a Learning Based Programming Language Writing classifiers Christos Christodoulopoulos Parisa Kordjamshidi Motivation 2 Motivation You still have not learnt
More informationFSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification
FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification arxiv:1709.09268v2 [cs.lg] 15 Nov 2017 Kamran Kowsari, Nima Bari, Roman Vichr and Farhad A. Goodarzi Department of Computer
More informationarxiv: v1 [cs.lg] 3 May 2013
Feature Selection Based on Term Frequency and T-Test for Text Categorization Deqing Wang dqwang@nlsde.buaa.edu.cn Hui Zhang hzhang@nlsde.buaa.edu.cn Rui Liu, Weifeng Lv {liurui,lwf}@nlsde.buaa.edu.cn arxiv:1305.0638v1
More informationLanguage Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus
Language Acquisition Fall 2010/Winter 2011 Lexical Categories Afra Alishahi, Heiner Drenhaus Computational Linguistics and Phonetics Saarland University Children s Sensitivity to Lexical Categories Look,
More informationWeb as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics
(L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes
More informationADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION
ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION Mitchell McLaren 1, Yun Lei 1, Luciana Ferrer 2 1 Speech Technology and Research Laboratory, SRI International, California, USA 2 Departamento
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationPredicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks
Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com
More informationSummarizing Answers in Non-Factoid Community Question-Answering
Summarizing Answers in Non-Factoid Community Question-Answering Hongya Song Zhaochun Ren Shangsong Liang hongya.song.sdu@gmail.com zhaochun.ren@ucl.ac.uk shangsong.liang@ucl.ac.uk Piji Li Jun Ma Maarten
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationWord learning as Bayesian inference
Word learning as Bayesian inference Joshua B. Tenenbaum Department of Psychology Stanford University jbt@psych.stanford.edu Fei Xu Department of Psychology Northeastern University fxu@neu.edu Abstract
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationProduct Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments
Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &
More informationSemantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma
Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma Adam Abdulhamid Stanford University 450 Serra Mall, Stanford, CA 94305 adama94@cs.stanford.edu Abstract With the introduction
More informationCultivating DNN Diversity for Large Scale Video Labelling
Cultivating DNN Diversity for Large Scale Video Labelling Mikel Bober-Irizar mikel@mxbi.net Sameed Husain sameed.husain@surrey.ac.uk Miroslaw Bober m.bober@surrey.ac.uk Eng-Jon Ong e.ong@surrey.ac.uk Abstract
More informationarxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationTaxonomy-Regularized Semantic Deep Convolutional Neural Networks
Taxonomy-Regularized Semantic Deep Convolutional Neural Networks Wonjoon Goo 1, Juyong Kim 1, Gunhee Kim 1, Sung Ju Hwang 2 1 Computer Science and Engineering, Seoul National University, Seoul, Korea 2
More informationMining Student Evolution Using Associative Classification and Clustering
Mining Student Evolution Using Associative Classification and Clustering 19 Mining Student Evolution Using Associative Classification and Clustering Kifaya S. Qaddoum, Faculty of Information, Technology
More informationDistant Supervised Relation Extraction with Wikipedia and Freebase
Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational
More informationUsing Deep Convolutional Neural Networks in Monte Carlo Tree Search
Using Deep Convolutional Neural Networks in Monte Carlo Tree Search Tobias Graf (B) and Marco Platzner University of Paderborn, Paderborn, Germany tobiasg@mail.upb.de, platzner@upb.de Abstract. Deep Convolutional
More informationDetecting English-French Cognates Using Orthographic Edit Distance
Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National
More informationIssues in the Mining of Heart Failure Datasets
International Journal of Automation and Computing 11(2), April 2014, 162-179 DOI: 10.1007/s11633-014-0778-5 Issues in the Mining of Heart Failure Datasets Nongnuch Poolsawad 1 Lisa Moore 1 Chandrasekhar
More informationIEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, VOL XXX, NO. XXX,
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH AND LANGUAGE PROCESSING, VOL XXX, NO. XXX, 2017 1 Small-footprint Highway Deep Neural Networks for Speech Recognition Liang Lu Member, IEEE, Steve Renals Fellow,
More informationTeam Formation for Generalized Tasks in Expertise Social Networks
IEEE International Conference on Social Computing / IEEE International Conference on Privacy, Security, Risk and Trust Team Formation for Generalized Tasks in Expertise Social Networks Cheng-Te Li Graduate
More informationA deep architecture for non-projective dependency parsing
Universidade de São Paulo Biblioteca Digital da Produção Intelectual - BDPI Departamento de Ciências de Computação - ICMC/SCC Comunicações em Eventos - ICMC/SCC 2015-06 A deep architecture for non-projective
More informationCopyright by Sung Ju Hwang 2013
Copyright by Sung Ju Hwang 2013 The Dissertation Committee for Sung Ju Hwang certifies that this is the approved version of the following dissertation: Discriminative Object Categorization with External
More informationEnsemble Technique Utilization for Indonesian Dependency Parser
Ensemble Technique Utilization for Indonesian Dependency Parser Arief Rahman Institut Teknologi Bandung Indonesia 23516008@std.stei.itb.ac.id Ayu Purwarianti Institut Teknologi Bandung Indonesia ayu@stei.itb.ac.id
More informationMULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY
MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract
More informationComparison of network inference packages and methods for multiple networks inference
Comparison of network inference packages and methods for multiple networks inference Nathalie Villa-Vialaneix http://www.nathalievilla.org nathalie.villa@univ-paris1.fr 1ères Rencontres R - BoRdeaux, 3
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More information