CAP 6412 Advanced Computer Vision
|
|
- Sherilyn Wood
- 6 years ago
- Views:
Transcription
1 CAP 6412 Advanced Computer Vision Boqing Gong Feb 23, 2016
2 Today Administrivia Neural networks & Backpropagation (IX) Pose estimation, by Amar
3 This week: Vision and language Tuesday (02/23) Suhas Nithyanandappa Thursday (02/25) Nandakishore Puttashamachar [VQA-1] Antol, Stanislaw, AishwaryaAgrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, and Devi Parikh. VQA: Visual Question Answering. arxiv preprint arxiv: (2015). & Secondary papers [VQA-2] Malinowski, Mateusz, and Mario Fritz. A multi-world approach to question answering about real-world scenes based on uncertain input. In Advances in Neural Information Processing Systems, pp & Secondary papers
4 Next week: Vision and language Tuesday (03/01) Javier Lores Thursday (03/03) Aisha Urooji [Relation Phrases] Sadeghi, Fereshteh, Santosh K. Divvala, and Ali Farhadi. VisKE: Visual Knowledge Extraction and Question Answering by Visual Verification of Relation Phrases. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp & Secondary papers [OCR in the wild] Jaderberg, Max, Karen Simonyan, Andrea Vedaldi, and Andrew Zisserman. "Reading text in the wild with convolutional neural networks." International Journal of Computer Vision 116, no. 1 (2016): & Secondary papers
5 Project 1: Due on 02/28 If you have discussed option 2 with me Send me the meeting minutes / slides --- grading criteria If you take option 1 In total, >6,000 validation images Test 3 images per class of the validation set
6 Travel plan At Washington DC on 03/01, Tuesday Guest lecture by Dr. Ulas Bagci
7 Today Administrivia Neural networks & Backpropagation (IX) VQA-1, by Suhas
8 Recap Data: (x i,y i ) 2 X Y,i=1, 2,,n Goal: Find the labeling function c : X 7! Y,c(x) =y Hypotheses: net(x; ) Expected risk: R( ) =E (x,y) [net(x; ) 6= y] 1 Empirical risk: ˆR( ) = n nx L(x i,y i ; ) i=1
9 Recap 2 6 Empirical risk: ˆR( ) = 1 n nx i=1 L(x i,y i ; ) Parameter estimation: ˆ arg min ˆR( ) Optimization by stochastic gradient descent (SGD) : : learning rate w t w t 1 rl(x i,y i ; t 1 )
10 Checking for convergence 1 ˆR What happened? 1. Bug in the program 2. Learning rate too high 3. What if c(x)=y is totally random? 4. No appropriate pre-processing of the Input dataàdummy gradients iteration t
11 Checking for convergence 2 What happened? Different learning rateà different local optimum & decreasing rate ˆR iteration t Why oscillation: 1. Skipped the optimum because of large learning rate 2. Gradient of a single data point is noisy 3. Not computing the real R, instead we approximate it (see P12)
12 Checking for convergence 3 Turning down the learning rate! After some iterations: ˆR = const 1 t + const 2 iteration t
13 Overfitting Error 1 2 iteration t Training, validation, test What shall we do? Early stopping Data augmentation Regularization (Dropout) Reduce the network complexity Which place should we (early) stop? Theta_1
14 Under-fitting Training, validation Possible reasons Error iteration t
15 Recap Neuron Two basic neural network (NN) structures Convolutional NN (CNN): a special feedforward network Using NN to approximate concepts (underlying labeling function) Training a NN (how to determine the weights of neurons)? Gradient descent, stochastic gradient descent (SGD) Backpropagation (for efficient gradient computation) Debugging tricks: learning rate, momentum, early stopping, dropout, weight decay, etc.
16 Next: Recurrent neural networks (RNN) Feed-forward networks Recurrent neural networks Image credit:
17 Why RNN? Feed-forward networks Model static input-output concept No time series Exists a single forward direction CNN Recurrent neural networks Model dynamic state transition Time & sequence data Exists feedback connections LSTM
18 Why RNN? (cont d) Markov models Model dynamic state transition Time & sequence data Markov (short-range) dependency Moderately sized states Recurrent neural networks Model dynamic state transition Time & sequence data Long-range dependency Exponentially expressive states
19 Next: Recurrent neural networks (RNN) RNN Vanishing and exploding gradients Long short-term memory (LSTM) Bi-directional RNN, GRU, Training algorithms, Applications (tentative)
20 Today Administrivia Neural networks & Backpropagation (IX) VQA-1, by Suhas
21 Upload slides before or after class See Paper Presentation on UCF webcourse Sharing your slides Refer to the originals sources of images, figures, etc. in your slides Convert them to a PDF file Upload the PDF file to Paper Presentation after your presentation
22 VQA: Visual Question Answering Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C. Lawrence Zitnick, Devi Parikh; The IEEE International Conference on Computer Vision (ICCV), 2015 Presented by Suhas Nithyanandappa February 23, 2016
23 Presentation Outline Motivation Problem Statement VQA Dataset Collection VQA Dataset Analysis VQA Baselines and Methods Results Related Work
24 Motivation Using word n-gram statistics to generate reasonable image captions isn t Artificial Intelligence Current state of the art doesn t capture the common sense reasoning present in the image Image credits:(c) Dhruv Batra
25 Why is AI hard? Slide credits:(c) Dhruv Batra
26 Is it useful? - VizWiz Slide credits:(c) Dhruv Batra (C) Dhruv Batra 5
27 Problem Statement In order to such common sense knowledge we need a big enough dataset Open-ended questions require a potentially vast set of AI capabilities to answer fine-grained recognition object detection activity recognition knowledge base reasoning In order to building such a system we need blocks from various fields: CV, NLP and KR VQA Good place to start
28 VQA Dataset Collection It is collect using Amazon Mechanical Turk service : >10,000 Turkers ~ >41,000 Human Hours ~ 4.7 Human Years ~ Person-Job-Years! Real Images 123,287 training and validation images and 81,434 test images from the newly-released Microsoft Common Objects in Context (MS COCO) dataset Images contain multiple objects and rich contextual information As they are visual complex, they are well-suited for VQA task Dataset contains five single-sentence captions for all images. Image credits:(c) MS COCO
29 VQA Dataset Collection Abstract Images In order to focus on reasoning, instead of the low-level vision tasks, a new abstract scenes dataset containing 50K scenes has been created The dataset contains 20 paperdoll human models : spanning genders, races, and ages with 8 different expressions The limbs are adjustable to allow for continuous pose variations. The clipart may be used to depict both indoor and outdoor scenes The set contains over 100 objects and 31 animals in various poses. Image credits : Appendix of the paper
30 VQA Dataset Collection Image credits : Appendix of the paper
31 VQA Dataset Collection Questions Collecting interesting, diverse, and well-posed questions is a significant challenge. Many questions require low-level computer vision knowledge, such as What color is the cat? or How many chairs are present in the scene? However, importance was given to collect questions that require common sense knowledge about the scene, such as What sound does the pictured animal make? Goal: Having wide variety of question types and difficulty, we may be able to measure the continual progress of both visual understanding and common sense reasoning Image credits : Appendix of the paper
32 AMT Data Collection Interface Slide credits:(c) Dhruv Batra
33 Slide credits:(c) Dhruv Batra
34 Slide credits:(c) Dhruv Batra
35 VQA Dataset Collection Answers Open-ended questions result in a diverse set of possible answers. Human subjects may also disagree on the correct answer, e.g., some saying yes while others say no To handle these discrepancies, we gather 10 answers from unique workers Two types of questions: open-answer and multiple-choice. For the open-answer task accuracy metric: an answer is deemed 100% accurate if at least 3 workers provided that exact answer Confidence level in answering By taking in responses yes, no and maybe
36 AMT Interface for Answer Collection (C) Dhruv Batra Slide credits:(c) Dhruv Batra 15
37
38 Slide credits:(c) Dhruv Batra
39 VQA Dataset Analysis The dataset includes : MS COCO dataset : 614,163 questions and 7,984,119 answers for 204,721 images Abstract scenes dataset: 150,000 questions with 1,950,000 answers for 50, 000 images Questions: We can cluster questions into different types based on the words that start the question Interestingly, the distribution of questions is quite similar for both real images and abstract scenes Length : We see that most questions range from four to ten words. Answers: Many questions have yes or no answers Questions such as What is... and What type... have a rich diversity of responses Question types such as What color... or Which... have more specialized responses Length : The distribution of answers containing one, two, or three words, 89.32%, 6.91%, and 2.74% for real images 90.51%, 5.89%, and 2.49% for abstract scenes This is in contrast with image captions that generically describe the entire image and hence tend to be longer
40 VQA Dataset Analysis Yes/ No and Number Answers Among these yes/no questions, there is a bias towards yes 58:83% and 55:86% of yes/no answers are yes for real images and abstract scenes Subject Confidence A majority of the answers were labelled as confident for both real images and abstract scenes Inter Human Agreement Does the self-judgment of confidence correspond to the answer agreement between subjects? Image credits:(c) Dhruv Batra
41 VQA Dataset Analysis Is the Image Necessary? Questions can sometimes be answered correctly using commonsense knowledge This issue by asking three subjects to answer the questions without seeing the image
42 VQA Dataset Analysis Captions vs. Questions Do generic image captions provide enough information to answer the questions? Answer : Statistically different from those mentioned in our questions + answers (Kolmogorov-Smirnov test, p <.001) for both real images and abstract scenes Table credits:(c) Dhruv Batra
43 VQA Dataset Analysis Which Questions Require Common Sense? To capture required knowledge external to the image, three categories were formed Toddler (3-4), younger child (5-8), older child (9-12), teenager (13-17), adult (18+) Statistics After data analysis it was observed that For 47.43% of question 3 or more subjects voted yes to common sense, (18:14%: 6 or more). Image credits:(c) Dhruv Batra
44 Least commonsense questions Slide credits:(c) Dhruv Batra 23
45 Most commonsense questions Slide credits:(c) Dhruv Batra 24
46 VQA Baseline and Analysis To establish baselines, difficulty of the VQA dataset for the MS COCO images is explored If we randomly choose an answer from the top 1K answers of the VQA train/ validation dataset, the teststandard accuracy is 0.12% If we always select the most popular answer ( yes ), the accuracy is 29.72% Picking the most popular answer per question type does 36.18% Nearest neighbour approach does 40.61% on validation dataset
47 2-Channel VQA Model Image Embedding 1k output units Neural Network Softmax over top K answers Convolution Layer + Non-Linearity Pooling Layer Convolution Layer + Non-Linearity Pooling Layer Fully-Connected MLP Question Embedding (BoW) How many horses are in this image? what 0 where 0 how 1 is 0 could 0 are 0 are 1 horse 1 image 1 Beginning of question words Slide credits:(c) Dhruv Batra 26
48 2-Channel VQA Model Image Embedding 1k output units Neural Network Softmax over top K answers Convolution Layer + Non-Linearity Pooling Layer Convolution Layer + Non-Linearity Pooling Layer Fully-Connected MLP Question Embedding (LSTM) How many horses are in this image? Slide credits:(c) Dhruv Batra 27
49 VQA Baseline and Analysis Results Slide credits:(c) Dhruv Batra
50 VQA Baseline and Analysis Results Image credits:(c) Dhruv Batra
51 Future Work Training on task-specific datasets may help enable practical VQA applications. Creation of Video datasets performing basic actions might help in capturing commonsense
52 Related Work: Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering Haoyuan Gao, Junhua Mao, Jie Zhou, Zhiheng Huang, Lei Wang, Wei Xu
53 Introduction It contains over 150,000 images and 310,000 freestyle Chinese question-answer pairs and their English translations. The quality of the generated answers of our mqa model on this dataset is evaluated by human judges through a Turing Test They will also provide a score (i.e. 0, 1, 2, the larger the better) indicating the quality of the answer
54 Model Image credits: Paper -Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
55 Fusing: The activation of the fusing layer f(t) for the (t)th word in the answer can be calculated as follows: where + denotes element-wise addition rq stands for the activation of the LSTM(Q) memory cells of the last word in the question I denotes the image representation Ra(t) and Wa(t) denotes the activation of the LSTM(A) memory cells and the word embedding of the t th word in the answer respectively Vrq, Vi, Vra and Vw are the weight matrices that need to be learned g(.) is an element-wise non-linear function Image credits: Paper -Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
56 Results Table credits: Paper -Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question Answering
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationA Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention
A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention Damien Teney 1, Peter Anderson 2*, David Golub 4*, Po-Sen Huang 3, Lei Zhang 3, Xiaodong He 3, Anton van den Hengel 1 1
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationarxiv: v1 [cs.cv] 10 May 2017
Inferring and Executing Programs for Visual Reasoning Justin Johnson 1 Bharath Hariharan 2 Laurens van der Maaten 2 Judy Hoffman 1 Li Fei-Fei 1 C. Lawrence Zitnick 2 Ross Girshick 2 1 Stanford University
More informationDeep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach
#BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationPOS tagging of Chinese Buddhist texts using Recurrent Neural Networks
POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important
More informationTraining a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski
Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski Problem Statement and Background Given a collection of 8th grade science questions, possible answer
More informationarxiv: v1 [cs.lg] 15 Jun 2015
Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationarxiv: v1 [cs.lg] 7 Apr 2015
Transferring Knowledge from a RNN to a DNN William Chan 1, Nan Rosemary Ke 1, Ian Lane 1,2 Carnegie Mellon University 1 Electrical and Computer Engineering, 2 Language Technologies Institute Equal contribution
More informationHIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION
HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION Atul Laxman Katole 1, Krishna Prasad Yellapragada 1, Amish Kumar Bedi 1, Sehaj Singh Kalra 1 and Mynepalli Siva Chaitanya 1 1 Samsung
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationAutoregressive product of multi-frame predictions can improve the accuracy of hybrid models
Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,
More informationWord Segmentation of Off-line Handwritten Documents
Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department
More informationKnowledge Transfer in Deep Convolutional Neural Nets
Knowledge Transfer in Deep Convolutional Neural Nets Steven Gutstein, Olac Fuentes and Eric Freudenthal Computer Science Department University of Texas at El Paso El Paso, Texas, 79968, U.S.A. Abstract
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationModel Ensemble for Click Prediction in Bing Search Ads
Model Ensemble for Click Prediction in Bing Search Ads Xiaoliang Ling Microsoft Bing xiaoling@microsoft.com Hucheng Zhou Microsoft Research huzho@microsoft.com Weiwei Deng Microsoft Bing dedeng@microsoft.com
More informationFramewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures
Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures Alex Graves and Jürgen Schmidhuber IDSIA, Galleria 2, 6928 Manno-Lugano, Switzerland TU Munich, Boltzmannstr.
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationIntroduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationSoftprop: Softmax Neural Network Backpropagation Learning
Softprop: Softmax Neural Networ Bacpropagation Learning Michael Rimer Computer Science Department Brigham Young University Provo, UT 84602, USA E-mail: mrimer@axon.cs.byu.edu Tony Martinez Computer Science
More informationarxiv: v1 [cs.cl] 27 Apr 2016
The IBM 2016 English Conversational Telephone Speech Recognition System George Saon, Tom Sercu, Steven Rennie and Hong-Kwang J. Kuo IBM T. J. Watson Research Center, Yorktown Heights, NY, 10598 gsaon@us.ibm.com
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationHuman Emotion Recognition From Speech
RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati
More informationAxiom 2013 Team Description Paper
Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationTruth Inference in Crowdsourcing: Is the Problem Solved?
Truth Inference in Crowdsourcing: Is the Problem Solved? Yudian Zheng, Guoliang Li #, Yuanbing Li #, Caihua Shan, Reynold Cheng # Department of Computer Science, Tsinghua University Department of Computer
More informationSpeech Emotion Recognition Using Support Vector Machine
Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationГлубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках
Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках Тарасов Д. С. (dtarasov3@gmail.com) Интернет-портал reviewdot.ru, Казань,
More informationCalibration of Confidence Measures in Speech Recognition
Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationSecond Exam: Natural Language Parsing with Neural Networks
Second Exam: Natural Language Parsing with Neural Networks James Cross May 21, 2015 Abstract With the advent of deep learning, there has been a recent resurgence of interest in the use of artificial neural
More informationTHE world surrounding us involves multiple modalities
1 Multimodal Machine Learning: A Survey and Taxonomy Tadas Baltrušaitis, Chaitanya Ahuja, and Louis-Philippe Morency arxiv:1705.09406v2 [cs.lg] 1 Aug 2017 Abstract Our experience of the world is multimodal
More informationSemantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma
Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma Adam Abdulhamid Stanford University 450 Serra Mall, Stanford, CA 94305 adama94@cs.stanford.edu Abstract With the introduction
More informationPredicting Future User Actions by Observing Unmodified Applications
From: AAAI-00 Proceedings. Copyright 2000, AAAI (www.aaai.org). All rights reserved. Predicting Future User Actions by Observing Unmodified Applications Peter Gorniak and David Poole Department of Computer
More informationSARDNET: A Self-Organizing Feature Map for Sequences
SARDNET: A Self-Organizing Feature Map for Sequences Daniel L. James and Risto Miikkulainen Department of Computer Sciences The University of Texas at Austin Austin, TX 78712 dljames,risto~cs.utexas.edu
More informationDropout improves Recurrent Neural Networks for Handwriting Recognition
2014 14th International Conference on Frontiers in Handwriting Recognition Dropout improves Recurrent Neural Networks for Handwriting Recognition Vu Pham,Théodore Bluche, Christopher Kermorvant, and Jérôme
More informationDual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-6) Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors Sang-Woo Lee,
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationOCR for Arabic using SIFT Descriptors With Online Failure Prediction
OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,
More informationarxiv: v4 [cs.cl] 28 Mar 2016
LSTM-BASED DEEP LEARNING MODELS FOR NON- FACTOID ANSWER SELECTION Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou IBM Watson Core Technologies Yorktown Heights, NY, USA {mingtan,cicerons,bingxia,zhou}@us.ibm.com
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationThe Good Judgment Project: A large scale test of different methods of combining expert predictions
The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania
More informationSEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING
SEMI-SUPERVISED ENSEMBLE DNN ACOUSTIC MODEL TRAINING Sheng Li 1, Xugang Lu 2, Shinsuke Sakai 1, Masato Mimura 1 and Tatsuya Kawahara 1 1 School of Informatics, Kyoto University, Sakyo-ku, Kyoto 606-8501,
More informationMachine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler
Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina
More informationImage based Static Facial Expression Recognition with Multiple Deep Network Learning
Image based Static Facial Expression Recognition with Multiple Deep Network Learning ABSTRACT Zhiding Yu Carnegie Mellon University 5000 Forbes Ave Pittsburgh, PA 1521 yzhiding@andrew.cmu.edu We report
More informationDialog-based Language Learning
Dialog-based Language Learning Jason Weston Facebook AI Research, New York. jase@fb.com arxiv:1604.06045v4 [cs.cl] 20 May 2016 Abstract A long-term goal of machine learning research is to build an intelligent
More informationLecture 1: Basic Concepts of Machine Learning
Lecture 1: Basic Concepts of Machine Learning Cognitive Systems - Machine Learning Ute Schmid (lecture) Johannes Rabold (practice) Based on slides prepared March 2005 by Maximilian Röglinger, updated 2010
More informationA Review: Speech Recognition with Deep Learning Methods
Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 4, Issue. 5, May 2015, pg.1017
More informationCourse Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE
EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers
More informationarxiv: v1 [cs.cv] 2 Jun 2017
Temporal Action Labeling using Action Sets Alexander Richard, Hilde Kuehne, Juergen Gall University of Bonn, Germany {richard,kuehne,gall}@iai.uni-bonn.de arxiv:1706.00699v1 [cs.cv] 2 Jun 2017 Abstract
More informationA Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval Yelong Shen Microsoft Research Redmond, WA, USA yeshen@microsoft.com Xiaodong He Jianfeng Gao Li Deng Microsoft Research
More informationAsk Me Anything: Dynamic Memory Networks for Natural Language Processing
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing Ankit Kumar*, Ozan Irsoy*, Peter Ondruska*, Mohit Iyyer*, James Bradbury, Ishaan Gulrajani*, Victor Zhong*, Romain Paulus, Richard
More informationDevice Independence and Extensibility in Gesture Recognition
Device Independence and Extensibility in Gesture Recognition Jacob Eisenstein, Shahram Ghandeharizadeh, Leana Golubchik, Cyrus Shahabi, Donghui Yan, Roger Zimmermann Department of Computer Science University
More informationAttributed Social Network Embedding
JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, MAY 2017 1 Attributed Social Network Embedding arxiv:1705.04969v1 [cs.si] 14 May 2017 Lizi Liao, Xiangnan He, Hanwang Zhang, and Tat-Seng Chua Abstract Embedding
More informationarxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationGeorgetown University at TREC 2017 Dynamic Domain Track
Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain
More informationTwitter Sentiment Classification on Sanders Data using Hybrid Approach
IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders
More informationOn-the-Fly Customization of Automated Essay Scoring
Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,
More informationProposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science
Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science Gilberto de Paiva Sao Paulo Brazil (May 2011) gilbertodpaiva@gmail.com Abstract. Despite the prevalence of the
More informationLaboratorio di Intelligenza Artificiale e Robotica
Laboratorio di Intelligenza Artificiale e Robotica A.A. 2008-2009 Outline 2 Machine Learning Unsupervised Learning Supervised Learning Reinforcement Learning Genetic Algorithms Genetics-Based Machine Learning
More informationPredicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks
Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks Devendra Singh Chaplot, Eunhee Rhim, and Jihie Kim Samsung Electronics Co., Ltd. Seoul, South Korea {dev.chaplot,eunhee.rhim,jihie.kim}@samsung.com
More informationA Reinforcement Learning Variant for Control Scheduling
A Reinforcement Learning Variant for Control Scheduling Aloke Guha Honeywell Sensor and System Development Center 3660 Technology Drive Minneapolis MN 55417 Abstract We present an algorithm based on reinforcement
More informationLecture 2: Quantifiers and Approximation
Lecture 2: Quantifiers and Approximation Case study: Most vs More than half Jakub Szymanik Outline Number Sense Approximate Number Sense Approximating most Superlative Meaning of most What About Counting?
More informationSummarizing Answers in Non-Factoid Community Question-Answering
Summarizing Answers in Non-Factoid Community Question-Answering Hongya Song Zhaochun Ren Shangsong Liang hongya.song.sdu@gmail.com zhaochun.ren@ucl.ac.uk shangsong.liang@ucl.ac.uk Piji Li Jun Ma Maarten
More informationWebLogo-2M: Scalable Logo Detection by Deep Learning from the Web
WebLogo-2M: Scalable Logo Detection by Deep Learning from the Web Hang Su Queen Mary University of London hang.su@qmul.ac.uk Shaogang Gong Queen Mary University of London s.gong@qmul.ac.uk Xiatian Zhu
More informationarxiv: v2 [cs.cv] 4 Mar 2016
MULTI-SCALE CONTEXT AGGREGATION BY DILATED CONVOLUTIONS Fisher Yu Princeton University Vladlen Koltun Intel Labs arxiv:1511.07122v2 [cs.cv] 4 Mar 2016 ABSTRACT State-of-the-art models for semantic segmentation
More informationSORT: Second-Order Response Transform for Visual Recognition
SORT: Second-Order Response Transform for Visual Recognition Yan Wang 1, Lingxi Xie 2( ), Chenxi Liu 2, Siyuan Qiao 2 Ya Zhang 1( ), Wenjun Zhang 1, Qi Tian 3, Alan Yuille 2 1 Cooperative Medianet Innovation
More informationResidual Stacking of RNNs for Neural Machine Translation
Residual Stacking of RNNs for Neural Machine Translation Raphael Shu The University of Tokyo shu@nlab.ci.i.u-tokyo.ac.jp Akiva Miura Nara Institute of Science and Technology miura.akiba.lr9@is.naist.jp
More informationModeling function word errors in DNN-HMM based LVCSR systems
Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford
More informationLip Reading in Profile
CHUNG AND ZISSERMAN: BMVC AUTHOR GUIDELINES 1 Lip Reading in Profile Joon Son Chung http://wwwrobotsoxacuk/~joon Andrew Zisserman http://wwwrobotsoxacuk/~az Visual Geometry Group Department of Engineering
More informationSegmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition
Segmental Conditional Random Fields with Deep Neural Networks as Acoustic Models for First-Pass Word Recognition Yanzhang He, Eric Fosler-Lussier Department of Computer Science and Engineering The hio
More informationTesting A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA
Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA Testing a Moving Target How Do We Test Machine Learning Systems? Peter Varhol, Technology
More informationADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF
Read Online and Download Ebook ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF Click link bellow and free register to download
More informationarxiv: v4 [cs.cv] 13 Aug 2017
Ruben Villegas 1 * Jimei Yang 2 Yuliang Zou 1 Sungryull Sohn 1 Xunyu Lin 3 Honglak Lee 1 4 arxiv:1704.05831v4 [cs.cv] 13 Aug 17 Abstract We propose a hierarchical approach for making long-term predictions
More informationLecture 10: Reinforcement Learning
Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation
More information12- A whirlwind tour of statistics
CyLab HT 05-436 / 05-836 / 08-534 / 08-734 / 19-534 / 19-734 Usable Privacy and Security TP :// C DU February 22, 2016 y & Secu rivac rity P le ratory bo La Lujo Bauer, Nicolas Christin, and Abby Marsh
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationThe 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X
The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,
More informationDiscriminative Learning of Beam-Search Heuristics for Planning
Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationA Vector Space Approach for Aspect-Based Sentiment Analysis
A Vector Space Approach for Aspect-Based Sentiment Analysis by Abdulaziz Alghunaim B.S., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer
More informationEvolution of Symbolisation in Chimpanzees and Neural Nets
Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication
More informationA study of speaker adaptation for DNN-based speech synthesis
A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,
More informationAnalysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier
IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 2, Ver.1 (Mar - Apr.2015), PP 55-61 www.iosrjournals.org Analysis of Emotion
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationTRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen
TRANSFER LEARNING OF WEAKLY LABELLED AUDIO Aleksandr Diment, Tuomas Virtanen Tampere University of Technology Laboratory of Signal Processing Korkeakoulunkatu 1, 33720, Tampere, Finland firstname.lastname@tut.fi
More informationData Integration through Clustering and Finding Statistical Relations - Validation of Approach
Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Marek Jaszuk, Teresa Mroczek, and Barbara Fryc University of Information Technology and Management, ul. Sucharskiego
More informationClassification Using ANN: A Review
International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 7 (2017), pp. 1811-1820 Research India Publications http://www.ripublication.com Classification Using ANN:
More informationPhrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues
Phrase Localization and Visual Relationship Detection with Comprehensive Image-Language Cues Bryan A. Plummer Arun Mallya Christopher M. Cervantes Julia Hockenmaier Svetlana Lazebnik University of Illinois
More informationActive Learning. Yingyu Liang Computer Sciences 760 Fall
Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,
More informationThe Importance of Social Network Structure in the Open Source Software Developer Community
The Importance of Social Network Structure in the Open Source Software Developer Community Matthew Van Antwerp Department of Computer Science and Engineering University of Notre Dame Notre Dame, IN 46556
More information