Knowledge Representation and Reasoning with Deep Neural Networks. Arvind Neelakantan
|
|
- Harriet Heath
- 6 years ago
- Views:
Transcription
1 Knowledge Representation and Reasoning with Deep Neural Networks Arvind Neelakantan UMass Amherst: David Belanger, Rajarshi Das, Andrew McCallum and Benjamin Roth Google Brain: Martin Abadi, Dario Amodei, Quoc Le and Ilya Sutskever 1
2 Knowledge Representation and Reasoning Represent world knowledge so that computers can use it Manipulating available knowledge to produce desired behavior Language understanding, robotics,. 2
3 Early Systems Symbolic Representation Reasoning/Inference with search General Problem Solver (Simon et al., 1959), Cyc (Lenat et al., 1986),. Precise 3
4 Early Systems Knowledge: Permissible Transformations Reasoning: Search Algorithm 4
5 Example 5
6 Example Which venue has the biggest turnout? 6
7 Example Which venue had the biggest turnout? 1. Pick column Attendance 2. Get Position of Max entry 3. Print corresponding entry from column Site 7
8 Example Which venue had the biggest turnout? select site (max attendance) Manipulating Symbols and Discrete Processing! 8
9 Early Systems: Issues Real-world data is challenging Lack of generalization to large number of symbols No learning 9
10 Recent Work Markov Logic Networks (Richardson & Domingos, 2006), Probabilistic Soft Logic (Kimmig et al., 2012), Semantic Parsers (Zelle & Mooney, 1996),. Some components are learned Symbolic, most of the problems remain 10
11 Deep Neural Networks Output Input Speech Recognition: ~5% absolute accuracy improvement Image Recognition: ~10% absolute accuracy improvement (Dahl et al., 2012) and (Krizhevsky et al., 2012) 11
12 Deep Neural Networks Output real-valued vector (distributed representations) Input Continuous data and processing through real-numbers Transformation from input to output is learned from data using backpropagation algorithm 12
13 Perception vs Reasoning Input: Continuous Data vs Discrete Symbols Processing: Fuzzy vs Programs containing discrete operations, Rules, 13
14 Deep Neural Networks for Knowledge Representation and Reasoning 14
15 Deep Neural Networks for Knowledge Representation and Reasoning 1. Can we represent symbols with distributed representations and learn them? 2. Can we learn neural networks to perform reasoning with these representations? 15
16 Deep Neural Networks for Knowledge Representation and Reasoning 1. Generalization via distributed representations 2. Powerful non-linear models 3. Learn end-to-end, handle messy real-world data 16
17 Deep Neural Networks for Knowledge Representation and Reasoning 1. Can we represent symbols with distributed representations and learn them? 2. Can we train neural networks to perform reasoning with these representations? Massive Structured Knowledge Base Semi-Structured Web Tables 17
18 Knowledge Graphs Melinda Gates ChairOf Gates Foundation Headquarters Seattle 18
19 Knowledge Graph Path Queries Task1 LivesIn Melinda Gates ChairOf Gates Foundation Headquarters Seattle ChairOf (A, X) ^ Headquarters (X, B) LivesIn (A, B) 19
20 Program Induction/Semantic Parsing 20
21 Program Induction/Semantic Parsing Which venue had the biggest turnout? => select site (max attendance) how many games were telecasted in CBS? => count(location == CBS) 21
22 Program Induction/Semantic Parsing Task 2 Which venue had the biggest turnout? => select site (max attendance) how many games were telecasted in CBS? => count(location == CBS) 22
23 Related Work in Reasoning Natural Language Inference/Textual Entailment Visual Question Answering Reading Comprehension 23
24 Task 1: Knowledge Graph Path Queries Arvind Neelakantan, Benjamin Roth, and Andrew McCallum. Knowledge base completion using compositional vector space models. Workshop on Automated Knowledge Base Construction at NIPS, 2014 Arvind Neelakantan, Benjamin Roth, and Andrew McCallum. Compositional vector space models for knowledge base completion. ACL, 2015 Rajarshi Das, Arvind Neelakantan, David Belanger, and Andrew McCallum. Chains of reasoning over entities, relations, and text using recurrent neural networks. EACL,
25 Path Queries heads Single-Hop Melinda Gates ChairOf Gates Foundation LivesIn Multi-Hop Melinda Gates ChairOf Gates Foundation Headquarters Seattle 25
26 Motivation LivesIn Melinda Gates ChairOf heads Gates Foundation Headquarters headquartered in Seattle leads headquarters located in leader of founded in chairperson of based in Previous Work: Symbolic - Path Ranking Algorithm (Lao et al., 2011) & Sherlock (Schoenmackers et al., 2010) - Combinatorial Explosion => Poor Generalization 26
27 Multi-hop Reasoning: Current Methods do not generalize to unseen paths 27
28 Model (Neelakantan, Roth, McCallum, 2014) LivesIn Vector Similarity RNN RNN Melinda Gates ChairOf Gates Foundation Headquarters Seattle Generalize to Unseen Paths! 28
29 Selection/Attention 1. Max Similarity Score Target Relation Spouse Bill Gates Friends Warren Buffett visited Melinda Gates ChairOf Gates Foundation Headquarters Seattle Train with backprop! 29
30 Data Entity Pairs 3.2M Facts Relations 52M 51K Relation types tested 46 Total # paths 191M Average Path Length 4.7 Maximum Path Length 7 30
31 Results - Attention Method Mean Average Precision Path Ranking Algorithm 64.4 Path Ranking Algorithm + bigram 64.9 RNN (max)
32 Selection/Attention 1. Max 2. Average 3. top-k 4. LogSumExp Similarity Score Target Relation Melinda Gates Spouse Bill Gates ChairOf (Das, Neelakantan, Belanger, McCallum, 2016) Friends Gates Foundation Warren Buffett visited Headquarters Seattle Train with backprop! 32
33 Results - Attention Method Mean Average Precision Path Ranking Algorithm 64.4 Path Ranking Algorithm + bigram 64.9 RNN (max) 65.2 RNN (avg) 55.0 RNN (top-k) 68.2 RNN (logsumexp)
34 Predictive Paths seen paths /people/person/place_of_birth(a, B) A X Y B was born in /location/mailing_address/ citytown /location/mailing_address/ state_province_region A X B from /location/location/ contains -1 unseen paths /people/person/place_of_birth(a, B) A X B born in near was born in commonly known as 34
35 Multi-hop Reasoning: Current Methods do not generalize to unseen paths Recurrent Neural Networks achieve state-of-the-art results on answering path queries 35
36 Zero-Shot LivesIn Vector Similarity RNN RNN Melinda Gates ChairOf Gates Foundation Headquarters Seattle Predict relations not explicitly trained on! 36
37 Results Method Mean Average Precision Random 7.6 RNN (zero-shot) 20.6 RNN (supervised)
38 Multi-hop Reasoning: Current Methods do not generalize to unseen paths Recurrent Neural Networks achieve state-of-the-art results on answering path queries RNNs can perform zero-shot learning! 38
39 Deep Neural Networks for Knowledge Representation and Reasoning Recurrent Neural Networks achieve state-of-the-art results on answering knowledge graph path queries 39
40 Task 2: Program Induction/Semantic Parsing Arvind Neelakantan, Quoc V. Le, and Ilya Sutskever. Neural Programmer: Inducing latent programs with gradient descent. ICLR, 2016 Arvind Neelakantan, Quoc V Le, Martin Abadi, Andrew McCallum, and Dario Amodei. Learning a natural language interface with neural programmer. ICLR,
41 Program Induction/Semantic Parsing 41
42 Program Induction/Semantic Parsing Lookup Question: Which venue had the biggest turnout? Number Question: how many games were telecasted in CBS? 42
43 Program Induction/Semantic Parsing Which venue had the biggest turnout? => select site (max attendance) how many games were telecasted in CBS? => count(location == CBS) 43
44 Challenges Multi-step Reasoning Which section is the longest? => select name (max kilometers) Weak Supervision Which section is the longest? => select name (max kilometers) aaaaaaaaaaaaaaaaaaaaaaaaaaaa => IDF Checkpoint 44
45 Motivation Strong Supervision Weak Supervision (dataset specific rules to guide program search) Non-Neural Network Zelle & Mooney, (1996); Zettlemoyer & Collins, (2005) Liang et al., (2011); Kwiatkowski et al., (2013); Pasupat & Liang, (2015) 45
46 End-to-End Neural Networks Learning Discrete Functions is notoriously challenging! (Joulin & Mikolov, 2015) 46
47 Semantic Parsing: multi-step reasoning with discrete functions; weak supervision 47
48 Neural Programmer (Neelakantan, Le, Sutskever, 2016) What was the timestep t = 1,,T total number of Neural Network goals scored in 2005 Scalar Row Column Selection Operation Selection Lookup Answer Answer Selector Operations Count Select ArgMax ArgMin > < Print Row Selector from t-1 Table 48 Data from Table
49 Neural Programmer History RNN Timestep t Outputt ht-1 Table Input at step t ct RNN step Question RNN ht q Col Selector Op Selector [ ; ] Operations t = 1, 2,, T Input at step t+1 Final Output = OutputT Output: Scalar Answer, Lookup Answer, Row Selector 49
50 Operations Row Selector: vector with size equal to number of rows - Comparison: >, <, >=, <= - Superlative: argmax, argmin - Table Ops: select, first, last, prev, next, group_by_max - Reset/No-Op Scalar Answer: real number - Aggregation: count Lookup Answer: matrix with same dimension as table - Print 50
51 Example Question Step 1 Step 2 Step 3 Step 4 What was the total number of goals scored in 2005 Operation Column No-Op - No-Op - select season print goals 51
52 Weak Supervision Question Step 1 Step 2 Step 3 Step 4 What was the total number of goals scored in 2005 Operation Column No-Op - No-Op - select season print goals Final Answer: 12 52
53 Soft Selection/Attention (Bahdanau, Cho, Bengio, 2014) Average outputs of the different operations weighted by the probabilities from the model Train with backprop! 53
54 Soft Selection/Attention Column A Column B 0.6 Operation A Operation B Output 0.6 x 0.7 x x 0.3 x x 0.7 x x 0.3 x
55 Training Objective Final Answer - Number Answer: Square Loss - Lookup Answer: Average of loss on each entry Answer simply written down introduces ambiguity - Number could be generated or a table entry - Multiple table entries match the answer - Minimum of individual losses 55
56 Semantic Parsing: multi-step reasoning with discrete functions; weak supervision Neural Programmer can be trained end-to-end with backprogragation using weak supervision 56
57 Previous Work Non-Neural Network Neural Network Strong Supervision Zelle & Mooney, (1996); Zettlemoyer & Collins, (2005) Jia & Liang, (2016); Neural Programmer Interpreter (Reed & De Freitas, 2015); Neural Enquirer (Yin et al., 2016) Weak Supervision Liang et al., (2011); Kwiatkowski et al., (2013); Pasupat & Liang, (2015) Dynamic Neural Module Network (Andreas et al., 2016) not end-to-end 57
58 Experiments WikiTablesQuestions dataset (Pasupat & Liang, 2015) Database at test time are unseen during training 10k training examples with weak supervision Hard selection at test time 4 timesteps and 15 operations 58
59 Neural Networks Seq2Seq (Sutskever, Vinyals & Le, 2014) 8.9% accuracy Pointer Networks (Vinyals, Fortunato & Jaitly, 2015) 4.0% accuracy on lookup questions 59
60 Results (Neelakantan, Le, Abadi, McCallum, Amodei, 2017) Method Dev Accuracy Test Accuracy Information Retrieval System Simple Semantic Parser Semantic Parser (Pasupat & Liang, 2015) Neural Programmer - {dropout, weight decay } Neural Programmer Ensemble of 15 Neural Programmers
61 Training Data Size Textual Entailment Textual Entailment Reading Comprehension # Training Examples 4.5k 550k 86k Non-Neural Network Neural Network
62 Conversational QA(Iyyer, Yih, Chang, 2017) Method Test Accuracy Semantic Parser (Pasupat & Liang, 2015) 33.2 Neural Programmer 40.2 DynSP (Iyyer, Yih, Chang, 2017)
63 Semantic Parsing: multi-step reasoning with discrete functions; weak supervision Neural Programmer can be trained end-to-end with backprogragation using weak supervision Neural Programmer works surprisingly well on a small real-world dataset 63
64 Example Programs (1) Question Step 1 Step 2 Step 3 Step 4 What is the total number of teams? Operation Column count - how many games had greater than 1500 in attendance? Operation Column >= attendance count - what is the total number of runnerups listed on the chart? Operation Column select outcome count - 64
65 Example Programs (2) Question Step 1 Step 2 Step 3 Step 4 which section is longest?? Operation Column argmax kilometers print name Which engine(s) has the least amount of power? Operation Column argmin power print engine Who had more silver medals, cuba or brazil? Operation Column argmax nation select nation argmax silver print nation 65
66 Example Programs (3) Question Step 1 Step 2 Step 3 Step 4 who was the next appointed director after lee p. brown? Operation Column select name next - last - print name what team is listed previous to belgium? Operation Column select team previous - first - print team 66
67 Summary 67
68 Deep Neural Networks for Knowledge Representation and Reasoning Recurrent Neural Networks achieve state-of-the-art results on answering knowledge graph path queries Neural Programmer achieves competitive results on a small real-world question answering dataset 68
69 Key Components Recurrent Neural Networks Attention/Selection Mechanism Backpropagation 69
70 Deep Neural Networks for Knowledge Representation and Reasoning Recurrent Neural Networks achieve state-of-the-art results on answering knowledge graph path queries Neural Programmer achieves competitive results on a small Code and Data are publicly available! real-world question answering dataset 70
71 Acknowledgements: Google PhD Fellowship, UMass Amherst and Google Brain Thank You! 71
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks
System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering
More informationSecond Exam: Natural Language Parsing with Neural Networks
Second Exam: Natural Language Parsing with Neural Networks James Cross May 21, 2015 Abstract With the advent of deep learning, there has been a recent resurgence of interest in the use of artificial neural
More informationLecture 1: Machine Learning Basics
1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3
More informationUnsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model
Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model Xinying Song, Xiaodong He, Jianfeng Gao, Li Deng Microsoft Research, One Microsoft Way, Redmond, WA 98052, U.S.A.
More informationarxiv: v4 [cs.cl] 28 Mar 2016
LSTM-BASED DEEP LEARNING MODELS FOR NON- FACTOID ANSWER SELECTION Ming Tan, Cicero dos Santos, Bing Xiang & Bowen Zhou IBM Watson Core Technologies Yorktown Heights, NY, USA {mingtan,cicerons,bingxia,zhou}@us.ibm.com
More informationA Neural Network GUI Tested on Text-To-Phoneme Mapping
A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis
More informationGenerative models and adversarial training
Day 4 Lecture 1 Generative models and adversarial training Kevin McGuinness kevin.mcguinness@dcu.ie Research Fellow Insight Centre for Data Analytics Dublin City University What is a generative model?
More informationModule 12. Machine Learning. Version 2 CSE IIT, Kharagpur
Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should
More informationCSL465/603 - Machine Learning
CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am
More informationCS Machine Learning
CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing
More informationarxiv: v1 [cs.lg] 15 Jun 2015
Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy arxiv:1506.04477v1 [cs.lg] 15 Jun 2015 Sang-Woo Lee Min-Oh Heo School of Computer Science and
More informationSemi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.
Semi-supervised methods of text processing, and an application to medical concept extraction Yacine Jernite Text-as-Data series September 17. 2015 What do we want from text? 1. Extract information 2. Link
More informationPython Machine Learning
Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled
More informationLecture 10: Reinforcement Learning
Lecture 1: Reinforcement Learning Cognitive Systems II - Machine Learning SS 25 Part III: Learning Programs and Strategies Q Learning, Dynamic Programming Lecture 1: Reinforcement Learning p. Motivation
More informationQuestion Answering on Knowledge Bases and Text using Universal Schema and Memory Networks
Question Answering on Knowledge Bases and Text using Universal Schema and Memory Networks Rajarshi Das Manzil Zaheer Siva Reddy and Andrew McCallum College of Information and Computer Sciences, University
More informationarxiv: v1 [cs.cv] 10 May 2017
Inferring and Executing Programs for Visual Reasoning Justin Johnson 1 Bharath Hariharan 2 Laurens van der Maaten 2 Judy Hoffman 1 Li Fei-Fei 1 C. Lawrence Zitnick 2 Ross Girshick 2 1 Stanford University
More informationA Case Study: News Classification Based on Term Frequency
A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center
More informationAutoregressive product of multi-frame predictions can improve the accuracy of hybrid models
Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models Navdeep Jaitly 1, Vincent Vanhoucke 2, Geoffrey Hinton 1,2 1 University of Toronto 2 Google Inc. ndjaitly@cs.toronto.edu,
More informationTraining a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski
Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski Problem Statement and Background Given a collection of 8th grade science questions, possible answer
More informationAsk Me Anything: Dynamic Memory Networks for Natural Language Processing
Ask Me Anything: Dynamic Memory Networks for Natural Language Processing Ankit Kumar*, Ozan Irsoy*, Peter Ondruska*, Mohit Iyyer*, James Bradbury, Ishaan Gulrajani*, Victor Zhong*, Romain Paulus, Richard
More informationSemantic and Context-aware Linguistic Model for Bias Detection
Semantic and Context-aware Linguistic Model for Bias Detection Sicong Kuang Brian D. Davison Lehigh University, Bethlehem PA sik211@lehigh.edu, davison@cse.lehigh.edu Abstract Prior work on bias detection
More informationCal s Dinner Card Deals
Cal s Dinner Card Deals Overview: In this lesson students compare three linear functions in the context of Dinner Card Deals. Students are required to interpret a graph for each Dinner Card Deal to help
More informationDialog-based Language Learning
Dialog-based Language Learning Jason Weston Facebook AI Research, New York. jase@fb.com arxiv:1604.06045v4 [cs.cl] 20 May 2016 Abstract A long-term goal of machine learning research is to build an intelligent
More informationResidual Stacking of RNNs for Neural Machine Translation
Residual Stacking of RNNs for Neural Machine Translation Raphael Shu The University of Tokyo shu@nlab.ci.i.u-tokyo.ac.jp Akiva Miura Nara Institute of Science and Technology miura.akiba.lr9@is.naist.jp
More informationAQUA: An Ontology-Driven Question Answering System
AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.
More informationLip Reading in Profile
CHUNG AND ZISSERMAN: BMVC AUTHOR GUIDELINES 1 Lip Reading in Profile Joon Son Chung http://wwwrobotsoxacuk/~joon Andrew Zisserman http://wwwrobotsoxacuk/~az Visual Geometry Group Department of Engineering
More informationGeorgetown University at TREC 2017 Dynamic Domain Track
Georgetown University at TREC 2017 Dynamic Domain Track Zhiwen Tang Georgetown University zt79@georgetown.edu Grace Hui Yang Georgetown University huiyang@cs.georgetown.edu Abstract TREC Dynamic Domain
More informationPOS tagging of Chinese Buddhist texts using Recurrent Neural Networks
POS tagging of Chinese Buddhist texts using Recurrent Neural Networks Longlu Qin Department of East Asian Languages and Cultures longlu@stanford.edu Abstract Chinese POS tagging, as one of the most important
More informationExperiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling
Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad
More informationOPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS
OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,
More informationAssignment 1: Predicting Amazon Review Ratings
Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for
More informationGrade 6: Correlated to AGS Basic Math Skills
Grade 6: Correlated to AGS Basic Math Skills Grade 6: Standard 1 Number Sense Students compare and order positive and negative integers, decimals, fractions, and mixed numbers. They find multiples and
More informationCompositional Semantics
Compositional Semantics CMSC 723 / LING 723 / INST 725 MARINE CARPUAT marine@cs.umd.edu Words, bag of words Sequences Trees Meaning Representing Meaning An important goal of NLP/AI: convert natural language
More informationA study of speaker adaptation for DNN-based speech synthesis
A study of speaker adaptation for DNN-based speech synthesis Zhizheng Wu, Pawel Swietojanski, Christophe Veaux, Steve Renals, Simon King The Centre for Speech Technology Research (CSTR) University of Edinburgh,
More informationГлубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках
Глубокие рекуррентные нейронные сети для аспектно-ориентированного анализа тональности отзывов пользователей на различных языках Тарасов Д. С. (dtarasov3@gmail.com) Интернет-портал reviewdot.ru, Казань,
More informationarxiv: v1 [cs.lg] 7 Apr 2015
Transferring Knowledge from a RNN to a DNN William Chan 1, Nan Rosemary Ke 1, Ian Lane 1,2 Carnegie Mellon University 1 Electrical and Computer Engineering, 2 Language Technologies Institute Equal contribution
More informationDeep Neural Network Language Models
Deep Neural Network Language Models Ebru Arısoy, Tara N. Sainath, Brian Kingsbury, Bhuvana Ramabhadran IBM T.J. Watson Research Center Yorktown Heights, NY, 10598, USA {earisoy, tsainath, bedk, bhuvana}@us.ibm.com
More informationChallenges in Deep Reinforcement Learning. Sergey Levine UC Berkeley
Challenges in Deep Reinforcement Learning Sergey Levine UC Berkeley Discuss some recent work in deep reinforcement learning Present a few major challenges Show some of our recent work toward tackling
More informationAn investigation of imitation learning algorithms for structured prediction
JMLR: Workshop and Conference Proceedings 24:143 153, 2012 10th European Workshop on Reinforcement Learning An investigation of imitation learning algorithms for structured prediction Andreas Vlachos Computer
More informationProbabilistic Latent Semantic Analysis
Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview
More informationarxiv: v2 [cs.ir] 22 Aug 2016
Exploring Deep Space: Learning Personalized Ranking in a Semantic Space arxiv:1608.00276v2 [cs.ir] 22 Aug 2016 ABSTRACT Jeroen B. P. Vuurens The Hague University of Applied Science Delft University of
More informationhave to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,
A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994
More informationA deep architecture for non-projective dependency parsing
Universidade de São Paulo Biblioteca Digital da Produção Intelectual - BDPI Departamento de Ciências de Computação - ICMC/SCC Comunicações em Eventos - ICMC/SCC 2015-06 A deep architecture for non-projective
More informationActive Learning. Yingyu Liang Computer Sciences 760 Fall
Active Learning Yingyu Liang Computer Sciences 760 Fall 2017 http://pages.cs.wisc.edu/~yliang/cs760/ Some of the slides in these lectures have been adapted/borrowed from materials developed by Mark Craven,
More informationIntroduction to Simulation
Introduction to Simulation Spring 2010 Dr. Louis Luangkesorn University of Pittsburgh January 19, 2010 Dr. Louis Luangkesorn ( University of Pittsburgh ) Introduction to Simulation January 19, 2010 1 /
More informationSemantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma
Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma Adam Abdulhamid Stanford University 450 Serra Mall, Stanford, CA 94305 adama94@cs.stanford.edu Abstract With the introduction
More informationLearning Microsoft Office Excel
A Correlation and Narrative Brief of Learning Microsoft Office Excel 2010 2012 To the Tennessee for Tennessee for TEXTBOOK NARRATIVE FOR THE STATE OF TENNESEE Student Edition with CD-ROM (ISBN: 9780135112106)
More informationQuickStroke: An Incremental On-line Chinese Handwriting Recognition System
QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents
More informationRadius STEM Readiness TM
Curriculum Guide Radius STEM Readiness TM While today s teens are surrounded by technology, we face a stark and imminent shortage of graduates pursuing careers in Science, Technology, Engineering, and
More informationStatewide Framework Document for:
Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance
More informationA DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF GRAPH DATA
International Journal of Semantic Computing Vol. 5, No. 4 (2011) 433 462 c World Scientific Publishing Company DOI: 10.1142/S1793351X1100133X A DISTRIBUTIONAL STRUCTURED SEMANTIC SPACE FOR QUERYING RDF
More informationOn-the-Fly Customization of Automated Essay Scoring
Research Report On-the-Fly Customization of Automated Essay Scoring Yigal Attali Research & Development December 2007 RR-07-42 On-the-Fly Customization of Automated Essay Scoring Yigal Attali ETS, Princeton,
More informationUsing Web Searches on Important Words to Create Background Sets for LSI Classification
Using Web Searches on Important Words to Create Background Sets for LSI Classification Sarah Zelikovitz and Marina Kogan College of Staten Island of CUNY 2800 Victory Blvd Staten Island, NY 11314 Abstract
More informationOnline Updating of Word Representations for Part-of-Speech Tagging
Online Updating of Word Representations for Part-of-Speech Tagging Wenpeng Yin LMU Munich wenpeng@cis.lmu.de Tobias Schnabel Cornell University tbs49@cornell.edu Hinrich Schütze LMU Munich inquiries@cislmu.org
More information(Sub)Gradient Descent
(Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include
More informationA New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation
A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation SLSP-2016 October 11-12 Natalia Tomashenko 1,2,3 natalia.tomashenko@univ-lemans.fr Yuri Khokhlov 3 khokhlov@speechpro.com Yannick
More informationA Vector Space Approach for Aspect-Based Sentiment Analysis
A Vector Space Approach for Aspect-Based Sentiment Analysis by Abdulaziz Alghunaim B.S., Massachusetts Institute of Technology (2015) Submitted to the Department of Electrical Engineering and Computer
More informationTextGraphs: Graph-based algorithms for Natural Language Processing
HLT-NAACL 06 TextGraphs: Graph-based algorithms for Natural Language Processing Proceedings of the Workshop Production and Manufacturing by Omnipress Inc. 2600 Anderson Street Madison, WI 53704 c 2006
More informationDeep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach
#BaselOne7 Deep search Enhancing a search bar using machine learning Ilgün Ilgün & Cedric Reichenbach We are not researchers Outline I. Periscope: A search tool II. Goals III. Deep learning IV. Applying
More informationEvolutive Neural Net Fuzzy Filtering: Basic Description
Journal of Intelligent Learning Systems and Applications, 2010, 2: 12-18 doi:10.4236/jilsa.2010.21002 Published Online February 2010 (http://www.scirp.org/journal/jilsa) Evolutive Neural Net Fuzzy Filtering:
More informationMassachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139
Hariharan Narayanan Massachusetts Institute of Technology Tel: 773.428.3115 LIDS har@mit.edu 77 Massachusetts Avenue http://www.mit.edu/~har Room 32-D558 MA 02139 EMPLOYMENT Massachusetts Institute of
More informationSINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)
SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,
More informationarxiv: v5 [cs.ai] 18 Aug 2015
When Are Tree Structures Necessary for Deep Learning of Representations? Jiwei Li 1, Minh-Thang Luong 1, Dan Jurafsky 1 and Eduard Hovy 2 1 Computer Science Department, Stanford University, Stanford, CA
More informationObjectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition
Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic
More informationReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology
ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology Tiancheng Zhao CMU-LTI-16-006 Language Technologies Institute School of Computer Science Carnegie Mellon
More informationDublin City Schools Mathematics Graded Course of Study GRADE 4
I. Content Standard: Number, Number Sense and Operations Standard Students demonstrate number sense, including an understanding of number systems and reasonable estimates using paper and pencil, technology-supported
More information1.11 I Know What Do You Know?
50 SECONDARY MATH 1 // MODULE 1 1.11 I Know What Do You Know? A Practice Understanding Task CC BY Jim Larrison https://flic.kr/p/9mp2c9 In each of the problems below I share some of the information that
More informationHIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION
HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION Atul Laxman Katole 1, Krishna Prasad Yellapragada 1, Amish Kumar Bedi 1, Sehaj Singh Kalra 1 and Mynepalli Siva Chaitanya 1 1 Samsung
More informationObjective: Add decimals using place value strategies, and relate those strategies to a written method.
NYS COMMON CORE MATHEMATICS CURRICULUM Lesson 9 5 1 Lesson 9 Objective: Add decimals using place value strategies, and relate those strategies to a written method. Suggested Lesson Structure Fluency Practice
More informationArtificial Neural Networks written examination
1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14
More informationLearning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models
Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models Stephan Gouws and GJ van Rooyen MIH Medialab, Stellenbosch University SOUTH AFRICA {stephan,gvrooyen}@ml.sun.ac.za
More informationLecture 2: Quantifiers and Approximation
Lecture 2: Quantifiers and Approximation Case study: Most vs More than half Jakub Szymanik Outline Number Sense Approximate Number Sense Approximating most Superlative Meaning of most What About Counting?
More informationDiscriminative Learning of Beam-Search Heuristics for Planning
Discriminative Learning of Beam-Search Heuristics for Planning Yuehua Xu School of EECS Oregon State University Corvallis,OR 97331 xuyu@eecs.oregonstate.edu Alan Fern School of EECS Oregon State University
More informationLearning From the Past with Experiment Databases
Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University
More informationDual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence (IJCAI-6) Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors Sang-Woo Lee,
More informationAxiom 2013 Team Description Paper
Axiom 2013 Team Description Paper Mohammad Ghazanfari, S Omid Shirkhorshidi, Farbod Samsamipour, Hossein Rahmatizadeh Zagheli, Mohammad Mahdavi, Payam Mohajeri, S Abbas Alamolhoda Robotics Scientific Association
More informationTheory of Probability
Theory of Probability Class code MATH-UA 9233-001 Instructor Details Prof. David Larman Room 806,25 Gordon Street (UCL Mathematics Department). Class Details Fall 2013 Thursdays 1:30-4-30 Location to be
More informationNatural Language Processing. George Konidaris
Natural Language Processing George Konidaris gdk@cs.brown.edu Fall 2017 Natural Language Processing Understanding spoken/written sentences in a natural language. Major area of research in AI. Why? Humans
More informationLevel 1 Mathematics and Statistics, 2015
91037 910370 1SUPERVISOR S Level 1 Mathematics and Statistics, 2015 91037 Demonstrate understanding of chance and data 9.30 a.m. Monday 9 November 2015 Credits: Four Achievement Achievement with Merit
More informationCircuit Simulators: A Revolutionary E-Learning Platform
Circuit Simulators: A Revolutionary E-Learning Platform Mahi Itagi Padre Conceicao College of Engineering, Verna, Goa, India. itagimahi@gmail.com Akhil Deshpande Gogte Institute of Technology, Udyambag,
More informationRule Learning With Negation: Issues Regarding Effectiveness
Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United
More informationDeep Value Networks Learn to Evaluate and Iteratively Refine Structured Outputs
Learn to Evaluate and Iteratively Refine Structured Outputs Michael Gygli 1 * Mohammad Norouzi 2 Anelia Angelova 2 Abstract We approach structured output prediction by optimizing a deep value network (DVN)
More informationarxiv: v2 [cs.lg] 8 Aug 2017
Learn to Evaluate and Iteratively Refine Structured Outputs Michael Gygli 1 * Mohammad Norouzi 2 Anelia Angelova 2 arxiv:1703.04363v2 [cs.lg] 8 Aug 2017 Abstract We approach structured output prediction
More informationWhat Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017
What Can Neural Networks Teach us about Language? Graham Neubig a2-dlearn 11/18/2017 Supervised Training of Neural Networks for Language Training Data Training Model this is an example the cat went to
More informationLIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting
LIM-LIG at SemEval-2017 Task1: Enhancing the Semantic Similarity for Arabic Sentences with Vectors Weighting El Moatez Billah Nagoudi Laboratoire d Informatique et de Mathématiques LIM Université Amar
More informationAUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION
JOURNAL OF MEDICAL INFORMATICS & TECHNOLOGIES Vol. 11/2007, ISSN 1642-6037 Marek WIŚNIEWSKI *, Wiesława KUNISZYK-JÓŹKOWIAK *, Elżbieta SMOŁKA *, Waldemar SUSZYŃSKI * HMM, recognition, speech, disorders
More informationLearning Methods for Fuzzy Systems
Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8
More informationA Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval
A Latent Semantic Model with Convolutional-Pooling Structure for Information Retrieval Yelong Shen Microsoft Research Redmond, WA, USA yeshen@microsoft.com Xiaodong He Jianfeng Gao Li Deng Microsoft Research
More informationON THE USE OF WORD EMBEDDINGS ALONE TO
ON THE USE OF WORD EMBEDDINGS ALONE TO REPRESENT NATURAL LANGUAGE SEQUENCES Anonymous authors Paper under double-blind review ABSTRACT To construct representations for natural language sequences, information
More informationIntroduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition
Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and
More informationSpeech Recognition at ICSI: Broadcast News and beyond
Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI
More informationCS 598 Natural Language Processing
CS 598 Natural Language Processing Natural language is everywhere Natural language is everywhere Natural language is everywhere Natural language is everywhere!"#$%&'&()*+,-./012 34*5665756638/9:;< =>?@ABCDEFGHIJ5KL@
More informationAlgebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview
Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best
More informationarxiv: v1 [cs.cl] 2 Apr 2017
Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,
More informationTransfer Learning Action Models by Measuring the Similarity of Different Domains
Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn
More informationPIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries
Ina V.S. Mullis Michael O. Martin Eugenio J. Gonzalez PIRLS International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries International Study Center International
More informationUnit 7 Data analysis and design
2016 Suite Cambridge TECHNICALS LEVEL 3 IT Unit 7 Data analysis and design A/507/5007 Guided learning hours: 60 Version 2 - revised May 2016 *changes indicated by black vertical line ocr.org.uk/it LEVEL
More informationA JOINT MANY-TASK MODEL: GROWING A NEURAL NETWORK FOR MULTIPLE NLP TASKS
A JOINT MANY-TASK MODEL: GROWING A NEURAL NETWORK FOR MULTIPLE NLP TASKS Kazuma Hashimoto, Caiming Xiong, Yoshimasa Tsuruoka & Richard Socher The University of Tokyo {hassy, tsuruoka}@logos.t.u-tokyo.ac.jp
More informationDistant Supervised Relation Extraction with Wikipedia and Freebase
Distant Supervised Relation Extraction with Wikipedia and Freebase Marcel Ackermann TU Darmstadt ackermann@tk.informatik.tu-darmstadt.de Abstract In this paper we discuss a new approach to extract relational
More information12- A whirlwind tour of statistics
CyLab HT 05-436 / 05-836 / 08-534 / 08-734 / 19-534 / 19-734 Usable Privacy and Security TP :// C DU February 22, 2016 y & Secu rivac rity P le ratory bo La Lujo Bauer, Nicolas Christin, and Abby Marsh
More information