SI425 : NLP. Missing Topics and the Future

Similar documents
Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Assignment 1: Predicting Amazon Review Ratings

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Cross Language Information Retrieval

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

CS 598 Natural Language Processing

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

Detecting English-French Cognates Using Orthographic Edit Distance

Calibration of Confidence Measures in Speech Recognition

Lecture 1: Machine Learning Basics

Dialog Act Classification Using N-Gram Algorithms

BYLINE [Heng Ji, Computer Science Department, New York University,

CS Machine Learning

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Python Machine Learning

A study of speaker adaptation for DNN-based speech synthesis

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Online Updating of Word Representations for Part-of-Speech Tagging

12- A whirlwind tour of statistics

Model Ensemble for Click Prediction in Bing Search Ads

A deep architecture for non-projective dependency parsing

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

arxiv: v1 [cs.cl] 2 Apr 2017

Modeling function word errors in DNN-HMM based LVCSR systems

Human Emotion Recognition From Speech

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

The Karlsruhe Institute of Technology Translation Systems for the WMT 2011

Learning Methods in Multilingual Speech Recognition

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Speech Recognition at ICSI: Broadcast News and beyond

Finding Translations in Scanned Book Collections

Greedy Decoding for Statistical Machine Translation in Almost Linear Time

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

INPE São José dos Campos

Beyond the Pipeline: Discrete Optimization in NLP

A Case Study: News Classification Based on Term Frequency

Grammar Lesson Plan: Yes/No Questions with No Overt Auxiliary Verbs

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

The Conversational User Interface

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

Axiom 2013 Team Description Paper

A Comparison of Two Text Representations for Sentiment Analysis

CROSS-LANGUAGE INFORMATION RETRIEVAL USING PARAFAC2

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

CSL465/603 - Machine Learning

Intensive Writing Class

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Loughton School s curriculum evening. 28 th February 2017

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

TeacherPlus Gradebook HTML5 Guide LEARN OUR SOFTWARE STEP BY STEP

A Vector Space Approach for Aspect-Based Sentiment Analysis

Distant Supervised Relation Extraction with Wikipedia and Freebase

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

FIN 448 Fundamental Financial Analysis

Test Effort Estimation Using Neural Network

Likelihood-Maximizing Beamforming for Robust Hands-Free Speech Recognition

Parsing of part-of-speech tagged Assamese Texts

Indian Institute of Technology, Kanpur

BUILDING CONTEXT-DEPENDENT DNN ACOUSTIC MODELS USING KULLBACK-LEIBLER DIVERGENCE-BASED STATE TYING

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Outline for Session III

Modeling function word errors in DNN-HMM based LVCSR systems

Cross-lingual Text Fragment Alignment using Divergence from Randomness

Administrative Services Manager Information Guide

Applications of memory-based natural language processing

Executive Summary. Vicenza Elementary School

CS224d Deep Learning for Natural Language Processing. Richard Socher, PhD

TAP Responsibilities. Gordon Burke

Roadmap to College: Highly Selective Schools

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

cmp-lg/ Jan 1998

Large vocabulary off-line handwriting recognition: A survey

Exploration. CS : Deep Reinforcement Learning Sergey Levine

Kent Island High School Spring 2016 Señora Bunker. Room: (Planning 11:30-12:45)

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Speech Emotion Recognition Using Support Vector Machine

Natural Language Processing. George Konidaris

SELF: CONNECTING CAREERS TO PERSONAL INTERESTS. Essential Question: How Can I Connect My Interests to M y Work?

Switchboard Language Model Improvement with Conversational Data from Gigaword

Community Power Simulation

INVESTIGATION OF UNSUPERVISED ADAPTATION OF DNN ACOUSTIC MODELS WITH FILTER BANK INPUT

arxiv: v1 [cs.cv] 10 May 2017

Guru: A Computer Tutor that Models Expert Human Tutors

Constructing Parallel Corpus from Movie Subtitles

Approaches to control phenomena handout Obligatory control and morphological case: Icelandic and Basque

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Transcription:

SI425 : NLP Missing Topics and the Future

Who cares about NLP? NLP has expanded quickly Most top-tier universities now have NLP faculty (Stanford, Cornell, Berkeley, MIT, UPenn, CMU, Hopkins, etc) Commercial NLP hiring: Google, Microsoft, IBM, Amazon, LinkedIn, Yahoo Web startups in Silicon Valley are eating up NLP students Navy, DoD, NSA, NIH: all funding NLP research 2

What NLP topics did we miss? Speech Recognition 3

What NLP topics did we miss? Speech Recognition 4

What NLP topics did we miss? Machine Translation 5

What NLP topics did we miss? Machine Translation Start at ~6min in. http://www.youtube.com/watch?feature=player_embedded&v=nu -nlqqfckg 6

What NLP topics did we miss? Machine Translation IBM Models (1 through 5) Neural Network Translation 7

Machine Translation How to model translations? Words: P( casa house ) Spurious words: P( a null ) Fertility: Pn( 1 house ) English word translates to one Spanish word Distortion: Pd( 5 2 ) The 2 nd English word maps to the 5 th Spanish word

Distortion Encourage translations to follow the diagonal P( 4 4 ) * P( 5 5 ) *

Learning Translations Huge corpus of aligned sentences. Europarl Corpus of European Parliamant proceedings The EU is mandated to translate into all 21 official languages 21 languages, (semi-) aligned to each other P( casa house ) = (count all casa/house pairs!) Pd( 2 5 ) = (count all sentences where 2 nd word went to 5 th word)

Machine Translation Technology Hand-held devices for military Speak english -> recognition -> translation -> generate Urdu Translate web documents Education technology? Doesn t yet receive much of a focus

What NLP topics did we miss? Dialogue Systems Do you think Anakin likes me? I don t care. 12

What NLP topics did we miss? Dialogue Systems Why? Heavy interest in human-robot communication. UAVs require teams of 5+ people for each operating machine Goal: reduce the number of people Give computer high-level dialogue commands, rather than low-level system commands 13

What NLP topics did we miss? Dialogue Systems Dialogue is a fascinating topic. Not only do we need to understand language, but now discourse cues: Questions require replies Imperatives/Commands Acknowledgments: ok Back-channels: uh huh, mm hmm Belief-Desire-Intention (BDI) Model Beliefs: you maintain a set of facts about the world Desires: things you want to become true in the world Intentions: desires that you are taking action on 14

Neural Networks The field started shifting ~2012 with real movement in just the past 2 years. Neural networks now becoming the default type of classifier. 15

Word Embeddings Our features in this class were largely binary on/off [ he 1, said 1, pizza 0, denied 1, ate 0, ] Now we represent a unigram as a vector of real numbers. Just like distributional learning!!! But: the vector is learned instead of derived from context word counts 16

Distributional to Embeddings Distributional word vector: Word Embedding: cell [ 1.47, 0.42, 1.44, 0.53, 0.69, 1.02, 0.42, -0.43 ]

Word Embeddings as Features The features are no longer n-grams, but embeddings of the n-grams. Before: [ he 1, said 1, pizza 0, denied 1, ate 0, ] Now: [ 1.47, 0.42, 1.44, 0.53, 0.69, 1.02, 0.42, -0.43 ] Sum up the word embeddings for each unigram OR just concatenate them into an even longer vector! 18

Basic Neural Network We learned multi-class logistic regression (MaxEnt) This is a one-layer neural network! thou hast opened the box Dickens Austen Shakespeare quicketh 19

Basic Neural Network Do several regressions at once. Decide later. thou hast opened the box quicketh Dickens Austen Shakespeare New hidden layer!!! 20

Basic Neural Network Do several regressions at once. Decide later. Word Embeddings As Input Dickens Austen Shakespeare 21

El Fin Secret 1: 22

El Fin Secret 1: I intentionally made some of our labs ambiguous 23

El Fin Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results 24

El Fin Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results Secret 2: 25

El Fin Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results Secret 2: I tried to teach you skills that have nothing to do with NLP 26

El Fin Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results Secret 2: I tried to teach you skills that have nothing to do with NLP Experimentation Error Analysis 27

El Fin Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results Secret 2: I tried to teach you skills that have nothing to do with NLP Experimentation Error Analysis Secret 3: 28

El Fin Secret 1: I intentionally made some of our labs ambiguous Under-defined tasks with unclear expected results Secret 2: I tried to teach you skills that have nothing to do with NLP Experimentation Error Analysis Secret 3: I appreciate the hard work you put into the class 29

30

What NLP topics did we miss? Unsupervised Learning 31

What NLP topics did we miss? Unsupervised Learning Most of this semester used data that had human labels. Bootstrapping was our main counterexample: it is mostly unsupervised. Many many algorithms being researched to learn language and knowledge without humans, only using text. 32