Natural Language Processing: Introduction. Matthias Naver Labs Europe. 08 th January NAVER LABS. All rights reserved.

Natural Language Problem: Definition Natural language processing (NLP) is a collective term referring to automatic computational processing of human languages Neural Network Methods for NLP Yoav Goldberg

vs Computational Linguistics scientific study of language from a computational perspective [aclweb.com] Like any computational X => use computers to understand X In general: more focused on understanding/supporting a theory than solving a problem

Natural Language Processing Natural Formal https://xkcd.com/1090 [audience looks around] 'What just happened?' 'There must be some context we're missing.'

Noam Chomsky Key to convert linguistics into a science Focused on the structure of human language Created the link with mathematical models & learnability Results used every time you use formal grammars (also my academic grand-grand-grand-grand-father)

Why natural language is hard It's difficult to extract sense from strings, but they're the only communication coin we can count on. Alan Perlis (first Turing award recipient) Within a computer, natural language is unnatural. We all know NLP is not «solved» [comic relief]

Why text is hard Ambiguous Exceptions Noisy Contextualized Coreference/Synonyms Common-sense Evolves Alice saw Bob using the telescope Try to understand the gender of ships OMG did U c how teen wrte dez days? LOL pizza coke.. It was cold! The US president greeted the German prime minister. She critiqued his stand on. https://xkcd.com/1576/ The trophy would not fit in the brown suitcase because it was too big [small] 1000 new words to the Oxford Dictionary each year [https://blog.oxforddictionaries.com/august-2013-update/]

ML methods in NLP 1. Rule-based models (inference rules, planning) Hard to adapt (noise or new environment) 2. Data-drive (probabilistic linear models) Feature engineering 3. Learn the features Non-convex models 60-90 90 2014 2014 - To know more: https://video.ias.edu/machinelearning/2017/1115-christophermanning

Neural Networks in NLP Early successes Yoshua Bengio, et al. "A neural probabilistic language model." JMLR 2003 James Henderson. "Inducing history representations for broad coverage statistical parsing." NAACL 2003. Ronan Collobert and Jason Weston. "A unified architecture for natural language processing: Deep neural networks with multitask learning." ICML 2008

Impact of DL in NLP at least not so clearly (yet) Smart feature engineering often (still) outperforms DL Chen, Danqi, Jason Bolton, and Christopher D. Manning. "A thorough examination of the CNN/Daily Mail reading comprehension task." ACL 2016 Some part of traditional ML still used Huang, Zhiheng, Wei Xu, and Kai Yu. "Bidirectional LSTM-CRF models for sequence tagging." arxiv 2015 Lample, Guillaume, et al. Neural Architectures for Named Entity Recognition. NAACL 2016. https://qz.com/1034972/the-data-that-changed-the-direction-of-ai-researchand-possibly-the-world/

Impact of DL in NLP 1. Wide adoption of continuous representation Mikolov, Chen, Corrado, and Dean. Efficient estimation of word representations in vector space. ICLR Workshop, 2013. Pennington, Jeffrey, Richard Socher, and Christopher Manning. "Glove: Global vectors for word representation." EMNLP 2014 Text 2. Encoder/Decoder framework (sequence models without markovian assumption) + differentiable extensions Sutskever, Ilya, Oriol Vinyals, and Quoc V. Le. "Sequence to sequence learning with neural networks." Advances in neural information processing systems. 2014. tokenization Neural lemmatization Network POS-tagging NER classification 3. Removal of feature engineering (replaced by architecture engineering) Predic tion Higher-level representation

This course: Boundaries Assuming you know about Basics of optimization (SGD) & linear algebra (SVD) Basics of supervised learning (logistic regression, regularization) python Limits: No spoken language (neither ASR or TTS) No multi-modality No OCR Only an introduction!

This course: Approach Problem-specific: Language modelling Representation of words & documents Part-of-Speech (POS) tagging Named Entity Recognition Parsing Social Media Analysis Machine Translation Question & Answering Dialogue Methods will be introduced as needed, trying to strike a balance between historical importance and modern approaches

This course: Lecturers Matthias Gallé 8, 15, 22/01 Salah Ait-Mokhtar 29/01 Caroline Brun 5/02 Marc Dymetman 26/02 Julien Perez 5, 12/03 <firstname>.<lastname>@naverlabs.com

This course: Evaluation 3 programming exercises word-embeddings: mid-january information extraction: beginning Feburary seq2seq: beginning March Recommended language: python

This course: Bibliography Speech and Language Processing Jurafsky & Martin. 2017 (3 rd edition, ongoing) https://web.stanford.edu/~jurafsky/slp3/ MAIN Neural Network Methods in Natural Language Processing Y. Goldberg. 2017

C2017 NAVER Corp.

NAVER LABS Europe Artificial Intelligence Computer Vision Machine Learning & Optimization Natural Language Processing Knowledge and Processes Geospatial Data Science UX and Ethnography http://www.europe.naverlabs.com/naver-labs-europe/internships http://www.europe.naverlabs.com/naver-labs-europe/jobs