Deep Learning for Educational Innovations Yuchi Huang {yuchi.huang@act.org} ACTNext October 4 th, 2018
Outline From AI to Machine Learning to Deep Learning Why we need Deep Learning (DL) Different Deep Learning models Deep Learning in educational applications
From AI to ML to DL
Artificial Intelligence Dartmouth Summer Workshop on Artificial Intelligence 1955 General AI: machines capable of sensing and reasoning think just like we do Narrow AI: technologies that are able to perform specific tasks as well as, or better than, we humans can even Narrow AI was mostly out of reach with early machine learning approaches
Most Efforts in early ML (esp. before 2010): Input signal Preprocessing Feature selection & extraction Inference, prediction, recognition Most critical for accuracy, Most time-consuming in development cycle, often hand-crafted
Limitations of hand-crafted features Hand-crafted features Low adaptability to various applications: limits performance How to hand engineer for new domains? Kinect, Video, Multi spectral Instead of designing features, can we train an end to end system (parameterized function) in which features are extracted and learnt efficiently and implicitly? Feature computation time Getting prohibitive for large datasets (several sec/image)
Why not shallow learning? Shallow Kernel learning Boosting BAD -- it may require an exponential nr. of templates!!!
Why do we need deep structure? Function composition is at the core of deep learning methods The composition makes a highly non-linear system Each simple function will have parameters subject to training GOOD -- (exponentially) more efficient: intermediate computations can be re-used distributed representations which are shared across classes
Biological inspiration - The visual cortex is hierarchical - The brain uses billions of slow and Unreliable processors (neurons) acting in parallel - Thousands of incoming connections per neurons Dendrites Terminal Branches of Axon Axon y = f ( w x + w x + w x + + w x + b) 1 2 3 m i i 1 i 2 i 3 i m j = f ( w x + b) = f ( X, W, b) j i j f could be: Sigmoid tanh Rectified linear Input nodes Connection s (with weights) Hidden nodes (neurons) Output nodes
Deep Learning : weights and biases in all layers Learning consists of minimizing the loss w.r.t. parameters over the whole training set What is Deep Learning: Cascade of non-linear transformations End to end learning Distributed representations Compositionality Deep Neural Networks contain a large number of neurons which can be computed distributedly or parallelly. Each neuron is a simple non-linear function. GPU (Graphics Processing Unit)
The DL Boom around 2010
Family of deep learning: BIG Convolutional Neural Networks (CNN) Google nets Residual networks Generative Adversarial Networks (GAN) Recurrent Neural Networks (RNN) LSTM Deep Reinforcement Learning Alpha-Go
Convolutions in CNN Convolution with a kernel Convolution with multiple kernels Learn these kernels during training
Applications of CNN About sensing what, where, when from visual/acoustic/text/depth
Recurrent Networks Change RNN architecture: long short term memory (LSTM), or Gated Recurrent unit (GRU) Attention model
Applications of RNN Applications: word/sentence completion, translation, time series prediction, image captioning among others; Take sequences as input, output could be a single unit (e.g. predicting next movement of a human) or a sequence (e.g. translation, seq. of words -> seq. of words)
Deep Reinforcement Learning Reinforcement learning (RL) is about an agent interacting with the environment, learning an optimal policy, by trial and error, for sequential decision the combination of deep neural networks and reinforcement learning = deep reinforcement learning Future of AI Chess (AlphaGo or AlphaZero) Computer games Self-driving cars Robotics
Generative Adversarial Networks (GANs) Discriminator D: distinguishes genuine data from forgeries created by G G: turns random noise into imitations of the real data, in an attempt to fool the D Two networks compete with each other! Conditional Info could be added to G and D
Open source libraries
Reference and computational resources A full reading list: http://deeplearning.net/reading-list/ Evaluation of deep learning toolkits https://github.com/zer0n/deepframeworks/blob/master/readme.md Tutorial: http://deeplearning.net/tutorial/ http://deeplearning.stanford.edu/tutorial/ https://www.tensorflow.org High-end computers with decent CPU, RAM, GPU Online deep learning platform: AWS deep learning instance CLOUD AI of Google
Limitations Main criticism: the lack of theory surrounding many of the methods Most of the learning is just some form of gradient descent Often looked at as a black box, with most confirmations done empirically Lack of mechanisms for complex reasoning, search, and inference Generate structured prediction? (a long text, or a label map) Lack of memory some applications require a way to store isolated facts (natural language understanding) LSTM, Memory Networks, Neural Turing Machines, and Stack-Augmented RNN: far from mature Lack of the ability to perform unsupervised learning Animals/humans learn the perceptual world in an unsupervised manner
Deep Learning in educational applications Facial expression generation in dyadic interactions Generative Adversarial Networks Facial biometrics for test centers Convolutional Neural Networks Generation of micro multimodal content (videos) Convolutional Neural Networks Recurrent Networks Automatic passage generation Convolutional Neural Networks
Facial Expression Generation in Dyadic Interactions Given the facial expressions of humans, generate facial expressions of agents Applications: autonomous realistic avatars for interviews
Generate dynamic facial expression for one agent Interviewee Joy ==> Interviewee Anger ==> Interviewee Surprise ==> Interviewee Fear ==> In International Conference on Multimodal Interaction 2018 Interviewee Contempt ==> Interviewee Disgust ==> Interviewee Sad ==> Interviewee Neutral ==>
Generate facial expression for multi-agents In British Machine Vision Conference 2018
Facial biometrics for test centers Test-taking fraud (Cheating) happens in all level tests Solution: deep learning based face and speech recognition based identity verification. With an Equal Error Rate of 5.6% on a tester dataset, our algorithm outperforms a third-party face recognition system (which has an EER of 7.4%)
Generation of micro multimodal content (videos) Content is key: promote engagement, increase interaction and boost efficacy But developing good content at scale is difficult Utilize AI/Machine Learning to generate effective content from existing material
Scale up generation of educational video content for precision learning Animals of Africa Natural scenery of Africa Archive of atomic construct video clips atomic clips Video Segmentation atomic clips atomic clips atomic clips Semantic Ordering (based on text caption and visual features) + Visual effect alignment Video Segmentation Wild Africa
Video Segmentation S: sentences of video caption Caption text features (e.g. word2vec embedding) S1 S2 S3 S4 S5 S6 p1 p2 p3 p4 p5 Extracted Visual features from frames (e.g. CNN based features) Text Segmentation as a Supervised Learning Task, Omri Koshorek et al. 2018
(Semi) Automated Passage Generation (APG/SAPG) Goals Help writers create testing passages in a more efficient way Provide adaptively searched and summarized material to learners
APG/SAPG - Framework Searching related passages Passage Summarization Semantic Ordering and integration TextRank: Bringing Order into Texts Rada Mihalcea Extractive Abstractive Multi source passages Extractive Summarization Coherence measuring of extracted sentences Merge & Order Paraphrasing Abstractive Paraphrasing
Seq2seq Model Why does no one response to my questions? New sentence LSTM + Attention Old sentence Why nobody answer my questions?
Sum Up Deep learning is a powerful tool in machine learning producing the best results in most of sub-fields of applied machine learning Deep learning has not been widely used for Education Create new/smart content, Personalized/Customized learning, Support teachers, Virtual lecturers and learning environment, the automation of administrative tasks Deep learning is far from maturity works but lack of theory
Thanks! yuchi.huang@act.org