Deep Learning for Educational Innovations. Yuchi Huang ACTNext October 4 th, 2018

Size: px

Start display at page:

Download "Deep Learning for Educational Innovations. Yuchi Huang ACTNext October 4 th, 2018"

Ezra Hodges
5 years ago
Views:

1 Deep Learning for Educational Innovations Yuchi Huang ACTNext October 4 th, 2018

2 Outline From AI to Machine Learning to Deep Learning Why we need Deep Learning (DL) Different Deep Learning models Deep Learning in educational applications

3 From AI to ML to DL

Artificial Intelligence Dartmouth Summer

AI: technologies that are able to perform

4 Artificial Intelligence Dartmouth Summer Workshop on Artificial Intelligence 1955 General AI: machines capable of sensing and reasoning think just like we do Narrow AI: technologies that are able to perform specific tasks as well as, or better than, we humans can even Narrow AI was mostly out of reach with early machine learning approaches

5 Most Efforts in early ML (esp. before 2010): Input signal Preprocessing Feature selection & extraction Inference, prediction, recognition Most critical for accuracy, Most time-consuming in development cycle, often hand-crafted

Limitations of hand-crafted features Hand-crafted features Low adaptability to various applications: limits performance How to hand engineer for new domains?

6 Limitations of hand-crafted features Hand-crafted features Low adaptability to various applications: limits performance How to hand engineer for new domains? Kinect, Video, Multi spectral Instead of designing features, can we train an end to end system (parameterized function) in which features are extracted and learnt efficiently and implicitly? Feature computation time Getting prohibitive for large datasets (several sec/image)

7 Why not shallow learning? Shallow Kernel learning Boosting BAD -- it may require an exponential nr. of templates!!!

8 Why do we need deep structure? Function composition is at the core of deep learning methods The composition makes a highly non-linear system Each simple function will have parameters subject to training GOOD -- (exponentially) more efficient: intermediate computations can be re-used distributed representations which are shared across classes

Biological inspiration - The visual cortex is

Unreliable processors (neurons) acting in parallel -

Terminal Branches of Axon Axon y = f ( w x + w x + w x +

f ( X, W, b) j i j f could be: Sigmoid tanh Rectified

9 Biological inspiration - The visual cortex is hierarchical - The brain uses billions of slow and Unreliable processors (neurons) acting in parallel - Thousands of incoming connections per neurons Dendrites Terminal Branches of Axon Axon y = f ( w x + w x + w x + + w x + b) m i i 1 i 2 i 3 i m j = f ( w x + b) = f ( X, W, b) j i j f could be: Sigmoid tanh Rectified linear Input nodes Connection s (with weights) Hidden nodes (neurons) Output nodes

10 Deep Learning : weights and biases in all layers Learning consists of minimizing the loss w.r.t. parameters over the whole training set What is Deep Learning: Cascade of non-linear transformations End to end learning Distributed representations Compositionality Deep Neural Networks contain a large number of neurons which can be computed distributedly or parallelly. Each neuron is a simple non-linear function. GPU (Graphics Processing Unit)

11 The DL Boom around 2010

12 Family of deep learning: BIG Convolutional Neural Networks (CNN) Google nets Residual networks Generative Adversarial Networks (GAN) Recurrent Neural Networks (RNN) LSTM Deep Reinforcement Learning Alpha-Go

13 Convolutions in CNN Convolution with a kernel Convolution with multiple kernels Learn these kernels during training

14 Applications of CNN About sensing what, where, when from visual/acoustic/text/depth

15 Recurrent Networks Change RNN architecture: long short term memory (LSTM), or Gated Recurrent unit (GRU) Attention model

Applications of RNN Applications: word/sentence

image captioning among others; Take sequences as

16 Applications of RNN Applications: word/sentence completion, translation, time series prediction, image captioning among others; Take sequences as input, output could be a single unit (e.g. predicting next movement of a human) or a sequence (e.g. translation, seq. of words -> seq. of words)

Deep Reinforcement Learning Reinforcement

with the environment, learning an optimal

reinforcement learning Future of AI Chess

17 Deep Reinforcement Learning Reinforcement learning (RL) is about an agent interacting with the environment, learning an optimal policy, by trial and error, for sequential decision the combination of deep neural networks and reinforcement learning = deep reinforcement learning Future of AI Chess (AlphaGo or AlphaZero) Computer games Self-driving cars Robotics

18 Generative Adversarial Networks (GANs) Discriminator D: distinguishes genuine data from forgeries created by G G: turns random noise into imitations of the real data, in an attempt to fool the D Two networks compete with each other! Conditional Info could be added to G and D

19 Open source libraries

20 Reference and computational resources A full reading list: Evaluation of deep learning toolkits Tutorial: High-end computers with decent CPU, RAM, GPU Online deep learning platform: AWS deep learning instance CLOUD AI of Google

21 Limitations Main criticism: the lack of theory surrounding many of the methods Most of the learning is just some form of gradient descent Often looked at as a black box, with most confirmations done empirically Lack of mechanisms for complex reasoning, search, and inference Generate structured prediction? (a long text, or a label map) Lack of memory some applications require a way to store isolated facts (natural language understanding) LSTM, Memory Networks, Neural Turing Machines, and Stack-Augmented RNN: far from mature Lack of the ability to perform unsupervised learning Animals/humans learn the perceptual world in an unsupervised manner

22 Deep Learning in educational applications Facial expression generation in dyadic interactions Generative Adversarial Networks Facial biometrics for test centers Convolutional Neural Networks Generation of micro multimodal content (videos) Convolutional Neural Networks Recurrent Networks Automatic passage generation Convolutional Neural Networks

23 Facial Expression Generation in Dyadic Interactions Given the facial expressions of humans, generate facial expressions of agents Applications: autonomous realistic avatars for interviews

International Conference on Multimodal Interaction 2018 Interviewee

24 Generate dynamic facial expression for one agent Interviewee Joy ==> Interviewee Anger ==> Interviewee Surprise ==> Interviewee Fear ==> In International Conference on Multimodal Interaction 2018 Interviewee Contempt ==> Interviewee Disgust ==> Interviewee Sad ==> Interviewee Neutral ==>

25 Generate facial expression for multi-agents In British Machine Vision Conference 2018

identity verification. With an Equal Error Rate of 5.

26 Facial biometrics for test centers Test-taking fraud (Cheating) happens in all level tests Solution: deep learning based face and speech recognition based identity verification. With an Equal Error Rate of 5.6% on a tester dataset, our algorithm outperforms a third-party face recognition system (which has an EER of 7.4%)

developing good content at scale is difficult Utilize

27 Generation of micro multimodal content (videos) Content is key: promote engagement, increase interaction and boost efficacy But developing good content at scale is difficult Utilize AI/Machine Learning to generate effective content from existing material

Scale up generation of educational video

Africa Natural scenery of Africa Archive of

28 Scale up generation of educational video content for precision learning Animals of Africa Natural scenery of Africa Archive of atomic construct video clips atomic clips Video Segmentation atomic clips atomic clips atomic clips Semantic Ordering (based on text caption and visual features) + Visual effect alignment Video Segmentation Wild Africa

29 Video Segmentation S: sentences of video caption Caption text features (e.g. word2vec embedding) S1 S2 S3 S4 S5 S6 p1 p2 p3 p4 p5 Extracted Visual features from frames (e.g. CNN based features) Text Segmentation as a Supervised Learning Task, Omri Koshorek et al. 2018

30 (Semi) Automated Passage Generation (APG/SAPG) Goals Help writers create testing passages in a more efficient way Provide adaptively searched and summarized material to learners

31 APG/SAPG - Framework Searching related passages Passage Summarization Semantic Ordering and integration TextRank: Bringing Order into Texts Rada Mihalcea Extractive Abstractive Multi source passages Extractive Summarization Coherence measuring of extracted sentences Merge & Order Paraphrasing Abstractive Paraphrasing

32 Seq2seq Model Why does no one response to my questions? New sentence LSTM + Attention Old sentence Why nobody answer my questions?

33 Sum Up Deep learning is a powerful tool in machine learning producing the best results in most of sub-fields of applied machine learning Deep learning has not been widely used for Education Create new/smart content, Personalized/Customized learning, Support teachers, Virtual lecturers and learning environment, the automation of administrative tasks Deep learning is far from maturity works but lack of theory

34 Thanks!

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks 1 Tzu-Hsuan Yang, 2 Tzu-Hsuan Tseng, and 3 Chia-Ping Chen Department of Computer Science and Engineering