Babu Madhav Institute of Information Technology, UTU : Machine Learning

Similar documents
Python Machine Learning

Lecture 1: Machine Learning Basics

(Sub)Gradient Descent

Artificial Neural Networks written examination

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

CSL465/603 - Machine Learning

A Simple VQA Model with a Few Tricks and Image Features from Bottom-up Attention

CS Machine Learning

arxiv: v1 [cs.lg] 15 Jun 2015

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Issues in the Mining of Heart Failure Datasets

HIERARCHICAL DEEP LEARNING ARCHITECTURE FOR 10K OBJECTS CLASSIFICATION

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Model Ensemble for Click Prediction in Bing Search Ads

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Indian Institute of Technology, Kanpur

Phonetic- and Speaker-Discriminant Features for Speaker Recognition. Research Project

Rule Learning With Negation: Issues Regarding Effectiveness

Axiom 2013 Team Description Paper

arxiv: v1 [cs.cv] 10 May 2017

Generative models and adversarial training

Learning From the Past with Experiment Databases

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Calibration of Confidence Measures in Speech Recognition

Rule Learning with Negation: Issues Regarding Effectiveness

Softprop: Softmax Neural Network Backpropagation Learning

Attributed Social Network Embedding

Australian Journal of Basic and Applied Sciences

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Assignment 1: Predicting Amazon Review Ratings

arxiv: v2 [cs.cv] 30 Mar 2017

Reducing Features to Improve Bug Prediction

Training a Neural Network to Answer 8th Grade Science Questions Steven Hewitt, An Ju, Katherine Stasaski

Test Effort Estimation Using Neural Network

Radius STEM Readiness TM

Multivariate k-nearest Neighbor Regression for Time Series data -

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Probability and Statistics Curriculum Pacing Guide

INPE São José dos Campos

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

A New Perspective on Combining GMM and DNN Frameworks for Speaker Adaptation

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Human Emotion Recognition From Speech

WHEN THERE IS A mismatch between the acoustic

Data Fusion Through Statistical Matching

Modeling function word errors in DNN-HMM based LVCSR systems

Knowledge Transfer in Deep Convolutional Neural Nets

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

A study of speaker adaptation for DNN-based speech synthesis

Switchboard Language Model Improvement with Conversational Data from Gigaword

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Learning Methods for Fuzzy Systems

Applications of data mining algorithms to analysis of medical data

Modeling function word errors in DNN-HMM based LVCSR systems

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

An empirical study of learning speed in backpropagation

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

CS 446: Machine Learning

Mathematics process categories

Experiment Databases: Towards an Improved Experimental Methodology in Machine Learning

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Cultivating DNN Diversity for Large Scale Video Labelling

Active Learning. Yingyu Liang Computer Sciences 760 Fall

Time series prediction

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Lecture 1: Basic Concepts of Machine Learning

arxiv: v2 [cs.ir] 22 Aug 2016

Universidade do Minho Escola de Engenharia

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Semi-Supervised Face Detection

Dual-Memory Deep Learning Architectures for Lifelong Learning of Everyday Human Behaviors

A Reinforcement Learning Variant for Control Scheduling

Learning to Schedule Straight-Line Code

Semantic Segmentation with Histological Image Data: Cancer Cell vs. Stroma

Large vocabulary off-line handwriting recognition: A survey

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

POS tagging of Chinese Buddhist texts using Recurrent Neural Networks

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

ADVANCES IN DEEP NEURAL NETWORK APPROACHES TO SPEAKER RECOGNITION

A Review: Speech Recognition with Deep Learning Methods

arxiv: v1 [cs.lg] 3 May 2013

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

arxiv: v1 [cs.cl] 27 Apr 2016

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

arxiv: v2 [cs.ro] 3 Mar 2017

TRANSFER LEARNING OF WEAKLY LABELLED AUDIO. Aleksandr Diment, Tuomas Virtanen

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Exploration. CS : Deep Reinforcement Learning Sergey Levine

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Transcription:

Babu Madhav Institute of Information Technology, UTU 060010907 : Machine Learning 2017 Unit 1. Introduction 1. Define: Machine learning. 2. How machine learning algorithm is applied in facebook? 3. Which machine learning algorithm is used in x-box? 4. Which machine learning algorithm is used in robot dog? 5. What do you mean by data exhaust? 6. Draw the chart representing year wise data generated in the world. 7. "Shivi, 1 year old girl, wants to identify colors." Which kind of learning is used in it? 8. "Shivi identified that Ram is a bad boy." Which kind of learning is used in it? 9. "Kunal has concluded that Hiten's performance in laboratory is finest." Which kind of learning algorith Kunal has used? 10. "Kunal has concluded that Piyush will deserve trophy out of 100 employees based on 6 parameters". Which king of learning algorithm Kunal has used? 11. Define: Training data 12. What is testing data? 13. List out any two solutions for handling missing data. 14. Which python library is used for taking care of missing data? 15. Which python class is used for taking care missing data? 16. What do you mean by categorical variables? Give two examples of the same. 17. Which python library is used for encoding categorical data? 18. Which python class is used for encoding categorical data? 19. x= 5y + z. Identify dependant and indipendant variables. 20. Define: Euclidian Distance. 21. List out any two feature scalling methods. 22. How to calculate standardization for feature scalling? 23. How to calculate normalisation for feature scalling? 24. List out any four applications of machine learning. 25. How machine learning algorithms are useful in below scenario? Amazon, Keyboards, Virtual reality headset, Maps 26. How machine learning algorithms are useful in below scenario? x-box, robot dog, Amazon/Netflix, Space 27. Why machine learning is the future? 28. Using chart, discuss how machine learning is useful for data scientists. 29. Differentiate supervised and unsupervised learning. 30. Differentiate testing data and test data. 31. How to choose training and testing data? 32. Write python snippet code for taking care of missing data with mean of coloum. 33. Write python snippet code for encoding categorical data. 34. What kind of problem occurs if I encode categorical data as below? How to solve it? 35. Write python snippet code for creating dummy variables for categorical data. 36. Write python snippet code for splitting dataset into training and testing part. 37. What is feature scalling? Why we need to do that? 38. Do we need to apply feature scalling on dummy variables? Why? 39. What is dummy variable? Why to create it? Give one example. 40. Write python snippet code for feature scalling. SAPAN NAIK 1

Unit 2. Regression 1. What do you mean by random forest regression? 2. "Random forest regration is non continues model." True or false. Justify. 3. Write python snippet code for fitting random forest regresion to dataset. 4. Write short note on simple linear regression. 5. Write short note on backward elimination method for multiple linear regression. 6. Write short note on bidirectional elimination method for multiple linear regression. 7. In Y=bo + b1x, what is the importance of bo and b1? What do you mean by best fitting line? Discuss using graph. 8. Explain ordinary least squeare. Feature scalling is needed in simple linear regression or not? Why? 9. Write down the sample equestion and assumptions for multiple linear regression. Give two examples where one can use multiple linear regression. Write python snippet code for visualise the regression model. 10. Below are the 10 records for coloum state, how many dummy variables one need to add, in multiple linear regression equations and why? What is dummy variable trap and multiple leaniarity? 11. (State: Gujarat, Maharastra, Bihar, Goa, Gujarat, Goa, Maharastra, Gujarat, Goa, Bihar) 12. Can we add all dummy variables in multiple linear regression equations? Why? What do you mean by stepwise regression. 13. Why one need to remove some of the independent variables while creating model? Write down 5 methods for building a model in multiple linear regression. When one need to use "All In" method for building model all multiple linear regression? 14. List out steps for backward elimination and forward selection method for building a multiple linear regression model. 15. List out steps for bidirectional elimination method for building a multiple linear regression model. If we will consider all possible multiple linear regression models, how many models are possible for dataset having 10 coloums? 16. Which one is the fastest method to build multiple linear regression model? Why you need backward elimination? What is the importance of below lines of python script in backward elimination? 17. If p value of X variable is 0.06 and your significance level is 5%, will you keep X variable in model? Why? When one need to use polynomial regression? Why it is called linear? 18. What is CART? What is information entropy? In which situations, decesion tree regression model is best suited and why? 19. Using chart, explain the creation of decesion tree regression model. 20. What do you mean by ensemble learning? Is it stable? Why? Random forest is continues or non continues regression model? 21. Discuss random forest regression model with example. 22. Prepare a regression template using python script which can be used for Decesion Tree and Random Forest regression model.

Babu Madhav Institute of Information Technology, UTU 060010907 : Machine Learning 2017 Unit 3. Classification 1. Prepare a classification template using python script which can be used for K-NN and logistic regression. 2. Write snippet code for K-NN and logistic regression. Also write python script for creating confusion metrix. 3. What is euclidian distance? Write down and discuss steps of KNN algorithm. 4. How support vector machine works? What is support vectors? Why SVM is different from other classifiers? 5. Write short note on K-NN classifier. 6. Write short note on SVM classifier. 7. Write short note on logistic regression classifier. 8. Write down equestions of sigmoid function and logistic regression. Also draw chart representing difference of linear regression and logistic regression. 9. Write snippet code for logistic regression classifier and visulizing training/testing results. Is logistic regression is linear classifier? Why? 10. Discuss bayes' theorem with examples. 11. Write bayes' theorem. Explain it with example. 12. Ram is having 700 Kesar and 300 Rajapuri Mango. Out of all, 600 ripe and 400 unripe. Out of all unripe mango, Kesar mango are 20%. 13. Find the probability that the selected Kesar mango is Ripe using Bayes' theorem. 14. In Eru village, 40 Neem trees and 60 Peepal trees are available. Out of 100 trees, 20% leaves are green and other are brown. Out of all brown leaves, 30% are from Peepal trees. Calculate the probability that the selected leaf is green and of Neem tree. 15. Write short note on naive bayes classification method. 16. What do you mean by prior probability, marginal likelihood, likelihood and posterior probability. Show the calculation for all of them using one example. 17. Why the term naive is used in bayes classification method? What is P(X) and what if more than two features are available in naive bayes classification method? 18. Write snippet code for naive bayes classifier. SAPAN NAIK 3

Unit 4. Clustering 1. What is the usage of K-mean clustering? 2. List out steps performed during k-mean clustering. 3. Selection of centroid in k-mean clustering is from given points. True/false? Why? 4. In below figure, points on green line are nearer to blue point or red point? 5. What is 'K' in k-mean clustering? 6. Apply K-mean clustering on above figure. Assume data and take approximation whenever needed. 7. What would happen if we had bad random initialisation in k-mean clustering? 8. What is random initialisation trap? 9. Can random initialisation affect your clustering? How? 10. What is the solution of random initialisation trap? 11. How to decide number of cluster for k-mean algorithm? 12. What is the full form of WCSS? How to find optimal value for it? 13. Define WCSS. How to calculate it? 14. What is the use of WCSS? Write down the equation of of it. 15. For same scattered plot, show two different WCSS calculations. 16. What is the use of Elbow method? 17. What HC does for user? 18. Compare k-mean and HC. 19. List out two types of HC. Give one difference of both. 20. Write down steps of agglomerative HC algorithm. 21. What do mean by closest cluster in agglomerative HC? 22. What do you mean by distance between two clusters? 23. What is dendograms? 24. What will be the value of x axis and y axis on dendogram? 25. Using one example, demonstrates dendogram construction. 26. Define : dissimilarity threshold. 27. What is the usage of dendograms? Give example of the same. 28. How one can decide optimal number of clusters using Dendograms? 29. Find out number of clusters from below Dendograms. 30. Give an example where Apriory algorithm can be used. 31. "people who bought also bought". Which algorithm can be used which supports given statement? 32. What do you mean by support in the context of Apriory? 33. What do you mean by confidence in the context of Apriory? 34. What do you mean by lift in the context of Apriory? 35. Using one example, describe support, confidence and lift in the context of Apriory algorithm. 36. List out steps of Apriory algorithm.

Babu Madhav Institute of Information Technology, UTU 060010907 : Machine Learning 2017 Unit 5. Reinforcement and Deep Learning 1. Why deep learning was not apriciated initially? Describe read-write speed, data retention, power usage and data density for different storage media. Discuss processing capacity in the context of time line. 2. Who is Geoffrey Hinton? Discuss the popularity reasons of deep learning now-a-days. 3. What is neurons? Discuss in detail. Differentiate standardization and normalization. 4. Draw the structure of single neuron. WHat is weight in the context of neural network? 5. What is activation function? List out any four of them and explain any two in detail. 6. If dependant variables having value 0 or 1, which activation function is more suitable? Why? 7. Which activation functions are commanly applied in hidden layer and output layer? Discuss them in detail. 8. How do the nueral networks work? Discuss with example. 9. What is perceptron? What is cost function? Give one example of it. What is one epoc? 10. How do the nueral network learn? Discuss with example. 11. Write short note on gradient descent. 12. WHat do you mean by curse of dimentionality? 13. In which situation stochastic gradient descent needed? Write two basic differences between normal gradient descent and stochastic gradient descent. How mini batch gradient descent menthod works? 14. What do you mean by backpropogation? What are the advantages of it? List out steps for taining ANN. 15. Discuss Tensorflow, Theano and Keras libraries of python in detail. 16. Discuss data prepocessing for ANN and it's importance. 17. Write python snippet code for improting Keras libraries and packages. Also discuss classifier.add() method with all its arguments for ANN as classifier. 18. One need to classify output in more then 2 categories. What are the changes one need to do in python script of ANN's output layer generation? Discuss all parameters in detail. 19. Write python script for making ANN. 20. What do you mean by compiling an ANN? Write and discuss python script for the same. 21. Write and discuss python script for predicting test results and making the confusion matrix in context of ANN. 22. Differentiate gyar/black and white images with color images. List out steps of convolutional NN. 23. Explain convolution operation in detail with one 7x7 binary image and 3x3 kernal/feature detector. 24. How feature map is useful for image understanding? Also discuss any four filters/kernals/feature detectors. 25. Discuss ReLU layer of CNN in detail. 26. What is pooling and why one need it? Discuss in detail. 27. Discuss pooling with the help of example feature map. 28. Discuss Flattening and Full connection steps of CNN. How hidden layer of CNN is different than of ANN? 29. Draw the basic architecture of CNN and explain it in detail. 30. What do you mean by softmax and cross-entropy? How it is useful in CNN? 31. List out Keras libraries and packages needed for CNN. Write function to add convolution layer with all arguments and explain it. 32. Why we need flattening? Why can't we apply flattening directly on input image? Write and discuss python script for adding pooling and flattening layers. 33. What do you mean by compiling an CNN? Write and discuss python script for the same. 34. What is image augmentation? Why one need to use it? 35. Write and discuss python script for fitting the CNN to the images. 36. How one can improve test result accuracy in CNN? Write down python script for the same and discuss. SAPAN NAIK 5

Unit 6. Dimensionality Reduction 1. List out any two dimentionality reduction technique and explain any one in detail. 2. What makes PCA, an unsupervised model? What are the advantages of PCA? At which position one need to apply PCA for classification problem? 3. Write python script for implementing PCA and discuss. 4. what makes LDA, a supervised model? Differentiate PCA and LDA. 5. Write python script for implementing Linear Discriminant Analysis and discuss it. 6. Which one is better dimetionality reduction technique and why? Write python script of it. 7. Which feature extracion technique for dimentionality reduction, works on non linear data? Explain it in detail. 8. Differentiate PCA and Kernel PCA. 9. Write python script for implementing Kernel PCA and discuss it. 10. Compare PCA, LDA and Kernel PCA. 11. Differentiate LDA and Kernel PCA. 12. How Kernel PCA works? Why can't we use simple PCA in place of Kernel PCA? 13. Compare and explain python script for PCA and LDA. 14. Compare and explain python script for PCA and Kernel PCA. 15. Why one need dimentionality reduction? List out linear and non linear methods. Which one is better in which situation and why?