Data Analytics for Business

Similar documents
Python Machine Learning

STA 225: Introductory Statistics (CT)

Probability and Statistics Curriculum Pacing Guide

(Sub)Gradient Descent

Lecture 1: Machine Learning Basics

Learning From the Past with Experiment Databases

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Theory of Probability

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Detailed course syllabus

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

School of Innovative Technologies and Engineering

CSL465/603 - Machine Learning

CS/SE 3341 Spring 2012

CS Machine Learning

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Reducing Features to Improve Bug Prediction

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Lecture 1: Basic Concepts of Machine Learning

Assignment 1: Predicting Amazon Review Ratings

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Mathematics. Mathematics

Axiom 2013 Team Description Paper

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Sociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website

Mathematics subject curriculum

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Learning Methods for Fuzzy Systems

Applications of data mining algorithms to analysis of medical data

Australian Journal of Basic and Applied Sciences

WHEN THERE IS A mismatch between the acoustic

Indian Institute of Technology, Kanpur

OFFICE SUPPORT SPECIALIST Technical Diploma

Artificial Neural Networks written examination

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Universidade do Minho Escola de Engenharia

On-Line Data Analytics

Rule Learning With Negation: Issues Regarding Effectiveness

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

Radius STEM Readiness TM

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Statewide Framework Document for:

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Human Emotion Recognition From Speech

A Case Study: News Classification Based on Term Frequency

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Multivariate k-nearest Neighbor Regression for Time Series data -

Learning Disability Functional Capacity Evaluation. Dear Doctor,

Probabilistic Latent Semantic Analysis

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

MASTER OF PHILOSOPHY IN STATISTICS

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Mining Association Rules in Student s Assessment Data

Issues in the Mining of Heart Failure Datasets

Dublin City Schools Mathematics Graded Course of Study GRADE 4

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Unit 7 Data analysis and design

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Predicting the Performance and Success of Construction Management Graduate Students using GRE Scores

arxiv: v1 [cs.lg] 15 Jun 2015

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Rule Learning with Negation: Issues Regarding Effectiveness

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Knowledge Transfer in Deep Convolutional Neural Nets

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

Modeling function word errors in DNN-HMM based LVCSR systems

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Introducing the New Iowa Assessments Mathematics Levels 12 14

CS 446: Machine Learning

Analysis of Enzyme Kinetic Data

COURSE SYNOPSIS COURSE OBJECTIVES. UNIVERSITI SAINS MALAYSIA School of Management

Test Effort Estimation Using Neural Network

Massachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139

Evolutive Neural Net Fuzzy Filtering: Basic Description

Green Belt Curriculum (This workshop can also be conducted on-site, subject to price change and number of participants)

ATW 202. Business Research Methods

Using Calculators for Students in Grades 9-12: Geometry. Re-published with permission from American Institutes for Research

Using Task Context to Improve Programmer Productivity

Linking Task: Identifying authors and book titles in verbose queries

M55205-Mastering Microsoft Project 2016

Modeling function word errors in DNN-HMM based LVCSR systems

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Exploration. CS : Deep Reinforcement Learning Sergey Levine

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Office Hours: Mon & Fri 10:00-12:00. Course Description

Time series prediction

Visit us at:

Learning Microsoft Publisher , (Weixel et al)

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

Softprop: Softmax Neural Network Backpropagation Learning

Truth Inference in Crowdsourcing: Is the Problem Solved?

Julia Smith. Effective Classroom Approaches to.

Transcription:

Data Analytics for Business Data analysis is the need of the hour. Today, different organizations are generating huge amounts of data without knowing how to make use of it for their benefit. To change this, machine learning and statistical techniques are now being to develop predictive models from existing data to forecast future outcomes. Objective Expecting to build a solid foundation of business analytics, this course has been designed to impart knowledge of machine learning and statistical methods for data analysis. The course shall also provide sufficient knowledge of python programming language to use for machine learning algorithm and python/r programming for statistical methods. A brief introduction of neural networks and deep learning will also be covered. Target Audience 1. Students who ve passed 10+2 examinations with Mathematics. 2. Professionals having knowledge of Mathematics. Course Duration: Fee: 125 Hours Rs. 40000/- plus taxes if any. Probable Resource Persons: 1. Prof. Sanjeet Singh, IIM, Kolkata 2. Prof. R. K. Agrawal, JNU, Delhi 3. Prof. Aparna Mehra, IIT Delhi 4. Mr. Premnath Dalai, Co-founder Stepup Analytics (Practitioner) 5. Experts from industries and institutions of repute like the University of Delhi, etc.

Course Contents Module 1: Introduction to Business Analytics (35 hours) Descriptive Analytics: Describing and summarizing data sets, measures of central tendency, dispersion, skewness, kurtosis, Correlation. Probability: Measures of probability, conditional probability, independent event, Bayes theorem, random variable, discrete (binomial, Poisson, geometric, hypergeometric, negative binomial) and continuous (uniform, exponential, normal, gamma). Expectation and variance, markov inequality, chebyshev s inequality, central limit theorem. Inferential Statistics: Sampling & Confidence Interval, Inference & Significance. Estimation and Hypothesis Testing, Goodness of fit, Test of Independence, Permutations and Randomization Test, t-test/z-test (one sample, independent, paired), ANOVA, chisquare. Module 2: Data Manipulation using Python (25 hours) Introduction to Python Editors & IDE s (Jupyter, Spyder, pycharm, etc.), custom environment settings, basic data types (numeric, string, float) and their operations, control flow (if-elif-else), loops (for, while), inbuilt functions for data conversion, writing user defined functions. Concepts of packages/libraries important packages like NumPy, SciPy, scikit-learn, Pandas, Matplotlib, seaborn, etc., installing and loading packages, reading and writing data from/to different formats, tuples, sets, dictionaries, simple plotting, functions, list comprehensions, database connectivity. Module 3: Data Analysis (15 hours) Relevance in industry, Statistical learning vs machine learning, types and phases of analytics. Data pre-processing and cleaning: data manipulation steps (sorting, filtering, duplicates, merging, appending, subsetting, derived variables, data type conversions, renaming, formatting, etc.), normalizing data, sampling, missing value treatment, outliers. Exploratory data analysis: Data visualization using matplotlib, seaborn libraries, creating graphs (bar/line/pie/boxplot/histogram, etc.), summarizing data, descriptive statistics,

univariate analysis (distribution of data), bivariate analysis (cross tabs, distributions and relationships, graphical analysis). Module 4: Machine learning Part 1 (20 hours) Introduction, Applications of Machine Learning, Key elements of Machine Learning, Supervised vs. Unsupervised Learning. Supervised Machine Learning: Linear Regression, Multiple Linear Regression Polynomial Regression. Classification: Using Logistic Regression, Logistic Regression vs. Linear Regression, Logistic Regression with one variable and with multiple variables, Application to multiclass classification. The problem of Overfitting, Application of Regularization in Linear and Logistic Regression. Regularization and Bias/Variance. Classification using K-NN, Naive Bayes classifier, Decision Trees (CHAID Analytics), Random Forest, Support Vector Machines. Model Evaluation: Cross validation types (train & test, bootstrapping, k-fold validation), parameter tuning, confusion matrices, basic evaluation metrics, precision-recall, ROC curves. Case study Module 5: Machine learning Part 2 (20 hours) Neural Networks: Introduction, Model Representation, Gradient Descent vs. Perceptron Training, Stochastic Gradient Descent, Multiclass Representation, Multilayer Perceptrons, Backpropagation Algorithm for Learning, Introduction to Deep Learning. Association Rule Mining: Mining frequent itemsets, Apriori algorithm, market basket analysis. Case study Unsupervised Machine Learning: Introduction, Clustering, K-Means algorithm, Affinity Propagation, Agglomerative Hierarchical, DBSCAN, Dimensionality Reduction using Principal Component Analysis. Case study: Application of PCA

Time Series Forecasting: Trends and seasonality in time series data, identifying trends, seasonal patterns, first order differencing, periodicity and autocorrelation, rolling window estimations, stationarity vs. non-stationarity, ARIMA and ARIMAX Modeling Case Study Module 6: Optimization in Analytics (10 hours) Introduction to Operations Research (OR), Linear Programming Problems (LPP), Geometry of linear programming, Sensitivity and Post-optimal analysis, Duality and its economic interpretation. Network models and project planning, Non-linear Programming KKT conditions, Introduction to Stochastic models, Markov models, Classification of states, Steady-state probability, Dynamic Programming. References: 1. Kumar, U.D. :Business Analytics The Science of Data Driven Decision Making, Wiley. 2. Gert, H.N., Thorlund, L. and Thorlund, J. :Business Analytics for Managers Taking Business Intelligence Beyond Reporting, Wiley. 3. Johnson, R.A., Miller, I. and Freund, J. :Probability and Statistics for Engineers, Pearson. 4. Jose, J. and Lal, S.P. :Introduction to Computing & problem solving with Python, Khanna Publishers. 5. Bowles, M. :Machine Learning in Python Essential Techniques for Predictive Analysis, Wiley. 6. Larose, D.T. and Larose, C.T.: Data Mining and Predictive Analytics, Wiley. 7. Bishop, C.M. :Pattern recognition & Machine Learning, Springer New York. 8. Falch, P. :Machine Learning, Wiley. 9. Deepa, S.N. and Sivanandam, S.N. :Principles of Soft Computing, Wiley. 10. Taha, A.H. :Operations Research An Introduction, Prentice Hall. 11. Raschka, S. :Python Machine Learning Course Co-coordinators: 1. Dr. Sameer Anand 2. Dr. Ajay Jaiswal