About This Specialization

Save this PDF as:
 WORD  PNG  TXT  JPG

Size: px
Start display at page:

Download "About This Specialization"

Transcription

1 About This Specialization The 5 courses in this University of Michigan specialization introduce learners to data science through the python programming language. This skills-based specialization is intended for learners who have basic a python or programming background, and want to apply statistical, machine learning, information visualization, text analysis, and social network analysis techniques through popular python toolkits such as pandas, matplotlib, scikit-learn, nltk, and networkx to gain insight into their data. Introduction to Data Science in Python (course 1), Applied Plotting, Charting & Data Representation in Python (course 2), and Applied Machine Learning in Python (course 3) should be taken in order and prior to any other course in the specialization. After completing those, courses 4 and 5 can be taken in any order. All 5 are required to earn a certificate. 5 courses Follow the suggested order or choose your own Projects Follow the suggested order or choose your own Certificates Follow the suggested order or choose your own

2 Introduction to Data Science in Python Upcoming Session: Dec 18 Subtitles English, Vietnamese, Chinese (Traditional) About the Course This course will introduce the learner to the basics of the python programming environment, including how to download and install python, expected fundamental python programming techniques, and how to find help with python programming questions. The course will also introduce data manipulation and cleaning techniques using the popular python pandas data science library and introduce the abstraction of the DataFrame as the central data structure for data analysis. The course will end with a statistics primer, showing how various statistical measures can be applied to pandas DataFrames. By the end of the course, students will be able to take tabular data, clean it, manipulate it, and run basic inferential statistical analyses. This course should be taken before any of the other Applied Data Science with Python courses: Applied Plotting, Charting & Data Representation in Python, Applied Machine Learning in Python, Applied Text Mining in Python, Applied Social Network Analysis in Python.

3 Week 1 Week 1 In this week you'll get an introduction to the field of data science, review common Python functionality and features which data scientists use, and be introduced to the Coursera Jupyter Notebook for the lectures. All of the course information on grading, prerequisites, and expectations are on the course syllabus, and you can find more information about the Jupyter Notebooks on our Course Resources page. Video Introduction to Specialization Video Python Demonstration: Reading and Writing CSV files Reading Syllabus Video Python Dates and Times Reading Help us learn more about you! Video Advanced Python Objects, map() Video Data Science Reading 50 years of Data Science, David Donoho (optional) Video The Coursera Jupyter Notebook System Reading Notice for Auditing Learners: Assignment Submission Other Week 1 Lectures Jupyter Notebook Video Python Functions Video Python Types and Sequences Video Python More on Strings Video Advanced Python Lambda and List Comprehensions Video Advanced Python Demonstration: The Numerical Python Library (NumPy) Quiz Week One Quiz

4 Week 2 Week 2 In this week of the course you'll learn the fundamentals of one of the most important toolkits Python has for data cleaning and processing -- pandas. You'll learn how to read in data into DataFrame structures, how to query these structures, and the details about such structures are indexed. The module ends with a programming assignment and a discussion question. Video Introduction Other Week 2 Lectures Jupyter Notebook Video The Series Data Structure Video Querying a Series Video The DataFrame Data Structure Video DataFrame Indexing and Loading Video Querying a DataFrame Video Indexing Dataframes Video Missing Values Other Hacked Data Other Assignment 2 Programming Assignment Assignment 2 Submission

5 Week 3 Week 3 In this week you'll deepen your understanding of the python pandas library by learning how to merge DataFrames, generate summary tables, group data into logical pieces, and manipulate dates. We'll also refresh your understanding of scales of data, and discuss issues with creating metrics for analysis. The week ends with a more significant programming assignment. Other Week 3 Lectures Jupyter Notebook Video Merging Dataframes Video Pandas Idioms Video Group by Video Scales Video Pivot Tables Video Date Functionality Other Goodhart's Law Other Assignment 3 Programming Assignment Assignment 3 Submission

6 Week 4 Week 4 In this week of the course you'll be introduced to a variety of statistical techniques such a distributions, sampling and t-tests. The majority of the week will be dedicated to your course project, where you'll engage in a real-world data cleaning activity and provide evidence for (or against!) a given hypothesis. This project is suitable for a data science portfolio, and will test your knowledge of cleaning, merging, manipulating, and test for significance in data. The week ends with two discussions of science and the rise of the fourth paradigm -- data driven discovery. Other Week 4 Lectures Jupyter Notebook Video Introduction Video Distributions Video More Distributions Video Hypothesis Testing in Python Other End of Theory Other Science Isn't Broken: p-hacking activity Other Assignment 4 - Project Programming Assignment Assignment 4 Submission Reading Post-course Survey

7 Applied Plotting, Charting & Data Representation in Python Upcoming Session: Dec 11 Subtitles English, Vietnamese, Chinese (Traditional) About the Course This course will introduce the learner to information visualization basics, with a focus on reporting and charting using the matplotlib library. The course will start with a design and information literacy perspective, touching on what makes a good and bad visualization, and what statistical measures translate into in terms of visualizations. The second week will focus on the technology used to make visualizations in python, matplotlib, and introduce users to best practices when creating basic charts and how to realize design decisions in the framework. The third week will describe the gamut of functionality available in matplotlib, and demonstrate a variety of basic statistical charts helping learners to identify when a particular method is good for a particular problem. The course will end with a discussion of other forms of structuring and visualizing data. This course should be taken after Introduction to Data Science in Python and before the remainder of the Applied Data Science with Python courses: Applied Machine Learning in Python, Applied Text Mining in Python, and Applied Social Network Analysis in Python.

8 Week 1 Module 1: Principles of Information Visualization In this module, you will get an introduction to principles of information visualization. We will be introduced to tools for thinking about design and graphical heuristics for thinking about creating effective visualizations. All of the course information on grading, prerequisites, and expectations are on the course syllabus, which is included in this module. Video Introduction Video Graphical heuristics: Chart junk (Edward Tufte) Reading Syllabus Reading Help us learn more about you! Reading Useful Junk?: The Effects of Visual Embellishment on Comprehension and Memorability of Charts Video About the Professor: Christopher Brooks Video Tools for Thinking about Design (Alberto Cairo) Reading Notice for Coursera Learners: Assignment Submission Video Graphical heuristics: Lie Factor and Spark Lines (Edward Tufte) Video The Truthful Art (Alberto Cairo) Other Must a visual be enlightening? Other Hands-on Visualization Wheel Video Graphical heuristics: Data-ink ratio (Edward Tufte) Reading Graphics Lies, Misleading Visuals Peer Review Graphics Lies, Misleading Visuals Reading Dark Horse Analytics (Optional)

9 Week 2 Module 2: Basic Charting In this module, you will delve into basic charting. For this week s assignment, you will work with real world CSV weather data. You will manipulate the data to display the minimum and maximum temperature for a range of dates and demonstrate that you know how to create a line graph using matplotlib. Additionally, you will demonstrate the procedure of composite charts, by overlaying a scatter plot of record breaking data for a given year. Other Module 2 Jupyter Notebook Video Line Plots Video Introduction Video Bar Charts Video Matplotlib Architecture Video Dejunkifying a Plot Reading Matplotlib Other Plotting Weather Patterns Reading Ten Simple Rules for Better Figures Peer Review Plotting Weather Patterns Video Basic Plotting with Matplotlib Video Scatterplots

10 Week 3 Module 3: Charting Fundamentals In this module you will explore charting fundamentals. For this week s assignment you will work to implement a new visualization technique based on academic research. This assignment is flexible and you can address it using a variety of difficulties - from an easy static image to an interactive chart where users can set ranges of values to be used. Other Module 3 Jupyter Notebook Practice Peer Review Practice Assignment: Understanding Distributions Through Sampling Video Subplots Other Building a Custom Visualization Video Histograms Reading Assignment Reading Reading Selecting the Number of Bins in a Peer Review Building a Custom Visualization Histogram: A Decision Theoretic Approach (Optional) Video Box Plots Video Heatmaps Video Animation Video Interactivity Other Practice Assignment: Understanding Distributions Through Sampling

11 Week 4 Module 4: Applied Visualizations In this module, then everything starts to come together. Your final assignment is entitled Becoming a Data Scientist. This assignment requires that you identify at least two publicly accessible datasets from the same region that are consistent across a meaningful dimension. You will state a research question that can be answered using these data sets and then create a visual using matplotlib that addresses your stated research question. You will then be asked to justify how your visual addresses your research question. Other Module 4 Jupyter Notebook Video Plotting with Pandas Video Seaborn Reading Spurious Correlations Video Becoming an Independent Data Scientist Other Project Description Peer Review Becoming an Independent Data Scientist Reading Post-course Survey

12 Applied Machine Learning in Python Upcoming Session: Dec 18 Subtitles English About the Course This course will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods. The course will start with a discussion of how machine learning is different than descriptive statistics, and introduce the scikit learn toolkit. The issue of dimensionality of data will be discussed, and the task of clustering data, as well as evaluating those clusters, will be tackled. Supervised approaches for creating predictive models will be described, and learners will be able to apply the scikit learn predictive modelling methods while understanding process issues related to data generalizability (e.g. cross validation, overfitting). The course will end with a look at more advanced techniques, such as building ensembles, and practical limitations of predictive models. By the end of this course, students will be able to identify the difference between a supervised (classification) and unsupervised (clustering) technique, identify which technique they need to apply for a particular dataset and need, engineer features to meet that need, and write python code to carry out an analysis. This course should be taken after Introduction to Data Science in Python and Applied Plotting, Charting & Data Representation in Python and before Applied Text Mining in Python and Applied Social Analysis in Python.

13 Week 1 Module 1: Fundamentals of Machine Learning - Intro to SciKit Learn This module introduces basic machine learning concepts, tasks, and workflow using an example classification problem based on the K-nearest neighbors method, and implemented using the scikit-learn library. Reading Course Syllabus Video Introduction Reading Help us learn more about you! Video Key Concepts in Machine Learning Video Python Tools for Machine Learning Reading Notice for Auditing Learners: Assignment Submission Other Module 1 Notebook Video An Example Machine Learning Problem Video Examining the Data Video K-Nearest Neighbors Classification Reading Zachary Lipton: The Foundations of Algorithmic Bias (optional) Quiz Module 1 Quiz Other Assignment 1 Programming Assignment Assignment 1 Submission

14 Week 2 Module 2: Supervised Machine Learning - Part 1 This module delves into a wider variety of supervised learning methods for both classification and regression, learning about the connection between model complexity and generalization performance, the importance of proper feature scaling, and how to control model complexity by applying techniques like regularization to avoid overfitting. In addition to k-nearest neighbors, this week covers linear regression (least-squares, ridge, lasso, and polynomial regression), logistic regression, support vector machines, the use of cross-validation for model evaluation, and decision trees. Other Module 2 Notebook Video Multi-Class Classification Video Introduction to Supervised Machine Learning Video Kernelized Support Vector Machines Video Overfitting and Underfitting Video Cross-Validation Video Supervised Learning: Datasets Video Decision Trees Video K-Nearest Neighbors: Classification and Regression Reading A Few Useful Things to Know about Machine Learning Video Linear Regression: Least-Squares Reading Ed Yong: Genetic Test for Autism Refuted (optional) Video Linear Regression: Ridge, Lasso, and Polynomial Regression Quiz Module 2 Quiz Video Logistic Regression Other Classifier Visualization Playspace Video Linear Classifiers: Support Vector Machines Other Assignment 2 Programming Assignment Assignment 2 Submission

15 Week 3 Module 3: Evaluation This module covers evaluation and model selection methods that you can use to help understand and optimize the performance of your machine learning models. Other Module 3 Notebook Video Multi-Class Classification Video Model Evaluation & Selection Video Kernelized Support Vector Machines Video Confusion Matrices & Basic Evaluation Metrics Video Cross-Validation Video Classifier Decision Functions Video Decision Trees Video Precision-recall and ROC curves Reading A Few Useful Things to Know about Machine Learning Video Multi-Class Evaluation Video Regression Evaluation Reading Practical Guide to Controlled Experiments on the Web (optional) Reading Ed Yong: Genetic Test for Autism Refuted (optional) Quiz Module 2 Quiz Other Classifier Visualization Playspace Video Model Selection: Optimizing Classifiers for Different Evaluation Metrics Other Assignment 2 Quiz Module 3 Quiz Programming Assignment Assignment 2 Submission Other Assignment 3 Programming Assignment Assignment 3 Submission

16 Week 4 Module 4: Supervised Machine Learning - Part 2 This module covers more advanced supervised learning methods that include ensembles of trees (random forests, gradient boosted trees), and neural networks (with an optional summary on deep learning). You will also learn about the critical problem of data leakage in machine learning and how to detect and avoid it. Other Module 4 Notebook Programming Assignment Assignment 4 Submission Video Naive Bayes Classifiers Other Unsupervised Learning Notebook Video Random Forests Video Introduction Video Gradient Boosted Decision Trees Video Dimensionality Reduction and Manifold Learning Video Neural Networks Video Clustering Reading Neural Networks Made Easy (optional) Reading How to Use t-sne Effectively Reading Play with Neural Networks: TensorFlow Playground (optional) Reading How Machines Make Sense of Big Data: an Introduction to Clustering Algorithms Video Deep Learning (Optional) Video Conclusion Reading Deep Learning in a Nutshell: Core Concepts (optional) Reading Post-course Survey Reading Assisting Pathologists in Detecting Cancer with Deep Learning (optional) Video Data Leakage Reading The Treachery of Leakage (optional) Reading Leakage in Data Mining: Formulation, Detection, and Avoidance (optional) Reading Data Leakage Example: The ICML 2013 Whale Challenge (optional) Reading Rules of Machine Learning: Best Practices for ML Engineering (optional) Quiz Module 4 Quiz Other Assignment 4

17 Applied Text Mining in Python Upcoming Session: Dec 18 Subtitles English About the Course About the Course This course will introduce the learner to text mining and text manipulation basics. The course begins with an understanding of how text is handled by python, the structure of text both to the machine and to humans, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including regular expressions (searching for text), cleaning text, and preparing text for use by machine learning processes. The third week will apply basic natural language processing methods to text, and demonstrate how text classification is accomplished. The final week will explore more advanced methods for detecting the topics in documents and grouping them by similarity (topic modelling). This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python

18 Week 1 Module 1: Working with Text in Python In this module, you will delve into basic charting. For this week s assignment, you will work with real world CSV weather data. You will manipulate the data to display the minimum and maximum temperature for a range of dates and demonstrate that you know how to create a line graph using matplotlib. Additionally, you will demonstrate the procedure of composite charts, by overlaying a scatter plot of record breaking data for a given year. Reading Course Syllabus Reading Help us learn more about you! Video Introduction to Text Mining Video Handling Text in Python Reading Notice for Auditing Learners: Assignment Submission Other Working with Text Video Regular Expressions Other Regex with Pandas and Named Groups Video Demonstration: Regex with Pandas and Named Groups Practice Quiz Practice Quiz Video Internationalization and Issues with Non-ASCII Characters Other Introduce Yourself Reading Resources: Common issues with free text Quiz Module 1 Quiz Other Assignment 1 Programming Assignment Assignment 1 Submission

19 Week 2 Module 2: Basic Natural Language Processing Video Basic Natural Language Processing Other Module 2 (Python 3) Video Basic NLP tasks with NLTK Video Advanced NLP tasks with NLTK Practice Quiz Practice Quiz Other Finding your own prepositional phrase attachment Quiz Module 2 Quiz Other Assignment 2 Programming Assignment Assignment 2 Submission

20 Week 3 Module 3: Classification of Text Video Text Classification Video Identifying Features from Text Video Naive Bayes Classifiers Video Naive Bayes Variations Video Support Vector Machines Video Learning Text Classifiers in Python Other Case Study - Sentiment Analysis Video Demonstration: Case Study - Sentiment Analysis Quiz Module 3 Quiz Other Assignment 3 Programming Assignment Assignment 3 Submission

21 Week 4 Module 4: Topic Modeling Video Semantic Text Similarity Video Topic Modeling Video Generative Models and LDA Practice Quiz Practice Quiz Video Information Extraction Reading Additional Resources & Readings Quiz Module 4 Quiz Other Assignment 4 Programming Assignment Assignment 4 Submission Reading Post-Course Survey

22 Applied Social Network Analysis in Python Upcoming Session: Jan 1 Subtitles English About the Course TThis course will introduce the learner to network analysis through the NetworkX library. The course begins with an understanding of what network analysis is and motivations for why we might model phenomena as networks. The second week introduces the concept of connectivity and network robustness.. The third week will explore ways of measuring the importance or centrality of a node in a network. The final week will explore the evolution of networks over time and cover models of network generation and the link prediction problem. This course should be taken after: Introduction to Data Science in Python, Applied Plotting, Charting & Data Representation in Python, and Applied Machine Learning in Python.

23 Week 1 Why Study Networks and Basics on NetworkX Module One introduces you to different types of networks in the real world and why we study them. You'll learn about the basic elements of networks, as well as different types of networks. You'll also learn how to represent and manipulate networked data using the NetworkX library. The assignment will give you an opportunity to use NetworkX to analyze a networked dataset of employees in a small company. Reading Course Syllabus Other Loading Graphs in NetworkX Reading Help us learn more about you! Video TA Demonstration: Loading Graphs in NetworkX Video Networks: Definition and Why We Study Them Quiz Module 1 Quiz Video Network Definition and Vocabulary Other Assignment 1 Video Node and Edge Attributes Programming Assignment Assignment 1 Submission Video Bipartite Graphs Reading Notice for Auditing Learners: Assignment Submission

24 Week 1 Why Study Networks and Basics on NetworkX Module One introduces you to different types of networks in the real world and why we study them. You'll learn about the basic elements of networks, as well as different types of networks. You'll also learn how to represent and manipulate networked data using the NetworkX library. The assignment will give you an opportunity to use NetworkX to analyze a networked dataset of employees in a small company. Reading Course Syllabus Other Loading Graphs in NetworkX Reading Help us learn more about you! Video TA Demonstration: Loading Graphs in NetworkX Video Networks: Definition and Why We Study Them Quiz Module 1 Quiz Video Network Definition and Vocabulary Other Assignment 1 Video Node and Edge Attributes Programming Assignment Assignment 1 Submission Video Bipartite Graphs Reading Notice for Auditing Learners: Assignment Submission

25 Week 2 Network Connectivity In Module Two you'll learn how to analyze the connectivity of a network based on measures of distance, reachability, and redundancy of paths between nodes. In the assignment, you will practice using NetworkX to compute measures of connectivity of a network of communication among the employees of a mid-size manufacturing company. Video Clustering Coefficient Video Distance Measures Video Connected Components Video Network Robustness Other Simple Network Visualizations in NetworkX Video TA Demonstration: Simple Network Visualizations in NetworkX Quiz Module 2 Quiz Other Assignment 2 Programming Assignment Assignment 2 Submission

26 Week 3 Influence Measures and Network Centralization In Module Three, you'll explore ways of measuring the importance or centrality of a node in a network, using measures such as Degree, Closeness, and Betweenness centrality, Page Rank, and Hubs and Authorities. You'll learn about the assumptions each measure makes, the algorithms we can use to compute them, and the different functions available on NetworkX to measure centrality. In the assignment, you'll practice choosing the most appropriate centrality measure on a real-world setting. Video Degree and Closeness Centrality Video Betweenness Centrality Video Basic Page Rank Video Scaled Page Rank Video Hubs and Authorities Video Centrality Examples Quiz Module 3 Quiz Other PageRank and Centrality in a real-life network Other Assignment 3 Programming Assignment Assignment 3 Submission

27 Week 4 Network Evolution In Module Four, you'll explore the evolution of networks over time, including the different models that generate networks with realistic features, such as the Preferential Attachment Model and Small World Networks. You will also explore the link prediction problem, where you will learn useful features that can predict whether a pair of disconnected nodes will be connected in the future. In the assignment, you will be challenged to identify which model generated a given network. Additionally, you will have the opportunity to combine different concepts of the course by predicting the salary, position, and future connections of the employees of a company using their logs of exchanges Video Preferential Attachment Model Reading Power Laws and Rich-Get-Richer Phenomena (Optional) Video Small World Networks Video Link Prediction Other Extracting Features from Graphs Quiz Module 4 Quiz Reading The Small-World Phenomenon (Optional) Other Assignment 4 Programming Assignment Assignment 4 Submission Reading Post-Course Survey

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

CS545 Machine Learning

CS545 Machine Learning Machine learning and related fields CS545 Machine Learning Course Introduction Machine learning: the construction and study of systems that learn from data. Pattern recognition: the same field, different

More information

DATA SCIENCE CURRICULUM

DATA SCIENCE CURRICULUM DATA SCIENCE CURRICULUM Immersive program covers all the necessary tools and concepts used by data scientists in the industry, including machine learning, statistical inference, and working with data at

More information

Syllabus Data Mining for Business Analytics - Managerial INFO-GB.3336, Spring 2018

Syllabus Data Mining for Business Analytics - Managerial INFO-GB.3336, Spring 2018 Syllabus Data Mining for Business Analytics - Managerial INFO-GB.3336, Spring 2018 Course information When: Mondays and Wednesdays 3-4:20pm Where: KMEC 3-65 Professor Manuel Arriaga Email: marriaga@stern.nyu.edu

More information

36-350: Data Mining. Fall Lectures: Monday, Wednesday and Friday, 10:30 11:20, Porter Hall 226B

36-350: Data Mining. Fall Lectures: Monday, Wednesday and Friday, 10:30 11:20, Porter Hall 226B 36-350: Data Mining Fall 2009 Instructor: Cosma Shalizi, Statistics Dept., Baker Hall 229C, cshalizi@stat.cmu.edu Teaching Assistant: Joseph Richards, jwrichar@stat.cmu.edu Lectures: Monday, Wednesday

More information

Session 1: Gesture Recognition & Machine Learning Fundamentals

Session 1: Gesture Recognition & Machine Learning Fundamentals IAP Gesture Recognition Workshop Session 1: Gesture Recognition & Machine Learning Fundamentals Nicholas Gillian Responsive Environments, MIT Media Lab Tuesday 8th January, 2013 My Research My Research

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab ali.harakeh@uwaterloo.ca May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA

Dudon Wai Georgia Institute of Technology CS 7641: Machine Learning Atlanta, GA Adult Income and Letter Recognition - Supervised Learning Report An objective look at classifier performance for predicting adult income and Letter Recognition Dudon Wai Georgia Institute of Technology

More information

PG DIPLOMA IN MACHINE LEARNING & AI 11 MONTHS ONLINE

PG DIPLOMA IN MACHINE LEARNING & AI 11 MONTHS ONLINE & PG DIPLOMA IN MACHINE LEARNING & AI 11 MONTHS ONLINE UpGrad is an online education platform to help individuals develop their professional potential in the most engaging learning environment. Online

More information

Scaling Quality On Quora Using Machine Learning

Scaling Quality On Quora Using Machine Learning Scaling Quality On Quora Using Machine Learning Nikhil Garg @nikhilgarg28 @Quora @QconSF 11/7/16 Goals Of The Talk Introducing specific product problems we need to solve to stay high-quality Describing

More information

Practical Data Science with R

Practical Data Science with R Practical Data Science with R Instructor Matthew Renze Twitter: @matthewrenze Email: info@matthewrenze.com Web: http://www.matthewrenze.com Course Description Data science is the practice of transforming

More information

CPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015

CPSC 340: Machine Learning and Data Mining. Course Review/Preview Fall 2015 CPSC 340: Machine Learning and Data Mining Course Review/Preview Fall 2015 Admin Assignment 6 due now. We will have office hours as usual next week. Final exam details: December 15: 8:30-11 (WESB 100).

More information

COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining.

COLLEGE OF SCIENCE. School of Mathematical Sciences. NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining. ROCHESTER INSTITUTE OF TECHNOLOGY COURSE OUTLINE FORM COLLEGE OF SCIENCE School of Mathematical Sciences NEW (or REVISED) COURSE: COS-STAT-747 Principles of Statistical Data Mining 1.0 Course Designations

More information

GIE - Management of Statistical Information

GIE - Management of Statistical Information Coordinating unit: Teaching unit: Academic year: Degree: ECTS credits: 2016 200 - FME - School of Mathematics and Statistics 707 - ESAII - Department of Automatic Control 723 - CS - Department of Computer

More information

Introduction to Classification, aka Machine Learning

Introduction to Classification, aka Machine Learning Introduction to Classification, aka Machine Learning Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes

More information

Introduction to Classification

Introduction to Classification Introduction to Classification Classification: Definition Given a collection of examples (training set ) Each example is represented by a set of features, sometimes called attributes Each example is to

More information

Machine Learning with MATLAB Antti Löytynoja Application Engineer

Machine Learning with MATLAB Antti Löytynoja Application Engineer Machine Learning with MATLAB Antti Löytynoja Application Engineer 2014 The MathWorks, Inc. 1 Goals Overview of machine learning Machine learning models & techniques available in MATLAB MATLAB as an interactive

More information

BGS Training Requirement in Statistics

BGS Training Requirement in Statistics BGS Training Requirement in Statistics All BGS students are required to have an understanding of statistical methods and their application to biomedical research. Most students take BIOM611, Statistical

More information

University of California, Berkeley Department of Statistics Statistics Undergraduate Major Information 2018

University of California, Berkeley Department of Statistics Statistics Undergraduate Major Information 2018 University of California, Berkeley Department of Statistics Statistics Undergraduate Major Information 2018 OVERVIEW and LEARNING OUTCOMES of the STATISTICS MAJOR Statisticians help design data collection

More information

2017 Predictive Analytics Symposium

2017 Predictive Analytics Symposium 2017 Predictive Analytics Symposium Session 35, Kaggle Contests--Tips From Actuaries Who Have Placed Well Moderator: Kyle A. Nobbe, FSA, MAAA Presenters: Thomas DeGodoy Shea Kee Parkes, FSA, MAAA SOA Antitrust

More information

Learning From the Past with Experiment Databases

Learning From the Past with Experiment Databases Learning From the Past with Experiment Databases Joaquin Vanschoren 1, Bernhard Pfahringer 2, and Geoff Holmes 2 1 Computer Science Dept., K.U.Leuven, Leuven, Belgium 2 Computer Science Dept., University

More information

Detection of Insults in Social Commentary

Detection of Insults in Social Commentary Detection of Insults in Social Commentary CS 229: Machine Learning Kevin Heh December 13, 2013 1. Introduction The abundance of public discussion spaces on the Internet has in many ways changed how we

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Bird Species Identification from an Image

Bird Species Identification from an Image Bird Species Identification from an Image Aditya Bhandari, 1 Ameya Joshi, 2 Rohit Patki 3 1 Department of Computer Science, Stanford University 2 Department of Electrical Engineering, Stanford University

More information

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages

More information

CS Data Science and Visualization Spring 2016

CS Data Science and Visualization Spring 2016 CS 207 - Data Science and Visualization Spring 2016 Professor: Sorelle Friedler sorelle@cs.haverford.edu An introduction to techniques for the automated and human-assisted analysis of data sets. These

More information

Lesson Plan. Preparation. Data Mining Basics BIM 1 Business Management & Administration

Lesson Plan. Preparation. Data Mining Basics BIM 1 Business Management & Administration Data Mining Basics BIM 1 Business Management & Administration Lesson Plan Performance Objective The student understands and is able to recall information on data mining basics. Specific Objectives The

More information

COMP 527: Data Mining and Visualization. Danushka Bollegala

COMP 527: Data Mining and Visualization. Danushka Bollegala COMP 527: Data Mining and Visualization Danushka Bollegala Introductions Lecturer: Danushka Bollegala Office: 2.24 Ashton Building (Second Floor) Email: danushka@liverpool.ac.uk Personal web: http://danushka.net/

More information

Evaluation and Comparison of Performance of different Classifiers

Evaluation and Comparison of Performance of different Classifiers Evaluation and Comparison of Performance of different Classifiers Bhavana Kumari 1, Vishal Shrivastava 2 ACE&IT, Jaipur Abstract:- Many companies like insurance, credit card, bank, retail industry require

More information

Machine Learning and Applications in Finance

Machine Learning and Applications in Finance Machine Learning and Applications in Finance Christian Hesse 1,2,* 1 Autobahn Equity Europe, Global Markets Equity, Deutsche Bank AG, London, UK christian-a.hesse@db.com 2 Department of Computer Science,

More information

Lecture 1. Introduction. Probability Theory

Lecture 1. Introduction. Probability Theory Lecture 1. Introduction. Probability Theory COMP90051 Machine Learning Sem2 2017 Lecturer: Trevor Cohn Adapted from slides provided by Ben Rubinstein Why Learn Learning? 2 Motivation We are drowning in

More information

Prediction of Useful Reviews on Yelp Dataset

Prediction of Useful Reviews on Yelp Dataset Prediction of Useful Reviews on Yelp Dataset Final Report Yanrong Li, Yuhao Liu, Richard Chiou, Pradeep Kalipatnapu Problem Statement and Background Online reviews play a very important role in information

More information

Linear Regression: Predicting House Prices

Linear Regression: Predicting House Prices Linear Regression: Predicting House Prices I am big fan of Kalid Azad writings. He has a knack of explaining hard mathematical concepts like Calculus in simple words and helps the readers to get the intuition

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time

Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time Stay Alert!: Creating a Classifier to Predict Driver Alertness in Real-time Aditya Sarkar, Julien Kawawa-Beaudan, Quentin Perrot Friday, December 11, 2014 1 Problem Definition Driving while drowsy inevitably

More information

TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS

TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS TOWARDS DATA-DRIVEN AUTONOMICS IN DATA CENTERS ALINA SIRBU, OZALP BABAOGLU SUMMARIZED BY ARDA GUMUSALAN MOTIVATION 2 MOTIVATION Human-interaction-dependent data centers are not sustainable for future data

More information

Word Sense Determination from Wikipedia. Data Using a Neural Net

Word Sense Determination from Wikipedia. Data Using a Neural Net 1 Word Sense Determination from Wikipedia Data Using a Neural Net CS 297 Report Presented to Dr. Chris Pollett Department of Computer Science San Jose State University By Qiao Liu May 2017 Word Sense Determination

More information

18 LEARNING FROM EXAMPLES

18 LEARNING FROM EXAMPLES 18 LEARNING FROM EXAMPLES An intelligent agent may have to learn, for instance, the following components: A direct mapping from conditions on the current state to actions A means to infer relevant properties

More information

Beating the Odds: Learning to Bet on Soccer Matches Using Historical Data

Beating the Odds: Learning to Bet on Soccer Matches Using Historical Data Beating the Odds: Learning to Bet on Soccer Matches Using Historical Data Michael Painter, Soroosh Hemmati, Bardia Beigi SUNet IDs: mp703, shemmati, bardia Introduction Soccer prediction is a multi-billion

More information

Supervised learning can be done by choosing the hypothesis that is most probable given the data: = arg max ) = arg max

Supervised learning can be done by choosing the hypothesis that is most probable given the data: = arg max ) = arg max The learning problem is called realizable if the hypothesis space contains the true function; otherwise it is unrealizable On the other hand, in the name of better generalization ability it may be sensible

More information

Machine Learning in Practice/ Applied Machine Learning ,11-663,05-834,05-434

Machine Learning in Practice/ Applied Machine Learning ,11-663,05-834,05-434 Machine Learning in Practice/ Applied Machine Learning 11-344,11-663,05-834,05-434 Instructor: Dr. Carolyn P. Rosé, cprose@cs.cmu.edu Office Hours: Gates-Hillman Center 5415, Time TBA Teaching Assistants:

More information

10-702: Statistical Machine Learning

10-702: Statistical Machine Learning 10-702: Statistical Machine Learning Syllabus, Spring 2010 http://www.cs.cmu.edu/~10702 Statistical Machine Learning is a second graduate level course in machine learning, assuming students have taken

More information

Introduction to Pattern Recognition

Introduction to Pattern Recognition Introduction to Pattern Recognition Selim Aksoy Department of Computer Engineering Bilkent University saksoy@cs.bilkent.edu.tr CS 551, Fall 2017 CS 551, Fall 2017 c 2017, Selim Aksoy (Bilkent University)

More information

White Paper. Using Sentiment Analysis for Gaining Actionable Insights

White Paper. Using Sentiment Analysis for Gaining Actionable Insights corevalue.net info@corevalue.net White Paper Using Sentiment Analysis for Gaining Actionable Insights Sentiment analysis is a growing business trend that allows companies to better understand their brand,

More information

INTRODUCTION TO DATA SCIENCE

INTRODUCTION TO DATA SCIENCE DATA11001 INTRODUCTION TO DATA SCIENCE EPISODE 6: MACHINE LEARNING TODAY S MENU 1. WHAT IS ML? 2. CLASSIFICATION AND REGRESSSION 3. EVALUATING PERFORMANCE & OVERFITTING WHAT IS MACHINE LEARNING? Definition:

More information

A Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington" 2012"

A Few Useful Things to Know about Machine Learning. Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Pedro Domingos Department of Computer Science and Engineering University of Washington 2012 A Few Useful Things to Know about Machine Learning Machine

More information

About This Specialization

About This Specialization About This Specialization In 2020 the world will generate 50 times the amount of data as in 2011. And 75 times the number of information sources (IDC, 2011). Being able to use this data provides huge opportunities

More information

Automatic Text Summarization for Annotating Images

Automatic Text Summarization for Annotating Images Automatic Text Summarization for Annotating Images Gediminas Bertasius November 24, 2013 1 Introduction With an explosion of image data on the web, automatic image annotation has become an important area

More information

P(A, B) = P(A B) = P(A) + P(B) - P(A B)

P(A, B) = P(A B) = P(A) + P(B) - P(A B) AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) P(A B) = P(A) + P(B) - P(A B) Area = Probability of Event AND Probability P(A, B) = P(A B) = P(A) + P(B) - P(A B) If, and only if, A and B are independent,

More information

Natural Language Processing

Natural Language Processing Natural Language Processing Sentiment Analysis Potsdam, 7 June 2012 Saeedeh Momtazi Information Systems Group based on the slides of the course book Sentiment Analysis 2 --------------- ---------------

More information

Overview COEN 296 Topics in Computer Engineering Introduction to Pattern Recognition and Data Mining Course Goals Syllabus

Overview COEN 296 Topics in Computer Engineering Introduction to Pattern Recognition and Data Mining Course Goals Syllabus Overview COEN 296 Topics in Computer Engineering to Pattern Recognition and Data Mining Instructor: Dr. Giovanni Seni G.Seni@ieee.org Department of Computer Engineering Santa Clara University Course Goals

More information

Performance Analysis of Various Data Mining Techniques on Banknote Authentication

Performance Analysis of Various Data Mining Techniques on Banknote Authentication International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 5 Issue 2 February 2016 PP.62-71 Performance Analysis of Various Data Mining Techniques on

More information

Lahore University of Management Sciences. DISC 420 Business Analytics Fall Semester 2017

Lahore University of Management Sciences. DISC 420 Business Analytics Fall Semester 2017 DISC 420 Business Analytics Fall Semester 2017 Instructors Zainab Riaz Room No. SDSB 4 38 Office Hours TBA Email zainab.riaz@lums.edu.pk Telephone 5130 Secretary/TA Sec: Muhammad Umer Manzoor, TA: TBA

More information

Lecture 1. Introduction Bastian Leibe Visual Computing Institute RWTH Aachen University

Lecture 1. Introduction Bastian Leibe Visual Computing Institute RWTH Aachen University Advanced Machine Learning Lecture 1 Introduction 20.10.2015 Bastian Leibe Visual Computing Institute RWTH Aachen University http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de Organization Lecturer

More information

CSL465/603 - Machine Learning

CSL465/603 - Machine Learning CSL465/603 - Machine Learning Fall 2016 Narayanan C Krishnan ckn@iitrpr.ac.in Introduction CSL465/603 - Machine Learning 1 Administrative Trivia Course Structure 3-0-2 Lecture Timings Monday 9.55-10.45am

More information

Concession Curve Analysis for Inspire Negotiations

Concession Curve Analysis for Inspire Negotiations Concession Curve Analysis for Inspire Negotiations Vivi Nastase SITE University of Ottawa, Ottawa, ON vnastase@site.uottawa.ca Gregory Kersten John Molson School of Business Concordia University, Montreal,

More information

Pattern Classification and Clustering Spring 2006

Pattern Classification and Clustering Spring 2006 Pattern Classification and Clustering Time: Spring 2006 Room: Instructor: Yingen Xiong Office: 621 McBryde Office Hours: Phone: 231-4212 Email: yxiong@cs.vt.edu URL: http://www.cs.vt.edu/~yxiong/pcc/ Detailed

More information

STA 225: Introductory Statistics (CT)

STA 225: Introductory Statistics (CT) Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic

More information

STID Statistics and Business Intelligence

STID Statistics and Business Intelligence STID Statistics and Business Intelligence IUT Roubaix Lille 2 University France Sylvia CANONNE Description of teaching modules. September 2014 3 Course descriptions subject to change Term 1 M1101A -Mathematics

More information

Machine Learning for SAS Programmers

Machine Learning for SAS Programmers Machine Learning for SAS Programmers The Agenda Introduction of Machine Learning Supervised and Unsupervised Machine Learning Deep Neural Network Machine Learning implementation Questions and Discussion

More information

Machine Learning for NLP

Machine Learning for NLP Natural Language Processing SoSe 2014 Machine Learning for NLP Dr. Mariana Neves April 30th, 2014 (based on the slides of Dr. Saeedeh Momtazi) Introduction Field of study that gives computers the ability

More information

Modelling Student Knowledge as a Latent Variable in Intelligent Tutoring Systems: A Comparison of Multiple Approaches

Modelling Student Knowledge as a Latent Variable in Intelligent Tutoring Systems: A Comparison of Multiple Approaches Modelling Student Knowledge as a Latent Variable in Intelligent Tutoring Systems: A Comparison of Multiple Approaches Qandeel Tariq, Alex Kolchinski, Richard Davis December 6, 206 Introduction This paper

More information

LEHMAN COLLEGE OF THE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE CURRICULUM CHANGE

LEHMAN COLLEGE OF THE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE CURRICULUM CHANGE LEHMAN COLLEGE OF THE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE CURRICULUM CHANGE Name of Program and Degree Award: Mathematics, BA Hegis Number: 1701.00 Program Code:

More information

Unit title: Analysis of Scientific Data and Information

Unit title: Analysis of Scientific Data and Information Unit title: Analysis of Scientific Data and Information Unit code: F/601/0220 QCF level: 4 Credit value: 15 Aim This unit develops skills in mathematical and statistical techniques used in the analysis

More information

Stochastic Gradient Descent using Linear Regression with Python

Stochastic Gradient Descent using Linear Regression with Python ISSN: 2454-2377 Volume 2, Issue 8, December 2016 Stochastic Gradient Descent using Linear Regression with Python J V N Lakshmi Research Scholar Department of Computer Science and Application SCSVMV University,

More information

STRAND 3: QUADRATIC FUNCTIONS

STRAND 3: QUADRATIC FUNCTIONS 1 STRAND 3: QUADRATIC FUNCTIONS TOPIC 3.3: APPLICATIONS Topic Notes Mathematical focus Participants put quadratic functions to use in maximization problems, learn how to extend standard textbook max/min

More information

CS519: Deep Learning. Winter Fuxin Li

CS519: Deep Learning. Winter Fuxin Li CS519: Deep Learning Winter 2017 Fuxin Li Course Information Instructor: Dr. Fuxin Li KEC 2077, lif@eecs.oregonstate.edu TA: Mingbo Ma: mam@oregonstate.edu Xu Xu: xux@oregonstate.edu My office hour: TBD

More information

The content is based on the National Council of Teachers of Mathematics (NCTM) standards and is aligned with state standards.

The content is based on the National Council of Teachers of Mathematics (NCTM) standards and is aligned with state standards. Core Algebra I provides a curriculum focused on the mastery of critical skills and the recognition and understanding of key algebraic concepts. Through a "Discovery-Confirmation-Practice"-based exploration

More information

Government of Russian Federation. Federal State Autonomous Educational Institution of High Professional Education

Government of Russian Federation. Federal State Autonomous Educational Institution of High Professional Education Government of Russian Federation Federal State Autonomous Educational Institution of High Professional Education National Research University Higher School of Economics Syllabus for the course Advanced

More information

Statistics for the Life Sciences, 5/e, Samuels, Witmer and Schaffner ISBN:

Statistics for the Life Sciences, 5/e, Samuels, Witmer and Schaffner ISBN: v Credits 4 credits Course Title Statistics Course Number STA 3100 Pre-requisite None Co-requisite (s) None (s) Hours 60 theory hours/60 clock hours Total Outside Hours 120 hours Note: A minimum of 2 hours

More information

Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network

Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network Classification of News Articles Using Named Entities with Named Entity Recognition by Neural Network Nick Latourette and Hugh Cunningham 1. Introduction Our paper investigates the use of named entities

More information

CS540 Machine learning Lecture 1 Introduction

CS540 Machine learning Lecture 1 Introduction CS540 Machine learning Lecture 1 Introduction Administrivia Overview Supervised learning Unsupervised learning Other kinds of learning Outline Administrivia Class web page www.cs.ubc.ca/~murphyk/teaching/cs540-fall08

More information

Optimal Task Assignment within Software Development Teams Caroline Frost Stanford University CS221 Autumn 2016

Optimal Task Assignment within Software Development Teams Caroline Frost Stanford University CS221 Autumn 2016 Optimal Task Assignment within Software Development Teams Caroline Frost Stanford University CS221 Autumn 2016 Introduction The number of administrative tasks, documentation and processes grows with the

More information

Agile Geoscience Ltd PO Box 336 Mahone Bay B0J 2E0 Canada

Agile Geoscience Ltd PO Box 336 Mahone Bay B0J 2E0 Canada Agile Geoscience Ltd PO Box 336 Mahone Bay B0J 2E0 Canada hello@agilegeoscience.com 1-902-980-0130 Custom training courses in creative geocomputing Did you know you can work with Agile to create the perfect

More information

MASTERING PYTHON FOR DATA SCIENCE BY SAMIR MADHAVAN DOWNLOAD EBOOK : MASTERING PYTHON FOR DATA SCIENCE BY SAMIR MADHAVAN PDF

MASTERING PYTHON FOR DATA SCIENCE BY SAMIR MADHAVAN DOWNLOAD EBOOK : MASTERING PYTHON FOR DATA SCIENCE BY SAMIR MADHAVAN PDF Read Online and Download Ebook MASTERING PYTHON FOR DATA SCIENCE BY SAMIR MADHAVAN DOWNLOAD EBOOK : MASTERING PYTHON FOR DATA SCIENCE BY SAMIR Click link bellow and free register to download ebook: MASTERING

More information

ECE-271A Statistical Learning I

ECE-271A Statistical Learning I ECE-271A Statistical Learning I Nuno Vasconcelos ECE Department, UCSD The course the course is an introductory level course in statistical learning by introductory I mean that you will not need any previous

More information

Linear Regression. Chapter Introduction

Linear Regression. Chapter Introduction Chapter 9 Linear Regression 9.1 Introduction In this class, we have looked at a variety of di erent models and learning methods, such as finite state machines, sequence models, and classification methods.

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

MAT 12O ELEMENTARY STATISTICS I

MAT 12O ELEMENTARY STATISTICS I LAGUARDIA COMMUNITY COLLEGE CITY UNIVERSITY OF NEW YORK DEPARTMENT OF MATHEMATICS, ENGINEERING, AND COMPUTER SCIENCE MAT 12O ELEMENTARY STATISTICS I 3 Lecture Hours, 1 Lab Hour, 3 Credits Pre-Requisite:

More information

Computer Vision for Card Games

Computer Vision for Card Games Computer Vision for Card Games Matias Castillo matiasct@stanford.edu Benjamin Goeing bgoeing@stanford.edu Jesper Westell jesperw@stanford.edu Abstract For this project, we designed a computer vision program

More information

10701: Intro to Machine Learning. Instructors: Pradeep Ravikumar, Manuela Veloso, Teaching Assistants:

10701: Intro to Machine Learning. Instructors: Pradeep Ravikumar, Manuela Veloso, Teaching Assistants: 10701: Intro to Machine Instructors: Pradeep Ravikumar, pradeepr@cs.cmu.edu Manuela Veloso, mmv@cs.cmu.edu Teaching Assistants: Shaojie Bai shaojieb@andrew.cmu.edu Adarsh Prasad adarshp@andrew.cmu.edu

More information

Course 395: Machine Learning - Lectures

Course 395: Machine Learning - Lectures Course 395: Machine Learning - Lectures Lecture 1-2: Concept Learning (M. Pantic) Lecture 3-4: Decision Trees & CBC Intro (M. Pantic & S. Petridis) Lecture 5-6: Evaluating Hypotheses (S. Petridis) Lecture

More information

Learning dispatching rules via an association rule mining approach. Dongwook Kim. A thesis submitted to the graduate faculty

Learning dispatching rules via an association rule mining approach. Dongwook Kim. A thesis submitted to the graduate faculty Learning dispatching rules via an association rule mining approach by Dongwook Kim A thesis submitted to the graduate faculty in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE

More information

LEARNING AGENTS IN ARTIFICIAL INTELLIGENCE PART I

LEARNING AGENTS IN ARTIFICIAL INTELLIGENCE PART I Journal of Advanced Research in Computer Engineering, Vol. 5, No. 1, January-June 2011, pp. 1-5 Global Research Publications ISSN:0974-4320 LEARNING AGENTS IN ARTIFICIAL INTELLIGENCE PART I JOSEPH FETTERHOFF

More information

Stanford NLP. Evan Jaffe and Evan Kozliner

Stanford NLP. Evan Jaffe and Evan Kozliner Stanford NLP Evan Jaffe and Evan Kozliner Some Notable Researchers Chris Manning Statistical NLP, Natural Language Understanding and Deep Learning Dan Jurafsky sciences Percy Liang Natural Language Understanding,

More information

Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 12, 2015

Machine Learning Tom M. Mitchell Machine Learning Department Carnegie Mellon University. January 12, 2015 Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University January 12, 2015 Today: What is machine learning? Decision tree learning Course logistics Readings: The Discipline

More information

Linear Models Continued: Perceptron & Logistic Regression

Linear Models Continued: Perceptron & Logistic Regression Linear Models Continued: Perceptron & Logistic Regression CMSC 723 / LING 723 / INST 725 Marine Carpuat Slides credit: Graham Neubig, Jacob Eisenstein Linear Models for Classification Feature function

More information

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best

More information

USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING

USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING USING DATA MINING METHODS KNOWLEDGE DISCOVERY FOR TEXT MINING D.M.Kulkarni 1, S.K.Shirgave 2 1, 2 IT Department Dkte s TEI Ichalkaranji (Maharashtra), India Abstract Many data mining techniques have been

More information

Analysis of Different Classifiers for Medical Dataset using Various Measures

Analysis of Different Classifiers for Medical Dataset using Various Measures Analysis of Different for Medical Dataset using Various Measures Payal Dhakate ME Student, Pune, India. K. Rajeswari Associate Professor Pune,India Deepa Abin Assistant Professor, Pune, India ABSTRACT

More information

Classifying Breast Cancer By Using Decision Tree Algorithms

Classifying Breast Cancer By Using Decision Tree Algorithms Classifying Breast Cancer By Using Decision Tree Algorithms Nusaibah AL-SALIHY, Turgay IBRIKCI (Presenter) Cukurova University, TURKEY What Is A Decision Tree? Why A Decision Tree? Why Decision TreeClassification?

More information

Hot Topics in Machine Learning

Hot Topics in Machine Learning Hot Topics in Machine Learning Winter Term 2016 / 2017 Prof. Marius Kloft, Florian Wenzel October 19, 2016 Organization Organization The seminar is organized by Prof. Marius Kloft and Florian Wenzel (PhD

More information

Database Systems Group Prof. Dr. Thomas Seidl. Topics. Praktikum Big Data Science SS 2017

Database Systems Group Prof. Dr. Thomas Seidl. Topics. Praktikum Big Data Science SS 2017 Database Systems Group Prof. Dr. Thomas Seidl Topics Overview Topics 1. Subspace Clustering 2. Search Engine 3. Graph Learning 4. Small Data Groups 2 Topic 1: Subspace Clustering In KDD1 and KDD2: learned

More information

Assignment 1: Predicting Amazon Review Ratings

Assignment 1: Predicting Amazon Review Ratings Assignment 1: Predicting Amazon Review Ratings 1 Dataset Analysis Richard Park r2park@acsmail.ucsd.edu February 23, 2015 The dataset selected for this assignment comes from the set of Amazon reviews for

More information

AP Statistics Course Syllabus

AP Statistics Course Syllabus AP Statistics Course Syllabus Textbook and Resource materials The primary textbook for this class is Yates, Moore, and McCabe s Introduction to the Practice of Statistics (TI 83 Graphing Calculator Enhanced)

More information

Introduction to Machine Learning

Introduction to Machine Learning 1, DATA11002 Introduction to Machine Learning Lecturer: Teemu Roos TAs: Ville Hyvönen and Janne Leppä-aho Department of Computer Science University of Helsinki (based in part on material by Patrik Hoyer

More information

Big Data Classification using Evolutionary Techniques: A Survey

Big Data Classification using Evolutionary Techniques: A Survey Big Data Classification using Evolutionary Techniques: A Survey Neha Khan nehakhan.sami@gmail.com Mohd Shahid Husain mshahidhusain@ieee.org Mohd Rizwan Beg rizwanbeg@gmail.com Abstract Data over the internet

More information

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition Todd Holloway Two Lecture Series for B551 November 20 & 27, 2007 Indiana University Outline Introduction Bias and

More information

ANNA UNIVERSITY SUBJECT NAME : ARTIFICIAL INTELLIGENCE SUBJECT CODE : CS2351 YEAR/SEM :III / VI QUESTION BANK UNIT I PROBLEM SOLVING 1. What is Intelligence? 2. Describe the four categories under which

More information