STATISTICAL PROGRAMMING - PYTHON

Similar documents
Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

School of Innovative Technologies and Engineering

CS 101 Computer Science I Fall Instructor Muller. Syllabus

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Software Maintenance

AQUA: An Ontology-Driven Question Answering System

Python Machine Learning

STA 225: Introductory Statistics (CT)

Generative models and adversarial training

MINISTRY OF EDUCATION

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Statistics and Data Analytics Minor

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

GACE Computer Science Assessment Test at a Glance

Bluetooth mlearning Applications for the Classroom of the Future

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Capturing and Organizing Prior Student Learning with the OCW Backpack

Course Content Concepts

CS177 Python Programming

Computer Science 1015F ~ 2016 ~ Notes to Students

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Introduction and Motivation

New Venture Financing

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Computer Science is more important than Calculus: The challenge of living up to our potential

CS Machine Learning

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

DIGITAL GAMING AND SIMULATION Course Syllabus Advanced Game Programming GAME 2374

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

Radius STEM Readiness TM

Diploma in Library and Information Science (Part-Time) - SH220

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

Bluetooth mlearning Applications for the Classroom of the Future

How to read a Paper ISMLL. Dr. Josif Grabocka, Carlotta Schatten

M55205-Mastering Microsoft Project 2016

Probability and Statistics Curriculum Pacing Guide

(Sub)Gradient Descent

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Like much of the country, Detroit suffered significant job losses during the Great Recession.

SIE: Speech Enabled Interface for E-Learning

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Using the Artificial Neural Networks for Identification Unknown Person

A Case Study: News Classification Based on Term Frequency

An Evaluation of E-Resources in Academic Libraries in Tamil Nadu

Android App Development for Beginners

Note: Principal version Modification Amendment Modification Amendment Modification Complete version from 1 October 2014

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Visual CP Representation of Knowledge

New Jersey Department of Education

Computer Science (CS)

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

My Program is Correct But it Doesn t Run: A Preliminary Investigation of Novice Programmers Problems

Quantifying Student Progress through Bloom s Taxonomy Cognitive Categories in Computer Programming Courses

Design Of An Automatic Speaker Recognition System Using MFCC, Vector Quantization And LBG Algorithm

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Human Emotion Recognition From Speech

Implementing a tool to Support KAOS-Beta Process Model Using EPF

Week 01. MS&E 273: Technology Venture Formation

Introduction to Communication Essentials

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Statewide Framework Document for:

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

Lectora a Complete elearning Solution

Education for an Information Age

TotalLMS. Getting Started with SumTotal: Learner Mode

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Learning From the Past with Experiment Databases

Livermore Valley Joint Unified School District. B or better in Algebra I, or consent of instructor

Arizona s English Language Arts Standards th Grade ARIZONA DEPARTMENT OF EDUCATION HIGH ACADEMIC STANDARDS FOR STUDENTS

Comprehension Recognize plot features of fairy tales, folk tales, fables, and myths.

Modeling function word errors in DNN-HMM based LVCSR systems

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

AC : PREPARING THE ENGINEER OF 2020: ANALYSIS OF ALUMNI DATA

WHEN THERE IS A mismatch between the acoustic

Learning Methods for Fuzzy Systems

ME 443/643 Design Techniques in Mechanical Engineering. Lecture 1: Introduction

EGRHS Course Fair. Science & Math AP & IB Courses

SARDNET: A Self-Organizing Feature Map for Sequences

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Probabilistic Latent Semantic Analysis

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Taking Kids into Programming (Contests) with Scratch

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Reducing Features to Improve Bug Prediction

knarrator: A Model For Authors To Simplify Authoring Process Using Natural Language Processing To Portuguese

Operational Knowledge Management: a way to manage competence

Modeling function word errors in DNN-HMM based LVCSR systems

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Speech Recognition at ICSI: Broadcast News and beyond

Transcription:

STATISTICAL PROGRAMMING - PYTHON Professor: IGNACIO LARRU MARTÍNEZ E-Mail: ilarru@faculty.ie.edu Ignacio Larrú is a freelance Python developer. His work involves developing new products and advising tech startups about automation of processes using Python. Additionally Ignacio is a tech lead investor and CFO in K Fund a venture capital fund especialized in technology startups. Ignacio is a former VP of investment baking at Credit Agricole and also works as a freelance consultant in the use of Big Data technologies to implement new business models. Previously he has worked as entrepreneur and founder of 6 different start-ups ranging from online retailers to complex software in the civil sector. He started his career as an IT consultant with PricewaterhouseCoopers developing software applications for leading financial institutions. Published by IE Publishing Department. Last revised, November 2016. 1

WHY THIS COURSE? It is easy to fall in love with Python given its simple syntax and coder-friendly structures. Since its appearance in 1991, Python has become one of the most popular interpreted programming languages. Among this interpreted languages Python has distinguished itself by its large scientific computing community and useful libraries for data manipulation. Python provides excellent capabilities for data analysis and interactive, exploratory computing and data visualization through its various dedicated libraries (primarily pandas, but also Num Py and matplotlib) This course will teach you the required general overview of the Python programming language coupled with specific use cases for data analysis OBJECTIVES The objectives of this course are as follows: Learn how to write Python programs Apply solid computer science design principles to our programs Learn the various data analysis specific functionalities in Python METHODOLOGY This course is organized around presentation of concepts, active discussions, programming assignments and class participation. Class participation is mandatory. Your voice is indispensable. It is important that you come to class prepared in order to enrich class discussions. BIBLIOGRAPHY There is no required book for the course but if students want to have a reference guide of the Python language they can use any of the various available resources on the Python language, for example: Python for Data Analysis Data Wrangling with Pandas, NumPy, and IPython by Wes McKinney Python Programming: An Introduction to Computer Science 2nd Edition by John Zelle PROGRAM SESSION 1 In this session we will comment general course principles while we program our first program (hello world). Additionally in this session we will discuss simple data types and variable assignment in Python. 2

SESSION 2 CONDITIONAL EXECUTION, ITERATION AND PROGRAM PLANNING In this session we will learn how to code conditions (if/else) in our programs together with while/for loops and how to write pseudo code to help us plan our software designs. SESSION 3 LISTS, TUPLES, SETS AND DICTIONARIES Once we know how to manage the execution flow of our programs we will learn advanced data structures like lists and dictionaries SESSION 4 FUNCTIONS We don t want to code the same code every time we need to perform certain actions, in this sessions we will learn how to encapsulate code in functions so we can reuse them efficiently SESSION 5 FILES AND EXCEPTIONS In this session we will learn how to use files to persist the state of our program and how to manage exceptions to capture unexpected behavior in our execution flow SESSION 6 STRING, DATE MANIPULATION & REGULAR EXPRESSIONS String manipulation is heavily used in any data analysis project, in this session we will learn how to manage textual data and dates in Python. SESSION 7 OBJECT ORIENTED PROGRAMMING We can take code encapsulation one step further using object oriented design in our programs. In this session we will learn the basic principles of Object Oriented programming (objects, classes, properties and methods) as we will work with objects for the rest of the course. SESSION 8 OBJECT ORIENTED PROGRAMMING We can take code encapsulation one step further using object oriented design in our programs. In this session we will learn how SESSION 9 ADVANCED OBJECT ORIENTED PROGRAMMING In this session we will continue learning advanced OOP topics lke inheritance, duck typing and interfaces 3

SESSIONS 10 & 11 GUI DEVELOPMENT In this sessions we will learn how to develop graphical user interface using the tkinter module so our users can interact with our program easier though a window based interface. SESSION 12 NUM PY BASICS: ARRAYS AND VECTORIZED COMPUTATION In this session we will start with the data analysis functionalities of Python, from the multidimensional array object to linear algebra operations and random numbers SESSION 13 Pandas data structures are a powerful tool for data management in Python, in this session we will start working with them SESSION 14 PLOTTING AND VISUALIZATION Data visualization is a very important step in any data analysis project, in this session we will review plotting functions in Python (matplotlib module) to help us achieve our goals SESSION 15 DESCRIPTIVE AND INFERENTIAL STATISTICS In this session we will learn the electronic data analysis capabilities of Python together with the most common statistical test to validate our hypotheses SESSION 16 Regression analysis in Python In this session we will start the review of the SciPy module with its capabilities regarding regression analysis (linear and logistic) SESSION 17 CLASSIFICATION ALGORITHMS In this session we will discuss the various classifying algorithms in the SciPy module and how to evaluate them SESSION 18 CLUSTERING In this session we will comment the various clustering alternatives available in Python 4

SESSION 19 RECOMMENDATION In this session we will review ow to implement the collaborative filtering algorithms in Python SESSION 20 FInal Exam EVALUATION METHOD Final evaluation will be based on (1) engagement in the classroom, (2) final exam and (3) a course assignment to be prepared in groups, with a breakdown of percentage contribution as follows: Criteria Score % Class Participation 20% Individual work 50% Workgroups 30% 5