Python for Predictive Data Analytics

Similar documents
Python Machine Learning

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Statistics and Data Analytics Minor

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

CS Machine Learning

Level 6. Higher Education Funding Council for England (HEFCE) Fee for 2017/18 is 9,250*

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

University of the Arts London (UAL) Diploma in Professional Studies Art and Design Date of production/revision May 2015

Top US Tech Talent for the Top China Tech Company

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Lecture 1: Machine Learning Basics

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Research computing Results

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

STA 225: Introductory Statistics (CT)

Generative models and adversarial training

Beyond the contextual: the importance of theoretical knowledge in vocational qualifications & the implications for work

Office Hours: Mon & Fri 10:00-12:00. Course Description

BSc (Hons) in International Business

Module 2 Protocol and Diplomatic Law:

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Assignment 1: Predicting Amazon Review Ratings

Universidade do Minho Escola de Engenharia

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Class Numbers: & Personal Financial Management. Sections: RVCC & RVDC. Summer 2008 FIN Fully Online

San Francisco County Weekly Wages

Journal title ISSN Full text from

Multivariate k-nearest Neighbor Regression for Time Series data -

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

2017 FALL PROFESSIONAL TRAINING CALENDAR

Learning From the Past with Experiment Databases

Using AMT & SNOMED CT-AU to support clinical research

On-Line Data Analytics

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Introduction to Simulation

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

PRINCE2 Foundation (2009 Edition)

Earthsoft s EQuIS Database Lower Duwamish Waterway Source Data Management

ACCT 100 Introduction to Accounting Course Syllabus Course # on T Th 12:30 1:45 Spring, 2016: Debra L. Schmidt-Johnson, CPA

CS 100: Principles of Computing

A virtual surveying fieldcourse for traversing

JONATHAN H. WRIGHT Department of Economics, Johns Hopkins University, 3400 N. Charles St., Baltimore MD (410)

School of Innovative Technologies and Engineering

Economics at UCD. Professor Karl Whelan Presentation at Open Evening January 17, 2017

Applications of data mining algorithms to analysis of medical data

Programme Specification

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Document number: 2013/ Programs Committee 6/2014 (July) Agenda Item 42.0 Bachelor of Engineering with Honours in Software Engineering

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Nottingham Trent University Course Specification

Unit 7 Data analysis and design

Welcome to. ECML/PKDD 2004 Community meeting

Time series prediction

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Essentials of Rapid elearning (REL) Design

Lecture 1: Basic Concepts of Machine Learning

Spring 2015 IET4451 Systems Simulation Course Syllabus for Traditional, Hybrid, and Online Classes

RESEARCH METHODS AND LIBRARY INFORMATION SCIENCE

Development of an IT Curriculum. Dr. Jochen Koubek Humboldt-Universität zu Berlin Technische Universität Berlin 2008

Laboratorio di Intelligenza Artificiale e Robotica

Len Lundstrum, Ph.D., FRM

Australian Journal of Basic and Applied Sciences

Probability and Statistics Curriculum Pacing Guide

Success Factors for Creativity Workshops in RE

IT4305: Rapid Software Development Part 2: Structured Question Paper

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

RWTH Aachen University

Switchboard Language Model Improvement with Conversational Data from Gigaword

OFFICE SUPPORT SPECIALIST Technical Diploma

Intermediate Computable General Equilibrium (CGE) Modelling: Online Single Country Course

ITE and PSA Launched Specialist Nitec Course Initiative to provide structured course for ITE graduates to sharpen their skills in port equipment

MODULE 4 Data Collection and Hypothesis Development. Trainer Outline

MSc INVESTMENT BANKING & RISK MANAGEMENT FULL-TIME 18 MONTH PROGRAMME IN ENGLISH IN COLLABORATION WITH

Report on Deliverable 5.1: Kick off Meeting & Prevention plan on obstacles

Integration of ICT in Teaching and Learning

Knowledge-Based - Systems

COMPUTER-ASSISTED INDEPENDENT STUDY IN MULTIVARIATE CALCULUS

Carolina Course Evaluation Item Bank Last Revised Fall 2009

PROMOTION MANAGEMENT. Business 1585 TTh - 2:00 p.m. 3:20 p.m., 108 Biddle Hall. Fall Semester 2012

Page 1 of 8 REQUIRED MATERIALS:

UK Residential Summer Camps English Summer School London Day Camps 3-17 year olds. The summer of your life...

Chromatography Syllabus and Course Information 2 Credits Fall 2016

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

EDUCATION. Graduate studies include Ph.D. in from University of Newcastle upon Tyne, UK & Master courses from the same university in 1987.

MINISTRY OF EDUCATION

New Venture Financing

BSc (Hons) Banking Practice and Management (Full-time programmes of study)

APAC Accreditation Summary Assessment Report Department of Psychology, James Cook University

Accounting & Financial Management

Introduction To Business Management Du Toit

Enter the World of Polling, Survey &

Global Television Manufacturing Industry : Trend, Profit, and Forecast Analysis Published September 2012

Empowering the Powerful Friday 11th August

Evaluating Interactive Visualization of Multidimensional Data Projection with Feature Transformation

Transcription:

Python for Predictive Data Analytics A specialist course in Canberra Audience: This is a course for data scientists, financial analysts, researchers, statisticians, and engineers interested in learning to use Python for analysing and visualising data. Outcome: By the end of the course, you will have all the knowledge you need to start using Python competently for processing, analysing, modelling, and visualising various kinds of data, with a focus on time series. You will have had experience with using Python for various scripting, datamanipulation and plotting tasks with data in a variety of formats, including CSV, Excel spreadsheets, SQL databases, JSON, and API endpoints, as well as unstructured text. You will have applied powerful machine learning tools for classification, regression, and clustering, in useful practical settings on a variety of data sets. You will understand the elegance and power of the Python language and its powerful ecosystem of packages for data analytics, and you will be well- placed to continue learning more as you use it day-to-day. Duration: 4 days Dates: 21 24 May 2018 Venue: Training Choice, Level 3, 54 Marcus Clarke Street, Canberra ACT 2601 Format: Each topic is a mixture of hands-on exercises and expert instruction. Instructor: Dr Edward Schofield and/or Dr Robert Layton Prerequisites: Some experience with programming (in any language) would be beneficial. A quantitative background and familiarity with basic probability and statistics would also be beneficial. Email info@pythoncharmers.com Web pythoncharmers.com

Course Outline Day 1: Python Basics Day 1 covers how to use Python for basic scripting and automation tasks, including tips and tricks for making this easy. The syllabus is as follows: Why use Python? What s possible? Setting up your Python development environment (IDE, Jupyter) The Jupyter notebook and shell for rapid prototyping Modules and packages Python concepts: an introduction through examples Essential data types, tips and tricks Raising and handling exceptions Worked example: fetching and ranking real-time temperature data for global cities from a web API Day 2: Handling, Analysing, and Presenting Data in Python The Pandas package is an amazingly productive tool for working with and analysing data in Python. Day 2 gives a thorough introduction to analysing data with Pandas and visualising it easily: Reading and writing essential data formats: CSV, Excel, SQL databases, JSON, time-series Indexing and selecting data in Pandas Data fusion: joining & merging datasets Summarisation with group by operations; pivot tables Visualisation and statistical graphics with Seaborn Worked example: creating automated reports with Jupyter, Pandas and nbconvert Day 3: Time-series, simulation, inference and modelling Day 3 demonstrates more advanced features of Pandas for working with data, including time-series data. It then describes Monte Carlo simulation methods and walks you through using powerful Bayesian methods of inference and modelling for different kinds of data in Python: Time-series analysis: parsing dates, resampling, handling time-zones Secret weapons for Pandas: searchsorted, hierarchical indices, unstack, categorical, qgrid Introduction to NumPy for linear algebra and Monte Carlo simulation methods Classical statistics with scipy.stats and statsmodels Density estimation with scikit-learn Bayesian inference with PyMC3: parameter and model selection; incorporating prior information Bayesian regression; assessing reliabilities

Day 4: Machine learning Day 4 introduces a more automated approach to modelling real-world data with several powerful machine learning algorithms using scikit-learn. The datasets are selected from a range of industries: financial, geospatial, medical, and social sciences. The syllabus is: Classification with scikit-learn: Naive Bayes, logistic regression, SVMs, random forests, with application to diagnosis, AI systems, and time-series prediction Nonlinear regression, with application to forecasting Clustering data with DBScan, with application to outlier detection Dimensionality reduction with PCA Validation and model selection Deploying machine learning models in production We encourage you to bring your own data sets to the course where relevant. Supplemental materials We will supply you with printed course notes, cheat sheets, and a USB stick containing kitchen-sink Python installers for multiple platforms, solutions to the programming exercises, several written tutorials, and reference documentation on Python and the third-party packages covered in the course. Instructor bio Your trainers for the course will be selected from: Edward Schofield Ed has consulted to or trained over 1500 people in Python for data analytics from dozens of organisations, including AGL, the Australian Federal Police, A*STAR, Barclays, the Bureau of Meteorology, Cisco, CSIRO, Dolby, DSTG, IMC, Macquarie Bank, Shell, Telstra, Toyota, and Verizon. Ed is the co-chair of the Python for Data Science miniconf for PyCon AU, co-organises the Python user group in Melbourne, and regularly presents at conferences related to Python and data analytics in Australia and internationally. He is a former release manager of SciPy and the author of the future package. Ed holds a PhD in machine learning from Imperial College London, with application to speech and image recognition technologies. He also holds BA and MA (Hons) degrees in maths and computer science from Cambridge University. He has 20+ years of experience in programming, teaching, and public speaking.

Robert Layton Robert is a data scientist who works across several industries including finance, information security, and transport. He is the author of the book Learning Data Mining in Python (O Reilly Press, 2015), which has received significant praise. He is a developer of the scikit-learn package for machine learning and the author of the website Learning Tensorflow. He has presented at the last four PyCon AU conferences, at multiple international research conferences, and has given training in Python to groups of staff from companies including Cisco, Lumascape, IMC, Optus, Sportsbet, and Woolworths. Robert has a PhD in cybercrime analytics from the Internet Commerce Security Laboratory at Federation University Australia, where he was the inaugural Young Alumni of the Year in 2014 and is now an Honorary Research Fellow. Robert is also an Official Member of the Ballarat Hackerspace, where he helps grow the future-tech sector in regional Victoria. Other information Computer: A computer will be provided for you during the course. Exercises: There will be practical programming exercises throughout the course. These will be challenging and fun, and the solutions will be discussed after each exercise and provided as source code on the USB sticks. During the exercises, the trainer will offer individual help and suggestions. Timing: The course will run from 9:00 to roughly 17:00 each day, with breaks of 1 hour for lunch and 15 minutes each for morning and afternoon tea. Personal help: Your trainer(s) will be available after the course each day for you to ask any one-onone questions you like whether about the course material and exercises or about specific problems you face in your work and how to use Python to solve them. We encourage you to have your own data sets ready to use if this is relevant. Certificate of completion: We will provide you a certificate if you complete the course and successfully answer the majority of the exercise questions. Food and drink: We will provide lunch, morning and afternoon tea, and drinks. Price $825 per day per person, including GST. Booking To book places on the course, please contact us, or visit: https://pythoncharmers.com/training/python-for-predictive-analytics

Testimonials Testimonials from past participants of similar courses are available at pythoncharmers.com/ testimonials. Questions? You are welcome to contact us if you have any questions before the course. You can reach us at info@pythoncharmers.com. About Python Charmers Python Charmers is the leading provider of Python training in the Asia-Pacific region, based in Australia and Singapore. Python Charmers specialises in teaching programming to scientists, engineers, financial engineers, data analysts, and computer scientists in the Python language. Python Charmers' delighted training clients include the ABC, Australian Federal Police, Barclays, Bureau of Meteorology, Cisco, CSIRO, Dolby, Geoscience Australia, IMC, Primary Health Care, Shell, Toyota Technical Centre, and Verizon. Contact Phone: +61 1300 963 160 Email: info@pythoncharmers.com Web: pythoncharmers.com