Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Similar documents
Python Machine Learning

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Course Content Concepts

CS 100: Principles of Computing

Social Media Journalism J336F Unique ID CMA Fall 2012

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

BUS Computer Concepts and Applications for Business Fall 2012

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Social Media Journalism J336F Unique Spring 2016

MGMT3274 INTERNATONAL BUSINESS PROCESSES AND PROBLEMS

Math 181, Calculus I


SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Syllabus ENGR 190 Introductory Calculus (QR)

MATH 1A: Calculus I Sec 01 Winter 2017 Room E31 MTWThF 8:30-9:20AM

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

ASTR 102: Introduction to Astronomy: Stars, Galaxies, and Cosmology

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Course Syllabus. Alternatively, a student can schedule an appointment by .

COMM370, Social Media Advertising Fall 2017

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

ECON492 Senior Capstone Seminar: Cost-Benefit and Local Economic Policy Analysis Fall 2017 Instructor: Dr. Anita Alves Pena

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

The University of Texas at Tyler College of Business and Technology Department of Management and Marketing SPRING 2015

FINN FINANCIAL MANAGEMENT Spring 2014

USC MARSHALL SCHOOL OF BUSINESS

Office Hours: Day Time Location TR 12:00pm - 2:00pm Main Campus Carl DeSantis Building 5136

International Business BADM 455, Section 2 Spring 2008

Spring 2015 CRN: Department: English CONTACT INFORMATION: REQUIRED TEXT:

HCI 440: Introduction to User-Centered Design Winter Instructor Ugochi Acholonu, Ph.D. College of Computing & Digital Media, DePaul University

CS/SE 3341 Spring 2012

Physics Experimental Physics II: Electricity and Magnetism Prof. Eno Spring 2017

Demography and Population Geography with GISc GEH 320/GEP 620 (H81) / PHE 718 / EES80500 Syllabus

CIS 121 INTRODUCTION TO COMPUTER INFORMATION SYSTEMS - SYLLABUS

Photography: Photojournalism and Digital Media Jim Lang/B , extension 3069 Course Descriptions

Speak Up 2012 Grades 9 12

MTH 141 Calculus 1 Syllabus Spring 2017

Course Policies and Syllabus BUL3130 The Legal, Ethical, and Social Aspects of Business Syllabus Spring A 2017 ONLINE

An Introduction to Simio for Beginners

SYLLABUS- ACCOUNTING 5250: Advanced Auditing (SPRING 2017)

Fundamental Accounting Principles, 21st Edition Author(s): Wild, John; Shaw, Ken; Chiappetta, Barbara ISBN-13:

CS 3516: Computer Networks

4. Long title: Emerging Technologies for Gaming, Animation, and Simulation

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

ITM2500 Spreadsheet & Database Productivity. Spreadsheet & Database Productivity

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

Class Numbers: & Personal Financial Management. Sections: RVCC & RVDC. Summer 2008 FIN Fully Online

Android App Development for Beginners

Bittinger, M. L., Ellenbogen, D. J., & Johnson, B. L. (2012). Prealgebra (6th ed.). Boston, MA: Addison-Wesley.

Syllabus for ART 365 Digital Photography 3 Credit Hours Spring 2013

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

GENERAL CHEMISTRY I, CHEM 1100 SPRING 2014

Page 1 of 8 REQUIRED MATERIALS:

COURSE BAPA 550 (816): Foundations of Managerial Economics Course Outline

Data Structures and Algorithms

COURSE DESCRIPTION PREREQUISITE COURSE PURPOSE

Foothill College Summer 2016

TROY UNIVERSITY MASTER OF SCIENCE IN INTERNATIONAL RELATIONS DEGREE PROGRAM

GEOG 473/573: Intermediate Geographic Information Systems Department of Geography Minnesota State University, Mankato

POFI 1349 Spreadsheets ONLINE COURSE SYLLABUS

STA2023 Introduction to Statistics (Hybrid) Spring 2013

3D DIGITAL ANIMATION TECHNIQUES (3DAT)

ME 4495 Computational Heat Transfer and Fluid Flow M,W 4:00 5:15 (Eng 177)

Spring 2015 IET4451 Systems Simulation Course Syllabus for Traditional, Hybrid, and Online Classes

BA 130 Introduction to International Business

Scottsdale Community College Spring 2016 CIS190 Intro to LANs CIS105 or permission of Instructor

ENCE 215 Applied Engineering Science Spring 2005 Tu/Th: 9:00 am - 10:45 pm EGR Rm. 1104

Food Products Marketing

Introduction to Moodle

Class meetings: Time: Monday & Wednesday 7:00 PM to 8:20 PM Place: TCC NTAB 2222

Education for an Information Age

Level 6. Higher Education Funding Council for England (HEFCE) Fee for 2017/18 is 9,250*

AGN 331 Soil Science Lecture & Laboratory Face to Face Version, Spring, 2012 Syllabus

SOUTHERN MAINE COMMUNITY COLLEGE South Portland, Maine 04106

INTRODUCTION TO GENERAL PSYCHOLOGY (PSYC 1101) ONLINE SYLLABUS. Instructor: April Babb Crisp, M.S., LPC

Open Source Mobile Learning: Mobile Linux Applications By Lee Chao

Course Syllabus Art History II ARTS 1304

POFI 2401 Word Processing Syllabus. MW 9AM-11:30AM TTH 8:30AM-11AM Friday By Appointment

THE UNIVERSITY OF SYDNEY Semester 2, Information Sheet for MATH2068/2988 Number Theory and Cryptography

Professors will not accept Extra Credit work nor should students ask a professor to make Extra Credit assignments.

Foothill College Fall 2014 Math My Way Math 230/235 MTWThF 10:00-11:50 (click on Math My Way tab) Math My Way Instructors:

Learning From the Past with Experiment Databases

SOUTHWEST COLLEGE Department of Mathematics

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

ADMN-1311: MicroSoft Word I ( Online Fall 2017 )

GEOG Introduction to GIS - Fall 2015

Intermediate Academic Writing

Instructor: Matthew Wickes Kilgore Office: ES 310

Office Hours: Mon & Fri 10:00-12:00. Course Description

Curriculum for the Academy Profession Degree Programme in Energy Technology

The Policymaking Process Course Syllabus

Computer Organization I (Tietokoneen toiminta)

ACADEMIC POLICIES AND PROCEDURES

CSCI 333 Java Language Programming Fall 2017 INSTRUCTOR INFORMATION COURSE INFORMATION

IST 440, Section 004: Technology Integration and Problem-Solving Spring 2017 Mon, Wed, & Fri 12:20-1:10pm Room IST 202

Course Syllabus for Math

EDU 614: Advanced Educational Psychology Online Course Dr. Jim McDonald

Transcription:

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence COURSE DESCRIPTION This course presents computing tools and concepts for all stages of dealing with the modern data deluge-- -statistical computing at the center, but also the essential surrounding tasks, including data organization, presentation of results and the user interface. This approach is needed to deal with the challenges posed by modern technology, challenges that are also opportunities for better use of data. The size and complexity of data sources has increased enormously, while the importance of learning from the data has been recognized as never before. New modes of computing such as large-scale parallelism and cloud computing can help, but require new approaches to programming. But the key challenge is to use our own time effectively by choosing the best programming approach for each stage of a project. The course also covers linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap, model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; support-vector machines; Some unsupervised learning: principal components and clustering (kmeans and hierarchical). Computing is done in R, through tutorial sessions and homework assignments. We present a range of computing paradigms and corresponding languages, each designed for ease of use but also providing a rich set of tools. We use the R language and the thousands of packages written for it for core statistical computing. This course also presents concepts and techniques as related big data analytics. Big Data Analytics with R and Hadoop exposes students to the paradigm of Mining of Massive data Sets. COURSE MATERIALS Recommended Textbooks: 1. Software for Data Analysis by John Chambers, Springer 2008 (PDF Downloadable from Rutgers Library for free). 2. An Introduction to Statistical Learning with Applications in R by Gareth James, Daniell Witten, Trevor Hastie, Robert Tibshirani: Springer 3. Big Data Analytics with R and Hadoop by Vignesh Prajapati: PackT publishing Opensource (http://www.packtpub.com/big-data-analytics-with-r-and-hadoop/book) 1

4. Mining of Massive Datasets by Anand Rajaraman, Jure Leskovec, Jeffrey D. Ullman: downloadable free from The Stanford University Infolab (http://infolab.stanford.edu/~ullman/mmds/book.pdf) Reference Textbooks: ggplot2: Elegant Graphics for Data Analysis, Hadley Wickam, Springer, 2009. Learning Python, Marj Lutz, O'Reilly. Parallel R, O'Reilly. Advanced R Development (forthcoming) by Hadley Wickam. See Advanced R Wiki Visualizing Data. Ben Fry. O'Reilly. Prerequisites: No previous knowledge of programming languages is required. However those of you that are familiar with some other language, particularly C or a C derivative, will have an easier ride in the first few weeks. You need to have access to a personal computer (Windows, Mac or Unix will all work.) You need to be able to download and install software on this machine. You also need to have access to the internet. CLASS ORGANIZATION & ADMINISTRATION Attention: This course is fundamentally different from other courses you have ever taken or will take in this program. It is not about learning a few formulas, principles, definitions, and applying them using the inventory of skills you have already acquired in your previous education. This course is about expanding exactly this inventory of skills that forms the underlying basis of your education to a totally new area, and develop a way of thinking that is unlike those you are employing in other coursework. Programming is not easy for those who have no prior experience with it, yet it becomes easy as you practice. Programming projects and homework are the heart and soul of this course. You have to do them in order to learn. Therefore, you may very well need to spend more time working on this course than on any other, practicing how to write programs. This is the only way you can acquire a skill essentially different from others that you already have. Attendance: Regular attendance is compulsory. You are not allowed to check your emails, access Web sites not related to the course or work on something that is beyond the scope of this course during the class time. Assignments: You may have discussions with your class members, but you have to submit your own work. Please be sure to keep a copy of the assignment by yourself in case that there is any problem with your handin/online submission or you have to use it later this semester. Assignments have to be submitted before the beginning of the class on the specified due day. No late submissions will be accepted. 2

Exams: There will be no make-up exams. You are required to present a written proof for situations such as going on to an emergency room due to unexpected and serious illness. Chatting during the exam is not allowed. Email communication during the exam will be considered cheating. No collaboration between class members will be allowed during any exam. There will be no extra-credit project. Collaboration and Cheating: Collaboration of any kind is strictly forbidden on all exams, and quizzes. Any violations that I detect will be formally prosecuted. Students should familiarize themselves with the RBS honor code pledge, "I pledge, on my honor, that I have neither received nor given any unauthorized assistance on this examination (assignment)." See http://academicintegrity.rutgers.edu/academic-integrityat-rutgers for more information. FINAL GRADE ASSIGMENT In-class work, Assignments 10% Exam I 20% Exam II 20% Project 15% Final 35% I reserve the right to make changes to the grade calculation scheme. 3

Week of Week Topic Business Analytics and Information Tech (33:136:494) COURSE SCHEDULE 01/20 Introduction to Data Mining and Business Intelligence 1 Functional programming and R; objects in R 01/27 02/03 02/10 02/17 02/24 2 3 4 5 6 Introduction to Statistical Learning: chapter 1&2 Dataframes in R R packages Design, checks, publishing Introduction to Statistical Learning: chapter 3 S4 Classes and Methods. Introduction to Statistical Learning: chapter 4 OOP computing model in R, Reference Classes and other languages Introduction to Statistical Learning: chapter 5 Databases, SQL, ODBC, drivers and interfaces from R (DBI) XML, Xschema, XSL Introduction to Statistical Learning: chapter 6 Intersystem interfaces: R and C,C++, Python, Java, etc Spreadsheet model of Computing, interface to R Introduction to Statistical Learning: chapter 7 03/03 7 Exam I 03/10 8 Data Visualization: R graphics, ggplot2, graphs Introduction to Statistical Learning: chapter 8 03/17 9 SPRING BREAK 03/24 10 Debugging R, interactively in R and at the C level 4

Introduction to Statistical Learning: chapter 9 03/31 11 Large computations and large data; vectorizing; measuring efficiency Cluster Computing, MPI, R facilities, CUDA examples if time permits Introduction to Statistical Learning: chapter 10 04/07 12 Cluster Computing, MPI, R facilities, CUDA examples if time permits 04/14 13 Exam II 04/21 04/28 14 15 Map-reduce computations, Hadoop Web based interfaces, libraries, publishing. Examples Review 05/08 16 Final Exam Period 05/08 to 05/14 Scholastic Dishonesty Policy The University defines academic dishonesty as cheating, plagiarism, unauthorized collaboration, falsifying academic records, and any act designed to avoid participating honestly in the learning process. Scholastic dishonesty also includes, but not limited to, providing false or misleading information to receive a postponement or an extension on assignments, and submission of essentially the same written assignment for two different courses without the permission of faculty members. The purpose of assignments is to provide individual feedback as well to get you thinking. Interaction for the purpose of understanding a problem is not considered cheating and will be encouraged. However, the actual solution to problems must be one's own. 5