SYLLABUS DSCI Introduction to Data Mining Fall 2017

Similar documents
Math 181, Calculus I

BUS Computer Concepts and Applications for Business Fall 2012

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

Office Hours: Day Time Location TR 12:00pm - 2:00pm Main Campus Carl DeSantis Building 5136

Financial Accounting Concepts and Research

ACCT 100 Introduction to Accounting Course Syllabus Course # on T Th 12:30 1:45 Spring, 2016: Debra L. Schmidt-Johnson, CPA

English Policy Statement and Syllabus Fall 2017 MW 10:00 12:00 TT 12:15 1:00 F 9:00 11:00

Design and Creation of Games GAME

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

INTERMEDIATE ALGEBRA Course Syllabus

ACC 362 Course Syllabus

Social Media Journalism J336F Unique ID CMA Fall 2012

EDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall Semester 2014 August 25 October 12, 2014 Fully Online Course

ACC 380K.4 Course Syllabus

CIS Introduction to Digital Forensics 12:30pm--1:50pm, Tuesday/Thursday, SERC 206, Fall 2015

Instructor: Matthew Wickes Kilgore Office: ES 310

STA2023 Introduction to Statistics (Hybrid) Spring 2013

EDIT 576 (2 credits) Mobile Learning and Applications Fall Semester 2015 August 31 October 18, 2015 Fully Online Course

Mktg 315 Marketing Research Spring 2015 Sec. 003 W 6:00-8:45 p.m. MBEB 1110

Course Syllabus Art History II ARTS 1304

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

MATH 1A: Calculus I Sec 01 Winter 2017 Room E31 MTWThF 8:30-9:20AM

Required Materials: The Elements of Design, Third Edition; Poppy Evans & Mark A. Thomas; ISBN GB+ flash/jump drive

Prerequisite: General Biology 107 (UE) and 107L (UE) with a grade of C- or better. Chemistry 118 (UE) and 118L (UE) or permission of instructor.

UCC2: Course Change Transmittal Form

MKT ADVERTISING. Fall 2016

MGMT3274 INTERNATONAL BUSINESS PROCESSES AND PROBLEMS

Class Tuesdays & Thursdays 12:30-1:45 pm Friday 107. Office Tuesdays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

Office Location: LOCATION: BS 217 COURSE REFERENCE NUMBER: 93000

Class Numbers: & Personal Financial Management. Sections: RVCC & RVDC. Summer 2008 FIN Fully Online

MGT/MGP/MGB 261: Investment Analysis

KOMAR UNIVERSITY OF SCIENCE AND TECHNOLOGY (KUST)

FINN FINANCIAL MANAGEMENT Spring 2014

Business Computer Applications CGS 1100 Course Syllabus. Course Title: Course / Prefix Number CGS Business Computer Applications

ACC : Accounting Transaction Processing Systems COURSE SYLLABUS Spring 2011, MW 3:30-4:45 p.m. Bryan 202

Math 22. Fall 2016 TROUT

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

ITSC 1301 Introduction to Computers Course Syllabus

ITSC 2321 Integrated Software Applications II COURSE SYLLABUS

Course Syllabus Advanced-Intermediate Grammar ESOL 0352

Please read this entire syllabus, keep it as reference and is subject to change by the instructor.

COURSE INFORMATION. Course Number SER 216. Course Title Software Enterprise II: Testing and Quality. Credits 3. Prerequisites SER 215

MBA 5652, Research Methods Course Syllabus. Course Description. Course Material(s) Course Learning Outcomes. Credits.

ED487: Methods for Teaching EC-6 Social Studies, Language Arts and Fine Arts

FINANCIAL STRATEGIES. Employee Hand Book

COMS 622 Course Syllabus. Note:

Visual Journalism J3220 Syllabus

STUDENT HANDBOOK ACCA

SPM 5309: SPORT MARKETING Fall 2017 (SEC. 8695; 3 credits)

Neuroscience I. BIOS/PHIL/PSCH 484 MWF 1:00-1:50 Lecture Center F6. Fall credit hours

Data Structures and Algorithms

COURSE WEBSITE:

Scottsdale Community College Spring 2016 CIS190 Intro to LANs CIS105 or permission of Instructor

Dr. Zhang Fall 12 Public Speaking 1. Required Text: Hamilton, G. (2010). Public speaking for college and careers (9th Ed.). New York: McGraw- Hill.

Course Name: Elementary Calculus Course Number: Math 2103 Semester: Fall Phone:

STUDENT PACKET - CHEM 113 Fall 2010 and Spring 2011

CIS 121 INTRODUCTION TO COMPUTER INFORMATION SYSTEMS - SYLLABUS

MGT 136 Advanced Accounting

Required Text: Oltmanns, T. & Emery, R. (2014). Abnormal Psychology (8th Edition) ISBN-13: ISBN-10:

MTH 215: Introduction to Linear Algebra

Latin I (LA 4923) August 23-Dec 17, 2014 Michal A. Isbell. Course Description, Policies, and Syllabus

Aerospace Engineering

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

POFI 1349 Spreadsheets ONLINE COURSE SYLLABUS

DIGITAL GAMING AND SIMULATION Course Syllabus Advanced Game Programming GAME 2374

PSCH 312: Social Psychology

ASTRONOMY 2801A: Stars, Galaxies & Cosmology : Fall term

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Statistics and Data Analytics Minor

Beginning and Intermediate Algebra, by Elayn Martin-Gay, Second Custom Edition for Los Angeles Mission College. ISBN 13:

Carolina Course Evaluation Item Bank Last Revised Fall 2009

CENTRAL MAINE COMMUNITY COLLEGE Introduction to Computer Applications BCA ; FALL 2011

The University of Texas at Tyler College of Business and Technology Department of Management and Marketing SPRING 2015

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

McKendree University School of Education Methods of Teaching Elementary Language Arts EDU 445/545-(W) (3 Credit Hours) Fall 2011

IPHY 3410 Section 1 - Introduction to Human Anatomy Lecture Syllabus (Spring, 2017)

MAR Environmental Problems & Solutions. Stony Brook University School of Marine & Atmospheric Sciences (SoMAS)

Medical Terminology - Mdca 1313 Course Syllabus: Summer 2017

SOUTHWEST COLLEGE Department of Mathematics

Connect Mcgraw Hill Managerial Accounting Promo Code

JN2000: Introduction to Journalism Syllabus Fall 2016 Tuesdays and Thursdays 12:30 1:45 p.m., Arrupe Hall 222

Introduction to Information System

MATH 205: Mathematics for K 8 Teachers: Number and Operations Western Kentucky University Spring 2017

Hist 1210, World History 1 Fall 2014

GEOG Introduction to GIS - Fall 2015

CRITICAL THINKING AND WRITING: ENG 200H-D01 - Spring 2017 TR 10:45-12:15 p.m., HH 205

Business 712 Managerial Negotiations Fall 2011 Course Outline. Human Resources and Management Area DeGroote School of Business McMaster University

Physics XL 6B Reg# # Units: 5. Office Hour: Tuesday 5 pm to 7:30 pm; Wednesday 5 pm to 6:15 pm

Strategic Management (MBA 800-AE) Fall 2010

Department of Anthropology ANTH 1027A/001: Introduction to Linguistics Dr. Olga Kharytonava Course Outline Fall 2017

BUS 4040, Communication Skills for Leaders Course Syllabus. Course Description. Course Textbook. Course Learning Outcomes. Credits. Academic Integrity

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

BA 130 Introduction to International Business

Course Syllabus. Course Information Course Number/Section OB 6301-MBP

SOUTHERN MAINE COMMUNITY COLLEGE South Portland, Maine 04106

Jeff Walker Office location: Science 476C (I have a phone but is preferred) 1 Course Information. 2 Course Description

George Mason University Graduate School of Education Program: Special Education

MARKETING ADMINISTRATION MARK 6A61 Spring 2016

Transcription:

SYLLABUS DSCI 4520.001 Introduction to Data Mining Fall 2017 CLASS (DAY/TIME): Wednesdays 6:30-9:20, BLB 070 INSTRUCTOR: Dr. Nick Evangelopoulos OFFICE HRS: TW 1:00-2:00pm at BLB 365D CONTACT INFO: OFFICE PHONE: 940-565-3056 E-MAIL (preferred): Nick.Evangelopoulos@unt.edu Textbooks in printed and PDF file format Kattamuri Sarma, Predictive Modeling with SAS Enterprise Miner, Second Edition, SAS Press 2013, ISBN: 978-1-60764-767-6 (required printed text) Data Mining Using SAS Enterprise Miner, A Case Study Approach, 3 rd Edition, SAS Publishing 2013 (required free PDF), or ISBN 978-1-61290-638 6 (optional printed text) Getting Started with SAS Enterprise Miner 14.1, SAS Publishing 2016 (required free PDF) Getting Started with SAS Text Miner 13.2, SAS Publishing 2014 (required free PDF) Getting Started with SAS Enterprise Miner 5.3, SAS Pub. 2008 (optional/recommended free PDF) Software IBM SPSS Statistics 24, IBM SPSS Modeler 15/18, SAS Enterprise Miner 14.1, SAS Text Miner 14.1. All these are available at the CoB lab, physically and via VMWare. Blackboard Learn The course is on Blackboard Learn. Please check frequently for updates. Course Description Introduction to Data Mining. 3 hours. Knowledge discovery in large databases, using data mining tools and techniques. Topics include data exploration, modeling, and model evaluation. Decision making in a case-embedded business environment is emphasized. Prerequisite(s): DSCI 3710; BCIS 3610; 2.7 UNT GPA (2.7 transfer GPA if no courses taken at UNT); a grade of C or better in each previously taken DSCI course. Purpose of the Course This course deals with the problem of extracting information from large databases and designing data-based decision support systems. The extracted knowledge is subsequently used to support human decision-making in the areas of summarization, prediction, and the explanation of observed phenomena (e.g. patterns, trends, and customer behavior). Techniques such as visualization, statistical analysis, decision trees, and neural networks can be used to discover relationships and patterns that shed light on business problems. This course will examine methods for transforming massive amounts of data into new and useful information, 2017 Nicholas Evangelopoulos

uncovering factors that affect purchasing patterns, and identifying potential profitable investments and opportunities. Learning Objectives 1. Understand the problems and opportunities when dealing with extremely large databases. 2. Review data visualization software used for interpreting complex patterns in multidimensional data. Learn to identify what information is useful and what is not. 3. Provide an understanding of predictive models and algorithms, as well as exploratory algorithms. 4. Examine all phases of decision making, including discovery and data query, data analysis and confirmation, presentation, and implementation of results. Class Attendance Regular class attendance and informed participation are expected. Academic Integrity This course adheres to the UNT policy on academic integrity. The policy can be found at http://vpaa.unt.edu/academic-integrity.htm. If you engage in academic dishonesty related to this class, you will receive a failing grade on the test or assignment, or a failing grade in the course. In addition, the case may be referred to the Dean of Students for appropriate disciplinary action. Students with Disabilities The College of Business complies with the Americans with Disabilities Act in making reasonable accommodations for qualified students with disability. If you have an established disability and would like to request accommodation, please see your instructor as soon as possible. You will need to register with the UNT Office for Disability Accommodation. Deadlines Dates of drop deadlines, final exams, etc., are published in the university catalog and the schedule of classes. Please be sure you stay informed about these dates. Student Perceptions of Teaching (SPOT) Student Perceptions of Teaching (SPOT) utilizes IASystem and is a requirement for all organized classes at UNT. This short Web-based survey will be available to you at the end of the semester, providing you a chance to comment on how this class is taught. I am very interested in this feedback from my students, as I work to continually improve my teaching. I consider SPOT to be an important part of your class participation. Cell Phones As a courtesy to your instructor and to your fellow classmates, you are asked to set your cell phone to vibrate. In case of a personal emergency, if you must use your cell phone, you are asked to step out of the classroom. 2

Incomplete Grade (I) The grade of "I" is not given except for rare and very unusual emergencies, as per University guidelines. An I grade cannot be used to substitute your poor performance in class. If you think you will not be able to complete the class satisfactorily, please drop the course. Campus Closures Should UNT close campus, it is your responsibility to keep checking your official UNT e-mail account (EagleConnect) to learn if your instructor plans to modify class activities, and how. This may include changing assignment due dates, rescheduling quizzes and exams, etc. Point Allocation DSCI 4520 Homework exercises (8 exercises) 25% In-class quizzes (8 quizzes, 3 dropped) 5% Mid-term Exam (in-class) 25% Final Exam (take-home) 20% Project (4 individual parts and 1 group part) 25% TOTAL 100% Letter Grades: 90% or more = A 80% or more = B 70% or more = C 60% or more = D Below 60% = F Homework Exercises There will be 8 homework exercises that you will have to turn in. Exercises will be using IBM SPSS Statistics, IBM SPSS Modeler, SAS Enterprise Miner, and SAS Text Miner. The homework exercises ask you to perform certain types of analysis, capture screen shots, and answer questions. Related handouts and PowerPoint slides with data description, step-by-step instructions, and assignment details, will be available on Blackboard. HW7 closely follows the text in Getting Started with SAS Text Miner, referred to as GSTM text below. Homework is turned in electronically using Blackboard, in the form of a report document. If you turn in your HW report late, 50% of HW credit is awarded. HW1. Multiple Regression for TargetD using IBM SPSS Statistics. MYRAW data. HW2. Logistic Regression for TargetB using IBM SPSS Statistics. Small sample effects. MYRAW data. HW3. Overview of SEMMA process in SAS Enterprise Miner. Decision Tree and Logistic Regression. Model comparison. HMEQ data. HW4. Scoring, Reporting in SAS EM. HMEQ data. HW5. Clustering in SAS EM. SHOESTORE data. HW6. Association Analysis in SAS EM. ASSOCS data. HW7. Text Analytics in SAS Text Miner. Text cleanup, synonyms, stop list, topic extraction, and predictive modeling using text data. VAEREXT data. Based on the GSTM text. HW8. Introduction to IBM SPSS Modeler. Decision Tree and Logistic Regression. HMEQ data. 3

Term Project This course has a term project. You will be asked to analyze data related to the KDD-cup 98, an International competition for professional data miners. The data set will be available on Blackboard. Handouts describing what you have to do will be distributed in class. During the first 4 parts you will work individually and submit your work as a Word document that includes screen shots from Enterprise Miner and answers to various questions as described on the handouts. You will turn in your reports by uploading them on Blackboard. Grading and late penalty policies for PR1-PR4 are the same as with HW1-HW8. During the last part of the project you will form groups. The maximum group size will be 6. Groups will be selfmanaged. If the group is not satisfied with some member s contribution they may choose to dismiss that person from the group. In such a case, alternative individual assignment will be given to the dismissed group member. The group will turn in a single PR5 report, listing all group member names, in printed hard copy format (i.e., brought to class, not uploaded on Blackboard). A summary of the project parts follows below. Topic PR1. Open the data, produce statistics and graphs PR2. Decision Trees PR3. Regression PR4. Neural Networks PR5. Final Written Report. Comparison and evaluation of 3 models (Decision tree, logistic regression, neural net). Work type Group 4

DSCI 4520 TIME SCHEDULE Fall 2017 The schedule below is a tentative outline for the semester. It is meant to be a guide and several items are subject to change. Certain topics may be stressed more or less than indicated. Date Topics Assignment due Aug. 30 Intro to Data Mining, Ch1 Multiple Linear Regression Sept. 6 Logistic Regression, Ch 6 HW1 (regression in SPSS, MYRAW) Stepwise Procedure Sept. 13 SEMMA, CRISP-DM, HW2 (Log Reg in SPSS, MYRAW) Model comparison, Ch7 Sept. 20 Scoring & Deployment HW3 (SEMMA, HMEQ) Sept. 27 Decision Trees, Sarma Ch 4 HW4 (scoring, HMEQ) Oct. 4 Decision Trees, Sarma Ch 4 PR1 (data explor., DONOR_RAW1) Oct. 11 Neural Networks, Sarma Ch 5 PR2 (trees, DONOR_RAW1, 2, 3 ) Oct. 18 Exam Review PR3 (reg, DONOR_RAW) Oct. 25 NO CLASS (team time) PR4 (neural nets, DONOR_RAW) Nov. 1 *** Exam 1 (in-class) *** Nov. 8 Clustering, Association Analysis Nov. 15 Text Mining, Sarma Ch. 9 HW6 (market basket, ASSOCS) HW5 (clustering, SHOESTORE) Nov. 22 Buffer lecture (use as needed) HW7 (text mining, VAEREXT) Nov. 29 Buffer lecture (use as needed) HW8 (SPSS Modeler, HMEQ) Dec 6 Course review PR5 (project report, hard copy) Take-home final handed out Dec 13 *** Final Exam (take-home, due 11:59PM on Blackboard) *** 5