CSCI , Data Mining and Warehousing Spring 2015

Similar documents
Python Machine Learning

(Sub)Gradient Descent

Coding II: Server side web development, databases and analytics ACAD 276 (4 Units)

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Mining Association Rules in Student s Assessment Data

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

EECS 571 PRINCIPLES OF REAL-TIME COMPUTING Fall 10. Instructor: Kang G. Shin, 4605 CSE, ;

Data Structures and Algorithms

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

ASTRONOMY 2801A: Stars, Galaxies & Cosmology : Fall term

MGMT 3362 Human Resource Management Course Syllabus Spring 2016 (Interactive Video) Business Administration 222D (Edinburg Campus)

SOUTHERN MAINE COMMUNITY COLLEGE South Portland, Maine 04106

Word Segmentation of Off-line Handwritten Documents

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

BUS Computer Concepts and Applications for Business Fall 2012

Mining Student Evolution Using Associative Classification and Clustering

Learning From the Past with Experiment Databases


Syllabus - ESET 369 Embedded Systems Software, Fall 2016

Computer Architecture CSC

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

COURSE DESCRIPTION PREREQUISITE COURSE PURPOSE

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

ECON492 Senior Capstone Seminar: Cost-Benefit and Local Economic Policy Analysis Fall 2017 Instructor: Dr. Anita Alves Pena

Australian Journal of Basic and Applied Sciences

Course Syllabus for Math

Axiom 2013 Team Description Paper

CS 100: Principles of Computing

Instructor: Matthew Wickes Kilgore Office: ES 310

Math 181, Calculus I

FINN FINANCIAL MANAGEMENT Spring 2014

Computer Science 141: Computing Hardware Course Information Fall 2012

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

GERM 3040 GERMAN GRAMMAR AND COMPOSITION SPRING 2017

Rule Learning With Negation: Issues Regarding Effectiveness

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

AGN 331 Soil Science Lecture & Laboratory Face to Face Version, Spring, 2012 Syllabus

Lecture 1: Machine Learning Basics

Lecture 1: Basic Concepts of Machine Learning

Universidade do Minho Escola de Engenharia

Instructor Experience and Qualifications Professor of Business at NDNU; Over twenty-five years of experience in teaching undergraduate students.

Foothill College Summer 2016

Issues in the Mining of Heart Failure Datasets

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

Reducing Features to Improve Bug Prediction

HCI 440: Introduction to User-Centered Design Winter Instructor Ugochi Acholonu, Ph.D. College of Computing & Digital Media, DePaul University

Required Materials: The Elements of Design, Third Edition; Poppy Evans & Mark A. Thomas; ISBN GB+ flash/jump drive

Large-Scale Web Page Classification. Sathi T Marath. Submitted in partial fulfilment of the requirements. for the degree of Doctor of Philosophy

TU-E2090 Research Assignment in Operations Management and Services

The Heart of Philosophy, Jacob Needleman, ISBN#: LTCC Bookstore:

Welcome to. ECML/PKDD 2004 Community meeting

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

Course Syllabus Chem 482: Chemistry Seminar

University of Victoria School of Exercise Science, Physical and Health Education EPHE 245 MOTOR LEARNING. Calendar Description Units: 1.

Syllabus Fall 2014 Earth Science 130: Introduction to Oceanography

INTERMEDIATE ALGEBRA Course Syllabus

DIGITAL GAMING AND SIMULATION Course Syllabus Advanced Game Programming GAME 2374

CALCULUS III MATH

Syllabus Foundations of Finance Summer 2014 FINC-UB

Beginning and Intermediate Algebra, by Elayn Martin-Gay, Second Custom Edition for Los Angeles Mission College. ISBN 13:

PHO 1110 Basic Photography for Photographers. Instructor Information: Materials:

Professors will not accept Extra Credit work nor should students ask a professor to make Extra Credit assignments.

BA 130 Introduction to International Business

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Course Syllabus. Alternatively, a student can schedule an appointment by .

AGN 331 Soil Science. Lecture & Laboratory. Face to Face Version, Spring, Syllabus

SPCH 1315: Public Speaking Course Syllabus: SPRING 2014

Medical Terminology - Mdca 1313 Course Syllabus: Summer 2017

Computerized Adaptive Psychological Testing A Personalisation Perspective

GAT General (Analytical Reasoning Section) NOTE: This is GAT-C where: English-40%, Analytical Reasoning-30%, Quantitative-30% GAT

International Business BADM 455, Section 2 Spring 2008

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Management 4219 Strategic Management

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

CIS 121 INTRODUCTION TO COMPUTER INFORMATION SYSTEMS - SYLLABUS

General Physics I Class Syllabus

Psychology 102- Understanding Human Behavior Fall 2011 MWF am 105 Chambliss

ACADEMIC AFFAIRS CALENDAR

Business Administration

Learning Methods for Fuzzy Systems

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

University of Waterloo Department of Economics Economics 102 (Section 006) Introduction to Macroeconomics Winter 2012

TROY UNIVERSITY MASTER OF SCIENCE IN INTERNATIONAL RELATIONS DEGREE PROGRAM

Department of Anthropology ANTH 1027A/001: Introduction to Linguistics Dr. Olga Kharytonava Course Outline Fall 2017

PSYC 2700H-B: INTRODUCTION TO SOCIAL PSYCHOLOGY

KOMAR UNIVERSITY OF SCIENCE AND TECHNOLOGY (KUST)

Introduction to Information System

Mktg 315 Marketing Research Spring 2015 Sec. 003 W 6:00-8:45 p.m. MBEB 1110

SOCIAL PSYCHOLOGY. This course meets the following university learning outcomes: 1. Demonstrate an integrative knowledge of human and natural worlds

Class Mondays & Wednesdays 11:00 am - 12:15 pm Rowe 161. Office Mondays 9:30 am - 10:30 am, Friday 352-B (3 rd floor) or by appointment

Rule Learning with Negation: Issues Regarding Effectiveness

Faculty of Health and Behavioural Sciences School of Health Sciences Subject Outline SHS222 Foundations of Biomechanics - AUTUMN 2013

Accounting 312: Fundamentals of Managerial Accounting Syllabus Spring Brown

Applications of data mining algorithms to analysis of medical data

AUTOMATED FABRIC DEFECT INSPECTION: A SURVEY OF CLASSIFIERS

Syllabus ENGR 190 Introductory Calculus (QR)

A Case Study: News Classification Based on Term Frequency

Introduction. Chem 110: Chemical Principles 1 Sections 40-52

Spring 2015 Natural Science I: Quarks to Cosmos CORE-UA 209. SYLLABUS and COURSE INFORMATION.

Business Computer Applications CGS 1100 Course Syllabus. Course Title: Course / Prefix Number CGS Business Computer Applications

Transcription:

CSCI 6366.01, Data Mining and Warehousing Spring 2015 Instructor: Zhixiang Chen, Office: ENGR 3.272, Phone: 665-3520, Email: zchen@utpa.edu, WWW Home Page: faculty. utpa.edu/zchen/ Office Hours: Monday Tuesday Wednesday Thursday Friday 10:45 PM -- 11:45 PM 4:35 AM -- 5:35 AM Lectures: CSCI 6366 Thursday 5:45 -- 8:25 PM ENGR 1.272 Course Description: UTPA Graduate Catalog: CSCI 6366 Data Mining and Warehousing As a multidisciplinary field, draws on work from areas including database technology, artificial intelligence, machine learning, neural network, statistics, information retrieval, and data visualization. Theoretical and practical methods will be presented on knowledge discovery and systems design and implementation. Text and Materials: The text book is "Introduction to Data Mining", Pang-Ning Tan, Michael Steinbach, Vipin Kumar, 2 nd (or first) edition, Pearson/Addison Wesley. Other suggested materials:

Will be given in class as the semester progresses. Prerequisites: CSCI 6305 Foundation of Algorithms, Data and Programming Languages in Computer Science: In-depth analysis of computing algorithms and data structures for implementation in the context of software engineering design using structured programming languages. Course Topics: Introduction o What Is Data Mining? o Motivating Challenges o The Origins of Data Mining o Data Mining Tasks Data o Types of Data o Data Quality o Data Preprocessing o Measures of Similarity and Dissimilarity Exploring Data o The Iris Data Set o Summary Statistics o Visualization o OLAP and Multidimensional Data Analysis Classification: Basic Concepts, Decision Trees, and Model Evaluation o Preliminaries o General Approach to Solving a Classification Problem o Decision Tree Induction o Model Overfitting o Evaluating the Performance of a Classifier o Methods for Comparing Classifiers Classification: Alternative Techniques o Rule-Based Classifier o Nearest-Neighbor classifiers o Bayesian Classifiers o Artificial Neural Network (ANN) o Support Vector Machine (SVM o Ensemble Methods o Class Imbalance Problem Association Analysis: Basic Concepts and Algorithms o Problem Definition o Frequent Itemset o Rule Generation o Compact Representation of Frequent Itemsets o Alternative Methods for Generating Frequent Itemsets

o FP-Growth Algorithm o Evaluation of Association Patterns o Effect of Skewed Support Distribution Association Analysis: Advanced Concepts o Handling Categorical Attributes o Handling Continuous Attributes o Handling a Concept Hierarchy o Sequential Patterns o Subgraph Patterns o Infrequent Patterns Cluster Analysis: Basic Concepts and Algorithms o Overview o K-means o Agglomerative Hierarchical Clustering o DBSCAN o Cluster Evaluation Cluster Analysis: Additional Issues and Algorithms o Characteristics of Data, Clusters, and Clustering Algorithms o Prototype-Based Clustering o Density-Based Clustering o Graph-Based Clustering o Scalable Clustering Algorithms o Which Clustering Algorithm? Anomaly Detection o Preliminaries o Statistical Approaches o Proximity-Based Outlier Detection o Density-Based Outlier Detection o Clustering-Based Techniques Course Objectives: After completing this course, you should be able to understand algorithms and methods of data mining develop data mining programs and applications program using available data mining tools and general-purpose languages understand analysis, metrics, visualization and navigation of data mining results learn how to use a few commercial data mining tools Know basic techniques for both directed and undirected knowledge discovery. Know and use software package techniques for mining. Have a good understanding of data mining techniques: association rules, clustering, anomaly detection, etc. Design data schemas for a warehouse environment. Student Learning Outcomes:

Upon successful completion of the course, students are able to: understand the basic principles of the primary data mining techniques understand the difference between data mining, data warehousing, machine learning, etc. Design mining models and manage databases to enable data mining technologies as part of larger systems. Exam, Assignment and Grading: Midterm 20% Final 30% Project one 10% Project two 10% Project three 10% Term paper (just one) 10% Presentation of term paper 5% Attendance 5% total 100% The letter grade will be determined as follows: A: 90-100% B: 80-89% C: 70-79%, D: 60-69% F: 0-59% Assignment Policies: All assignments must be in the instructor's hands before class on the due date which will be specified on each assignment. Late assignments will be accepted up to two days with a one-time 30% late penalty. Any work submitted more two days past the deadline will not accepted. Assignments will be graded on the basis of correctness, logic, clearness, motivation, and style. Unless stated otherwise, all assignments are individual assignments and are expected to be a student's own work. General discussions regarding understanding problems are encouraged, but giving or receiving major sections of solutions to problems will be considered cheating and will be dealt with on an individual basis. Attendance: You are responsible for all materials covered in class, the text book, and homework assignments. Integrity:

Cheating of any kind will not be tolerated. Any assignment or exam that is handed in must be your own work. However, talking with one another to understand the material better is strongly encouraged. Recognizing the distinction between cheating and cooperation is very important. If you copy someone else's solution, you are cheating. If you let someone else copy your solution, you are cheating. We will not distinguish between the person who copied a solution and the person whose solution was copied. Both people will be treated as cheaters. If someone dictates a solution to you, you are cheating. Everything you hand in must be in your own words, based on your own understanding of the solution. If someone helps you understand the problem during a high-level discussion, you are not cheating. We strongly encourage you to help one another understand the material presented in class, in the book, and general issues relevant to the assignments. When taking an exam, you must work independently. Any collaboration during an exam will be considered cheating. When a cheating is caught, zero marks will be given the cheated work, and the case will be forwarded to the Department chair and beyond if necessary. ADA Announcement: If you have a documented disability which will make it difficult for you to carry out the work as I have outlined here and/or if you need special accommodation/assistance due to a disability, please contact the Office of Services for Persons with Disabilities (OSPD), Emilia Ramirez- Schunior Hall, Room 1.101, immediately, or the Associate Director at MAUREEN@UTPA.EDU, Ext. 7005. Appropriate arrangements/accommodations can be arranged. Verification of disability and processing of special services required, such as note takers, extended test time, separate accommodations for testing, will be determined by OSPD. Please do not assume adjustments/accommodations are impossible. Please consult with the Associate Director, OSPD, at Ext. 7005. Additional Policies: Collaboration All assignments in this course are to be done individually. This does not mean that you cannot discuss anything about this course with others. What it does mean is that anything that you hand in must accurately represent your knowledge and work. Plagiarism This class will heavily involve the use of the written works of others. Your own written work will involve discussing the ideas of others. When using the ideas of others, it is important to acknowledge whose ideas you are using, and to clearly distinguish the ideas of others from your own. To convey the impression, whether inadvertently or deliberately, that another's work is your own, is called plagiarism. Plagiarism is a serious offense in the university.