Syllabus Data Mining (LIS 4070) 1. Course Information Course #/Title: LIS 4070: Data Mining (3 Credits) Quarter: Summer 2014, MTW, 4:00pm 6:20pm, KAR 305 Meetings: June 16, 2014 July 2, 2014 2. Faculty Information Instructor: Contact Information: Office Hours: Shimelis Assefa, PhD. P. 303-871-6072 Email: sassefa@du.edu Office: KAR 244 Monday 2:00-4:00pm, Wed 2:00-4:00 pm and other times by appointment. 3. Course Description The explosion of available data has revolutionized how research is done and decisions are made. This course introduces popular data mining methods for discovering knowledge from data. The principles and theories of data mining methods will be discussed while the focus of the course will be on applications of data mining techniques to problem solving and decision making. Topics will include data preparation, concept description, classification, prediction, clustering, association and visualization. Students will also acquire hands-on experience using state-of-the-art software to develop data mining solutions. Through the exploration of the concepts and techniques of data mining and practical exercises, students will develop skills that can be applied to business, science or other organizational problems. This course will use library and educational datasets and questions as examples, and thus is specially tailored to LIS/MCE student. However, the knowledge, tools and techniques covered in the course would be equally applicable to other domains. We welcome anyone who is interested in finding solutions and answers from data. 4. Course Materials Required textbook (available Via DU AAC EBL database)
Witten, I.H., Frank, E., & Hall, M.A. (2011). Data Mining: Practical Machine Learning Tools and Techniques. 3rd edition. Burlington : Elsevier Science 5. Learning Outcomes Explain the fundamental processes and concepts of data mining Apply common data mining techniques to real-world problems Explore and evaluate contemporary data-mining systems. 6. Method of Instruction We conclude the face-to-face instruction on July 2. However, the class remains available and students continue to do on their assignments through the end of the summer quarter, August 14. Presentation slides, lab exercise files and datasets are available on Canvas online. We will have lectures and labs, with lectures and discussions to cover material and labs to investigate small examples for the topics. The textbook supplemental materials will be available online. Weka 3.6-11 should be downloaded and installed on your computer from this source - http://www.cs.waikato.ac.nz/~ml/weka/downloading.html. 7. Methods of Assessment All assignments are to be completed from within the Canvas site. Please review the descriptions of individual assignments that are available in the Assignments page. Points Possible: Assignments Weight (percentage) Points Class participation 20% 100 Lab assignments 50% 100 Final project 30% 100 Total 100% 300 Evaluation: Grades will be based on points accumulated and converted to 100 percentile according to the following scale: Grades Points (%) Grades Points (%) A 95-100 C+ 76-79 A- 90-94 C 73-75 B+ 86-89 C- 70-72 B 83-85 D 60-69 B- 80-82 F <60 2
8. Course and Related DU Policies Student Responsibilities. As a student, you are expected to challenge yourself, to actively participate in your education, and to search both inside and outside of the classroom for answers to your questions. Answers are rarely black and white at this level of study. I expect you to actively participate in the classroom, to listen and to discuss ideas with your colleagues. I expect you to read all assigned materials, and research additional sources for more information. The sources I have chosen are only some of those available in the field; you are encouraged to find other resources and share them with the class. Most importantly, you are expected to learn, and to leave this course with new ideas. My goal is to provide you with the foundation to continue to explore these ideas when you leave the classroom. Faculty Responsibilities. My primary role is to serve as a facilitator in a manner that supports a meaningful learning. I will present information related to the topics covered, help you synthesize materials assigned for the course. I will both ask and answer questions; this class is your opportunity to discuss the issues. I am available outside of class time to answer questions concerning lab assignments and topics covered in class. I will also give you a grade. My expectations for your performance are clearly outlined in this syllabus with further descriptions on Canvas. If anything appears unclear, or if you have any questions, please ask me. Most of all, my role is to encourage you to learn -- encourage, not force. You will take from this course what you put into it. I hope you will take advantage of the opportunity to learn in this class, from me, from the materials on the subject, and from your colleagues. 3
If you have special needs as addressed by the Americans with Disabilities Act and need any test or course materials provided in an alternative format, notify the instructors. HONOR CODE STATEMENT All members of the University community are entrusted with the responsibility of observing certain ethical goals and values as they relate to academic integrity. Essential to the fundamental purpose of the University is the commitment to the principles of truth and honesty. The Honor Code is designed so that responsibility for upholding these principles lies with the individual as well as the entire community. The Honor Code fosters and advances an environment of ethical conduct in the academic community of the University, the foundation of which includes the pursuit of academic honesty and integrity. Through an atmosphere of mutual respect we enhance the value of our education and bring forth the highest standard of academic excellence. Members of the University community, including students, faculty, staff, administrators and trustees, must not commit any intentional misrepresentation or deception in academic or professional matters 4
9. Course Schedules Week 1 6/16 2 6/17 3 6/18 4 6/23 5 6/24 6 6/25 7 6/30 8 7/1 9 7/2 Topic/Readings** ** See Respective Modules on Canvas for Readings & Other materials Introduction to data mining. Introduction to Weka and other data mining software. Data exploration, preparation, and evaluation. Classification, Decision trees.. Association rules. Clustering Text mining. Literature mining. Visualization. Assignments and due dates Lab 1 Install Weka, explore Weka interface, explore the explorer, and the experimenter answer the questions posted on Canvas Lab 2 k-means and hierarchical clustering Lab 3 association rules create processes of association rule mining, tune parameters, and interpret results Lab 4 text categorization create processes of text categorization Lab 5 decision trees open datasets in Weka, transform data type, set attribute role, and build decision tree Final project Detailed description on Canvas. 5