IST 718 Advanced Information Analytics. Course: Advanced Information Analytics Semester: Summer 2016

Similar documents
IST 649: Human Interaction with Computers

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

SY 6200 Behavioral Assessment, Analysis, and Intervention Spring 2016, 3 Credits

Class Numbers: & Personal Financial Management. Sections: RVCC & RVDC. Summer 2008 FIN Fully Online

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Field Experience Management 2011 Training Guides

Monday/Wednesday, 9:00 AM 10:30 AM

IDS 240 Interdisciplinary Research Methods

Assessment System for M.S. in Health Professions Education (rev. 4/2011)

STA 225: Introductory Statistics (CT)

Texas A&M University-Central Texas CISK Comprehensive Networking C_SK Computer Networks Monday/Wednesday 5.

TROY UNIVERSITY MASTER OF SCIENCE IN INTERNATIONAL RELATIONS DEGREE PROGRAM

EDIT 576 (2 credits) Mobile Learning and Applications Fall Semester 2015 August 31 October 18, 2015 Fully Online Course

EDIT 576 DL1 (2 credits) Mobile Learning and Applications Fall Semester 2014 August 25 October 12, 2014 Fully Online Course

STA2023 Introduction to Statistics (Hybrid) Spring 2013

OFFICE SUPPORT SPECIALIST Technical Diploma

TotalLMS. Getting Started with SumTotal: Learner Mode

POFI 1301 IN, Computer Applications I (Introductory Office 2010) STUDENT INFORMANTION PLAN Spring 2013

BUS Computer Concepts and Applications for Business Fall 2012

The Moodle and joule 2 Teacher Toolkit

Pitching Accounts & Advertising Sales ADV /PR

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

On-Line Data Analytics

Sociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website

Course Content Concepts

8. Prerequisites, corequisites (If applicable) Prerequisites: ACCTG 1 (Financial Accounting) ACCTG 168 (Tax Accounting)

Ryerson University Sociology SOC 483: Advanced Research and Statistics

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

GEOG Introduction to GIS - Fall 2015

PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS APPLIED MECHANICS MET 2025

SYLLABUS: RURAL SOCIOLOGY 1500 INTRODUCTION TO RURAL SOCIOLOGY SPRING 2017

COURSE INFORMATION. Course Number SER 216. Course Title Software Enterprise II: Testing and Quality. Credits 3. Prerequisites SER 215

Texas A&M University - Central Texas PSYK PRINCIPLES OF RESEARCH FOR THE BEHAVIORAL SCIENCES. Professor: Elizabeth K.

Learning Microsoft Office Excel

ITM2500 Spreadsheet & Database Productivity. Spreadsheet & Database Productivity

Required Materials: The Elements of Design, Third Edition; Poppy Evans & Mark A. Thomas; ISBN GB+ flash/jump drive

MATH 205: Mathematics for K 8 Teachers: Number and Operations Western Kentucky University Spring 2017

Demography and Population Geography with GISc GEH 320/GEP 620 (H81) / PHE 718 / EES80500 Syllabus

COURSE WEBSITE:

2 User Guide of Blackboard Mobile Learn for CityU Students (Android) How to download / install Bb Mobile Learn? Downloaded from Google Play Store

BRAZOSPORT COLLEGE LAKE JACKSON, TEXAS SYLLABUS. POFI 1301: COMPUTER APPLICATIONS I (File Management/PowerPoint/Word/Excel)

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

Houghton Mifflin Online Assessment System Walkthrough Guide

Evaluation of Respondus LockDown Browser Online Training Program. Angela Wilson EDTECH August 4 th, 2013

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Bittinger, M. L., Ellenbogen, D. J., & Johnson, B. L. (2012). Prealgebra (6th ed.). Boston, MA: Addison-Wesley.

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

MEE 6501, Advanced Air Quality Control Course Syllabus. Course Description. Course Textbook. Course Learning Outcomes. Credits.

Course Syllabus p. 1. Introduction to Web Design AVT 217 Spring 2017 TTh 10:30-1:10, 1:30-4:10 Instructor: Shanshan Cui

ACCOUNTING FOR MANAGERS BU-5190-AU7 Syllabus

Georgetown University School of Continuing Studies Master of Professional Studies in Human Resources Management Course Syllabus Summer 2014

ACCT 100 Introduction to Accounting Course Syllabus Course # on T Th 12:30 1:45 Spring, 2016: Debra L. Schmidt-Johnson, CPA

NCAA Eligibility Center High School Portal Instructions. Course Module

EdX Learner s Guide. Release

STRATEGIC LEADERSHIP PROCESSES

TUCSON CAMPUS SCHOOL OF BUSINESS SYLLABUS

SOLANO. Disability Services Program Faculty Handbook

General Physics I Class Syllabus

State Parental Involvement Plan

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Strengthening assessment integrity of online exams through remote invigilation

Dutchess Community College College Connection Program

GIS 5049: GIS for Non Majors Department of Environmental Science, Policy and Geography University of South Florida St. Petersburg Spring 2011

BSM 2801, Sport Marketing Course Syllabus. Course Description. Course Textbook. Course Learning Outcomes. Credits.

Name: Giovanni Liberatore NYUHome Address: Office Hours: by appointment Villa Ulivi Office Extension: 312

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

Java Programming. Specialized Certificate

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

ITED350.02W Spring 2016 Syllabus

Soil & Water Conservation & Management Soil 4308/7308 Course Syllabus: Spring 2008

Sul Ross State University Spring Syllabus for ED 6315 Design and Implementation of Curriculum

SAMPLE. PJM410: Assessing and Managing Risk. Course Description and Outcomes. Participation & Attendance. Credit Hours: 3

Required Texts: Intermediate Accounting by Spiceland, Sepe and Nelson, 8E Course notes are available on UNM Learn.

Physics XL 6B Reg# # Units: 5. Office Hour: Tuesday 5 pm to 7:30 pm; Wednesday 5 pm to 6:15 pm

M.S. in Environmental Science Graduate Program Handbook. Department of Biology, Geology, and Environmental Science

CSCI 333 Java Language Programming Fall 2017 INSTRUCTOR INFORMATION COURSE INFORMATION

An Introductory Blackboard (elearn) Guide For Parents

TCH_LRN 531 Frameworks for Research in Mathematics and Science Education (3 Credits)

SURVEY RESEARCH POLICY TABLE OF CONTENTS STATEMENT OF POLICY REASON FOR THIS POLICY

ACCOUNTING FOR MANAGERS BU-5190-OL Syllabus

POFI 1349 Spreadsheets ONLINE COURSE SYLLABUS

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Introduction to Moodle

EDUC 2020: FOUNDATIONS OF MULTICULTURAL EDUCATION Spring 2011

Carolina Course Evaluation Item Bank Last Revised Fall 2009

Ruggiero, V. R. (2015). The art of thinking: A guide to critical and creative thought (11th ed.). New York, NY: Longman.

IST 440, Section 004: Technology Integration and Problem-Solving Spring 2017 Mon, Wed, & Fri 12:20-1:10pm Room IST 202

RETURNING TEACHER REQUIRED TRAINING MODULE YE TRANSCRIPT

CS 100: Principles of Computing

EEAS 101 BASIC WIRING AND CIRCUIT DESIGN. Electrical Principles and Practices Text 3 nd Edition, Glen Mazur & Peter Zurlis

ecampus Basics Overview

Android App Development for Beginners

Cleveland State University Introduction to University Life Course Syllabus Fall ASC 101 Section:

MAT 122 Intermediate Algebra Syllabus Summer 2016

Transcription:

IST 718 Advanced Information Analytics Course: Advanced Information Analytics Semester: Summer 2016 Instructor: Gary Krudys Email: gekrudys@syr.edu Office: Hinds 114 Phone: 315-857-7243 (cell) Office Hours: by Appointment/Online Meeting Place: Online Catalog Description A broad introduction to analytical processing tools and techniques for information professionals. Students will develop a portfolio of resources, demonstrations, recipes, and examples of various analytical techniques. Prerequisite Skills There is no required prerequisite for this course, but you will find it much easier to succeed if you have completed IST687, IST777, or both. This course makes extensive use of the open source R package as a framework for analytical computing. In addition to algebra, Boolean algebra, and probability, this course makes extensive use of complex data structures. If you have not taken either IST687 or IST777, then having some background in scripting or programming languages will be helpful. Course Description Analytics is a huge topic, comprising quantitative analysis, systematic and automated analysis of qualitative material (such as text), parsing, measurement, missing data mitigation, data reduction, data mining, descriptive statistics, modeling, machine learning, visualization, and a variety of other areas. As this list suggests, it would be impossible to cover all of these topics in a single semester. Rather than attempt to cover all of these areas badly, this course focuses on becoming familiar and comfortable with a range of the available tools in the context of challenging, data-focused problems. Addressing these problems in creative ways, by connecting datasets and tools, can provide a practical understanding of analytics as a whole while allowing students to develop specialization in one or more areas of interest to them. The primary goal of this course is for you become familiar and comfortable with a variety of methods for obtaining, screening, cleaning, linking, manipulating, analyzing, and displaying data. This is not a course on data visualization per se but you will learn create summaries, overviews, models, analyses, and basic displays such as tables, histograms, trees, and scattergrams. Upon successful completion of this course, you will have developed some or all of the following areas of skill and knowledge: 1

Review of data repositories, sources of archival data, database structures, and metadata Essential quantitative analysis including descriptive statistics, summarization, and a brief review of inferential statistics (a complete treatment of inferential statistics occurs in IST777) Linked data and data mashups Scripting methods for handling data in R and other tools Translating the provenance and structure of a linked data set into a set of reasonable analyses and displays Matching available analyses to the information needs of clients and users Debugging problems in data processing and results Drawing conclusions and presenting data Learning Objectives During the course, we will emphasize: Experiential learning through reading and practical exercises. Collaborative learning through online discussions between instructors and peers. Self-learning with appropriate instructional support and timely feedback using analytical case studies. In order to be successful in this course, the student will: Pro-actively research solution options vs. relying solely on textbook content Actively code while completing the reading assignments. Present results in a professional manner. Comments Clarity Correctness. Submit their assignments on time. Upon completion of the course, the student will be able to: Understand complex data structures, transformation of data structures, and manipulation of data elements. Understand essential analysis techniques including descriptive statistics, summarization, and elementary modeling. Understand scripting methods, including debugging methodologies, for handling data in R and other tools. Appreciation of the range of applicability of information analytics to real problems in areas such as business, science, and engineering. Capability to match available analytical methodologies to the information needs of clients and users and present results in a meaningful way. 2

Course Materials Hogan, Thomas P., Bare Bones R: A Brief Introductory Guide, Sage, 2010. (Optional for those who have taken IST687 or IST777) Stanton, Jeffrey M., Introduction to Data Science, 2013. (Free to download at http://jsresearch.net; optional for those who have taken IST687 or IST777) Leipzig, Jeremy and Xiao-Yi Li, Data Mashups in R, O Reilly, 2011. (Required) Matloff, Norman, The Art of R Programming: A Tour of Statistical Software Design, No Starch Press, 2011. (Required) Student Evaluation: 1. Five Laboratory Exercises 30% Due biweekly 2. Linked Dataset 10% Due mid-semester 3. Tool Exploration Case Study 20% Due by week 10 4. Final Project 30% Due at end of semester 5. Discussion 10% All Semester Long Laboratory Exercises Laboratory exercises provide problem-solving experiences that reinforce the material covered in the readings. The laboratory exercises facilitate the first learning objective of the course by providing the opportunity to apply techniques from class to realistic problem solving situations. A separate laboratory template document will provided with specific instructions for each assignment. There are 5 graded laboratory exercises in this course worth a total of 30% of the course grade or about 6% apiece. The exercises come at about two week intervals. Maximum points are possible if the submission is on time, complete, and correct. Late Exercises will only be accepted within 1 week of due date o 5= Solid / no mistakes (or really minor), well commented/documented o 4 = Good / some mistakes o 3 = Fair / some major conceptual errors o 2 = Poor / did not finish o 0 = Did not participate / did not hand in o On time +1 3

Linked Data Set One of the most critical and difficult tasks that analysts face lies in bringing together disparate data sets to create analytical possibilities that do not exist with simpler arrangements. Some examples here include crime data joined with maps; census data joined with health outcome records; national economic data joined with cultural factors; and polling data joined with social media activity. A linked dataset suitable for subsequent analysis offers a successful join, robust checks for accuracy, missing data mitigation, and metadata fully describing the contents and provenance of the new dataset. A separate document will be provided with specifications for creating the linked dataset. The code, documentation, and data will be used in later phases of the semester, so on time completion of this assignment is essential. Tool Exploration Case Study Data science is a young and fast moving professional field. Vendors continually develop new tools and capabilities for analysts. In this case study project, you will locate, explore, and learn a new technology tool of your own choosing. The tool must provide an interface or connection to R, must be open source or available in a free educational version, and must provide a demonstrable or visible result that can be shared with other members of the class. An example in this category is the RHadoop toolset provided by Revolution Analytics as an interface between R and Hadoop. Final Project. For the final project, students will identify a set of questions that pertain to their linked data set, will conduct analysis to explore those questions, will draw conclusions based on the outputs of those analyses, and will produce a readable report explaining the results. Maximum points are possible if the submission is on-time, complete, and demonstrates the student s ability to match the appropriate analytical methods to the chosen problem, draw appropriate conclusions, and present the results in a meaningful way. Class-Wide Phone Conferences: For the online version of this course, the instructor will answer student questions during periodic toll-free phone conference calls. There will be an introductory call early in the semester, and then one call prior to each of the three non-lab assignments. The phone conferences are optional but participation is highly encouraged as course learning objectives, specific concepts, and upcoming assignments will be discussed. 4

Course Grading: Grades for specific assignments and the course final grade will be assigned by the instructor. There are 1000 possible grade points in this course and each Assignment s grade value goes directly toward the total earned by each student. The numeric final point total will translate to the final letter grade for the course as follows: A = 95-100 A- = 90-94.9 B+ = 85-89.9 B = 80-84.9 B- = 75-79.9 C+ = 70-74.9 C = 65-69.9 C- = 60-64.9 F = below 60 Grades will be available for viewing in the Grade Book section for the course s on-line site. Academic Integrity The academic community of Syracuse University and of the School of Information Studies requires the highest standards of professional ethics and personal integrity from all members of the community. Violations of these standards are violations of a mutual obligation characterized by trust, honesty, and personal honor. As a community, we commit ourselves to standards of academic conduct, impose sanctions against those who violate these standards, and keep appropriate records of violations. The academic integrity statement can be found at http://supolicies.syr.edu/ethics/acad_integrity.htm. Blackboard The ischool uses Syracuse University s Blackboard system to facilitate distance learning and main campus resources. The environment is composed of a number of elements that will help you be successful in both your current coursework and your lifelong learning opportunities. To access Blackboard, go to the following URL: http://blackboard.syr.edu. Use your Syracuse University NetID & Password to log into Blackboard. For questions regarding technical aspects of Blackboard, please submit a help ticket to the ischool dashboard at My.iSchool.Dashboard (https://my.ischool.syr.edu). Log in with your NetID, select Submit a Helpdesk Ticket, and select Blackboard as the request type. The ischool Blackboard support team will assist you. Students with Disabilities In compliance with Section 504 of the Americans with Disabilities Act (ADA), Syracuse University is committed to ensure that no otherwise qualified individual with a disability shall, solely by reason of disability, be excluded from participation in, be denied the benefits of, or be 5

subjected to discrimination under any program or activity If you feel that you are a student who may need academic accommodations due to a disability, you should immediately register with: Office of Disability Services (ODS) 804 University Avenue Room 308 3 rd Floor 315.443.4498 or 315.443.1371 (TTD only) ODS is the Syracuse University office that authorizes special accommodations for students with disabilities. 6

Course Schedule as of 3/16/2016 Week Topics Readings Activities/Assignments 0 5/16/16 Course Introduction Aligning Align class with the methods, goals, and expectations of the course. Syllabus Walk Through Course navigation Subject content Exercises/Assignments Discussion threads Grading Course communication Final Project Walk Through Introduction Lecture Final Project Lecture Complete and post Student Profile Introduce Yourself Access R-Bloggers site http://www.r-bloggers.com/ Follow instructions to subscribe to daily newsletter 1 Setting Up Data/Bare Bones R 5/23/16 Installation Data Sets Workspace Functions Graphics Hogan Ch. 1 Install the R open source software package on your computer Install RStudio Exercise 1 2 5/30/16 Describing Demonstrating ability to describe a data set via summary statistics and visualization. 3 Modeling 6/6/16 Model patterns in data to better understand a business process. Hogan Ch. 2 Matloff Ch. 12 (Brief Overview) Exercise 2 Matloff Intro, Ch. 1, 2 Discussion Ideas for Linked Data Set Building 4 Expand our initial modeling efforts 6/13/16 to build information from data. Matloff Ch. 3, 4, Exercise 3 5 Scripting Matloff Ch. 5, 6 Discussion: Dealing with Messy Data 7

6/20/16Script our initial methods in order to deal with the volume and velocity of data. 6 Inferring 6/27/16 Use analytics to infer the unknown given a set of knowns. Matloff Ch. 7, 8 Exercise 4 Submit 1 page proposal for the dataset or data source you plan to use for your Final Project. Follow Final Project framework guidelines. 7 Mapping 7/4/16 Explore how to gain information from geospatial data. Matloff Ch. 9, 10, 11 Linked Data Set Submission 8 Mashups 7/11/16 Use scripting skills to combine (or mashup) data sets and produce meaningful analysis. TBD Exercise 5 Linked Data Set Discussion Board Commentary Submit 2 page proposal outlining data analysis plan. Follow Final Project framework guidelines. 9 Mashups TBD Discussion: Good Research 7/18/16 Questions 10 Mashups TBD Tool Exploration Case Study 7/25/16 Submission Submit 3 page project report describing results of data 8

screening, cleaning, and linking. Follow Final Project framework guidelines. 11 Presenting 8/1/16 Examine how to present results in a meaningful way Hogan Ch 3 Matloff Ch 12 Final Project submissions due. Follow Final Project framework guidelines. Debugging 12 Examine the process for testing for 8/8/16 and removing defects from a system Matloff Ch 13 Discussion: Efficient Debugging and Problem Solving 9

Additional Information: Read More About It: Bivand, R. S., Pebesma, E. J., & Gomez-Rubio, V. (2008). Applied Spatial Data Analysis with R. New York: Springer. Davenport, T. H., & Harris, J. G. (2007). Competing on Analytics. Boston: Harvard Business School Press. Faraway, J. J. (2006). Extending the Linear Model with R. Boca Raton: Chapman & Hall / CRC. Provost, F., & Fawcett, T. (2013). Data Science for Business. Sebastopol, CA: O'Reilly Media, Inc. 10