CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

Similar documents
Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

CS 100: Principles of Computing

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Course Content Concepts

Spring 2015 IET4451 Systems Simulation Course Syllabus for Traditional, Hybrid, and Online Classes

Grading Policy/Evaluation: The grades will be counted in the following way: Quizzes 30% Tests 40% Final Exam: 30%

Syllabus Foundations of Finance Summer 2014 FINC-UB

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

CS 101 Computer Science I Fall Instructor Muller. Syllabus

CS Machine Learning

Lecture 1: Basic Concepts of Machine Learning

ECON 6901 Research Methods for Economists I Spring 2017

Topic 3: Roman Religion

Computer Science 1015F ~ 2016 ~ Notes to Students

(Sub)Gradient Descent

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

CSL465/603 - Machine Learning

Interior Design 350 History of Interiors + Furniture

SYLLABUS. EC 322 Intermediate Macroeconomics Fall 2012

Syllabus - ESET 369 Embedded Systems Software, Fall 2016

Self Study Report Computer Science

BA 130 Introduction to International Business

The Heart of Philosophy, Jacob Needleman, ISBN#: LTCC Bookstore:

LMIS430: Administration of the School Library Media Center

Assignment 1: Predicting Amazon Review Ratings

Rottenberg, Annette. Elements of Argument: A Text and Reader, 7 th edition Boston: Bedford/St. Martin s, pages.

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Rule Learning With Negation: Issues Regarding Effectiveness

Spring 2015 CRN: Department: English CONTACT INFORMATION: REQUIRED TEXT:

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

An unexamined life is not worth living -Socrates

DEPARTMENT OF HISTORY AND CLASSICS Academic Year , Classics 104 (Summer Term) Introduction to Ancient Rome

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Rule Learning with Negation: Issues Regarding Effectiveness

CS 3516: Computer Networks

COURSE DESCRIPTION PREREQUISITE COURSE PURPOSE

Statistics and Data Analytics Minor

Mathematics Program Assessment Plan

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Physics XL 6B Reg# # Units: 5. Office Hour: Tuesday 5 pm to 7:30 pm; Wednesday 5 pm to 6:15 pm

Social Media Journalism J336F Unique ID CMA Fall 2012

Texas A&M University-Central Texas CISK Comprehensive Networking C_SK Computer Networks Monday/Wednesday 5.

Philosophy. Philosophy 463. Degrees. Program Description

Visual CP Representation of Knowledge

Firms and Markets Saturdays Summer I 2014

British International School Istanbul Academic Honesty Policy

MGMT3274 INTERNATONAL BUSINESS PROCESSES AND PROBLEMS

use different techniques and equipment with guidance

Welcome to WRT 104 Writing to Inform and Explain Tues 11:00 12:15 and ONLINE Swan 305

Math 181, Calculus I

AHS 105 INTRODUCTION TO ART HISTORY: PREHISTORY-MEDIEVAL

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

Bergen Community College School of Arts, Humanities, & Wellness Department of History & Geography. Course Syllabus

Texas A&M University - Central Texas PSYK EDUCATIONAL PSYCHOLOGY INSTRUCTOR AND CONTACT INFORMATION

POLITICAL SCIENCE 315 INTERNATIONAL RELATIONS

Mathematics 112 Phone: (580) Southeastern Oklahoma State University Web: Durant, OK USA

AGN 331 Soil Science Lecture & Laboratory Face to Face Version, Spring, 2012 Syllabus

A Case Study: News Classification Based on Term Frequency

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

Guide to Teaching Computer Science

HCI 440: Introduction to User-Centered Design Winter Instructor Ugochi Acholonu, Ph.D. College of Computing & Digital Media, DePaul University

Python Machine Learning

DOCTORAL SCHOOL TRAINING AND DEVELOPMENT PROGRAMME

Reducing Features to Improve Bug Prediction

Introduction to CS 100 Overview of UK. CS September 2015

Course Syllabus Advanced-Intermediate Grammar ESOL 0352

International Advanced level examinations

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Cleveland State University Introduction to University Life Course Syllabus Fall ASC 101 Section:

Course Syllabus. Alternatively, a student can schedule an appointment by .

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

MTH 141 Calculus 1 Syllabus Spring 2017

MTH 215: Introduction to Linear Algebra

BRAZOSPORT COLLEGE LAKE JACKSON, TEXAS SYLLABUS. POFI 1301: COMPUTER APPLICATIONS I (File Management/PowerPoint/Word/Excel)

Week 2 Unit 1, Cluster 1

Fundamental Accounting Principles, 21st Edition Author(s): Wild, John; Shaw, Ken; Chiappetta, Barbara ISBN-13:

IST 440, Section 004: Technology Integration and Problem-Solving Spring 2017 Mon, Wed, & Fri 12:20-1:10pm Room IST 202

CS 446: Machine Learning

SPM 5309: SPORT MARKETING Fall 2017 (SEC. 8695; 3 credits)

MGT/MGP/MGB 261: Investment Analysis

Course Syllabus MFG Modern Manufacturing Techniques I Spring 2017

Jeffrey Church and Roger Ware, Industrial Organization: A Strategic Approach, edition 1. It is available for free in PDF format.

Timeline. Recommendations

Syllabus Fall 2014 Earth Science 130: Introduction to Oceanography

ADMN-1311: MicroSoft Word I ( Online Fall 2017 )

content First Introductory book to cover CAPM First to differentiate expected and required returns First to discuss the intrinsic value of stocks

Instructor: Matthew Wickes Kilgore Office: ES 310

essays. for good college write write good how write college college for application

Marketing Management MBA 706 Mondays 2:00-4:50

ITSC 2321 Integrated Software Applications II COURSE SYLLABUS

TRINITY VALLEY COMMUNITY COLLEGE COURSE SYLLABUS

Required Text: Oltmanns, T. & Emery, R. (2014). Abnormal Psychology (8th Edition) ISBN-13: ISBN-10:

MKT ADVERTISING. Fall 2016

MGMT 479 (Hybrid) Strategic Management

INTERMEDIATE ALGEBRA Course Syllabus

Transcription:

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE Mingon Kang, PhD Computer Science, Kennesaw State University

Self Introduction Mingon Kang, PhD Homepage: http://ksuweb.kennesaw.edu/~mkang9 Or Google Mingon Kang and the top one. Research interests: Bioinformatics, Machine Learning, Data Mining, and Big Data Analytics Projects you may be interested in: Several Genomic projects in Bioinformatics Facial Image Recognition Gender/Age/Emotion

Now it s your turn Name, program/year, where from Your interests in Computer Science Your favorites What do you expect in Big Data Analytics? If you are in the online course, introduce yourself in D2L, Discussions Self-Introduction

Course Information Instructor: Dr. Mingon Kang Office: J-339 Email: mkang9@kennesaw.edu Only reply to e-mails that are sent from KSU student email accounts and list the course number Office Hours: Tue/Wed, 1-5pm Anytime my door is open. Course Materials Homework assignments, lecture slides, and other materials will be posted in D2L. All lectures will be recorded.

Choice of Language You can use your favorite language, but R, Matlab, Python are highly recommended. The course will briefly introduce R/Python in case you have no experience of those script languages. Why? Better for file I/O of textual data Better to do matrix manipulation Fast Prototyping

Topics in Machine Learning Classification Problems Decision Tress Linear Models Naïve Bayes Classifiers Logistic Regression Fisher Linear Discriminant Analysis K-Nearest Neighbors Data Presentation Principal Component Analysis Clustering Problems K-Means

Topics in Big Data Analytics MapReduce Framework Hadoop Apache Spark Applications in Big Data Analytics

Textbook Advanced Analytics with Spark (Patterns for Learning from Data at Scale) By Sandy Ryza, Uri Laserson, Sean Owen, Josh Wills O Reilly Media, 2015

Reference Books in Machine Learning Pattern Recognition and Machine Learning, Christopher M. Bishop, 6-edition, Springer-Verlag New York, 2006

Reference Book Elements of statistical learning, Hastie, Tibshirani, Friedman, Edition 2, Springer, 2009

Evaluation (tentative) Attendance (CS4491&CS7265: 5%) If a student misses more than 4 session (class meetings), the student's final grade for the course may be reduced by 5% Homework Assignment (4-5 assignments: CS4491&CS7265-W01: 45%, CS7265: 40%) All programming assignments Exams (40%) Exam1 (20%) and Exam 2 (20%) Project (10%) Machine Learning (6%) Spark (4%) Presentation (5%) - only for CS7265 A PhD student will be asked additional presentations. Late submission policy: see syllabus

Grade Evaluation CS4491 CS7265 A 90% - 100% 90% - 100% B 75% - 89% 80% - 89% C 60% - 74% 70% - 79% D 45% - 59% 60% - 69% F 44% or below 59% or below

Two components in Project One component in Machine Learning/Data Mining Algorithms/One in Big Data Analytics Individual work You can choose one ML project from Kaggle Dataset (www.kaggle.com) UC Irvine Machine Learning Repository (http://archive.ics.uci.edu/ml/) One component in Big Data Analytics from the textbook

Academic Integrity Academic dishonesty Cheating Plagiarism Collusion The submission for credit of any work or materials that are attributable in whole or in part to another person Taking an examination for another person Any act designed to give unfair advantage to a student or the attempt to commit

How to succeed this class THINK hard, not WORK hard Scientific Thinking Passion to learn something NEW ASK ME questions (office hours) Begin homework assignments EARLY

Programming? Looks like building a house Timeline to build a house Layout: Floor plan Excavation Footing/ Foundation Framing Mechanicals Insulation Drywall Paint

House!!

Work of art like Antoni Gaudi s?

Before beginning the course Let s discuss about the origins of Computer Science

Philosophy Definition of the word The study of the fundamental nature of knowledge, reality, and existence, especially when considered as an academic discipline. Oxford Dictionary Literally means love of wisdom or friend of wisdom

Philosophy Flooding of the Nile Logic logically describe world (around 500 BC) From God to Human Ancient Graeco-Roman philosophy Socrates, Plato, Aristotle, and etc..

Philosophers Aristotle Gottfried Wilhelm Leibniz George Boole Bertrand Russell Alan Turing

Aristotle (384 322 BC) So many different roles Physics, Biology, Music, Linguistics, Zoology, Economy, Politics How to understand the different world? LOGIC

Gottfried Wilhelm Leibniz German philosopher (1646-1716) Known as one of the founding fathers of calculus Wanted to prove all phenomena using binary logic Convert world to binary logic

George Boole English mathematician, philosopher, and logician (1815-1864) Author of The Laws of Thought Inventor of Boolean Logic Note that Boolean logic can be used to implement binary arithmetic

Bertrand Russell British philosopher, logician, mathematician, historian, writer, social critic and political activist Wanted to make perfect mathematics from perfect logic Author of Principia Mathematica, published in 1910, 1912, and 1913. Total of 1994 pages!!

Principia Mathematica 54.43: "From this proposition it will follow, when arithmetical addition has been defined, that 1+1=2." Volume I, 1st edition, page 379

Alan Turing (1912-1954) Automatize logic. If everything can be explained by logic, we may implement the logic automatically not manually. Introduced Turing test: https://www.csee.umbc.edu/courses/471/papers/turing.pdf Turing Machine A model of a general purpose computer

Summary Aristotle (384-322BC) modern disciplines Gottfried Wilhelm Leibniz (1646-1716) binary logic George Boole (1815-1864) Boolean Logic Bertrand Russell (1872-1970) Principia Mathematica Alan Turing (1912-1954) Automated logic See http://www.datesandevents.org/events-timelines/07-computer-history-timeline.htm