Creative Data Mining. Lecture 1: Introduction. Spring Dr. Varun Ojha Danielle Griego,

Similar documents
CS Machine Learning

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Python Machine Learning

(Sub)Gradient Descent

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

CSL465/603 - Machine Learning

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Assignment 1: Predicting Amazon Review Ratings

GACE Computer Science Assessment Test at a Glance

Rule Learning With Negation: Issues Regarding Effectiveness

Lecture 1: Machine Learning Basics

Mining Association Rules in Student s Assessment Data

Rule Learning with Negation: Issues Regarding Effectiveness

LEGO MINDSTORMS Education EV3 Coding Activities

PROCESS USE CASES: USE CASES IDENTIFICATION

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Welcome to. ECML/PKDD 2004 Community meeting

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

Time series prediction

Switchboard Language Model Improvement with Conversational Data from Gigaword

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Human Emotion Recognition From Speech

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Lecture 1: Basic Concepts of Machine Learning

Active Learning. Yingyu Liang Computer Sciences 760 Fall

A Context-Driven Use Case Creation Process for Specifying Automotive Driver Assistance Systems

Knowledge Elicitation Tool Classification. Janet E. Burge. Artificial Intelligence Research Group. Worcester Polytechnic Institute

ECE-492 SENIOR ADVANCED DESIGN PROJECT

Linking Task: Identifying authors and book titles in verbose queries

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Laboratorio di Intelligenza Artificiale e Robotica

AQUA: An Ontology-Driven Question Answering System

Implementing a tool to Support KAOS-Beta Process Model Using EPF

M55205-Mastering Microsoft Project 2016

Agents and environments. Intelligent Agents. Reminders. Vacuum-cleaner world. Outline. A vacuum-cleaner agent. Chapter 2 Actuators

Exposé for a Master s Thesis

Characteristics of Functions

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

Going to School: Measuring Schooling Behaviors in GloFish

Spring 2016 Stony Brook University Instructor: Dr. Paul Fodor

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Generative models and adversarial training

arxiv: v2 [cs.cv] 30 Mar 2017

Mining Student Evolution Using Associative Classification and Clustering

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

CS 446: Machine Learning

Humboldt-Universität zu Berlin

university of wisconsin MILWAUKEE Master Plan Report

Teaching Architecture Metamodel-First

CS 101 Computer Science I Fall Instructor Muller. Syllabus

COURSE SYNOPSIS COURSE OBJECTIVES. UNIVERSITI SAINS MALAYSIA School of Management

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Probabilistic Latent Semantic Analysis

Laboratorio di Intelligenza Artificiale e Robotica

Guru: A Computer Tutor that Models Expert Human Tutors

Analysis of Emotion Recognition System through Speech Signal Using KNN & GMM Classifier

Learning Methods for Fuzzy Systems

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

On-Line Data Analytics

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

RESEARCH METHODS AND LIBRARY INFORMATION SCIENCE

Hardhatting in a Geo-World

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Statistics and Data Analytics Minor

Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question

LA1 - High School English Language Development 1 Curriculum Essentials Document

Word Segmentation of Off-line Handwritten Documents

PELLISSIPPI STATE TECHNICAL COMMUNITY COLLEGE MASTER SYLLABUS APPLIED STATICS MET 1040

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The Enterprise Knowledge Portal: The Concept

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

DIGITAL GAMING & INTERACTIVE MEDIA BACHELOR S DEGREE. Junior Year. Summer (Bridge Quarter) Fall Winter Spring GAME Credits.

A. DEGREE REQS. & OPTIONS

new research in learning and working

Evaluating Statements About Probability

City University of Hong Kong Course Syllabus. offered by Department of Architecture and Civil Engineering with effect from Semester A 2017/18

Urban Analysis Exercise: GIS, Residential Development and Service Availability in Hillsborough County, Florida

Learning From the Past with Experiment Databases

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

What is Effect of k-12 in the Electrical Engineering Practice?

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

Investment in e- journals, use and research outcomes

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Comparison of EM and Two-Step Cluster Method for Mixed Data: An Application

MYCIN. The MYCIN Task

Robot manipulations and development of spatial imagery

Introduction to Modeling and Simulation. Conceptual Modeling. OSMAN BALCI Professor

Reducing Features to Improve Bug Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Continual Curiosity-Driven Skill Acquisition from High-Dimensional Video Inputs for Humanoid Robots

Physics 270: Experimental Physics

What is PDE? Research Report. Paul Nichols

Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Modeling function word errors in DNN-HMM based LVCSR systems

Transcription:

Creative Data Mining Spring 2018 Lecture 1: Introduction 19 02 2018 Dr. Varun Ojha ojha@arch.ethz.ch Danielle Griego, griego@arch.ethz.ch

What we ll cover today Background Data Mining for Architects and Urban Planners Learning objectives & Course schedule Semester project Discussion Homework Install Python and Spyder

Background What is Data Mining? Data collection Data selection Processing Transformation Machine Learning Visualization & Interpretation Typical Knowledge Discovery Diagram (KDD)

Background It is an exploratory and iterative process Data collection Data selection Processing Transformation Machine Learning Visualization & Interpretation Typical Knowledge Discovery Diagram (KDD)

Background What is machine learning? Data collection Data selection Processing Transformation Machine Learning Visualization & Interpretation ML Supervised Learning Unspervised Learning Regression Classification SOM Clustering Nerural Networks Linear Non-Linear SVM

Background Data mining does not always include machine learning, for example in many time-series analysis and geo-referenced data visualization Data collection Data selection Processing Transformation Machine Learning Visualization & Interpretation Typical Knowledge Discovery Diagram (KDD)

Background How can data mining be creative? What do we want to know? Data collection Data selection Processing Transformation Machine Learning Visualization & Interpretation Typical Knowledge Discovery Diagram (KDD)

Background How can data mining be creative? Domain specific data source(s) Data collection Data selection Processing Transformation Machine Learning Visualization & Interpretation Typical Knowledge Discovery Diagram (KDD)

Background The not-so creative, but essential part of data mining Is the data usable? Data collection Data selection Processing Transformation Machine Learning Visualization & Interpretation Typical Knowledge Discovery Diagram (KDD)

Background Types of data Original data sources: Images (pixels) Categorical (labels) Numeric (integers and floats) Binary (0/1) - useful for yes/no, true/false Metadata - data descriptors for multi-dimensional data sets. Processed for analysis

Background Types of analysis, visualization & interpretation: Time Series and georeferenced data visualization

Background Types of analysis, visualization & interpretation: Hierarchical clustering Zünd D. (2016). A Meso-Scale Framework to Support Urban Planning (Doctoral dissertation)

Background Types of analysis, visualization & interpretation: SOM- Self organizing Maps SOM clustering map of participants (indicated by numbers) Changing of participants behavior biofeedback responses Ojha V. ESUM-Analyzing Tradeoffs between Energy and Social Performance of Urban Morphology

Conceptual diagram Integrating the creative aspects of data mining Analysis visualization & interpretation Manual Automated Manual Data Source Automated

Conceptual diagram Elaborating on the traditional architectural process Analysis visualization & interpretation Manual Iterative evaluations Automated Manual Data Source Hand-drawn sketches Automated http://www.stamfordbuildingandconstruction.co.uk/our-services/architectural-drawings

Conceptual diagram Process taught in previous semesters Analysis visualization & interpretation Manual Automated Machine Learning: SOM Manual Data Source Hand-drawn sketches Automated Final Project from Moritz Berchtold, Creative Data Mining FS2015

Conceptual diagram Time-series & geo-referenced data visualizations Manual Analysis visualization & interpretation Manual Time-series & georeferenced data visualization Automated Data Source Automated Sensor data ESUM project experimental equipment set up and data analysis techniques

Conceptual diagram Machine Learning Analysis, visualization & interpretation Manual Automated Machine Learning Techniques Manual Data Source Automated Sensor data ESUM project experimental equipment set up and data analysis techniques

Data Mining for Architects and Urban Planners? A few examples

National data collection project Geo-referenced sensor data visualization

Chicago OpenGrid Geo-referenced data visualization http:///chicago.opengrid.io/opengrid/#

Newcastle University Urban Observatory Geo-referenced and time-series data visualization http://uoweb1.ncl.ac.uk/

Urban Morphology meets big data Urban network classification using nearest neighbor clustering https://vahidmoosavi.com/2017/01/20/gitpitch-sevamooroadsarereadmaster/

Data canvas project: Sense your city Geo-referenced and time-series data visualization http://datacanvas.org/sense-your-city/

Data Canvas project output Nearest neighbor clustering with images and time-series/geo-referenced weather https://vimeo.com/nikolamarincic/it-feels-like/

Data driven buildings Clustering and anomaly detection Miller C., & Schlueter A. (2015, April). Forensically Discovering Simulation Feedback Knowledge from a Campus Energy Information System. In Proceedings of the Symposium on Simulation for Architecture and Urban Design (SimAUD). (pp. 136-143). Society for Computer Simulaiton International. datadrivenbuilding.org

Other Examples? Analysis visualization & interpretation Manual Automated Manual Data Source Automated

Course Structure Labeled data Unlabeled data Discrete output Classification Clustering Continuous output Regression Clustering and dimensionality reduction Supervised learning Unsupervised learning

Course Schedule What to Expect

Semester Project Something to start thinking about 1. Formulate 1-2 specific question(s) of interest to you 2. State your hypothesis/expected outcome based on supporting literature (minimum one source) your expertise, and intuition 3. Answer that question through your analysis, for this: Select the best available data sources for your question (min. of 2 data sources) Include a time series and/or clustering analysis 4. Summarize your results Show a clear conclusion, does your analysis answer your question(s)? 5. Conclusions & lessons learned 6. Include motivation and references

Learning objectives We encourage you to be creative! 1. Become familiar with programming and integrating new tools in your work 2. Come up with an interesting research question and learn how to answer it by: Selecting appropriate data source(s) Applying the relevant analysis and visualization techniques Interpreting and refining your results http://ac297r.org/

Short discussion Your expectations? Stop Target Inputs Learning System Output Comparator feedback loop

Homework You can stick around and install the programs now if you d like 1. Install Python from https://www.python.org/downloads/ 2. Install Spyder from https://pythonhosted.org/spyder/ 3. Research other examples of urban data mining and make 2 slides about the most interesting project/application/research group(s) that you find. This will be presented at the beginning of next lecture

Resources for the course Course Material Posted to: http://www.ia.arch.ethz.ch/category/fs2018-creative-data-mining/ Tutorials: https://www.tutorialspoint.com/python/python_basic_operators.htm http://www.informatics.indiana.edu/rocha/academics/ibic/lab1/python%20review.pdf References: A Byte of Python https://python.swaroopch.com/ Coelho, Luis Pedro; Richard, Will. Building Machine Learning Systems with Python, Packt Publishing (Adobe Editions Library)

Science without philosophy is blind, and philosophy without science is paralyzed (Paul Cilliers, Complexity and Postmodernism) Lecture 1: Introduction Questions? 19 02 2018