Sanjoy Dasgupta Professor, Computer Science and Engineering Faculty-Affiliate, Calit2

Similar documents
Statistics and Data Analytics Minor

We are strong in research and particularly noted in software engineering, information security and privacy, and humane gaming.

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Massachusetts Institute of Technology Tel: Massachusetts Avenue Room 32-D558 MA 02139

TREATMENT OF SMC COURSEWORK FOR STUDENTS WITHOUT AN ASSOCIATE OF ARTS

Undergraduate Program Guide. Bachelor of Science. Computer Science DEPARTMENT OF COMPUTER SCIENCE and ENGINEERING

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

CSL465/603 - Machine Learning

Python Machine Learning

Laboratorio di Intelligenza Artificiale e Robotica

MASTER OF SCIENCE (M.S.) MAJOR IN COMPUTER SCIENCE

Dinesh K. Sharma, Ph.D. Department of Management School of Business and Economics Fayetteville State University

Laboratorio di Intelligenza Artificiale e Robotica

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Probabilistic Latent Semantic Analysis

COSI Meet the Majors Fall 17. Prof. Mitch Cherniack Undergraduate Advising Head (UAH), COSI Fall '17: Instructor COSI 29a

Computer Science (CSE)

Master s Programme in Computer, Communication and Information Sciences, Study guide , ELEC Majors

Introduction to CS 100 Overview of UK. CS September 2015

Welcome to. ECML/PKDD 2004 Community meeting

On-Line Data Analytics

B.S/M.A in Mathematics

Natural Sciences, B.S.

Mathematics Program Assessment Plan

Learning Methods for Fuzzy Systems

11:00 am Robotics and the Law: An American Perspective Prof. Ryan Calo, University of Washington School of Law

Bluetooth mlearning Applications for the Classroom of the Future

College and Career Ready Performance Index, High School, Grades 9-12

UC San Diego - WASC Exhibit 7.1 Inventory of Educational Effectiveness Indicators

EECS 700: Computer Modeling, Simulation, and Visualization Fall 2014

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Research Area

Linguistics Program Outcomes Assessment 2012

Top US Tech Talent for the Top China Tech Company

ADVANCED PLACEMENT STUDENTS IN COLLEGE: AN INVESTIGATION OF COURSE GRADES AT 21 COLLEGES. Rick Morgan Len Ramist

Department of Computer Science GCU Prospectus

Lecture 1: Basic Concepts of Machine Learning

COMPUTER SCIENCE GRADUATE STUDIES Course Descriptions by Methodology

Chemistry Senior Seminar - Spring 2016

Development of an IT Curriculum. Dr. Jochen Koubek Humboldt-Universität zu Berlin Technische Universität Berlin 2008

TEACHING AND EXAMINATION REGULATIONS PART B: programme-specific section MASTER S PROGRAMME IN LOGIC

Lecture 1: Machine Learning Basics

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

(Sub)Gradient Descent

The Talent Development High School Model Context, Components, and Initial Impacts on Ninth-Grade Students Engagement and Performance

Knowledge based expert systems D H A N A N J A Y K A L B A N D E

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Navigating the PhD Options in CMS

CS Machine Learning

Xinyu Tang. Education. Research Interests. Honors and Awards. Professional Experience

Math 4 Units Algebra I, Applied Algebra I or Algebra I Pt 1 and Algebra I Pt 2

Georgia Institute of Technology Graduate Curriculum Committee Minutes. January 20, 2011

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Self Study Report Computer Science

Accuplacer Implementation Report Submitted by: Randy Brown, Ph.D. Director Office of Institutional Research Gavilan College May 2012

UEP 251: Economics for Planning and Policy Analysis Spring 2015

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

Introduction to Simulation

Human Emotion Recognition From Speech

EGRHS Course Fair. Science & Math AP & IB Courses

Courses in English. Application Development Technology. Artificial Intelligence. 2017/18 Spring Semester. Database access

Computational Data Analysis Techniques In Economics And Finance

Course Development Using OCW Resources: Applying the Inverted Classroom Model in an Electrical Engineering Course

CS177 Python Programming

The Method of Immersion the Problem of Comparing Technical Objects in an Expert Shell in the Class of Artificial Intelligence Algorithms

Mathematics. Mathematics

English (native), German (fair/good, I am one year away from speaking at the classroom level), French (written).

ReFresh: Retaining First Year Engineering Students and Retraining for Success

Handbook for the Graduate Program in Quantitative Biomedicine

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Ecole Polytechnique Fédérale de Lausanne EPFL School of Computer and Communication Sciences IC. School of Computer and Communication Sciences

ME 443/643 Design Techniques in Mechanical Engineering. Lecture 1: Introduction

Study in Berlin at the HTW. Study in Berlin at the HTW

Integrating simulation into the engineering curriculum: a case study

Artificial Neural Networks written examination

Time series prediction

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Academic Catalog Programs & Courses Manchester Community College

Axiom 2013 Team Description Paper

Measurement. When Smaller Is Better. Activity:

Playing It By Ear The First Year of SCHEMaTC: South Carolina High Energy Mathematics Teachers Circle

How People Learn Physics

Bachelor of Science. Undergraduate Program. Department of Physics

Java Programming. Specialized Certificate

Citrine Informatics. The Latest from Citrine. Citrine Informatics. The data analytics platform for the physical world

OFFICE SUPPORT SPECIALIST Technical Diploma

Speech Recognition at ICSI: Broadcast News and beyond

PROVIDENCE UNIVERSITY COLLEGE

Fashion Design Program Articulation

EE, CompE and CS Programs: Merger or Peaceful Co-Existence?

2017 Florence, Italty Conference Abstract

Mathematics 112 Phone: (580) Southeastern Oklahoma State University Web: Durant, OK USA

Office Hours: Mon & Fri 10:00-12:00. Course Description

Workload Policy Department of Art and Art History Revised 5/2/2007

Modeling function word errors in DNN-HMM based LVCSR systems

Decision Making. Unsure about how to decide which sorority to join? Review this presentation to learn more about the mutual selection process!

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Intermediate Algebra

Proposal of Pattern Recognition as a necessary and sufficient principle to Cognitive Science

Transcription:

Sanjoy Dasgupta Professor, Computer Science and Engineering Faculty-Affiliate, Calit2 Prior to joining the UCSD Jacobs School in 2002, Sanjoy Dasgupta was a senior member of the technical staff at AT&T Labs-Research, where his work focused on algorithms for data mining, with applications to speech recognition and to the analysis of business data. Prof Dasgupta received a Ph.D. in Computer Science in 2000 from UC Berkeley and a B.A. in Computer Science from Harvard in 1993. He is a member of the editorial boards of the Journal of Machine Learning Research, the Journal of Artificial Intelligence Research, and the Machine Learning Journal. High-dimensional statistics, clustering, algorithms for finding underlying patterns in highdimensional data, machine learning Professor Sanjoy Dasgupta develops algorithms for the statistical analysis of high-dimensional data. Such data is now widespread, in domains ranging from environmental modeling to genomics to web search. The geometry of high-dimensional spaces presents unusual challenges; many traditional statistical procedures were developed with one- or twodimensional data in mind and do not scale well to this modern context. Some of them are very inefficient; others give poor results because of counter-intuitive effects in high dimension. Dasgupta has developed the first provably correct, efficient algorithms for a variety of canonical statistical tasks, especially related to clustering (grouping) data. He is one of the few machine learning researchers whose work combines algorithmic theory with geometry and mathematical statistics. He adds a strong theoretical focus to UCSD's CSE artificial intelligence and bioinformatics groups.

DATA SCIENCE IN THE JACOBS SCHOOL OF ENGINEERING Sanjoy Dasgupta Computer Science and Engineering

Data + Methods The data From all over campus Neural, atmospheric/oceanic, medical, personal health, internet, genetic, The methods Concentrated in the Jacobs School

Research in core methodologies Machine learning Big data algorithmics Security and privacy Interpretability and confidence assessment Yoav Freund Daniel Kane Mihir Bellare Kamalika Chaudhuri

Goals 1. Spread the expertise 2. Simplify the interface between domain experts methods experts 3. The view beyond campus

Spreading the expertise Starting fall 2017: Undergraduate major in data science Starting fall 2017: MSc in data science (through ECE dept) Starting summer 2017: Micro-MSc in data science

Undergraduate data science Application domains Machine learning / data mining Algorithms Visualization Database management Distributed computing Linear algebra Probability and statistics Programming Discrete structures

Major in data science Core classes: lower division Core classes: upper division Electives Senior project Overview of data science Introduction to programming Introduction to data structures Representations of data Linear algebra Discrete math for data science Networked life Calculus, Physics/Chemistry/Biology Probability and statistics Exploratory data analysis Databases Distributed computation Data visualization Probabilistic reasoning and decision making Machine learning Data mining 8 classes: ideally, develop domain of specialization

Domains of specialization Computer science Cognitive science Signal processing Theory In planning: Computational social science Digital humanities / arts / music Neuroscience Biology/medicine Business analytics Climate and environmental science

Interface: {domain,methods} experts Recent faculty hires: engineering + application domain Another idea: help desk drop-in consultation with methods experts

Looking beyond campus Two-year Master of Advanced Study program (since 2014) Full-day classes, every second Friday and Saturday Taught mostly by UCSD faculty Small class sizes (under 30) Significant TA support outside class Total cost (for two years): roughly $36K

MAS: the curriculum Term Fall Winter Spring Fall Winter Spring Course Python for data analysis Case studies in data science Probability and statistics using Python Data management systems Machine learning Big data analysis using Hadoop and Spark Beyond relational data models Unsupervised learning Data visualization Capstone project Capstone project