CSC200: Lecture 4. Allan Borodin

Similar documents
Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Radius STEM Readiness TM

MODULE 4 Data Collection and Hypothesis Development. Trainer Outline

Study Group Handbook

BADM 641 (sec. 7D1) (on-line) Decision Analysis August 16 October 6, 2017 CRN: 83777

An Effective Framework for Fast Expert Mining in Collaboration Networks: A Group-Oriented and Cost-Based Method

Characteristics of Collaborative Network Models. ed. by Line Gry Knudsen

Writing Research Articles

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice

TU-E2090 Research Assignment in Operations Management and Services

Case study Norway case 1

SELF: CONNECTING CAREERS TO PERSONAL INTERESTS. Essential Question: How Can I Connect My Interests to M y Work?

Math 181, Calculus I

Probability and Statistics Curriculum Pacing Guide

Team Formation for Generalized Tasks in Expertise Social Networks

Innovative Methods for Teaching Engineering Courses

Successfully Flipping a Mathematics Classroom

Just in Time to Flip Your Classroom Nathaniel Lasry, Michael Dugdale & Elizabeth Charles

Similar Triangles. Developed by: M. Fahy, J. O Keeffe, J. Cooper

Thesis-Proposal Outline/Template

The Importance of Social Network Structure in the Open Source Software Developer Community

Multimedia Application Effective Support of Education

SOCIAL SCIENCE RESEARCH COUNCIL DISSERTATION PROPOSAL DEVELOPMENT FELLOWSHIP SPRING 2008 WORKSHOP AGENDA

Communities in Networks. Peter J. Mucha, UNC Chapel Hill

Prerequisite: General Biology 107 (UE) and 107L (UE) with a grade of C- or better. Chemistry 118 (UE) and 118L (UE) or permission of instructor.

OUTLINE OF ACTIVITIES

Python Machine Learning

re An Interactive web based tool for sorting textbook images prior to adaptation to accessible format: Year 1 Final Report

Shockwheat. Statistics 1, Activity 1

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

Teaching a Discussion Section

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

NCEO Technical Report 27

Team Dispersal. Some shaping ideas

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

Let s think about how to multiply and divide fractions by fractions!

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Class Meeting Time and Place: Section 3: MTWF10:00-10:50 TILT 221

A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting

Simulation in Maritime Education and Training

Unit 3 Ratios and Rates Math 6

MARKETING MANAGEMENT II: MARKETING STRATEGY (MKTG 613) Section 007

Rote rehearsal and spacing effects in the free recall of pure and mixed lists. By: Peter P.J.L. Verkoeijen and Peter F. Delaney

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

While you are waiting... socrative.com, room number SIMLANG2016

AP Statistics Summer Assignment 17-18

SAT MATH PREP:

Pre-AP Geometry Course Syllabus Page 1

Colorado State University Department of Construction Management. Assessment Results and Action Plans

I N T E R P R E T H O G A N D E V E L O P HOGAN BUSINESS REASONING INVENTORY. Report for: Martina Mustermann ID: HC Date: May 02, 2017

Physics 270: Experimental Physics

Life and career planning

MTH 141 Calculus 1 Syllabus Spring 2017

Lecture 1: Machine Learning Basics

STA 225: Introductory Statistics (CT)

ACTION LEARNING: AN INTRODUCTION AND SOME METHODS INTRODUCTION TO ACTION LEARNING

WORK OF LEADERS GROUP REPORT

What is PDE? Research Report. Paul Nichols

10.2. Behavior models

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

Economics 201 Principles of Microeconomics Fall 2010 MWF 10:00 10:50am 160 Bryan Building

Introduction to Causal Inference. Problem Set 1. Required Problems

ALL-IN-ONE MEETING GUIDE THE ECONOMICS OF WELL-BEING

Lecture 2: Quantifiers and Approximation

Seminar - Organic Computing

Application of Multimedia Technology in Vocabulary Learning for Engineering Students

Education as a Means to Achieve Valued Life Outcomes By Carolyn Das

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

Person Centered Positive Behavior Support Plan (PC PBS) Report Scoring Criteria & Checklist (Rev ) P. 1 of 8

Grade 4. Common Core Adoption Process. (Unpacked Standards)

Penn State University - University Park MATH 140 Instructor Syllabus, Calculus with Analytic Geometry I Fall 2010

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

Psychology 102- Understanding Human Behavior Fall 2011 MWF am 105 Chambliss

Strategic Practice: Career Practitioner Case Study

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Lab 1 - The Scientific Method

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

Predatory Reading, & Some Related Hints on Writing. I. Suggestions for Reading

BUS Computer Concepts and Applications for Business Fall 2012

Mathematics process categories

Biological Sciences, BS and BA

Davidson College Library Strategic Plan

AGENDA LEARNING THEORIES LEARNING THEORIES. Advanced Learning Theories 2/22/2016

Consultation skills teaching in primary care TEACHING CONSULTING SKILLS * * * * INTRODUCTION

School of Innovative Technologies and Engineering

Evidence for Reliability, Validity and Learning Effectiveness

On the Combined Behavior of Autonomous Resource Management Agents

Full text of O L O W Science As Inquiry conference. Science as Inquiry

Monitoring Metacognitive abilities in children: A comparison of children between the ages of 5 to 7 years and 8 to 11 years

The Art and Science of Predicting Enrollment

What is Thinking (Cognition)?

The Effect of Discourse Markers on the Speaking Production of EFL Students. Iman Moradimanesh

Math 96: Intermediate Algebra in Context

Word Segmentation of Off-line Handwritten Documents

Critical Thinking in Everyday Life: 9 Strategies

Why Pay Attention to Race?

Julia Smith. Effective Classroom Approaches to.

Transcription:

CSC200: Lecture 4 Allan Borodin 1 / 22

Announcements My apologies for the tutorial room mixup on Wednesday. The room SS 1088 is only reserved for Fridays and I forgot that. My office hours: Tuesdays 2-4 (SF 2303B) or by appointment; note that I may have to move office hours some weeks but will notify class. You can also drop in and if I am not busy, I am happy to meet. First quiz on Friday, October 2. Next week, I will announce the scope of the quiz. It is timed for 15 minutes and the remaining part of the tutorial will follow. We will be using SS 1069 and SS 1088 so please attend the appropriate tutorial. 2 / 22

Today s agenda Last lecture: We finished our introduction of basic graph concepts Basic graph definitions (see Chapter 2) and some additional concepts: forests and trees for undirected graphs directed paths cycles and trees for directed graphs briefly discussed the use of node and edge weighted graphs. briefly discussed embeddedness and dispersion with regard to the romantic relation prediction problem. We also just began discussing chapter 3 of the text. This lecture: Cntinued discussion of Strong and Weak Ties (Chapter 3 of textbook). 3 / 22

Chapter 3: Strong and Weak Ties There are two themes that run throughout this chapter. 1 Strong vs. weak ties and the strength of weak ties is the specific defining theme of the chapter. Also start the discussion of how networks evolve. 2 The larger theme is in some sense the scientific method. Formalize concepts, construct models of behaviour and relationships, and test hypotheses. Models are not meant to be the same as reality but to abstract the important aspects of a system so that it can be studied and analyzed. See the discussion of the strong triadic closure property on pages 49-50 of textbook. Notes strong ties: stronger links, corresponding to friends weak ties: weaker links, corresponding to acquaintances 4 / 22

Triadic 48 closure (undirectedchapter graphs) 3. STRONG AND WEAK TIES G G B B F C F C A A E D E D (a) Before B-C edge forms. (b) After B-C edge forms. Figure 3.1: The formation of the edge between B and C illustrates the effects of triadic closure, since they have a common neighbor A. Figure : The formation of the edge between B and C illustrates the effects of triadic closure, since they have a common neighbor A. seeking, and offers a way of thinking about the architecture of social networks more generally. Triadic closure: mutual friends of say A are more likely (than To get at this broader view, we first develop some general principles about social networks normally ) and their evolution, to and become then return friends to Granovetter s over time. question. How do we measure the extent to which triadic closure is occurring? How 3.1 can Triadic we know Closure why a new friendship is formed? (Such ties can range from just knowing someone to friendship.) In Chapter 2, our discussions of networks treated them largely as static structures we take a snapshot of the nodes and edges at a particular moment in time, and then ask about paths, 5 / 22

Measuring the extent of triadic closure The clustering coefficient of a node A is a way to measure (over time) the extent of triadic closure (perhaps without understanding why it is occurring). Let E be the set of an undirected edges of a network graph. For an node A, the clustering coefficient is the following ratio: { (B, C) E : (B, A) E and (C, A) E } { {B, C} : (B, A) E and (C, A) E } The numerator is the number of all edges (B, C) in the network such that B and C are adjacent to A. The denominator is the number of all unordered pairs {B, C} such that B and C are adjacent to A. 6 / 22

3.1. TRIADIC CLOSURE 49 Example of clustering coefficient G G B B F C F C A A E D E D (a) Before new edges form. (b) After new edges form. Figure 3.2: If we watch a network for a longer span of time, we can see multiple edges forming some The form clustering through triadic coefficient closure while of node others A (such in Fig. as the (a) D-G is 1/6 edge) (since form even there though is the two endpoints have no neighbors in common. only the single edge (C, D) among the six pairs of friends {B, C}, {B, D}, {B, E}, {C, D}, {C, E}, and {D, E}) The clustering coefficient of node A in Fig. (b) increased to 1/2 (because there are three edges (B, C), (C, D), and (D, E)). the fact that the B-C edge has the effect of closing the third side of this triangle. If we observe Note snapshots that another of a social edge network (D, G) athas twoalso distinct formed pointsfor some time, then other in reason. the later snapshot, we generally find a significant number of new edges that have formed through this 7 / 22

Interpreting triadic closure Does a low clustering coefficient suggest anything? 8 / 22

Interpreting triadic closure Does a low clustering coefficient suggest anything? Bearman and Moody reported finding that a low clustering coefficient amongst teenage girls implies a higher probability of suicide (compared to those with high clustering coeficient). How can we understand this finding? 8 / 22

Interpreting triadic closure Does a low clustering coefficient suggest anything? Bearman and Moody reported finding that a low clustering coefficient amongst teenage girls implies a higher probability of suicide (compared to those with high clustering coeficient). How can we understand this finding? What are the reasons for triadic closure? 8 / 22

Interpreting triadic closure Does a low clustering coefficient suggest anything? Bearman and Moody reported finding that a low clustering coefficient amongst teenage girls implies a higher probability of suicide (compared to those with high clustering coeficient). How can we understand this finding? What are the reasons for triadic closure? Opportunity to meet, trust, incentive ; it can be awkward to have good friends (i.e. with strong ties) who are not themselves friends. The implication is that low clustering coefficient imples few good friends. 8 / 22

Granovetter s thesis: the strength of weak ties In 1960s interviews: Many people learn about new jobs from personal contacts (not surprising) and often these contacts were acquaintances rather than friends (surprising?). Upon a little reflection, this intuitively makes sense. The idea is that weak ties link together tightly knit communities, each containing a large number of strong ties. Can we say anything more quantitative about such phenomena? To gain some understanding of this phenomena, we need some additional concepts relating to structural properties of a graph. Recall strong ties: stronger links, corresponding to friends weak ties: weaker links, corresponding to acquaintances 9 / 22

Bridges and local bridges One measure of connectivity is the number of edges (or nodes) that have to be removed to disconnect a graph. A bridge (if one exists) is an edge whose removal will disconnect a (connected) graph. We expect that large social networks will have a giant component and few bridges. A local bridge is an edge (A, B) whose removal would cause A and B to have graph distance (called the span of this edge) greater than two. Note: Span is a dispersion measure as informally defined in Lecture 3. Would the span of an edge be useful for the detection of the romantic relation as discussed in Lecture 3? Local bridges (A,B) play a role similar to bridges providing access for A and B to parts of the network that would otherwise be (in a useful sense) inaccessible. 10 / 22

3.2. THE STRENGTH OF WEAK TIES 51 Local bridge (A, B) J G K F H C A B D E Figure Figure 3.4: : The Theedge A-B edge (A, B) is aislocal a local bridge bridge of span of span 4, since 4, the since removal the removal of this edge of this would edge increase would the distance increasebetween the distance A and Bbetween to 4. A and B to 4. [E&K Figure 3.4] 11 / 22

Strong triadic closure property: connecting tie strength and local bridges Strong triadic closure property Whenever (A, B) and (A, C) are strong ties, then there will be a tie (possibly only a weak tie) between B and C. Such a strong property is not likely true in a large social network (that is, holding for every node A) However, it is an abstraction that may lend insight. Theorem Assuming the strong triadic closure property, for a node involved in at least two strong ties, any local bridge it is part of must be a weak tie. Informally, local bridges must be weak ties since otherwise strong triadic closure would produce shorter paths between the end points. 12 / 22

Strong triadic closure property continued Again we emphasize (as the text states) that Clearly the strong triadic closure property is too extreme to expect to hold across all nodes... But it is a useful step as an abstraction to reality,... Sintos and Tsaparas give evidence that assuming the strong triadic closure property can help in determining whether a link is a strong or weak tie. (www.cs.uoi.gr/ tsap/publications/frp0625-sintos.pdf) 13 / 22

Strong triadic closure property continued Again we emphasize (as the text states) that Clearly the strong triadic closure property is too extreme to expect to hold across all nodes... But it is a useful step as an abstraction to reality,... Sintos and Tsaparas give evidence that assuming the strong triadic closure property can help in determining whether a link is a strong or weak tie. (www.cs.uoi.gr/ tsap/publications/frp0625-sintos.pdf) More specifically, for a social network where the edges are not labelled they define the following two computational problems: Label the graph edges (by strong and weak) so as to satisfy the strong triadic closure property and either 1 maximize the number of strong edges or 2 minimize the number of weak edges For computational reasons, it is not usually possible to optimize and it is best to approximately minimize the number of weak edges. Their computational results (labeling the edges) are validated with 5 network data sets for which the strength of ties can be determined. 13 / 22

Large scale experiment supporting strength of weak ties and triadic closure Onnela et al. [2007] study of who-talks-to-whom network maintained by a cell phone provider. Large network of cell users where an edge exists if there existed calls in both directions in 18 weeks. First observation: a giant component with 84% of nodes. Need to quantify the tie strength and the closeness to being a local bridge. Tie strength is measured in terms of the total number of minutes spent on phone calls between the two ends of an edge. Closeness to being a local bridge is measured by the neighborhood overlap of an edge (A, B) defined as the ratio number of nodes adjacent to both A and B number of nodes C A, B adjacent to at least one of A or B Local bridges are precisely edges having overlap 0. 14 / 22

3.2. THE STRENGTH OF WEAK TIES 51 Example: Embeddedness and neighborhood overlap J G K F H C A B D E The edge (A, B) has embeddedness 0 and hence is a local bridge of Figure 3.4: The A-B edge is a local bridge of span 4, since the removal of this edge would span 4, since the removal of this edge would increase the distance increase the distance between A and B to 4. between A and B to 4. The edge (B, H) has embeddedness 1 and neighborhood overlap 1 6. Bridges and Local Bridges. Let s start by positing that information about good jobs 15 is/ 22

58 CHAPTER 3. STRONG AND WEAK TIES Onnela et al. study continued Figure 3.7: A plot of the of edges as a function of their percentile in Figure : A plot of the neighborhood overlap of edges as a function of their the sorted order of all edges by tie strength. The fact that overlap increases with increasing percentile intiethe strength sorted is consistent orderwith ofthe alltheoretical edges by predictions tie strength. from Section[E&K 3.2. (Image Figfrom 3.7] [334].) The figure shows the relation between tie strength and overlap. Quantitative evidence supporting the theorem: as tie strength decreases, the overlap decreases weak ties becoming almost local bridges. where in the denominator we don t count A or B themselves (even though A is a neighbor of B and B is a neighbor of A). As an example of how this definition works, consider the edge A-F in Figure 3.4. The denominator of the neighborhood overlap for A-F is determined by the nodes B, C, D, E, G, and J, since these are the ones that are a neighbor of at least one 16 / 22

Onnela et al. study continued To support the hypothesis that weak ties tend to link together more tightly knit communities, Onnela et al. perform two simulations: 1 Removing edges in decreasing order of tie strength, the giant component shrank gradually. 2 Removing edges in increasing order of tie strength, the giant component shrank more rapidly and at some point then started fragmenting into several components. 17 / 22

Word of caution in text regarding such studies Easley and Kleinberg (end of Section 3.3): Given the size and complexity of the (who calls whom) network, we cannot simply look at the structure... Indirect measures must generally be used and, because one knows relatively little about the meaning or significance of any particular node or edge, it remains an ongoing research challenge to draw richer and more detailed conclusions... 18 / 22

Word of caution in text regarding such studies Easley and Kleinberg (end of Section 3.3): Given the size and complexity of the (who calls whom) network, we cannot simply look at the structure... Indirect measures must generally be used and, because one knows relatively little about the meaning or significance of any particular node or edge, it remains an ongoing research challenge to draw richer and more detailed conclusions... Yogi Berra(1925-2015): In theory there is no difference between theory and practice. In practice there is. 18 / 22

Strong vs. weak ties in large online social networks (Facebook and Twitter) The meaning of friend as in Facebook is not the same as one might have traditionally interpreted the word friend. Online social networks give us the ability to qualify the strength of ties in a useful way. For an observation period of one month, Marlow et al. (2009) consider 4 Facebook networks defined by (in increasing order of strength): all friends, maintained (passive) relations of following a user, one-way communication, and reciprocal communication. 1 These networks thin out when links represent stronger ties. 2 As the number of total friends increases, the number of reciprocal communication links levels out at slightly more than 10. 3 How many Facebook friends did you have for which you had a reciprocal communication in the last month? 19 / 22

Different 3.4. TIE STRENGTH, TypesSOCIAL of Facebook MEDIA, AND PASSIVE Friendships ENGAGEMENT 61 All Friends Maintained Relationships One-way Communication Mutual Communication 20 / 22

A limit 62 to the number of strong CHAPTER 3. ties STRONG AND WEAK TIES Figure 3.9: The number of links corresponding to maintained relationships, one-way communication, and reciprocal communication as a function of the total neighborhood size for users on Facebook. (Image from [286].) Figure : The number of links corresponding to maintained relationships, one-way communication, and reciprocal communication as a function of the total neighborhood size for users on Facebook. [Figure 3.9, textbook] cation. Moreover, as we restrict to stronger ties, certain parts of the network neighborhood thin out much faster than others. For example, in the neighborhood of the sample user in 21 / 22

3.4. TIE STRENGTH, SOCIAL MEDIA, AND PASSIVE ENGAGEMENT 63 Twitter:Limited Strong Ties vs Followers Figure : The total number of a user s strong ties (defined by multiple directed messages) Figure as a3.10: function The total of number the number of a user s strong of followees ties (defined he by multiple or shedirected has on messages) Twitter. [Figure 3.10, as a function textbook] of the number of followees he or she has on Twitter. (Image from [222].) 22 / 22