The Impact of Group Contract and Governance Structure on Performance Evidence from College Classrooms

Similar documents
NCEO Technical Report 27

BENCHMARK TREND COMPARISON REPORT:

Sector Differences in Student Learning: Differences in Achievement Gains Across School Years and During the Summer

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Evaluation of a College Freshman Diversity Research Program

IS FINANCIAL LITERACY IMPROVED BY PARTICIPATING IN A STOCK MARKET GAME?

ABILITY SORTING AND THE IMPORTANCE OF COLLEGE QUALITY TO STUDENT ACHIEVEMENT: EVIDENCE FROM COMMUNITY COLLEGES

ECON 365 fall papers GEOS 330Z fall papers HUMN 300Z fall papers PHIL 370 fall papers

Role Models, the Formation of Beliefs, and Girls Math. Ability: Evidence from Random Assignment of Students. in Chinese Middle Schools

Probability and Statistics Curriculum Pacing Guide

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Class Size and Class Heterogeneity

STA 225: Introductory Statistics (CT)

PEER EFFECTS IN THE CLASSROOM: LEARNING FROM GENDER AND RACE VARIATION *

A Comparison of Charter Schools and Traditional Public Schools in Idaho

American Journal of Business Education October 2009 Volume 2, Number 7

ReFresh: Retaining First Year Engineering Students and Retraining for Success

w o r k i n g p a p e r s

(Includes a Detailed Analysis of Responses to Overall Satisfaction and Quality of Academic Advising Items) By Steve Chatman

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

UK Institutional Research Brief: Results of the 2012 National Survey of Student Engagement: A Comparison with Carnegie Peer Institutions

South Carolina English Language Arts

DEMS WORKING PAPER SERIES

Summary results (year 1-3)

Match Quality, Worker Productivity, and Worker Mobility: Direct Evidence From Teachers

School Competition and Efficiency with Publicly Funded Catholic Schools David Card, Martin D. Dooley, and A. Abigail Payne

PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia

DO CLASSROOM EXPERIMENTS INCREASE STUDENT MOTIVATION? A PILOT STUDY

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

Greek Teachers Attitudes toward the Inclusion of Students with Special Educational Needs

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

THE PENNSYLVANIA STATE UNIVERSITY SCHREYER HONORS COLLEGE DEPARTMENT OF MATHEMATICS ASSESSING THE EFFECTIVENESS OF MULTIPLE CHOICE MATH TESTS

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

Teacher intelligence: What is it and why do we care?

MGT/MGP/MGB 261: Investment Analysis

On-the-Fly Customization of Automated Essay Scoring

Grade Dropping, Strategic Behavior, and Student Satisficing

Inquiry Learning Methodologies and the Disposition to Energy Systems Problem Solving

GROUP COMPOSITION IN THE NAVIGATION SIMULATOR A PILOT STUDY Magnus Boström (Kalmar Maritime Academy, Sweden)

STAT 220 Midterm Exam, Friday, Feb. 24

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

TU-E2090 Research Assignment in Operations Management and Services

Swords without Covenants Do Not Lead to Self-Governance* Timothy N. Cason Purdue University. and. Lata Gangadharan Monash University.

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

What effect does science club have on pupil attitudes, engagement and attainment? Dr S.J. Nolan, The Perse School, June 2014

AP Statistics Summer Assignment 17-18

Aalya School. Parent Survey Results

Working Paper: Do First Impressions Matter? Improvement in Early Career Teacher Effectiveness Allison Atteberry 1, Susanna Loeb 2, James Wyckoff 1

PREDISPOSING FACTORS TOWARDS EXAMINATION MALPRACTICE AMONG STUDENTS IN LAGOS UNIVERSITIES: IMPLICATIONS FOR COUNSELLING

Abu Dhabi Indian. Parent Survey Results

New Venture Financing

Evidence for Reliability, Validity and Learning Effectiveness

12- A whirlwind tour of statistics

Abu Dhabi Grammar School - Canada

The Impact of Formative Assessment and Remedial Teaching on EFL Learners Listening Comprehension N A H I D Z A R E I N A S TA R A N YA S A M I

Australia s tertiary education sector

Essentials of Ability Testing. Joni Lakin Assistant Professor Educational Foundations, Leadership, and Technology

Management of time resources for learning through individual study in higher education

GCSE English Language 2012 An investigation into the outcomes for candidates in Wales

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

Strategy for teaching communication skills in dentistry

CONSTRUCTION OF AN ACHIEVEMENT TEST Introduction One of the important duties of a teacher is to observe the student in the classroom, laboratory and

Is there a Causal Effect of High School Math on Labor Market Outcomes?

PROJECT MANAGEMENT AND COMMUNICATION SKILLS DEVELOPMENT STUDENTS PERCEPTION ON THEIR LEARNING

Reasons Influence Students Decisions to Change College Majors

Critical Thinking in Everyday Life: 9 Strategies

National Survey of Student Engagement Spring University of Kansas. Executive Summary

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

Improving Conceptual Understanding of Physics with Technology

teacher, peer, or school) on each page, and a package of stickers on which

Lecture 1: Machine Learning Basics

Ph.D. in Behavior Analysis Ph.d. i atferdsanalyse

success. It will place emphasis on:

GDP Falls as MBA Rises?

A Study of Metacognitive Awareness of Non-English Majors in L2 Listening

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

CSC200: Lecture 4. Allan Borodin

Integrating simulation into the engineering curriculum: a case study

Diagnostic Test. Middle School Mathematics

How and Why Has Teacher Quality Changed in Australia?

Longitudinal Analysis of the Effectiveness of DCPS Teachers

learning collegiate assessment]

When!Identifying!Contributors!is!Costly:!An! Experiment!on!Public!Goods!

Ryerson University Sociology SOC 483: Advanced Research and Statistics

Effectiveness of McGraw-Hill s Treasures Reading Program in Grades 3 5. October 21, Research Conducted by Empirical Education Inc.

CONSISTENCY OF TRAINING AND THE LEARNING EXPERIENCE

Psychometric Research Brief Office of Shared Accountability

Algebra 2- Semester 2 Review

Miami-Dade County Public Schools

A Note on Structuring Employability Skills for Accounting Students

Office of Institutional Effectiveness 2012 NATIONAL SURVEY OF STUDENT ENGAGEMENT (NSSE) DIVERSITY ANALYSIS BY CLASS LEVEL AND GENDER VISION

Mandarin Lexical Tone Recognition: The Gating Paradigm

A Program Evaluation of Connecticut Project Learning Tree Educator Workshops

LANGUAGE DIVERSITY AND ECONOMIC DEVELOPMENT. Paul De Grauwe. University of Leuven

Assignment 1: Predicting Amazon Review Ratings

Transcription:

JLEO 1 The Impact of Group Contract and Governance Structure on Performance Evidence from College Classrooms Zeynep Hansen* Boise State University and NBER Hideo Owan 5 University of Tokyo Jie Pan Loyola University Maryland Shinya Sugawara University of Tokyo 10 In this article, we empirically analyze the effect of team characteristics on a team s choice of group contract type (its governance structure) and examine the combined impact of team characteristics and the group contract choice on group and individual performance in a classroom setting. We utilize endogenous dummy variable models in both group-level and individual-level analyses 15 due to the expected endogeneity of the contract choice. The estimation results confirm a statistically significant positive effect of a governance structure, democratic contract that includes a mechanism to punish free-riders on both group and individual performance. We also estimate switching regression models to account for the possible heterogeneous treatment effects but do not find any 20 significant difference between the treated and the nontreated in the effect of the democratic contract option implying that the contract choice is not necessarily motivated by its performance-enhancing effect. (JEL: D7, D86, I2) 1. Introduction In this article, we empirically analyze the effect of team characteristics on a 25 team s choice of governance structure and the combined impact of group *Email: zeynephansen@boisestate.edu We thank Bill Bottom, Stuart Bunderson, Jed DeVaro, Bart Hamilton, Caroline Hoxby, Hidehiko Ichimura, Takao Kato, Marc Law, Jackson Nickerson, Peter Reiss, Jacob Vigdor, Masaru Sasaki, and seminar participants at the Olin School of Business and the Department of Economics at Washington University in St. Louis, the National Bureau of Economic Research, Higher Education Meetings, the Midwest Economics Association Conference, and the Japanese Economic Association semi-annual meetings for helpful comments and suggestions. Part of this research was completed while authors were at the Washington University in St. Louis. The usual disclaimer applies. The Journal of Law, Economics, and Organization, Vol. 0, No. 0 doi:10.1093/jleo/ewt007 ß The Author 2013. Published by Oxford University Press on behalf of Yale University. All rights reserved. For Permissions, please email: journals.permissions@oup.com

2 The Journal of Law, Economics, & Organization characteristics and contract choice on team and individual performance in a college classroom setting. Our contributions to the existing literature are two-fold. First, we contribute to the contract literature by demonstrating that a group governance structure that incorporates a proper incentive 5 scheme, such as punishment for free-riders and rewards for hard workers, influences group and individual performance positively. Second, we empirically show that certain characteristics of a team influence its contract choice but the treatment effect is not significantly different between the treated and the nontreated suggesting that the contract choice is not ne- 10 cessarily motivated by the expected positive effect on team performance. Teamwork can facilitate skill specialization and knowledge transfer by utilizing members comparative advantages. 1 Thus, diversity in skills and knowledge may improve productivity through knowledge sharing and task coordination. On the other hand, free-riding 2 and coordination prob- 15 lems 3 can arise and hinder the efficiency of a team. Teams governance structure, which specifies team decision rules and the way individual payoffs are determined based on teamwork outcomes, significantly affects how well teams deal with task coordination and free-riding. If diversity makes it difficult to coordinate or build trust among team members, 20 diverse teams may especially benefit from a governance structure with an incentive mechanism that facilitates task coordination and deters free-riding. Our empirical study is based on performance data from an undergraduate introductory management course and academic backgrounds and 25 demographic data found in the students personal records from the registrar s office at a private US university. The implications of our study, however, go beyond the domain of the economics of education because teamwork in the study includes reading and collecting materials, identifying problems, devising solutions, organizing findings in a paper, and 30 making an oral presentation an array of tasks similar to group projects observed in many workplaces. Moreover, the scope of the tasks and requisite skills is extensive and makes collaboration and coordination ssential elements for a successful completion of a group project. Student groups are exogenously assigned in the beginning of the course, 35 and thus, bias that can arise from self-selection of team members is largely eliminated. In addition, our study is unique in that teams governance choices (explained below) and individuals choices regarding their effort 1. Some of the benefits and tradeoffs of teamwork are discussed in Lazear (1999) and Garicano (2000). 2. See, for example, Alchian and Demsetz (1972), Holmstrom (1982), and Kandel and Lazear (1992) for a thorough discussion of free-riding. Free-riding problems arise especially when team members actions are unobservable or team members do not have means to punish free-riders (e.g., through social pressure). 3. For a theoretical model of how coordination costs affect specialization, see Becker and Murphy (1992).

Group Contract and Governance Structure on Performance 3 levels are made in a context where individuals care about their reputations, a condition that is not achievable in laboratory experiments. A novel aspect of our study is that each group chooses its own form of governance that varies in how peers may punish free-riding and reward 5 dedication. Each group formalizes its governance form by signing a contract. Teams that opt for an autonomous contract agree that all members will receive the same grade for their group work regardless of the extent of their individual contributions. Teams selecting a democratic contract grade each of their members internally; and subject to the ap- 10 proval of the majority of members, these internal team grades can then be used to alter the project grades given by the instructor up to one full letter grade, either up or down. Since a group s contract choice is influenced by the group s characteristics and other unobservable factors that may also affect group perform- 15 ance, the contract choice has to be treated as endogenous when estimating its impact. Thus, in order to determine the link between governance structure and group performance, we estimate dummy endogenous variable models at both the group and individual levels using the maximum likelihood estimation (MLE) method. We surveyed teaching assistants (TAs) 20 and used their responses as an instrument to identify the unbiased treatment effects. Our results indicate a statistically significant positive impact of the democratic group contract on both the group and individual performance. Our main findings in this analysis of how group contract choice and its 25 governance structure affect group performance can be evaluated in comparison to related research on team incentive pay. Several studies, using workplace data or data from laboratory experiments, find that teamwork under a team incentive pay is generally characterized by improved productivity and performance, and is not subject to serious free-riding (see, 30 e.g., Dijk et al. 2001; Knez and Simester 2001; Hamilton et al. 2003; Falk and Ichino 2006). 4 In addition, to the extent that demographic diversity affects the effectiveness of peer pressure and the threat of punishment, our study is related to the literature of diversity. Some earlier work has implied that demo- 35 graphic diversity; in particular age and ethnic diversity within teams may hinder cooperation due to higher communication costs, decreased social ties, and weakened peer pressure (e.g., Hamilton et al. 2004; Bandiera et al. 2005). If some dimensions of demographic diversity encourage free-riding, as suggested by these studies, team composition should also 40 affect both the efficacy and the necessity of mechanisms to deter freeriding. In our analysis, however, we did not find any significant difference 4. Other relevant literature on the economics of education defines peer groups either at the class or school level such as, Hoxby (2000), Arcidiacono and Nicholson (2002), and Angrist and Lang (2004). These studies provide mixed evidence on the impact and significance of peer effects in various educational settings.

4 The Journal of Law, Economics, & Organization between the treated and the nontreated in the effect of choosing the democratic group contract. This may be because the choice of self-governing mechanism is influenced by perceived bargaining costs and asymmetric information about other members personality traits. Thus, groups that 5 benefit more from the democratic contract may not be more likely to adopt it. For example, we find that groups with a higher proportion of male students perform worse in their group work than those groups with a higher proportion of female students. If male students are more prone to free-ride, introducing a mechanism to punish free-riders should generate a 10 greater benefit for male-dominant groups than others. However, our results indicate that male-dominant groups are not more likely than femaledominant groups to adopt the democratic contract. The rest of the article is organized as follows. In Section 2, we describe the nature of the group work assigned to the students. Section 3 provides 15 an overview of the data used in our study. In Section 4, we explain our empirical strategy and discuss the estimation results and implications in detail. We conclude in Section 5. 2. Group Work in Management 100 Using class performance data from an undergraduate introductory man- 20 agement course (MGT100) taught during 2003 04, we evaluate the relationship between a group s composition, its choice of governance structure, and these factors impact on individual and group performance. MGT100 is designed to offer a thematic introduction to the world of business and to provide an overview of many key economic concepts. 25 The course is required for all students whose major or minor field of study is in business and, since it is designed and taught in cooperation with other faculty in the management field, its instruction format has been largely constant over time. MGT100 requires participation in class lectures as well as weekly subsection sessions (with 12 25 students) where 30 pairs of undergraduate TAs lead discussions and run simulation games. 5 In the second week of classes, TAs divide their subsections into teams of four or five students. TAs are instructed to assign teams using a random procedure, although they are simultaneously advised to reassign team members to create gender diversity within groups. Despite this selection 35 process, gender makeup varies substantially in groups due to differences in gender composition across subsections and some attrition of group members. MGT100 requires a substantial amount of group work including three assignments and a group project. The group work requirement in the 40 course accounts for 28% of the total course grade (the project accounts 5. TAs were all undergraduate students who had previously taken MGT100 and successfully completed it (with at least an A-grade in class). They were chosen based on their success in class, enthusiasm for topics covered in MGT100, expected leadership and communication skills, and their availability in number of hours per week.

Group Contract and Governance Structure on Performance 5 for 25% and each assignment accounts for 1%). For the group project, each team is required to write a paper and make an oral presentation on a company described in one of a list of business books provided by the instructor. Students are asked to use course concepts to analyze a firm s 5 market environment, discuss the features of its organization as a response to that market environment, and to examine the strategic options that are available to the firm. This group project is completed by the end of the semester and evaluated by the instructor. Instructors made the evaluation forms used for grading the paper and the presentation available to stu- 10 dents early in the semester, and used the same evaluation criteria in all three classes in our sample. In addition to this major project, each group works together on three long assignments during the semester. Despite the relatively heavy workload, each assignment counts for only 1% of the total course grade; hence the group work grades in our study mostly reflect 15 the students performance on their group projects. Clear logical and analytical thinking, a good understanding of class concepts, and good exam preparation and exam-taking skills are necessary and largely sufficient for successful performance in exams in MGT100, but success in group work requires additional capabilities. 20 The group paper requires good writing skills in addition to understanding and applying class concepts to a specific firm and industry. The group presentation involves a lot of practice and speaking at ease in front of an audience. Most importantly, successful completion of the group project requires responsibility, effective organization, coordination, and commu- 25 nication among group members and good time management skills. A novel aspect of the group work in MGT100 is that each group selects one of three contracts, each with specific set of voting rules that team members must follow to coordinate their actions and possibly punish free-riders and reward those who contribute the most. After TAs explain 30 the rules of each team structure and discuss their implications, teams choose their governance form in the second week of the semester. Once all members agree and sign the team contract, they cannot change their contract type under any circumstance. A consensus-based autonomous team contract requires that all deci- 35 sions made within the team receive unanimous support and all team members are awarded the same grade for their group projects. A majority rule-based democratic team contract requires that all decisions have majority support. 6 At the end of the semester, members of democratic teams vote on how to allocate grades to individual team members. The instructor 40 assigns an overall grade for the group project, but individual grades of 6. The contract label autonomous is chosen because most autonomous work teams in businesses make decisions based on the consensus to facilitate smooth implementation; the label democratic is used because majority voting rule is the basic principle in democratic societies. These labels for each contract were chosen to avoid any negative connotations possibly associated with other terms such as consensus or majority-rule.

6 The Journal of Law, Economics, & Organization group members then can be modified for better or worse by at most oneletter grade. 7 This differentiated point allocation rule allows democratic team members to punish free-riders and reward hard working members. A third contract type, the manager-led team contract, assigns coordin- 5 ation responsibility to one member who is elected by the team at the beginning of the course. This contract type is not discussed much in the paper since only two teams chose it, and we dropped them from our analysis. 8 3. Data Description 10 In this article, we use a novel data set constructed from the academic and demographic records of 340 undergraduate students who took MGT100 over three semesters in 2003 04 (spring 2003, spring 2004, and fall 2004). All entering business majors as well as all interested freshmen from other schools in the university take the course together in the fall of their first 15 year. Students with a minor in business and students who are transferring to the business school generally take the course in spring semesters. 9 Our original sample contained 81 teams. We dropped three twomember teams from the sample because the group dynamics of twomember teams would be very different from larger teams and the 20 two-member groups might be subject to severe self-selection. 10 Next, we dropped two teams with the manager-led group structure that we observed in fall 2004. There were no manager-led groups in the spring classes. These exclusions left a total of 325 students in 76 teams 34 from spring semesters and 42 from the fall 2004 class. 25 Our data include grade sheets that provide detailed information on students grades on two midterm exams and a final exam as well as group assignments and the group project that includes a paper and an oral presentation. In addition, each student is matched to his or her corresponding group and each group s contract choice is also recorded. 30 We augmented this class performance data with the records we obtained on students academic and demographic backgrounds from the university 7. For example, a team could receive a B (or an 85 on a 100-point scale) from the instructor. If the team is autonomous, all members would receive a B (85) for the team project. If a four-member team is democratic or manager led, one member could conceivably receive an A (95), two members a B (85), and one member a C (75). 8. In a manager-led team, the manager evaluates other members contributions and assigns grades to them at the end of the course. Similarly, team members evaluate their manager s performance and assign a grade to the manager. The grades are then used to adjust the project grade assigned by the instructor in a similar way to the democratic team contract. 9. MGT100 had 50, 89, and 201 students in spring 2003, spring 2004, and fall 2004, respectively. There were only a few upperclassmen in the fall 2004 class who were given special permission to take the class due to their scheduling conflicts and graduation requirements. 10. Two-member teams arise only as a result of attrition where at least two members drop the course.

Group Contract and Governance Structure on Performance 7 registrar. Information on a student s academic background includes year level in college (freshman, sophomore, etc.), SAT or ACT scores, courses taken prior to MGT100, and the primary division of the student. 11,12 Demographic information includes age, gender, ethnic origin, and current 5 residence type (dormitory, fraternity, or off-campus). We aggregated individual information to construct group-level variables measuring demographic and knowledge/skill compositions. These group-level variables are then used to analyze the effect of group composition on performance and groups choice of contract. Group-level vari- 10 ables include average age, proportion of male students, average SAT score, proportion of domestic Caucasian students, as well as standard deviations in age and SAT scores. Group-level dispersion measures in age and SAT scores are used to capture a group s knowledge and skill diversity. 15 Table 1 presents descriptive statistics for primary variables used in the empirical analysis. It should be noted that, although the statistics for SAT scores in the table are calculated using the original values, the actual values of SAT scores used in our empirical analyses are normalized so that the mean and the standard deviation are 0 and 1, respectively. 20 Summary statistics are shown separately for spring and fall classes in addition to the pooled sample since certain demographic characteristics are expected to be different between fall and spring classes. As described earlier, the fall class is taken by freshmen who are mostly business majors, while the spring class includes upper classmen, many with nonbusiness 25 majors. Therefore, those in the latter class are older and have more diverse skills and knowledge background than in the former. One more notable difference is that the average group size is smaller in the spring class reflecting its relatively higher attrition rate. 13 11. For those students who took the ACT instead of the SAT, the ACT scores were converted into SAT scores according to the Comparison Table provided by College Educational Board. The Comparison Table was constructed based on 103,525 test-takers who took both tests between October 1994 and December 1996. Equivalent SAT and ACT scores are those with the same percentile ranks for a common group of test takers. 12. The primary divisions of the students in our sample pool are Schools of: Art and Sciences, Fine Art, Architecture, Business, and Engineering. 13. Although we do not have a precise figure, the average attrition rate is around 5% in the fall class and 10 15% in the spring classes. This self-selection could cause bias in our analysis if dropping the course is related to the expected performance of teams or is correlated with some unobservable TA characteristics that could be captured by our instrumental variable (described later in the text). Attrition is unlikely to be associated with the expected performance of teams or TA characteristics, however, because most attrition took place during the first few weeks in the semester before students were assigned to teams or before they started working on their group projects. In addition, it is highly unlikely that strict TAs had caused the attrition of unproductive workers as suggested by a reviewer. TAs do not have such a policing role. TAs led classroom games and discussions and occasionally helped some groups to complete their group homework assignment when the group had a trouble solving the problems. The students were encouraged to come forward to the instructor to report any serious within-group conflicts and TAs sometimes reported such incidents to the instructor.

8 The Journal of Law, Economics, & Organization Table 1. Summary Statistics Individual-level variables (number of observations: 325) Mean (SD) Minimum Maximum Individual exam score 81.44 (8.37) 45.23 98.31 Age 19.21 (1.12) 17.17 24.83 SAT score 1391.2 (90.2) 1010 1600 Gender (male ¼ 1) 0.58 (0.49) 0 1 Engineering student 0.09 (0.29) 0 1 College status a 1.46 (0.85) 1 4 Contract dummy (Democratic ¼ 1) 0.67 (0.47) 0 1 Group-level variables (number of observations: 76) Mean (SD) Minimum Maximum Group performance score 88.74 (5.17) 70.26 100 Group size 4.30 (0.65) 3 5 Average age 19.27 (0.81) 18.25 21.48 Standard deviation of age 0.69 (0.45) 0.06 2.46 Gender composition (proportion of male students) 0.57 (0.17) 0 1 Average SAT score 1389.1 (50.0) 1175 1480 Standard deviation of SAT scores 72.1 (31.1) 17.9 169.2 Engineering student dummy b 0.33 (0.47) 0 1 Contract (democratic ¼ 1) 0.67 (0.47) 0 1 Spring semester Mean (SD) Fall semester Mean (SD) Group performance score 88.40 (6.48) 89.02 (3.88) Group size 3.94 (0.60) 4.60 (0.54) Average age 19.99 (0.65) 18.69 (0.29) Standard deviation of age 0.92 (0.48) 0.50 (0.32) Gender composition (proportion of male students) 0.51 (0.20) 0.62 (0.12) Average SAT score 1362.89 (53.82) 1410.38 (34.83) Standard deviation of SAT scores 77.47 (34.23) 67.82 (28.05) Engineering student dummy b 0.44 (0.50) 0.24 (0.43) Contract (democratic ¼ 1) 0.68 (0.47) 0.67 (0.48) a The variable College status takes values 1 4 corresponding to the ranks of freshman to senior. b The variable Engineer student dummy takes the value of 1 if a group has at least one engineering student. The biggest challenge we encounter in our estimations is the identification of the causal impact of the group governance structure on group performance, especially since its choice is clearly influenced by the characteristics of group members and other circumstantial factors. Note that 5 the same unobservable factors that influence group performance are likely to also influence a group s contract choice. For example, an individual who expects to work less for the course may insist on using the autonomous contract where free-riders are not punished; at the same time, groups with such potential free-riders are likely to underperform. Our empirical

Group Contract and Governance Structure on Performance 9 strategy is to use an instrumental variable that affects group performance solely through its impact on the governance choice. The instrumental variable, TA influence, for the group contract choice, uses TAs responses to the following brief questionnaire we sent in October 2005 to all the TAs 5 of the MGT100 classes in 2003 and 2004: 14 During Friday sessions, did you discuss the positive aspects of the democratic or manager-led group contract (contract choice that allows unequal point allocation within groups) or talk about your positive experience with these types of 10 contracts (e.g., you actually punished your teammate who did not work well)? TAs were asked to rate their answers in the 5-point Likert scale from 0 (strongly disagree) to 5 (strongly agree). The TA influence variable was constructed by adding up the responses from the two TAs who were in 15 charge of each subsection. Hence, the variable takes an integer value in the range between 0 and 10 the higher value indicting greater encouragement by TAs to choose the democratic contract. TA influence was shown to be a significant and robust determinant of groups contract choices in our unreported Probit model estimations. 20 Another important issue that needs further clarification is TAs use of their discretion to assign students into groups. As indicated earlier, TAs were instructed to make team assignments randomly, but no formal randomization mechanism was specified. It was up to them how to form teams some divided their students in alphabetical order and others 25 used Excel s random number generator to assign teams. Since our instrument is potentially correlated with unobserved characteristics of TAs, it is necessary to consider whether unobserved TA characteristics could be somehow correlated with unobserved team characteristics as a result of this discretionary team assignment process. We argue, for several reasons, 30 it is highly unlikely that TAs took into account personality traits of students in team assignments or gave favorable treatment to some students (e.g., let friends form teams). 15 If, however, TAs took into account other 14. All former TAs responded to this short questionnaire. 15. First, the chance that TAs find their friends in their section is minimal because: (1) TAs rarely knew any of their students prior to the first meeting of the semester, especially since all students are freshmen in the fall semester and those taking the class in the spring semester are all from different schools and programs; and (2) the instructors assigned TAs to sessions after students signed up for the sessions to attend; thus, students could not choose their TAs by selecting a specific session time. Second, TAs did not have much time to learn the personality traits of the students and use this information in assigning teams because TAs assigned students into groups in the second week of the semester after only brief interactions with students playing a classroom game that they led during the first Friday session. Third, the instructors told TAs never to take requests from the students regarding team membership. It was very unlikely that they disobeyed this instruction given that TAs were the best students from previous classes; they were chosen for their honesty and commitment as well as their comprehension of the class material; and they cared about their reputation in the Business

10 The Journal of Law, Economics, & Organization factors in forming teams (e.g., placing friends on the same team) that are unobservable to us, then the estimated impact of team characteristics on contract choice and performance could be biased to the extent that these unobserved factors are correlated with both team characteristics and con- 5 tract choice or performance. In order to ensure that there was no systematic sorting that might cause estimation bias for the impact of team characteristics; we tested the null hypothesis of a truly random group formation process by comparing actual group compositions with a simulated distribution of group characteristics. The results shown and dis- 10 cussed in Hansen et al. (2013) indicate no sign of systematic sorting. A concern in using TA influence as an instrument is that unobservable TA characteristics may be affecting both TA influence and the group performance directly. 16 For example, TAs who encourage students to adopt the democratic contract may be more likely to be good TAs who guide 15 group work and facilitate learning better than those who are more neutral or negative toward the adoption of the democratic contract. In order to evaluate the validity of such concerns, we conducted some preliminary analyses (not reported in the paper) and found that: (1) TA influence had no correlation with TA working hours and team characteristics; 20 and (2) TA influence has no correlation with the group performance once team characteristics and the contract choice are accounted for. These findings suggest that the way TAs described the possible team governance choices was not correlated with their efforts and the quality of their help or with the observed characteristics of groups. 17 Therefore, we 25 conclude that the measured TA influence affected group performance primarily through the contract choice. School. In addition, most students never knew that the group assignments were made by TAs, and TAs submitted the group assignment sheets to the instructor before they were announced to the students. Once groups were assigned, the students were never allowed to switch teams. 16. Another concern raised during the review process is that TA influence may reflect the needs of the students to adopt the democratic contract. Good TAs may discern for which groups the democratic contract is a better choice. TA influence, however, is the measure of whether TAs discussed the positive aspects of the democratic contract on the day the teams were making contract decisions, which is in the second week of the semester. Based on our observation as the instructors, there is no way that TAs can get any meaningful signals about each groups needs on the day it was announced, especially given that a majority of students are freshmen (235 out of 325), and all upperclassmen came from different schools outside the business school. 17. The results discussed later in the article also indicate that: (a) the section random effect which is supposed to capture the quality of TA and the peer effect in the performance equation is found to be quite small and almost negligible (see footnote 20 for details); and (b) our results did not change after accounting for TA working hours, a proxy variable for TA effort levels (Tables 2 and 4). These additional findings further eliminate the possibility that unobservable TA characteristics are affecting both TA influence and the group performance directly.

Group Contract and Governance Structure on Performance 11 4. Empirical Analysis In this section, we examine the link between group characteristics, the choice of governance structure, and individual and group performance. First, we investigate how group characteristics influenced the choice of 5 team governance structure. Second, we ask how the group s governance structure affected its performance. Third, we answer the question of which contract groups ended up actually punishing their members. Fourth, we investigate how the governance type influenced learning and knowledge spillovers among group members as measured by individuals exam 10 grades. In addition, we briefly discuss how group characteristics such as heterogeneity in gender, age, race and capabilities affected group performance and individual learning; a more detailed discussion of this issue is presented in Hansen et al. (2013). In the subsections that follow, we estimate a model with a dummy en- 15 dogenous regressor to evaluate the causal effect of the democratic contract. We first discuss the impact of contract type on performance in group work and then address the impact of contract choice on success in group members individual exams. 4.1 Groups Governance Choice and its Impact on Group Performance 20 4.1.1 Discussion of Results. The dummy endogenous regressor model we estimate is expressed as: Contract jk ¼ X j D + D TA influence j + D k +"D j ð1þ Contract jk ¼ 1ifContract jk 0; ¼ 0 if otherwise Y jk ¼ X j Y + Y Contract jk + Y k +"Y j ð2þ Subscript j is the index for the j-th study group and subscript k indicates 25 the class in which the j-th study group belongs. Contract jk denotes the contract choice of group j in class k and is an indicator for the democratic contract and Contract jk is its latent variable. Y jk denotes group j s performance measured by its group work grade: the weighted sum of the grades on three group assignments and the group project. TA influence j 30 is the instrumental variable that measures the TAs expected influence over the groups choice of contract as explained in the previous section, whereas X j is a vector of group control variables including average SAT score, proportion of male students, dummy for five-member groups, dummy for having an engineering student in a group as well as standard 35 deviations in age and SAT scores in each group. 18 D k and Y k are vectors of class dummies that are included to account for possible differences in assignments, grading cutoffs, and teaching styles across professors and 18. We dropped average age from X j because the average age and the deviation of age are highly correlated and this collinearity made it difficult to get convergence in our MLE estimation.

12 The Journal of Law, Economics, & Organization years. 19 " D j and " Y j are unobserved variables affecting the contract choice and group performance, respectively. " D j and " Y j are potentially correlated but are assumed to be independent across groups. 20 The error terms reflect group-specific uncertainties, individual unobserved characteristics, and 5 their interactive influence over decision-making and efforts. We employ MLE method to identify the causal impact of the group governance structure. We first discuss how group characteristics affect the contract choice. Contract jk ¼ 1 indicates the democratic contract for group j, a governance 10 structure that implements a mechanism to punish free-riders and reward hard workers within a group. Autonomous groups have no such punishment and reward mechanism. Results of the first stage of the endogenous dummy variable model from equation (1) are reported in the bottom half of Table 2. Columns 1 and 2 15 show the MLE results for the pooled sample that includes all three classes and Columns 3 and 4 present results for the fall class only. The results for the subsample are reported due to the difference in student composition between fall and spring classes, as the former consists of primarily freshmen and is more homogenous. 20 Models presented in Columns 2 and 4 include the variable TA working hours. This variable identifies the number of hours worked by TAs during an average week, and it is an indicator of TAs efforts to help students with course work and the group project. Although it is confirmed that TA influence and TA working hours are not significantly correlated, this fact 25 may not completely rule out the possibility that some unobservable TA characteristics affect both contract choice and TAs effort levels in a nonlinear form and/or as a result of interactions with team characteristics. If TA influence is correlated with the group performance through TAs help efforts, accounting for such information in the performance regres- 30 sion will make the exclusion restriction satisfied. Although we do not have the exact measure of TAs provision of help, we do have a very good proxy: TA working hours, a measure of TAs efforts. If the channel 19. As described in Section 2, the course is designed and taught in cooperation with other faculty in the management department and the instructors of the course use the same grading sheet and share all teaching materials. Therefore, differences are limited across classes and semesters. 20. Groups in the same subsection have the same TAs and are likely to have some knowledge spillover among them. To account for a possible correlation among " Y j within each subsection, we also estimated a random-effect model where " Y j s have the same random component within each subsection for the pooled sample. The estimated standard deviation of this random component is small: 0.75 and 1.26 with and without TA working hours, respectively, and insignificant. The estimated treatment effect of contract choice is approximately 8 points, a few points higher than in the original model. However, due to small sample size, the treatment effect became insignificant. This small random effect and still sizable treatment effect (although insignificant due to a large standard deviation) obtained in the random effect model indicate that it is very unlikely that our qualitative results are significantly affected by possible misspecification.

Group Contract and Governance Structure on Performance 13 Table 2. Endogenous dummy variable model: MLE (1) Pooled (2) Pooled (3) Fall class (4) Fall class Dependent variable: group performance Average SAT 1.281* 1.268* 0.512 0.121 (0.665) (0.678) (0.978) (0.972) Standard deviation of SAT 0.165 0.092 0.530 0.499 (0.619) (0.640) (0.797) (0.752) Standard deviation of Age 0.703 0.512 1.643 1.401 (1.801) (1.732) (1.985) (1.883) Gender composition 5.172* 5.841** 11.824** 12.794*** (2.858) (2.781) (4.941) (4.712) Group size (five-member) dummy 0.925 0.939 0.967 0.998 (1.140) (1.018) (1.322) (1.247) Engineering student dummy 0.755 0.594 0.746 0.391 (0.777) (0.774) (1.583) (1.518) TA working hours 0.027** 0.026 (0.012) (0.021) Contract Choice 6.595*** 6.056*** 4.229** 3.259 (1.712) (1.544) (2.141) (2.236) Dependent variable: contract choice Average SAT 0.311* 0.304* 0.163 0.214 (0.162) (0.168) (0.355) (0.379) Standard deviation of SAT 0.288* 0.288* 0.041 0.003 (0.157) (0.160) (0.303) (0.311) Standard deviation of age 0.852*** 0.861*** 0.506 0.535 (0.282) (0.291) (0.744) (0.754) Gender composition 1.698* 1.747* 1.125 1.056 (0.989) (1.041) (1.819) (1.806) Group size (five-member) dummy 0.895*** 0.804** 0.808 0.738 (0.296) (0.341) (0.563) (0.560) Engineering student dummy 1.035** 0.935** 0.776 0.681 (0.415) (0.451) (0.635) (0.661) TA influence 0.618*** 0.577*** 0.504** 0.453** (0.155) (0.127) (0.208) (0.216) TA working hours 0.005 0.004 (0.006) (0.008) Variance components rho 0.541** 0.510** 0.595* 0.501 (0.180) (0.186) (0.254) (0.304) sigma 4.537*** 4.413*** 3.779*** 3.556*** (0.461) (0.432) (0.561) (0.512) Number of observations 76 76 42 42 Note: *p< 0.1; **p< 0.05; ***p <0.01; class fixed effects are included. Standard errors are clustered within session. through TAs provision of help is driving our results, we expect that including TA working hours would reduce the estimated treatment effect drastically. First stage results show that TA influence was an important factor in the 5 choice of group governance structure both in the pooled sample and in the

14 The Journal of Law, Economics, & Organization fall class. When TAs discussed the positive aspects of a democratic type of governance structure before students finalized their choice, groups were more likely to choose the democratic governance form. Standard deviation of age has a negative and weakly significant effect in 5 the pooled sample and a negative but insignificant effect in the fall class where most students are freshmen and much closer in age. The results imply that groups that are more heterogeneous in age are less likely to choose the democratic team contract. Since we do not control for average age due to the high correlation between average age and standard devi- 10 ation of age, groups with more age diversity typically include more upperclassmen. One plausible explanation for this result is that students of similar age and year level are more likely to have similar expectations about how much effort they would exert, and therefore the risk of unexpectedly suffering a grade sanction or failing to coordinate in imposing a 15 sanction under the democratic contract appears smaller. Another possible explanation is that upperclassmen can exert more peer pressure and leadership, which effectively substitutes for the punishment mechanism in the democratic contract. If the latter interpretation were true, we should observe positive correlation between age diversity and the group perform- 20 ance under the autonomous contract. A result from our estimation of a switching regression model, which is not reported in the article, indicates that there is a positive but statistically insignificant correlation between the two, failing to reject either interpretation. 21 Additional results for the first-stage equation show that groups with five 25 members (group size dummy ¼ 1) were more likely to choose the autonomous governance form, holding all other factors constant, although the coefficient is insignificant in the fall class presumably due to the small sample size (note that the magnitude of the coefficient is roughly the same). This result is inconsistent with a natural prediction that members 30 of a larger group are more likely to select the contract type that punishes free-riding, because such behavior tends to be more prominent in larger 21. There is an alternative explanation, suggested by an anonymous referee, that we examined. Groups with more upperclassmen are more likely to have some members who know each other, which in turn may help the groups to form higher trust. In other words, the finding may simply reflect the tendency that trust formed based on pre-existing acquaintance make the democratic contract unnecessary. In order to examine whether this alternative explanation could be supported, we created a new dummy variable, Same Program Dummy, indicating whether the group has two or more upperclassmen from the same year level AND the same school/college. (If a group has friends or those who have known each other, they are likely to be in the same year level and the same school/college). There were 14 such groups out of 76. If the age diversity effect picks up this acquaintance effect (or friend effect), controlling for the above peculiar group composition will reduce the magnitude of age diversity effect. The result, which is not included in the article but available upon request, shows that adding the dummy variable barely changed the coefficients of all variables including the standard deviation of age. In addition, Same Program Dummy is not significantly associated with either contract choice or the performance, contradicting this alternative interpretation.

Group Contract and Governance Structure on Performance 15 groups. The result, however, is in accordance with the argument that freeriding behavior by one or two members is more detrimental to a smaller group. A smaller group may prefer to choose the explicit punishment mechanism that safeguards its members against potential free-riding. In 5 contrast, in a five-member team, students might feel that they have more members than necessary and free-riding by one or even two members is not as detrimental as for those in smaller groups. 22 Another interpretation is that the psychological cost of supporting the democratic contract may be higher because it may signal students distrust of other members. Thus, 10 some students may veto the choice of the democratic team contract in favor of the autonomous team contract. Larger teams are more likely to have someone opposed to the democratic team contract and, therefore, end up adopting the autonomous team contract. The upper half of Table 2 reports the results for equation (2), which 15 includes the same set of control variables as in the first equation plus the contract choice as the treatment variable. These results show the possible impacts of group contract choice and group composition characteristics on group performance. As shown in bold, the coefficient of the endogenous indicator variable 20 for democratic contract is statistically significant at the 1 and 5% level for both the pooled sample and the fall class when the TA working hours is not controlled for. On average, a democratic group scored 6.6 points and 4.2 points higher (out of 100 points) on group project work compared with an autonomous group in the pooled and fall samples, respectively, holding all 25 other group characteristics constant. Including TA working hours does not seem to make the treatment effect significantly smaller especially for the pooled sample implying that the estimated impact of contract choice is not a spurious one caused by unobserved TAs characteristics that are correlated with their working hours (e.g., effort, help, commitment, etc.). These 30 important results indicate that group members have extra motivation to work harder under the democratic team contract that punishes free-riders and rewards hard workers; whereas in autonomous groups, such motivation does not exist and individual productivity could be discouraged by potential free-riding activities. 35 We also find that groups with higher average SAT scores perform better in their group projects than those with lower average SAT scores in the pooled sample. We do not find such effect in the fall class, in which crossteam variation in students SAT scores is much less and averages are higher relative to spring classes. Moreover, we find that groups with a 40 higher proportion of male students perform worse in their group work than those groups with a higher proportion of female students. Interestingly, this gender effect seems to appear primarily in the fall 22. Footnote 31 provides support for this interpretation.

16 The Journal of Law, Economics, & Organization class where the average age of students is lower. These findings are consistent with prior works in the education literature indicating that gender effects gradually disappear as students move up in the year level (see, e.g., Richter and Trede 2006). 5 Finally, groups with all male students scored significantly lower than groups with all female students on average, holding all other group characteristics constant. 23 This poor average performance of male-dominant groups is interesting, especially given the result (shown on the lower half of Table 2) that groups with a higher proportion of male students are less 10 likely to choose the democratic contract than those groups with a higher proportion of female students. If male students are more likely to freeride, introducing a mechanism to punish free-riders should generate a greater benefit for male-dominant groups than others. One potential explanation for this is that the contract choice of male-dominant groups may 15 have been influenced by potential free-riders desire to avoid punishment. Knowing that they might shirk, some male students presumably voted against a democratic contract. In Appendix A, we briefly discuss the results from alternative specifications in our modeling. In particular, we more carefully look at the possible 20 nonlinear effects of the gender composition and group size variable on the group performance. In addition, we use a different knowledge and skill measure of a group (other than the average SAT) as a robustness check. By so doing, we address the following two concerns: first, gender composition and group size may affect how group members interact with each 25 other in a more complex way than simple linear measurement can capture. Second, due to possible spillover effects, the most capable and knowledgeable person in a group may have much greater impact on the group performance than all others. In this case, the average SAT may not be a good proxy for the group skill or knowledge level, and instead we use the high- 30 est SAT in a group. Overall, using these different specifications did not significantly improve the fit of the estimation or produce qualitatively different results compared with the one presented in Table 2. See Appendix A for the details. 4.1.2 Grade Adjustment Process. In this section, we take a close look at the 35 incidents of grade adjustment (or punishment) among 51 democratic groups to understand which groups choose to punish. Table 3 summarizes the frequency of punishment across certain group characteristics, namely group size and gender composition. As summarized in this table, many groups made only moderate adjustments in grades (on average 2.7 points 40 out of 100), whereas severe punishment of a single member was extremely rare. In fact, there were no cases in which one member s grade was lowered 23. See Hansen et al. (2013) for a more detailed discussion on group characteristics and their impact on group performance.

Group Contract and Governance Structure on Performance 17 Table 3. Frequency of Grade Adjustment/Punishment in Democratic Teams By group size Three-member Four-member Five-member Total Total number 5 27 19 51 No. of groups with grade 1 9 10 20 adjustment No. of shirkers (those with 1 point 0 9 10 19 or more subtracted) Average penalty on shirkers NA 3.1 2.4 2.7 By gender composition Male-dominant Balanced Female-dominant Total Total number 14 28 9 51 No. of groups with grade 6 12 2 20 adjustment No. of shirkers (those with 1 point 7 10 2 19 or more subtracted) Average penalty on shirkers 3.3 2.6 1.3 2.7 by 10 points and there were only three cases in which the maximum penalty was 5 points or more. We counted punishment at two levels: number of groups that implemented any punishment and number of shirkers those students whose grade was lowered at least 1 point. It is clear from 5 Table 3 that grade adjustment is not a mere threat because two out of every five democratic groups actually resort to it. Therefore, some students do free-ride and others detect it and punish the free-rider. Another important finding is that the punishment of free riders apparently is not too difficult for larger groups to coordinate. We find that 10 out of 19 of 10 the five-member groups under the democratic contract actually made a grade adjustment, and this ratio is higher than the incidence of punishment among three and four-member groups as shown in Table 3. Furthermore, whereas slightly less than half of male-dominant and balanced groups chose to punish, we find that only two out of nine 15 female-dominant democratic groups resorted to punishment. In order to examine more rigorously what kind of groups are more likely to experience free-riding and subsequently punishment, we estimated two probit models contract choice and punishment simultaneously to avoid a selection bias using the MLE. Based on these results, 20 not reported in the article, punishment does not seem to be significantly associated with any of the group characteristics. There may be two reasons why we did not find any significant and robust relationships between group characteristics and the likelihood of punishment. First, our sample size of democratic groups (51 groups) is too

18 The Journal of Law, Economics, & Organization small to obtain any reliable estimates. Second, there may not be clear association between the incidents of free-riding and grade adjustment because the personality mix of members also matters in determining the nature and the consequences of conflicts. 5 4.1.3 Switching Regression Model and the Discussion of Results. A characteristic of the model with a dummy endogenous regressor is that the treatment effect is assumed to be constant across all groups either treated or nontreated. This model is misspecified if there exists enormous heterogeneity in the treatment effect across groups. Suppose the effectiveness of each 10 governance form varies and depends on factors that are observable to group members but not necessarily observable to researchers such as the personalities of members, qualities of TAs, differences in course loads across team members, and so forth. If students observe all relevant information and choose the democratic team contract only when the expected 15 benefit of the contract exceeds that of the autonomous team contract, the treatment effect should differ across groups, especially between the treated and the nontreated. To account for the possible heterogeneous treatment effects, we estimate the following switching regression model next: 24 Contract jk ¼ X j D + D TA influence j + D k +"D j ð3þ Contract jk ¼ 1ifContract jk 0; ¼ 0 if otherwise Y jk ¼ð1 Contract jk ÞY 0 jk +Contract jky 1 jk ð4þ 20 Y 0 jk ¼ X j 0 + 0 k +"0 j if Contract jk ¼ 0 ð5þ Y 1 jk ¼ X j 1 + 1 k +"1 j if Contract jk ¼ 1 0 B @ " D j " 0 j " 1 j 1 C Afollows Nð0, Þ where ¼ 0 1 1 D0 D1 @ D0 0 2 01 A D1 01 1 2 ð6þ ð7þ where the same variables are used for group j in class k as in equations (1) 25 and (2). Using the estimation results in the switching regression model, we can obtain the conditional expectation of treatment effect given the characteristics of group j, X j ¼ x j and its contract choice Contract jk ¼ l (l ¼ 0or 1) by estimating 24. See Maddala (1983: 223). Amemiya (1985) refers to the model as a Type 5 Tobit Model. The model we estimate is similar to the one in DeVaro (2006) where he estimates the effect of teams and team autonomy although his unit of observation is the workplace rather than the team and the team autonomy level is generally chosen by employers rather than workers themselves.

Group Contract and Governance Structure on Performance 19 E½Y 1 Y 0 jx j ¼x j, Contract jk ¼ lš ¼x j ð 1 0 Þ+ 1 k 0 k +E½" 1 j " 0 j Contract jk ¼ lš: Further, we can approximate the average treatment effect for the treated and the nontreated by taking the following sample averages, respectively: E½Y 1 Y 0 jcontract jk ¼ 1Š 1 X x j ð ^ 1 ^ 0 Þ+ ^ 1 k n ^ 0 k +ð ^ D1 ^ D0 Þ ðx j ^ D + ^ D TA inf luence j + ^ D k Þ 1 j:contract jk ¼1 ðx j ^ D + ^ D TA inf luence j + ^ D k Þ E½Y 1 Y 0 jcontract jk ¼ 0Š 1 X x j ð ^ 1 ^ 0 Þ+ ^ 1 k n ^ 0 k +ð ^ ðx j ^ D + ^ D TA inf luence j + ^ D k D0 ^ D1 Þ Þ 0 1 ðx j ^ D + ^ D TA inf luence j + ^ D k Þ j:contract jk ¼0 5 where n l denotes the number of groups with Contract ¼ l and parameters with ^ indicate the estimated parameter values. An additional benefit of using this model is that we can investigate how the impact of group characteristics on group performance differs between the treated and the nontreated. In short, a switching regression model allows for: (1) treatment 10 effects to vary across groups and (2) group characteristics to affect group performance differently in the two governance regimes. The MLEs of switching regression models tend to require a larger sample for identification. Given our small sample size, we were unable to obtain the MLEs for this model. As a second-best solution, we com- 15 puted the Heckman s two-step estimators (Heckit hereafter) for the switching regression model. Figure 1 illustrates the kernel density estimates of treatment effects from this model. 25 The average treatment effect is 10.72 and there is only one observation out of 76 for which the estimated conditional treatment effect is negative. Therefore, the expected 20 impact of the democratic team contract is positive for almost all groups and quite large for most. In addition, we do not find a substantial difference between the average treatment effects for the democratic and autonomous contract choice, which are 11.94 and 8.24, respectively. As shown in Figure 1, the difference in the peak of the distribution between the treated 25 and the nontreated is relatively small compared to the dispersion. 26 This implies that groups that would benefit more from the democratic team contract are not necessarily more likely to adopt it. This result raises the question of why groups with certain characteristics are more likely to 25. The estimated coefficients from the two-step estimators are not reported but available from authors upon request. Most coefficients for the variables of our interest are insignificant partly due to the small sample size. One notable finding is that the average SAT is positively associated with the group performance for autonomous teams only. This may imply that, for those who choose the democratic team contract, increased and coordinated effort and knowledge inflow to the group may be substituting pre-existing knowledge within the group. 26. We also attempted to estimate the boot-strapping standard errors for the treatment effects, but failed to obtain stable estimates due to the small sample.

20 The Journal of Law, Economics, & Organization Figure 1. Kernel Density of Treatment Effect: Group Performance. choose the democratic contract as we find in the models presented in Table 2. One potential explanation is that the bargaining costs to agree to choose the democratic contract is greater for larger groups, male-dominant groups, and groups with upper classmen, presumably due to the 5 possibility that there is greater heterogeneity and asymmetric information regarding each member s commitment to contribute to the group work in such groups. 4.2 Impact of a Group s Governance Choice on Individual Learning In this section, we examine the extent of knowledge spillovers that take 10 place within groups. We investigate what group characteristics may have contributed to the learning of individual members and whether individual learning is associated with the governance choice. Similar to our analysis of the group performance data, we estimate two models: the model with a dummy endogenous regressor and the switching regression model at the 15 individual level. One caveat is that the decision to choose the team governance form is made collectively at the group level while performance is measured at the individual level. In addition, we expect that the performances of group members on exams are likely to be correlated with each other due to knowledge spillovers: groups worked on lengthy and challen- 20 ging assignments together and likely studied together for their exams as well. This presumption, however, turns out not to hold true as we show below.

Group Contract and Governance Structure on Performance 21 4.2.1 Dummy Endogenous Regressor Model and Individual Learning. We estimate the following model:. Contract jk ¼ X j D + D TA influence j + D k +"D j Contract jk ¼ 1ifContract ð8þ jk 0; ¼ 0 if otherwise Y ijk ¼ Z i Y +X j Y + Y Contract jk + Y k +"Y ij ð9þ 5 Subscript i is the index for the i-th student in the sample while the subscripts j and k serve as the indices for the j-th group and the k-th class in which the i-th student participates. Equation (8) is identical to equations (1) and (3), whereas equation (9) determines the performance of student i, Y ijk, which is the weighted average of the three exam scores two mid- 10 terms and one final. 27,28 The vector of control variables Z i includes individual characteristics such as year level in college, gender, SAT score, and whether students lived off-campus residence or were engineering students. X j is the same vector of group characteristics variables used in previous models. The team contract choice and individual performance measures 15 are assumed to be independent across groups, but we allow the error terms of team contract decision and individual performance measures within the same group to correlate with each other. Thus, error terms are assumed to follow a normal distribution with the following covariance matrix: 0 1 D D 1 D D 2 Y 2 Y 2 ð" D j, " Y 1j,..., "Y n j j follows Nð0, Þ and ¼ Þ0 D Y 2 2 Y 2 B........ C @ A D Y 2 Y 2 2 ð10þ 20 The results of the second stage of the endogenous dummy variable model, specified by equation (9), are reported in Table 4. We omit the results for the team contract choice, estimates of equation (8), since they are very similar to the earlier estimates presented in Table 2. We present the results for the pooled sample that includes students from all classes in 25 Columns 1 3 and show the results for the fall semester subsample in Column 4. Of the three columns showing estimation results for the pooled sample, Column 1 presents the Heckit results whereas Columns 27. The two midterm exams each account for 20% of the course grade and the final exam accounts for 25%. IndScore is calculated based on these weights, that is, IndScore ¼ (20*midterm1 + 20*midterm2 + 25*final)/65. 28. We investigated the possibility of nonlinearity between performance measures and group and individual characteristics estimating a simple ordered probit. These unreported results reveal that the grade scale does not play any important role here. In other words, changes in characteristics causing the improvement in grade from a B to an A is similar to those inducing the improvement from a C to a B.

22 The Journal of Law, Economics, & Organization 2 (our baseline model) and 3 (baseline model plus the explanatory variable for TA working hours) show results from MLE estimations. Our most striking result is the substantial and statistically significant impact of the group governance choice on individual learning. Students in 5 groups with the democratic team contract apparently learned more (or at least were better prepared for exams). These students obtained 4.3 7.0 points (out of 100) higher exam scores than those with the autonomous team contract. Among the MLE results, the coefficient on the democratic contract is larger in the fall class implying that the impact of the contract 10 choice is greater for the more homogenous class of freshmen who are mostly in business school. Among this group of students, possibly due to more similar interests and schedules, the incentive mechanism of democratic governance structure apparently led to increased knowledge spillovers and group learning. 29 15 The coefficient of SAT is statistically significant at 1% level in all columns indicating that an individual s SAT score is a very good indicator of how well he or she performed in MGT100 exams. None of the other individual characteristics are significant in determining student performance in exams. 20 Among variables indicating group characteristics, students from fivemember groups performed consistently better in their exams relative to students in smaller groups. This is likely either a result of the greater pool of knowledge and skill sets in larger groups, or the greater likelihood that at least one group member can help others in studying for exams. But, 25 since the five-member group dummy also indicates that nobody in the group dropped out of the course, it may be simply correlated with unobservable group qualities such as the personality mix of group members. Exam performance of students from groups with larger variation in age is substantially higher (2 3.4 points), and statistically significant, although 30 weakly in models where we control for TA working hours. This result may show that there is more learning and increased levels of knowledge spillovers when students have different knowledge levels and skill set as indicated by age distribution within groups. The negative and statistically significant effect of TA working hours is 35 somewhat puzzling because it implies that those who get more help from TAs perform worse. There are two possibilities. First, it may be the case that groups whose members expect to perform worse in the exam are more likely to ask their TAs for help. Thus, more TA working hours simply indicates more students having trouble in completing homework and sol- 40 ving sample exam questions. Second, TAs often spend time advising groups on their group projects. If working on a group project and 29. In unreported results we find that shirkers (i.e., students who were punished by their group members in democratic groups) score on average 5 points lower in their exams. This difference in exam scores may be partially due to a lack of knowledge spillovers from group work to individual exams for some students.

Group Contract and Governance Structure on Performance 23 Table 4. Endogenous Dummy Variable Model (Second Stage) for Individual Performance (1) Pooled sample Heckit Dependent variable: individual performance (2) Pooled sample MLE (3) Pooled sample MLE (4) Fall class sample MLE Individual characteristics Gender 0.226 0.146 0.152 0.473 (0.974) (0.952) (0.950) (1.331) SAT 2.916*** 2.805*** 2.818*** 2.874*** (0.478) (0.501) (0.501) (0.665) Off campus 1.799 1.351 1.586 0.638 (1.715) (1.511) (1.505) (2.164) Engineering student 1.756 1.783 1.764 0.684 (1.508) (1.668) (1.663) (2.688) Sophomore student 0.256 0.161 0.146 (1.224) (1.488) (1.481) Junior student 0.132 0.180 0.076 (1.774) (2.115) (2.102) Senior student 1.659 2.140 1.890 (2.322) (2.493) (2.480) Group characteristics Average SAT 0.245 0.151 0.218 1.032 (0.569) (0.569) (0.563) (0.654) Standard deviation of SAT 0.166 0.241 0.156 0.309 (0.396) (0.460) (0.454) (0.626) Gender composition 4.573 5.631* 4.727 14.066*** (3.648) (3.038) (3.026) (4.729) Standard deviation of age 2.364* 1.993 2.172* 3.426* (1.204) (1.301) (1.259) (1.818) Group size (five-member) dummy 2.158** 2.056** 2.012** 3.395*** (1.036) (0.982) (0.968) (1.191) Engineering student dummy 0.535 0.748 0.438 0.727 (1.110) (1.026) (1.018) (1.540) TA working hours 0.040** 0.037** 0.041** (0.019) (0.017) (0.019) Treatment Contract choice 5.702* 4.330*** 4.968*** 6.984*** (3.024) (1.609) (1.437) (1.414) Variance components sigma 7.331*** 7.334*** 7.742*** (0.325) (0.325) (0.441) rho_d 0.170* 0.200** 0.273*** (0.103) (0.081) (0.062) rho_y 0.013 0.014 0.001 (0.043) (0.043) (0.049) Number of individuals 325 325 325 192 Note: *p < 0.1; **p < 0.05; ***p < 0.01; class fixed effects for (1) (3) and a constant term for (4) are included but omitted from the table. Clustered standard errors for (1) and standard errors for (2) (4) are in the parentheses.

24 The Journal of Law, Economics, & Organization preparing for individual exams compete in one s time allocation, students who spent more time with TAs working on their group projects may not have performed as well as those students who did not work as much on their group projects. 5 The gender composition of groups also had a noticeable influence over individual performance, especially in the fall class. Specifically, a student in a group with a higher proportion of males tended to do much worse on exams compared with other groups after controlling for everything else including the gender of the student. This finding is consistent with the 10 previous results obtained using group-level data. 30 Finally, another notable finding is the lack of positive correlation in performance within groups after controlling for individual and group characteristics and contract choice. The estimated correlation coefficient Y is even shown to be slightly negative in the pooled sample in Table 4. 15 This indicates that the size of within-group peer effects is largely explained by observable group characteristics and contract choice and there is little knowledge spillover associated with unobservable group characteristics. To sum up, after controlling for most observable individual and group characteristics, students with more knowledge resources measured by the 20 age diversity in their groups and the size of the groups and those with effective punishment mechanism performed better in exams. 4.2.2 Switching Regression Model and Individual Learning. We estimate a switching regression model for individual learning similar to the one we have estimated in our group-level analysis (See equations (3) (7)) to ac- 25 count for heterogeneous treatment effects. We again allow the error terms of the contract choice and individual performance measures within the same group to correlate with each other so the covariance matrix similar to equation (10) is assumed here, too. Since the sample size is larger, we are able to obtain MLE results for individual performance. 31 30. Hansen et al. (2013) offer an explanation for this gender effect: men and women have complementary skills, perspectives, and personalities. For example, a general finding in the psychology and education literature is that men tend to outperform women at various quantitative tasks, and women appear to outperform men at a variety of verbal tasks. Maledominant groups are also more likely to suffer coordination failures in assigning tasks among members and organizing findings because men are more single minded, task oriented, and less compromising than women according to the social psychology literature. 31. The coefficient results again are not reported because many variables of our interest do not have significant coefficients but they are available from the authors upon request. One notable finding is that students learn more and score higher in their exams if they are in larger, five-member groups than in smaller groups when the autonomous team contract is chosen. However, students in democratic groups do not similarly benefit from being part of a larger team. This finding may indicate that three- or four-member groups are more vulnerable to free-riding than five-member groups because small groups have fewer people to parcel out work to and it becomes more difficult to complete homework and group project assignments when one or two members choose not to cooperate. Since the democratic contract prevents free-riding behavior, this small group disadvantage does not show up in democratic groups.

Group Contract and Governance Structure on Performance 25 Figure 2. Kernel Density of Treatment Effects: Individual Performance. The kernel density of treatment effects is presented in Figure 2. The average treatment effect is 2.08 points, much smaller than the result from the endogenous dummy variable model. Since the kernel density of treatment effects is so dispersed, 80 out of 325 observations (25% of the pooled 5 sample) exhibit negative treatment effects. Further, the average treatment effects for the treated and the nontreated are 2.07 points and 2.10 points, respectively, showing little difference between the two groups similar to our group-level analysis. This indicates that the expected benefit of the democratic contract on individual performance was not perceived or con- 10 sidered when choosing a contract. This interpretation is further reinforced by the result that the MLE for D is not significant, which implies that there is no endogeneity of the contract choice on individual performance. 5. Conclusion In this article, we have empirically analyzed the effect of team character- 15 istics on the team s choice of governance structure and the combined impact of group characteristics and the group contract choice on team and individual performance. In order to properly deal with the endogeneity of the contract choice, we estimated endogenous dummy variable In order to examine this hypothesis that the treatment effect should be smaller for larger groups, we calculated the sample average treatment effects for groups with three, four, and five members, which are 4.85, 2.45, and 1.30, respectively. Such an inverse proportional relationship supports our hypothesis.