Running Head: ATTITUDE TOWARD NON-INTELLIGENT AGENT SCALE Reliability and factor structure of the Attitude Toward Non-Intelligent Agent Scale Richard Van Eck, University of Memphis Amy Adcock, University of Memphis
Abstract This paper describes the principle components analysis of a scale designed to measure the affective impact of non-intelligent pedagogical agents. Pedagogical agents are on the increase as delivery methods for computer-based instruction. Because of programming constraints, many researchers are using non-intelligent agents, which can deliver preprogrammed instruction but still appear to be interactive to the user. The Attitude Toward Non-Intelligent Agent Scale (ATNAS) was designed to measure user perceptions of an agent s affective impact during instruction. Scale items were derived from research on student ratings of human teachers. Principle components and reliability analysis show a scale with three identifiable constructs and high alpha coefficients. 2
Reliability and factor structure of the Attitude Toward Non-Intelligent Agent Scale (ATNAS) Introduction Statement of the problem Computer-based learning environments that use pedagogical agents are rapidly coming to the forefront of learning technologies (Lester, et al., 1997; Graesser, Wiemer-Hastings, Wiemer- Hastings, Kreuz, & Tutoring Research Group, 1999; Johnson, Rickel & Lester, 1999; Moreno, Mayer, Spires & Lester 2001; Baylor & Ryu, 2003). Pedagogical agents allow designers to create an environment in which learners can interact with an agent to get advice, feedback, or instruction. Some of these agents are capable of interacting with learners in a conversational, dynamic format, while others are used to present relatively static content and feedback. Existing research in agent environments has predominantly focused on learning gains and describing the roles agents can take in an instructional scenario (e.g., Baylor 2000; Cassell & Thorisson 1999; Link, Kreuz, Graesser, & TRG 1997; Moreno, Mayer & Lester 2000; Moreno, et al., 2001; Rickel, 2001Whatley, Staniford, Beer & Scown 1999). While learning gains are a primary variable of interest in the study of any pedagogical tool, it is also important to examine the affective components as well. The role that affective variables such as mood, motivation, attitude toward instruction, and attitude toward content can play in the learning process has been well-documented, and many believe that these factors are at least as important as direct measures of learning gains (e.g., Anderson, 1995; Bardwell, 1984; CTGV, 1992c; Lent, Brown, & Larkin, 1984; Marsh, Cairns, Relich, Barnes, & Debus, 1984; Sedighian & Sedighian, 1996; Shaw & Costanza, 1970; Smead & Chase, 1981). Some researchers have begun to examine these kinds of affective variables with pedagogical agents (Baylor & Ryu, 2003; Lester, et al., 1997; Lester & Stone, 1997). It is important for researchers who are working 3
with agent learning environments to be able to analyze learning gains independently of affective reaction to the agent. While current research indicates that the effect of agents is nominal in terms of learning gains, affect remains an important variable in the design of appealing animated pedagogical agent environments. This paper will describe the initial development and reliability tests of an instrument to measure the learners affective response toward non-conversational pedagogical agents. The evolution of pedagogical agent environments Although not a brand new technology, the use of software agents has witnessed a marked increase over the last few years. Initially, the term software agent was defined as software programmed with some type of agent communication language, in turn defined as a complex algorithm designed for effective communication of statements back and forth from agent to program (Genesereth, 1994). In this context, agents were best used as software interoperators (p. 53), e.g., as communication tools for the integration of databases. No thought was given to communication with human users. More recently in the field of education, the focus has been on what is termed animated pedagogical agents. This new breed of intelligent agent is defined as a combination of knowledge based engineering principles and animated interface agents with the ability to combine the roles of animated and autonomous agents (Johnson, Rickel & Lester, 1999). Although there is little doubt that dynamic, intelligent tutoring agents hold the greatest potential for deep learning, developing such systems is a resource intensive task. For this reason and because of technology limitations, many researchers continue to develop non-intelligent agents (i.e., those that deliver pre-specified content and feedback). These agents are easier to develop and deploy, and by adhering to sound instructional design considerations can be effective in 4
coaching learners and delivering instruction and feedback. Such agents have also been called advisors. The importance of affect To date, researchers trying to create intelligent computers have focused on problem solving, reasoning, learning, perception, language, and other cognitive tasks considered essential to intelligence. Most of them have not been aware that emotion influences these functions in humans, (Picard, 1997, p. 47) The role of affect is an important component in agent environments. As seen in the previous writings, new technologies and developers are moving towards the use of agents in a dynamic conversational interface. Several studies directly addressing the affective aspect of pedagogical agent learning environments have been conducted (Baylor & Ryu, 2003; Lester, et al., 1997; Lester & Stone, 1997). The educational uses of agents have, to this point, centered on the agent as a guide or mentor for students. This is consistent with the tenets of learner-centered education, rather than content-centered. The role of emotion and attitudes of the learner toward the agent should also be examined if we are to create true, learner-centered environments. Researchers have begun to examine how these variables can be assessed and incorporated into the agent environment. The need for an affective pedagogical agent scale Even with the limited number of empirical studies, it seems that agents can serve as effective mentors or guides for students. It also appears that the central goal of designers is to create a student centered learning environment applying the constructivist paradigm of education. Part of this requires attending to affective variables such as emotion, motivation, and attitude, and the literature to date shows that researchers are beginning to examine this (e.g. Lester, et. al, 1997; Baylor & Ryu, 2003). There is a need to measure affective response to both conversational 5
(dynamic) and non-conversational (non-intelligent) pedagogical agents. The researchers have developed a scale to measure affective response to conversational pedagogical agents based on current research on student ratings of human instructors (Adcock & Van Eck, unpublished). This scale is not appropriate for non-intelligent agents, however. It was the purpose of this study to develop and validate an instrument to measure affective response to non-intelligent pedagogical agents. Method Most existing scales for rating attitude toward instruction (assuming the agent takes the role of the instruction, ) are domain specific and are difficult to abstract general attitudinal measures; there is little guidance available for the construction of an attitudinal scale from these. There is an extensive body of research on instructor rating scales (e.g., Abrami, d Apollonia, & Rosenfeld, 1997), however, these scales are all predicated on the dynamic interaction between student and instructor. While agent environments that mimic instructors are possible and available (e.g., Graesser, Wiemer-Hastings, Wiemer-Hastings, Kreuz, and the TRG, 1999), the purpose of the current scale is to assess affective ratings of the agent in isolation from pedagogical efficacy and other constructs inherent in the student-teacher learning dyad. Because student instructor rating scales often include affective measures, and since the agents serve in the role of instructor (whether they are intelligent agents or not), the researchers elected to identify some of the more common, non-interactive/pedagogical constructs from items on student teacher ratings scales, which were then modified as necessary for an agent environment. The researchers reviewed existing literature on student instructor ratings, most notably Abrami, d Apollonia, and Rosenfeld (1997) to identify general constructs and items that are used to rate instructors. Additional items were also added based on the researchers past experience with agent 6
environments. The initial scale consisted of thirty items divided into five constructs: Intelligence, Attractiveness/Likeability, Teaching Ability, Trustworthiness, and Interestingness. Each item was scored on a six point Likert-type scale (1=strongly disagree to 6=strongly agree). Appendix A lists each question with its corresponding construct in this initial scale. Participants Participants for this study consisted of eighty-four undergraduate students at a large a private urban college. Participants were volunteers who were offered extra course credit in psychology courses in exchange for participation in the experiment. Data Analysis Data was collected from participants taking part in a study of stereotyping of pedagogical agents in a lesson on how to take a blood pressure reading created with Macromedia Authorware 6.0. Participants were randomly assigned to one of four conditions created by crossing the two independent variables: gender and race of agent. Participants viewed the agents (either male African American, male Caucasian, female African American, or female Caucasian) as they delivered preprogrammed instruction. Participants completed a posttest and the attitude scale items described in this paper. After the data collection took place, the results from the scale were extracted from the overall data set and a principal component analysis was run on the items to determine if the items for each of the four sub-scales did indeed load on four factors. This analysis was used to evaluate the validity of the constructs and to determine whether each item was strongly enough related to its subscale to justify its inclusion. The results of the initial item analysis and the steps taken to revise and construct the final version of the scale are discussed in the following section. Results 7
The results of the initial principal component analysis using Oblimin with Kaiser Normalization rotation indicated seven factors with eigen values greater than one (9.99, 4.05, 1.72, 1.50, 1.37, 1.17, and 1.07, respectively). These values explained 69.5% of the total variance (see Table 1). Table 1. Eigenvalues of principal factor analysis Initial Eigenvalues Factor Total % of Variance Cumulative % 1 9.99 33.29 33.29 2 4.05 13.49 46.78 3 1.72 5.72 52.5 4 1.50 4.99 57.49 5 1.37 4.56 62.05 6 1.17 3.90 65.94 7 1.07 3.58 69.52 The structure and pattern matrices showed that eleven items loaded on the first factor, four on the second, five on the third, five on the fourth, five on the fifth, four on the sixth, and four on the seventh. Seven items loaded on more than one factor. Problematic items (e.g., those that loaded on more than one factor or had weak strength of association) were removed. The researchers then examined each construct to determine what common threads tied the items together. Factors 3, 5, 6, and 7 did not seem to make sense as subscales, and given that each of them were comprised of two or three items, they were cut from the instrument. The resulting factors were named Interestingness (factor 1, 8 items), Trustworthiness (factor 2, 4 items), and Appearance (factor 4, 2 items). A second principle components analysis using Oblimin with Kaiser Normalization rotation was run on these final scale items. The three components had Eigenvalues greater than 1 (5.41, 2.72, and 1.35, respectively). These items accounted for 67.71% of the total variance. Table 2 presents these results. Appendix B presents the final scale items. Table 2. Results of Second Principal Components Analysis. 8
Initial Eigenvalues Factor Total % of Variance Cumulative % 1 5.41 38.62 38.62 2 2.72 19.44 58.06 3 1.35 9.64 67.71 The three constructs were then analyzed for reliability coefficients. Reliability for the overall scale was.87. Reliability for subscale 1, Interestingness, was.91; reliability for subscale 2, Trustworthiness, was.83; and reliability for subscale 3, Appearance, was.71. Discussion The finalized version of the Attitude Toward Non-intelligent Agent Scale is composed of 14 items in three subscales. The three subscales are defined as Interestingness (eight items), Trustworthiness (4 items), and Appearance (2 items). Principal components analysis and reliability analysis indicate the validity and reliability of this instrument and its subscales. The rationale for the development of this scale was to provide a reliable, valid, and generalizable way to measure the affective response of learners toward a variety of pedagogical agents. Previously researchers have measured affect in pedagogical agent environments by creating new items each time an experiment was designed (i.e. Lester, et al. 1997; Baylor & Ryu, 2003). The ATNAS was created to meet the need for a standardized measure of affect in these environments. The ATNAS appears to provide a reliable, valid measure of the affective response of the learner toward non-intelligent agents. The ATNAS will allow researchers interested in measuring affect to do so consistently and reliably, and to allow meaningful comparison across studies. Although work still needs to be done, this initial analysis is encouraging. The researchers intend to develop additional items for the Appearance subscale and collect additional data for analysis in the near future. 9
APPENDIX A Initial items for Attitude Toward Non-intelligent Agent Scale Agent Intelligence 1. "The character seemed intelligent." 2. "The character didn't know very much." 3. "The character seemed competent." 4. "The character seemed to know the material." 5. "The character did not seem to have a good grasp of the material." 6. "The character did not seem very smart." Agent Attractiveness/Likeability 7. "The character was likeable." 8. "The character had a pleasing appearance." 9. "I liked the character." 10. "I don't think most people would find the character likeable." 11. "The character was ugly." 12. "The character seemed friendly." Agent Ability to Teach 13. "The character presented the material in a way that was easy to understand." 14. "I learned a lot from the character." 15. "The character was a good teacher." 16. "It was hard to understand the material because of how the character presented it." 17. "I would NOT like to learn from this character if I studied other topics the same way." 18. "The character could not teach the material well." 10
Agent Trustworthiness 19. "I could trust what the character was saying." 20. "I believed what the character had to say." 21. "Most people would have a hard time believing what the character had to say." 22. "The character seemed untrustworthy." 23. "I wouldn't trust what the character had to say about anything." 24. "I believed what the character said." Agent Interestingness 25. "The character was interesting." 26. "The character got my attention." 27. "The character was boring." 28. "I did not find the character interesting." 29. "The character made the material more interesting." 30. "Most people would find the character uninteresting." 11
APPENDIX B Results of final factor analysis on revised scale items Revised Factors (Overall reliability=.87) 1 Interestingness (reliability=.91) "Most people would find the character uninteresting." "The character made the material more interesting." "I did not find the character interesting." "The character was boring." "The character got my attention." "The character was interesting." "I liked the character." "The character seemed friendly." 2 Trustworthiness (reliability=.83) "I wouldn't trust what the character had to say about anything." "The character seemed untrustworthy." "Most people would have a hard time believing what the character had to say." "The character didn't know very much." 3 Appearance (reliability=.71) "The character was ugly." "The character had a pleasing appearance."
REFERENCES Abrami, P. C., d'apollonia, S. & Rosenfield, S. (1997). The dimensionality of student ratings of instruction: What we know and what we do not. In R. P. S. Perry, J. C. (Ed.), Effective teaching in higher education: Research and practice. Edison: Agathon Press. Anderson, J. R. (1995). Cognitive psychology and its implications (4 th ed.). New York: W. H. Freeman. Bardwell, R. (1984). The development and motivational function of expectations. American Educational Research Journal, 21(2), 461-472. Bates, J. (1994). The role of emotion in believable agents. Communications of the ACM, 37(7), 122-125. Baylor, A. (1999). Intelligent agents as cognitive tools for education. Educational Technology, 39(2), 36-40. Baylor, A. (2000). Beyond butlers: Intelligent agents as mentors. Journal of Educational Computing Research, 22(4), 373-382. Baylor, A. & Ryu, J. (2003). Does the presence of image and animation enhance pedagogical agent persona? Journal of Educational Computing Research, 28(4), 373-394. Bloom, B. S. (1984). The 2-sigma problem: The search for methods of group instruction as effective as one-to-one tutoring. Educational Researcher, 13(6), 4-16. Cassell, J. & Thorrison K.R. (1999). The power of a nod and a glance: Envelope vs. emotional feedback in animated conversational agents. Applied Artificial Intelligence, 13, 519-538. 13
Chappell, K. K. & Taylor, C. S. (1997). Evidence for the reliability and factorial validity of the "Computer Game Attitude Scale", Journal of Educational Computing Research, 17(1), 67-77. Cognition and Technology Group at Vanderbilt. (1992c). The Jasper series as an example of anchored instruction: Theory, program description, and assessment data. Educational Psychologist, 27(3), 291-315. Dempsey, J., Litchfield, B., & Van Eck, R. (2002). Use of pedagogical advisement in technology-assisted learning environments. Proceedings of the annual meeting of the International Conference on Computers in Education, December 3 6, Aukland, New Zealand. ICCE Society Press. Genesereth, M. R. (1994). Software agents. Communications of the ACM, 37(7), 48-53. Graesser, A. C., Wiemer-Hastings, K., Wiemer-Hastings, P., Kreuz, R. & Tutoring Research Group. (1999). AutoTutor: A simulation of a human tutor. Journal of Cognitive Systems Research, 1, 35-51. Lent, R. W., Brown, S. D., & Larkin, K. E. (1984). Relation of self-efficacy expectations to academic achievement and persistence. Journal of Counseling Practice, 31(3), 356-362. Lester, J. C., Converse, S.A., Kahler, S.E., Barlow, S.T., Stone, B.A. & Bhoga, R.S. (1997). The Persona Effect: Affective impact of animated pedagogical agents. Association of Computing Machinery. Retrieved, 2001, from the World Wide Web: http://www.acm.org/sigchi/chi97/proceedings/paper/j1.htm Lester, J. C. & Stone, B.A. (1997). Increasing believability in animated pedagogical agents. Paper presented at the First International Conference on Autonomous Agents, Marina del Ray, CA. 14
Link, K., Kreuz, R., Graesser, A.C. & Tutoring Research Group. (2001). Factors that influence the perception of feedback delivered by a pedagogical agent. International Journal of Speech Technology, 4, 145-153 Marsh, H. W., Cairns, L., Relich, J., Barnes, J., & Debus, R. L. (1984). The relationship between dimensions of self-attribution and dimension of self-concept. Journal of Educational Psychology, 76(1), 3-32. Moreno, R., Mayer, R. & Lester, J.C. (2000). Life-like pedagogical agents in constructivist mulitmedia environments: Cognitive consequences of their interaction. Paper presented at the World Conference on Educational Multimedia, Hypermedia and Telecommunications, Montreal, Canada. Moreno, R., Mayer, R., Spires, H.A. & Lester, J.C. (2001). The case for social agency in computer-based teaching: Do students learn more deeply when they interact with animated pedagogical agents? Cognition and Instruction, 19(2), 177-213. Norman, D. (1994). How people might interact with agents. Communications of the ACM, 37(7), 68-71. Rickel, J. (2001). Intelligent virtual agents for education and training: Opportunities and challenges. Paper presented at the IVA 2001, Lnai, Hawaii. Rickel, J. & Johnson, L. (1997). Integrating pedagogical capabilities in a virtual environment agent. Paper presented at the First International Conference on Autonomous Agents, Marina del Ray, CA. Sedighian, K. & Sedighian, A. S. (1996). Can educational computer games help educators learn about the psychology of learning mathematics in children? Paper presented at the 18th Annual Meeting of the International Group for the Psychology of Mathematics Education -- 15
the North American Chapter, Florida, USA. Retrieved September, 1998, from the World Wide Web: http://www.cs.ubc.ca/nest/egems/papers.html Shaw, M. E., & Costanza, P. R. (1970). Theories of social psychology. New York, McGraw-Hill. Smead, V. S., & Chase, C. I. (1981). Student expectations as they relate to achievement in eighth grade mathematics. Journal of Educational Research, 75, 115-120. Picard, R. (1997). Affective Computing. Cambridge: The MIT Press. Van Eck, R., & Dempsey, J. V. (In Press). The effect of competition and contextualized advisement on the transfer of mathematics skills in a computer-based instructional simulation game. ETR&D. 16