UNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL

UNIVERSITY OF CALIFORNIA SANTA CRUZ TOWARDS A UNIVERSAL PARAMETRIC PLAYER MODEL A thesis submitted in partial satisfaction of the requirements for the degree of DOCTOR OF PHILOSOPHY in COMPUTER SCIENCE by JOSEPH OSBORN November 2012 The Thesis of Joseph Osborn is approved: Professor X, Chair Professor Y Professor Z Tyrus Miller Vice Provost and Dean of Graduate Studies

Table of Contents List of Figures iv List of Tables v Abstract vi Dedication vii Acknowledgments viii 1 Introduction 1 2 Related Work 2 3 Method 2 4 Experiments 3 5 Results 3 6 Discussion 3 7 Figure/Table/Ref Competency 4 References 5 iii

List of Figures 1 Combat in Dragon Quest: A toy environment.............. 2 iv

List of Tables 1 Edges in a DFS forest........................... 4 v

Abstract Towards A Universal Parametric Player Model by Joseph Osborn 1. State the problem briefly. 2. Describe the methodology. 3. Summarize the findings. ProQuest recommends that the Abstract be no longer than 350 words, as it may be posted to sites with limited file size

DEDICATION! vii

ACKNOWLEDGMENTS! viii

1 Introduction Automated design support for goal-oriented dynamic systems with human users (e.g. games, software, or amusement parks) relies on three types of model to make useful recommendations: A model of the task under consideration, a model of the human doing the design, and a model of the human(s) using the designed system. In the domain of game design, this last concept is called a player model [1]. There are many different types of player model which are used for different analysis or user-adaptation tasks, but here I am concerned with the project of generalizing a set of Individual Generative Action models from individual games to a Universal Induced Generative Action model applicable to any of a class of games (these terms indicate that the first models exist to Generate game Actions representative of an Individual player; if generalization succeeds, I gain a model Induced from the data which describes the Universe of players from which I can synthesize any individual by instantiating a parameter vector). Specifically, I intend to lift from agents trained in the same style across multiple toy environments a fingerprint: hidden parameters which inform the specific policies in each case. I make the simplifying assumption for the sake of data gathering and labeling that I may use Synthetic ( justified by reference to an internal belief or external theory ) models in the data to be classified, rather than requiring that the Individual models be Induced from actual human play traces; this is not a requirement of the theory, but an expedience. In other words, I want to identify whether it is possible to classify distinct play traces, possibly from distinct games, as possessing the same style or some other identifying characteristics, and to describe those characteristics. I expect from intuition that candidate parameters include, but are not limited to, risk-aversion, greed, forethought, stubbornness, and curiosity all of which can be operationalized in terms of expected negative reward (loss), expected positive reward, time discounting factors, and selected actions over time. It is also possible that additional parameters such as reaction-time or social-consciousness might exhibit in real-time or multi-user environments, but this is an area for future research. It is enough for this project if 1

Figure 1: Combat in Dragon Quest: A toy environment. the parameters cover the players of the class of turn-taking games. 2 Related Work Könik et al. have studied how an agent can transfer task knowledge across environments [2], but I am more interested in how an agent with the same generic parameters but no knowledge transfer would learn in the new environment. Researchers have used play metrics [3] and specific actions ([4, 5]) to classify players into groups, and some have identified game-specific player parameters [6]. Of the parametric approaches, Spronck and den Teuling s considers hours of play and requires that players signal changes in intention, and Gold s considers only a single task in the game under test and, further, finds that non-synthetic players cut across all of the preconceived categories in the model. Thue et al. use an online and incremental approach which hand-codes the parameter shifts based on in-game events, whereas I hope to induce those parameter shifts by other means. All of these classifications are closely and pragmatically tied to the games or genres under test. 3 Method... 2

4 Experiments... 5 Results Results information will go in here... 6 Discussion Discussion information will go in here... 3

7 Figure/Table/Ref Competency Above is a test figure 1 and below is a test table 1 and aligned equation environment (I prefer align to the \[\] environment). And here s a url: http://ndseg.asee. org/application_instructions/summary_of_goals. Edge Order Type GD 1 Tree DA 2 Tree AB 3 Tree BC 4 Tree BD 5 Back AC 6 Descendant AF 7 Tree F A 8 Back F C 9 Cross DC 10 Descendant GE 11 Tree EC 12 Cross EG 13 Back Table 1: Edges in a DFS forest. 4

An equation: w = (X T X) 1 X T t w = 1 1 1 1 1 3 6 7 8 1 9 9 7 6 0 2 1 7 4 8 1 3 9 2 1 6 9 1 1 7 7 7 1 8 6 4 1 1 0 8 1 1 1 1 1 1 3 6 7 8 1 9 9 7 6 0 2 1 7 4 8 10 11 3 w = 5 25 31 22 25 159 178 101 31 178 247 100 22 101 100 134 1 1 1 1 1 1 3 6 7 8 1 9 9 7 6 0 2 1 7 4 8 10 11 3 w = 6.611 0.080 0.607 0.693 0.080 0.047 0.035 0.023 0.607 0.035 0.078 0.068 0.693 0.023 0.068 0.088 1 1 1 1 1 3 6 7 8 1 9 9 7 6 0 2 1 7 4 8 10 11 3 w = 0.002 0.935 1.929 0.837 1.147 0.140 0.024 0.003 0.154 0.057 0.126 0.047 0.170 0.147 0.098 0.026 0.131 0.238 0.117 0.012 10 11 3 w = 4.279 0.309 1.878 0.866 5

References [1] A. Smith, C. Lewis, K. Hullett, G. Smith, and A. Sullivan, An inclusive taxonomy of player modeling, University of California, Santa Cruz, Tech. Rep. UCSC-SOE-11-13, 2011. [Online]. Available: http : / / sokath. com / main / files/amsmith-ucsc-soe-11-13.pdf. [2] T. Könik, P. O Rorke, D. Shapiro, D. Choi, N. Nejati, and P. Langley, Skill transfer through goal-driven representation mapping, Cognitive Systems Research, vol. 10, no. 3, pp. 270 285, 2009. [Online]. Available: http://www. sciencedirect.com/science/article/pii/s1389041708000715. [3] A. Drachen, A. Canossa, and G. Yannakakis, Player modeling using selforganization in tomb raider: underworld, pp. 1 8, 2009. [Online]. Available: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=5286500. [4] D. Thue, V. Bulitko, M. Spetch, and E. Wasylishen, Interactive storytelling: A player modelling approach, pp. 43 48, 2007. [Online]. Available: http:// www.aaai.org/papers/aiide/2007/aiide07-008.pdf. [5] K. Gold, Trigram Timmies and Bayesian Johnnies: Probabilistic Models of Personality in Dominion, 2011. [Online]. Available: http://www.aaai.org/ ocs/index.php/aiide/aiide11/paper/viewfile/4072/4426. [6] P. Spronck and F. den Teuling, Player Modeling in Civilization IV, pp. 180 185, 2010. [Online]. Available: http://www.aaai.org/ocs/index.php/aiide/ AIIDE10/paper/viewPDFInterstitial/2124/2565. 6