Scrabble sucks! Toward higher-order word games

Similar documents
5 Guidelines for Learning to Spell

a) analyse sentences, so you know what s going on and how to use that information to help you find the answer.

Chapter 5: TEST THE PAPER PROTOTYPE

Contents. Foreword... 5

Testing for the Homeschooled High Schooler: SAT, ACT, AP, CLEP, PSAT, SAT II

How to make an A in Physics 101/102. Submitted by students who earned an A in PHYS 101 and PHYS 102.

2013 DISCOVER BCS NATIONAL CHAMPIONSHIP GAME NICK SABAN PRESS CONFERENCE

Undocumented Students. from high school also want to attend a university. Unfortunately, the majority can t due to their

Welcome to the Purdue OWL. Where do I begin? General Strategies. Personalizing Proofreading

LEARN TO PROGRAM, SECOND EDITION (THE FACETS OF RUBY SERIES) BY CHRIS PINE

PUBLIC SPEAKING: Some Thoughts

Susan Castillo Oral History Interview, June 17, 2014

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

What is Teaching? JOHN A. LOTT Professor Emeritus in Pathology College of Medicine

No Child Left Behind Bill Signing Address. delivered 8 January 2002, Hamilton, Ohio

Science Fair Project Handbook

PREPARATION STUDY ABROAD PERIOD. Adam Mickiewicz University Report 1. level bachelor s master s PhD. 30 / 06 / 2017 (dd/mm/yyyy)

Orange Coast College Spanish 180 T, Th Syllabus. Instructor: Jeff Brown

Teacher Loses Job After Commenting About Students, Parents on Facebook

Basic lesson time includes activity only. Introductory and Wrap-Up suggestions can be used

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

IN THIS UNIT YOU LEARN HOW TO: SPEAKING 1 Work in pairs. Discuss the questions. 2 Work with a new partner. Discuss the questions.

Iraqi EFL Students' Achievement In The Present Tense And Present Passive Constructions

Building a Sovereignty Curriculum

Teaching Reproducible Research Inspiring New Researchers to Do More Robust and Reliable Science

We'll be looking at some of the work of Isabel Beck, Mckeown, and Kucan as we look at developing

Mathematics process categories

HOW TO LEARN FASTER AND (RE)DISCOVER JOY OF LEARNING

Introduction 1 MBTI Basics 2 Decision-Making Applications 44 How to Get the Most out of This Booklet 6

CHAPTER 5. THE SIMPLE PAST

What Teachers Are Saying

flash flash player free players download.

Graduation Party by Kelly Hashway

Section 7, Unit 4: Sample Student Book Activities for Teaching Listening

site site social networking disadvantage disadvantage

Our installer John Stoddard was polite, courteous, and efficient. The order was exactly as we had placed it and we are very satisfied.

White Paper. The Art of Learning

Learning, Communication, and 21 st Century Skills: Students Speak Up For use with NetDay Speak Up Survey Grades 3-5

A CONVERSATION WITH GERALD HINES

CORRECT YOUR ENGLISH ERRORS BY TIM COLLINS DOWNLOAD EBOOK : CORRECT YOUR ENGLISH ERRORS BY TIM COLLINS PDF

TIMBERDOODLE SAMPLE PAGES

Academic Integrity RN to BSN Option Student Tutorial

English Language Arts Summative Assessment

Chapter 4 - Fractions

1 (Pages 1 to 4) Page 1. Page 3. Page 2. Page 4

Ks3 Sats Papers Maths 2003

SMARTboard: The SMART Way To Engage Students

Genevieve L. Hartman, Ph.D.

5. UPPER INTERMEDIATE

and. plan effects, about lesson, plan effect and lesson, plan. and effect

ANGLAIS LANGUE SECONDE

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

PREP S SPEAKER LISTENER TECHNIQUE COACHING MANUAL

1. Do you use eportfolios in your classes or program?

PREPARATION STUDY ABROAD PERIOD

Cognitive Thinking Style Sample Report

Programma di Inglese

PREPARATION. None. Eventhough the Irish have a sometimes difficult accent, the medium was English.

TeacherPlus Gradebook HTML5 Guide LEARN OUR SOFTWARE STEP BY STEP

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

UNIT ONE Tools of Algebra

Association Between Categorical Variables

Formulaic Language and Fluency: ESL Teaching Applications

Switchboard Language Model Improvement with Conversational Data from Gigaword

Webinar How to Aid Transition by Digitizing Note-Taking Support

OSR Preclinical Grading Questionnaire Results

Spinal Cord. Student Pages. Classroom Ac tivities

Lesson M4. page 1 of 2

Case study Norway case 1

Learning to Think Mathematically with the Rekenrek Supplemental Activities

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

Introduction to the Practice of Statistics

MUSICAL CHEERS Songs Grammar Objectives:

Introduction to Causal Inference. Problem Set 1. Required Problems

Why Pay Attention to Race?

Ohio s Learning Standards-Clear Learning Targets

Shockwheat. Statistics 1, Activity 1

Dentist Under 40 Quality Assurance Program Webinar

Physics 270: Experimental Physics

Diagnostic Test. Middle School Mathematics

Consequences of Your Good Behavior Free & Frequent Praise

English Nexus Offender Learning

TECHNICAL REPORT FORMAT

Hentai High School A Game Guide

CLASSROOM PROCEDURES FOR MRS.

HOW TO STUDY A FOREIGN LANGUAGE MENDY COLBERT

Paper Reference. Edexcel GCSE Mathematics (Linear) 1380 Paper 1 (Non-Calculator) Foundation Tier. Monday 6 June 2011 Afternoon Time: 1 hour 30 minutes

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

The Evolution of Random Phenomena

CS 598 Natural Language Processing

Backwards Numbers: A Study of Place Value. Catherine Perez

This curriculum is brought to you by the National Officer Team.

PHYS 2426: UNIVERSITY PHYSICS II COURSE SYLLABUS: SPRING 2013

Tutoring First-Year Writing Students at UNM

Schoology Getting Started Guide for Teachers

JUDICIAL QUALIFICATIONS COMMISSION Tallahassee, Florida INQUIRY CONCERNING A JUDGE NO.: , , /

10 tango! lessons. for THERAPISTS

Generating Test Cases From Use Cases

Transcription:

Scrabble sucks! Toward higher-order word games!!con 18 May 2014. For the purposes of this talk, I'm going to assume you know what scrabble is. If not, I give you permission to google it on your phone real quick or something. I'm going to assume you know what google is.

I'm @aparrish This is me. I'm a computer programmer, experimental poet and game designer.

Scrabble sucks! Now, a lot of people have fun playing scrabble and there's an amazing competitive culture surrounding it. So something must be good about it and I don't want to detract from that! So I want to revise the title of this talk to a bit to something more like...

I make word games that differ from Scrabble in several important particulars!!...this one. What I'm going to talk about in this talk is what I think is wrong with Scrabble and some of the games I've designed that work differently and (I hope) better. Along the way there will be some computer programming!

An anecdote So what evil did Scrabble visit on me to make me hate it so much? Once I was in Utah visiting my family over the holidays and we decided to play a board game. Someone suggested scrabble and I was like, okay. This is essentially how the game went:

First my mom played "north." A perfectly good play, worth 24 points.

Then my little sister played "fireman" for 26 points. And then it's my turn to play and I look at my rack and my eyes light up and I don't WANT to be an insufferable smart-ass, but what can I do? so I play...

QOPH. And "qi" and "pe" and "hm." It's a pretty good move, perfectly legal Scrabble, and it's worth 66 points... but is it worth the contention and strife caused by playing not one but FOUR weird words in one turn? This move had a deleterious effect on our fun. Everyone thought I was engaging in ostentatious brainshowboating. AND I KIND OF WAS. It wasn't fun for anyone and eventually we decided to play something else. So here's why I don't like scrabble:

Scrabble turns otherwise nice people into pedantic a**holes A nicer way to phrase this would be: competitive Scrabble play requires a lot of arcane knowledge. You have to memorize a lot of words, both tiny and large. So when you're playing with people who haven't memorized all this hermetic vocabulary, it can lead to hurt feelings and hurt feelings are no fun! It's worth mentioning that other games aren't like this if I was better than someone in my family at soccer or street fighter, they probably wouldn't think I was being a smart-ass.

Why? It's easy to chalk these problems up to the culture of Scrabble, or game balance issues, or simply the inability of certain individuals (ahem, me) to keep their smartassery in check for more than thirty seconds at a time. But I think there are actual structural problems with the game worth mentioning. Here's one:

Unigram frequency The "moving parts" of the game are individual letters Commonality and value of letters based on (some model of) letter frequency in English Letters drawn at random Scrabble is based on what I'm going to call "unigram frequency" by which I mean the "moving parts" of the game are individual letters, whose value and commonality are determined by the frequency of those letters in the English language. Letters are put into play by drawing them randomly, one at a time.

There are advantages to this model! Letter frequency is super intuitive everyone understands that E is a really common letter, but Z isn't. And this model is familiar and successful enough that many many word games are based on it.

-ing c k re- HOWEVER, our intuitive understanding of how words are put together doesn't stop at just knowing which letters are the most frequent. We also know things like... -ing occurs at the ends of words a lot, and the letter 'k' is often preceded by 'c', and re- is something you can put in front of verbs to make new verbs. Scrabble doesn't model or reward any of that knowledge at all. Like I said, it mostly rewards the memorization of specific vocabulary.

In unigram frequency games, the most valuable plays will be words densely packed with rare letters. So, my hunch/hypothesis: if valuable letters are associated with higher scores, and there are fewer valuable letters, and those letters are drawn at random, the most valuable plays will be shorter words with rarer letters. And short words with rare letters tend to be those smart-ass vocabulary test words we were talking about earlier. They're the smart-ass vocabulary words that make it so your family won't play board games with you anymore.

But it doesn't have to be this way. I don't think people should have to memorize long lists of words, or have extensive vocabularies, in order to enjoy and be competitive at word games. As a linguist and a poet I am sensitive to the fact that everyone is a fluent speaker of their own language. Everyone has intuition about how the words of their language are put together, and you know, everyone wants to play and have fun with that knowledge without being made to feel unintelligent. It breaks my heart that someone would come away from any word game thinking that they weren't a smart and creative person. That sucks!

My experiments So over the past few years, I've been making word games that DON'T use unigram frequency as their model, in an attempt to escape the arcane hermetic vocabulary problem that plagues scrabble. Here are some of those experiments.

Rewordable Card game for 2+ players Co-designed with Adam Simon and Tim Szetela Based on the most common unigrams, bigrams and trigrams in English Cards freely available at rewordable.com The first experiment is... what happens if you just change the unit from unigrams to higher order n-grams? Rewordable is a game I designed a few years ago with my friends Adam and Tim with this in mind. It's a card game played with a deck of 160 cards, each of which has a unigram, bigram or trigram on it. ("Bigram" and "trigram" just mean groupings of two or three letters, respectively.) The bigrams and trigrams were selected because they're the most frequent of their kind in English words sequences like "ing" or "er."

The idea is that players will be able to form longer, more satisfying words, because the sequences of letters on the cards themselves are longer. Here are some action shots. We're still working on getting the rules just right but I think by and large it accomplishes the goal of encouraging fluent word creation without the frustrations of scrabble. LOOK FOR A KICKSTARTER SOON.

The second experiment I want to talk about is Characterror, a video game I made. you play as the little "ship" there on the right, which is trailing letters. You "fire" the letters into one of the "slots" there on the left. when you've made a word that you're satisfied with, you can "score" it, clearing the slot. The idea is to build the longest words you can. The "trick" of this game is that the list of letters trailing the ship aren't random, or selected merely by their unigram frequency they're generated with a Markov chain. Essentially: using a statistical model of english words, the game determines which letters are most likely to help you form a word, based on the letters already in play on the board.

QUICK ASIDE: how markov chains work. wait markov what now

condescendences bigrams co on nd de es sc ce en nc I'm talking to a room full of math wizards who know this better than me, pardon the inexact language, but this is how I think about it and how I explain it to students. Let's take a corpus consisting of a single string the word "condescendences" and make a list of all unique bigrams in that word (all sequences of two letters). A Markov chain looks at each of these n-grams and then records which letters FOLLOW those n- grams. Take the bigram "de" for example a markov chain analysis would tell us that half the time (in our corpus of one word) it's followed by 's' and half the time followed by 'n'.

condescendences bigrams next letter? co n (1.0) on d (1.0) nd e (1.0) de s (0.5), n (0.5) es c (0.5), EOL (0.5) sc e (1.0) ce n (0.5), s (0.5) en d (0.5), c (0.5) nc e (1.0) Here's that process but applied to the entire string, giving us a table with probabilities. We can use this data to make predictions: given an n-gram, what letter is most likely to follow? Markov chains are famously used for generating amusing nonsense text if we make predictions recursively, using the previous prediction as input for the next prediction, we can come up with new sequences of letters that statistically resemble but are not identical to the original source text, like the word...

condendescencesces CON den DES sen SES ses...this is a word "generated" from a bigram Markov chain of the word "condescendences." Now if we had a Markov chain probability table of not just a single word but ALL words in the english language, we could make word games that take into account which letters are already in play, and supply players with letters likely to lead to more common, longer words. That's what's happening in Characterror.

Lexcavator is another video game I made. It's a cross between Boggle and Mr. Driller you find and select words to clear them from the board, allowing your at-symbol to progress further into the game. To ensure that the words you find are interesting and fun, the board is generated with Markov chains! In two dimensions!

R E R A A C F T H U C A T R O????? Here's a simplified diagram of how lexcavator board generation works. It starts with a few rows of random letters (weighted by english frequency), just to prime the process. Then for each cell in the next row, we randomly select a column of letters connected straight up or diagonally and populate that cell with a letter randomly selected from our Markov chain.

Does it work? So then the question arises... do any of these techniques actually make the games better, according to the criteria I set out earlier? Well, to find this out...

Corpus analysis ~15000 games downloaded from crosstables.com, an archive of Scrabble game transcripts, containing a total corpus of ~670k words played ~620k words played in online sessions of Lexcavator, dumped from MongoDB I collected a corpus of Scrabble games and a corpus of Lexcavator games and compared the two, graphing how often more "common" words were played in each game. I have no good reason to assume that the cross-tables data is representative of all Scrabble ever, but it seems like a reasonable place to start.

This is a scrabble word commonality histogram overlaid by Lexcavator word frequency histogram. The "word commonality" is judged by how frequent a particular word is in the English language e.g., "the" is rank #1, "hippocampus" is rank #32766 the graph shows how the commonality of words in both games is distributed, with higher commonality on the left. You can immediately see that in both games, more common words occur more frequently. But you can see some spots where Lexcavator and Scrabble have a different "curve"

the pattern is more evident when I graphed word commonality histograms individually by word length. The areas of solid blue are where Lexcavator's words are distributed, and the areas of solid red are where Scrabble's words are concentrated. You can see that Lexcavator's words are more bunched up toward the left of the graph, and Scrabble's are more evenly distributed across the entire range. So Lexcavator does indeed encourage people to form more common words. Success!

Thanks! http://twitter.com/aparrish http://www.decontextualize.com/ http://www.lexcavator.com/ http://www.rewordable.com/ A big thank you to the organizers of the conference for everything. It's been a privilege and a pleasure to participate.