Creating a Difficulty Metric for A Sudoku Variation

Similar documents
Artificial Neural Networks written examination

Evolution of Symbolisation in Chimpanzees and Neural Nets

Knowledge-Based - Systems

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

CS 1103 Computer Science I Honors. Fall Instructor Muller. Syllabus

Laboratorio di Intelligenza Artificiale e Robotica

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

Generating Test Cases From Use Cases

Learning Methods for Fuzzy Systems

Test Effort Estimation Using Neural Network

CS Machine Learning

Evolutive Neural Net Fuzzy Filtering: Basic Description

Session 2B From understanding perspectives to informing public policy the potential and challenges for Q findings to inform survey design

Mathematics process categories

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

A Neural Network GUI Tested on Text-To-Phoneme Mapping

An Introduction to Simio for Beginners

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Laboratorio di Intelligenza Artificiale e Robotica

Backwards Numbers: A Study of Place Value. Catherine Perez

Primary National Curriculum Alignment for Wales

WiggleWorks Software Manual PDF0049 (PDF) Houghton Mifflin Harcourt Publishing Company

Artificial Neural Networks

Python Machine Learning

BMBF Project ROBUKOM: Robust Communication Networks

Major Milestones, Team Activities, and Individual Deliverables

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

The Evolution of Random Phenomena

This scope and sequence assumes 160 days for instruction, divided among 15 units.

Visual CP Representation of Knowledge

Ordered Incremental Training with Genetic Algorithms

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Contents. Foreword... 5

Automating the E-learning Personalization

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Lecture 2: Quantifiers and Approximation

The Indices Investigations Teacher s Notes

The dilemma of Saussurean communication

South Carolina College- and Career-Ready Standards for Mathematics. Standards Unpacking Documents Grade 5

An OO Framework for building Intelligence and Learning properties in Software Agents

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Lecture 1: Machine Learning Basics

Add and Subtract Fractions With Unlike Denominators

Using focal point learning to improve human machine tacit coordination

Introduction to Simulation

PRIMARY ASSESSMENT GRIDS FOR STAFFORDSHIRE MATHEMATICS GRIDS. Inspiring Futures

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

TOPICS LEARNING OUTCOMES ACTIVITES ASSESSMENT Numbers and the number system

MinE 382 Mine Power Systems Fall Semester, 2014

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

A simulated annealing and hill-climbing algorithm for the traveling tournament problem

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Designing a Computer to Play Nim: A Mini-Capstone Project in Digital Design I

Graphic Organizer For Movie Notes

Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

Paper 2. Mathematics test. Calculator allowed. First name. Last name. School KEY STAGE TIER

Mathematics Success Level E

Biology 10 - Introduction to the Principles of Biology Spring 2017

Learning and Transferring Relational Instance-Based Policies

CS 101 Computer Science I Fall Instructor Muller. Syllabus

INPE São José dos Campos

Learning Cases to Resolve Conflicts and Improve Group Behavior

Math Hunt th November, Sodalitas de Mathematica St. Xavier s College, Maitighar Kathmandu, Nepal

IAT 888: Metacreation Machines endowed with creative behavior. Philippe Pasquier Office 565 (floor 14)

Dublin City Schools Mathematics Graded Course of Study GRADE 4

School of Innovative Technologies and Engineering

Extending Place Value with Whole Numbers to 1,000,000

While you are waiting... socrative.com, room number SIMLANG2016

Human Emotion Recognition From Speech

Issues in the Mining of Heart Failure Datasets

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

General Microbiology (BIOL ) Course Syllabus

BADM 641 (sec. 7D1) (on-line) Decision Analysis August 16 October 6, 2017 CRN: 83777

I-COMPETERE: Using Applied Intelligence in search of competency gaps in software project managers.

Standard 1: Number and Computation

Probability and Game Theory Course Syllabus

Lecture 10: Reinforcement Learning

Ohio s Learning Standards-Clear Learning Targets

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

BIOS 104 Biology for Non-Science Majors Spring 2016 CRN Course Syllabus

Are You Ready? Simplify Fractions

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Arizona s College and Career Ready Standards Mathematics

Activities, Exercises, Assignments Copyright 2009 Cem Kaner 1

Cooperative evolutive concept learning: an empirical study

M55205-Mastering Microsoft Project 2016

8. UTILIZATION OF SCHOOL FACILITIES

KIS MYP Humanities Research Journal

Knowledge Transfer in Deep Convolutional Neural Nets

ABSTRACT. A major goal of human genetics is the discovery and validation of genetic polymorphisms

Transcription:

Creating a Difficulty Metric for A Sudoku Variation Emily Alfs Mathematics and Computer Science Doane College Crete, Nebraska 68333 emily.alfs@doane.edu Abstract Frame Sudoku is very similar to traditional Sudoku. The game is set up with 9X9 grid and nine 3X3 sub-grids. The goal of the game is the same as traditional Sudoku: place the values 1-9 exactly once in each row, column, and 3X3 sub-grid. Frame Sudoku differs from traditional Sudoku in how it starts. In Frame Sudoku, we are given only frame clues and no internal clues. As with Sudoku, there are many ways to judge the difficulty of a game. During the fall semester, we created a computer program that would assess the difficulty of any given Frame Sudoku game based on the number of times any techniques were used in the solving process. We designed and implemented these techniques then weighted those techniques based on their individual difficulties. Once we created a system that would rate games, we created a machine-learning program to learn our rating system.

1 Introduction Sudoku as we see it in newspapers and magazines is a very approachable game. You are given a nine by nine board that is separated into nine three by three sub-grids. The game starts with a certain number of cells filled in with values ranging from one through nine. The goal of the game is to get the values one through nine in each row, column, and three by three block exactly once. When we see Sudoku in the newspaper or in magazines, we often see it accompanied with some sort of difficulty rating. These can range from easy to brainy and typically, ranking techniques vary from source to source. The goal of this research was to create a machine-learning program to learn a particular rating system we create for a variation of Sudoku, Frame Sudoku. Our rating system was set up as follows: we analyzed solving techniques for Frame Sudoku, assigned those techniques a difficulty rating, then based on how many times each technique was used we assigned a difficulty level to the puzzle. Once we had a large enough data set, we created a machine-learning program to learn our system and tested it on other data points. 2 How to Play Frame Sudoku is similar to traditional Sudoku in the form of the game and the goal. However, the starting givens are different. There are no cells filled in initially and the only clues are on the outside of the board, the frame. These clues tell the player the sum of the three closest cells. An example of this game can be seen in Figure 1. Figure 1: An example of a typical Frame Sudoku game. 1

As mentioned previously, each frame clue tells us the sum of the three closest cells. To help illustrate this, consider the top left block of the game in Figure 1. The 6 tells us that the values 1, 2, and 3 must go in the three cell column below the 6. However, we do not know in what order to place them. We know that these values must be 1, 2, and 3 as Sudoku rules allow us to use the values 1 through 9 exactly once in each row, column, and block. This can be better visualized in Figure 2. Figure 2: We know 1, 2, and 3 must go in the green cells, as those are the only values that add up to 6 with Sudoku constraints. To figure out what order the 1, 2, and 3 must be placed in we must use the intersecting row clues. As one might be able to tell, the strategy behind this variation relies heavily on partitions. A partition is a way of writing an integer n as a sum of positive integers where the order of the summands is not significant, possibly subject to one or more additional constraints. [2] The partitions of the three row clues, 17, 12, and 16, will help us to place the 1, 2, and 3 in column one. However, these will be more difficult to solve for as they have many more partitions. Consider the frame clue 12: this can be filled in using, {1, 2, 9}, {1, 3, 8}, {1, 4, 7}, {1, 5, 6}, {2, 3, 7}, {2, 4, 6}, or {3, 4, 5}. Thus, the game must be played more strategically. 3 Previous Research During the summer, Susanna Lange and I analyzed this particular version of Sudoku to better understand it. This work was partially supported by National Science Foundation grant DMS-1262342, which funds a Research Experiences for Undergraduates program at Grand Valley State University. One of the many products of this research was a program that generated Frame Sudoku games with unique solutions, meaning they can only be filled in one way. I continued with this topic at Doane College for a senior research project. The goal of this research was to create a difficulty metric for Frame Sudoku. We found Djape s book that had approximately 50 rated Frame Sudoku games. [1] Throughout the research, we 2

attempted to reflect Djape s rating system by creating our own. Our system was developed by defining solving techniques and programming them so a computer could solve games as a person would using these techniques. Difficulty was rated by the number of times each technique was used. Based on the level of difficulty, these techniques were given different weights. These weights were added together to give us our difficulty rating, which ranges from 10 to 3,400. Unfortunately, we were not able to replicate Djape s rating system. However, we were able to create unique games and assign them difficulties to train and test our machine learning system. From these previous research experiences, we were able to create and rate as many games as we would like to run through our neural network. 4 Neural Network An artificial neuron network (ANN) is a computational model based on the structure and functions of biological neural networks. Information that flows through the network affects the structure of the ANN because a neural network changes - or learns, in a sense - based on that input and output. [5] The key components to a neural network are edges and nodes. Each edge, which is a connection between the nodes, has a weight. The assigned weight of that edge will multiply the value that is traveling down the edge. The nodes can have different types and in our case, we had two types: multipliers and adders. The nodes accumulate the weighted values that are coming from the input edges. The node either adds all of the weighted input values or multiplies them depending on the node type. Neural networks consist of three layers: the input layer, one or more hidden layers, and the output layer. The input layer is just that, the initial input values. In our case, the input values were all of the frame clues for a single game. The hidden layer is where the weights and accumulations happen which was mentioned previously. To see the specific setup, please refer to Figure 4 at the end of this paper, which shows half of the neural network. Our output layer is a single node that represents our difficulty. In order to find the weights and node types, we had to train our neural network. Our training process was through a genetic algorithm. 5 Genetic Algorithm We modeled natural evolution in a genetic algorithm by using operators such as crossover, mutation, and selection to get the best individual. Crossover behaves like a mating process in that it takes two individuals, switches some of their DNA, and produces two new offspring. Mutation takes a single individual and randomly changes its DNA based on a probability of mutation. Selection takes two random individuals, evaluates their fitnesses, and selects the individual with the better fitness [3]. This process 3

trained our population of neural networks to ultimately give us the fittest individual, meaning the most accurate neural network. The genetic algorithm we created was based off of the Doane Evolutionary Algorithm (DEA) [4]. In our case, the individual, a single neural network, is represented as an array of 107 doubles. This allowed us to use methods that were built into the DEA for crossover, selection, and mutation. Our neural networks had a static structure so they always had the same number of nodes and edges, half of which can be seen in Figure 4 at the end of the paper. 5.1 Crossover Crossover happens at a rate of 60%. So we go through the entire population and generate a random number between 0 and 1. If that number is below.6 then crossover happens. When an individual is selected for crossover then another random individual is selected and they crossover a random value in their respective arrays. 5.2 Mutation Our mutation operator is rather straightforward. The mutation will happen if a random number between zero and one is less than our chosen μ, which we have set to.25. Our mutation is a single point mutation so only one value in our array will change when mutation happens. 5.3 Selection Within our program, we use elitist tournament selection. This ensures that our best individual from the population survives into the next generation. Without elitist tournament selection, this is not guaranteed. In regular selection two individuals from the current population are selected and have their fitness s compared. Whichever individual has the higher fitness goes to the next generation. However, both individuals remain in the current population and are subject to selection again. By using elitist tournament selection, our next generation will always contain the best member of the previous population. 5.4 Fitness Fitness of a single neural net is measured by putting 500 Frame Sudoku games through the neural net. These games were produced and rated from previous research and have an expected difficulty level already assigned to them. We then add together the differences between the expected difficulty level and output layer value of the game. 4

6 Results Half of our fittest neural network can be seen in Figure 4 at the end of this paper. This individual was reached using 2000 generations each with a population size of 100,000 neural networks. Figure 5 shows a partial table of results. These results are based on games that the neural network was not trained on. Overall, the average difference between the expected value of the game and the evaluated value was 453. The standard deviation was 578 and the variance was 335,124. Some ways that we could improve the results would be to run the genetic algorithm with more generations and individuals or to change the structure of the neural network. As mentioned previously, our neural network was a static structure so it never changed. The structure of the neural network could potentially evolve through the genetic algorithm as more advancement occurs throughout this project. 5

27 0 1 2 3 4 5 6 7 8 36 45 28 37 46 29 30 38 9 10 11 12 13 14 15 16 17 39 47 48 31 32 40 41 49 50 33 42 51 34 35 43 44 52 53 18 19 20 21 22 23 24 25 26 Figure 3: This diagram simply shows how the frame clues map to the input layer of the neural network. 6

Figure 4: This shows half of the neural network structure and half of the assigned values. The other half is symmetric in shape however the values are different. The two halves would be joined with two edges to a single node, which is our output value. 7

Figure 5: Partial results depicting the expected value, which was assigned by a program using solving techniques, then the value evaluated by the neural network, followed by the difference. 8

References [1] Djape, Frame Sudoku: A Hybrid Between Killer Sudoku and Outside Sudoku, CreateSpace Publishing, 2014 [2] Hardy, Wright, Encyclopedia of Mathematics, 2003 [3] Hiu Man Wong. Genetic Algorithms. In SURPRISE 96 Journal, 1996 [4] Mark M. Meysenburg. The DEA: A Framework for Exploring Evolutionary Computation. In MICS 2004: Proceedings of the Midwest Instruction and Computing Symposium, April 2004. [5] Techopedia, Artificial Neural Network 9