CHAPTER 3 SYNTACTIC PATTERN RECOGNITION TECHNIQUES FOR OBJECT IDENTIFICATION

Similar documents
Language properties and Grammar of Parallel and Series Parallel Languages

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Proof Theory for Syntacticians

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

A General Class of Noncontext Free Grammars Generating Context Free Languages

University of Groningen. Systemen, planning, netwerken Bosman, Aart

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

AQUA: An Ontology-Driven Question Answering System

CS 598 Natural Language Processing

A Grammar for Battle Management Language

PRODUCT PLATFORM DESIGN: A GRAPH GRAMMAR APPROACH

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Grammars & Parsing, Part 1:

A Version Space Approach to Learning Context-free Grammars

Montana Content Standards for Mathematics Grade 3. Montana Content Standards for Mathematical Practices and Mathematics Content Adopted November 2011

On-Line Data Analytics

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

Developing a TT-MCTAG for German with an RCG-based Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

ReinForest: Multi-Domain Dialogue Management Using Hierarchical Policies and Knowledge Ontology

Highlighting and Annotation Tips Foundation Lesson

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Some Principles of Automated Natural Language Information Extraction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

Language Acquisition Fall 2010/Winter Lexical Categories. Afra Alishahi, Heiner Drenhaus

Parsing of part-of-speech tagged Assamese Texts

"f TOPIC =T COMP COMP... OBJ

Liquid Narrative Group Technical Report Number

Prediction of Maximal Projection for Semantic Role Labeling

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

Discriminative Learning of Beam-Search Heuristics for Planning

This scope and sequence assumes 160 days for instruction, divided among 15 units.

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

An Introduction to the Minimalist Program

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Abstractions and the Brain

Ontologies vs. classification systems

Basic Parsing with Context-Free Grammars. Some slides adapted from Julia Hirschberg and Dan Jurafsky 1

First Grade Standards

The Strong Minimalist Thesis and Bounded Optimality

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Individual Component Checklist L I S T E N I N G. for use with ONE task ENGLISH VERSION

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

CAAP. Content Analysis Report. Sample College. Institution Code: 9011 Institution Type: 4-Year Subgroup: none Test Date: Spring 2011

Implementing a tool to Support KAOS-Beta Process Model Using EPF

Lecture 10: Reinforcement Learning

Loughton School s curriculum evening. 28 th February 2017

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Natural Language Processing. George Konidaris

Lecture 1: Basic Concepts of Machine Learning

Rule Learning With Negation: Issues Regarding Effectiveness

CS Machine Learning

Chapter 2 Rule Learning in a Nutshell

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

Axiom 2013 Team Description Paper

Guidelines for Writing an Internship Report

Grade 6: Correlated to AGS Basic Math Skills

Lecture 1: Machine Learning Basics

Numeracy Medium term plan: Summer Term Level 2C/2B Year 2 Level 2A/3C

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

Statewide Framework Document for:

(Sub)Gradient Descent

Teaching a Laboratory Section

Extending Place Value with Whole Numbers to 1,000,000

Evolution of Collective Commitment during Teamwork

Artificial Neural Networks written examination

Seminar - Organic Computing

Innovative Methods for Teaching Engineering Courses

- «Crede Experto:,,,». 2 (09) ( '36

TABE 9&10. Revised 8/2013- with reference to College and Career Readiness Standards

How do adults reason about their opponent? Typologies of players in a turn-taking game

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

1 st Quarter (September, October, November) August/September Strand Topic Standard Notes Reading for Literature

Learning Methods in Multilingual Speech Recognition

Learning goal-oriented strategies in problem solving

Rule Learning with Negation: Issues Regarding Effectiveness

Rule-based Expert Systems

Arizona s College and Career Ready Standards Mathematics

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

Missouri Mathematics Grade-Level Expectations

On Human Computer Interaction, HCI. Dr. Saif al Zahir Electrical and Computer Engineering Department UBC

Student Name: OSIS#: DOB: / / School: Grade:

Visual CP Representation of Knowledge

Circuit Simulators: A Revolutionary E-Learning Platform

Monitoring Metacognitive abilities in children: A comparison of children between the ages of 5 to 7 years and 8 to 11 years

Improving Fairness in Memory Scheduling

Towards a MWE-driven A* parsing with LTAGs [WG2,WG3]

Comprehension Recognize plot features of fairy tales, folk tales, fables, and myths.

Large vocabulary off-line handwriting recognition: A survey

Learning Disability Functional Capacity Evaluation. Dear Doctor,

MASTER S THESIS GUIDE MASTER S PROGRAMME IN COMMUNICATION SCIENCE

A Case Study: News Classification Based on Term Frequency

Probabilistic Latent Semantic Analysis

Introduction to HPSG. Introduction. Historical Overview. The HPSG architecture. Signature. Linguistic Objects. Descriptions.

Transcription:

CHAPTER 3 SYNTACTIC PATTERN RECOGNITION TECHNIQUES FOR OBJECT IDENTIFICATION 3.1. Introduction Pattern recognition problems may be logically divided into two major categories, (i) Study of pattern recognition capabilities of human beings and (ii) Development of theory and techniques for the design of devices that perform a pattern recognition task for a specific application. Pattern recognition could be formally defined as categorization of input data into identifiable classes via extraction of significant features or attributes of the data from the background of irrelevant detail. A pattern class is a category determined by some given common attributes. A pattern is the description of any member of a category representing a pattern class. When a set of patterns of different classes are available, it is necessary to categorize these patterns into their respective classes through the use of some automatic device. 34

3.2. Design concepts of Automatic Pattern Recognition The design concepts for automatic pattern recognition are motivated by the ways in which pattern classes are characterized and defined [27]. Three basic design concepts are discussed in the following. 3.2.1. Membership-roster Concept Characterization of a pattern class by a roster of its members suggests automatic pattern recognition by template matching. The set of patterns belonging to the same pattern class is stored in the pattern recognition system. When an unknown pattern is given to the system, it is compared with the stored patterns and the system classifies this input pattern as a member of a pattern class if it matches with one of the stored patterns belonging to that pattern class. The membership-roster approach works well for near perfect noise-free pattern samples. 3.2.2. Common-property Concept Characterization of a pattern by the common properties shared by all of its members suggests common property concept of automatic pattern recognition system. The patterns with common properties or attributes which reflect similarities among these patterns, are stored in the pattern recognition system. When an unknown pattern is observed by 35

the system, its features are extracted and sometimes coded and then are compared with the stored features. The recognition scheme will classify the new pattern as belonging to the pattern class if its features match with any of the stored features of that class. So, the main objective in this approach is to determine common properties from a finite set of sample patterns and to examine a new pattern for a suitable match. 3.2.3. Clustering Concept When the patterns of a class are vectors whose components are real numbers, a pattern class can be characterized by its clustering properties in the pattern space. The pattern recognition system based on this concept could be designed using the relative geometrical arrangement of the target vectors. The unknown patterns can be easily classified, if the target vectors are far apart in their geometrical arrangement. The simple recognition scheme used in such case is minimum-distance classifiers. On the other hand when the clusters overlap more sophisticated techniques are used for partitioning the pattern space. 3.3. Methodologies The basic design of automatic pattern recognition systems described above makes use of three categories of methodologies: (i) 36

heuristic, (ii) mathematical, and (iii) linguistic or syntactic. Sometimes, a combination of these methods is also used in the design of a pattern recognition system. 3.3.1. Heuristic Methods The heuristic approach is based on human intuition and experience, making use of the membership-roster and common-property concepts. A system designed using this approach would usually consist of ad hoc procedures developed for specialized recognition tasks. Heuristic approach is an important branch of pattern recognition system design, but lacks in generalization since each problem requires application of specifically tailored design rules. 3.3.2. Mathematical Methods The approach is based on classification rules, which are formulated and derived in a mathematical framework, making use of the common-property and clustering concepts. The mathematical approach may be subdivided into two categories: deterministic and statistical. The deterministic approach is based on a mathematical framework, which does not employ explicitly the statistical properties of the pattern classes under consideration. The statistical approach is based on mathematical classification rules, which are formulated and derived in a statistical framework. 37

3.3.3. Linguistic (syntactic) Methods Characterization of patterns by sub patterns and their relationships suggests as automatic pattern recognition by the linguistic or syntactic approach, making use of the common-property concept. A pattern can be described by a hierarchical structure of sub patterns analogous to the syntactic structure of formal languages [67]. This permits the use of formal languages for tackling pattern recognition problems. A pattern grammar consists of finite sets of elements called variables, primitives, and productions. The production rules determine the type of grammar to be used for pattern recognition. Among the most studied grammars are regular grammars, context free grammars, and context-sensitive grammars. The selection of pattern primitives, the assembling or the primitives and their relationships into pattern grammars, and analysis and recognition uses the rules of these grammars. This approach is also useful in dealing with patterns, which cannot be conveniently described by numerical measurements. 3.4. Syntactic Pattern Recognition (SPR) Among the various techniques for object recognition, syntactic pattern recognition technique is generally preferred when high-speed 38

recognition is a matter of concern. The idea behind syntactic pattern recognition is the specification of a set of pattern primitives, a set of rules that governs their interconnection and a recognizer whose structure is determined by the set of rules in the grammar. The description of an object is called pattern. When a person perceives a pattern he makes an inductive inference and associates his perception with some concepts or clues which he might have derived from his past experience. Thus the problem of pattern recognition may be recorded as a classification process of discriminating input data not between individual patterns but between pattern classes via search for certain invariant attributes among members of classes. The patterns used in the process of pattern recognition for identification and classification of patterns are either spatial or temporal. Spatial patterns are those which occupy space like characters, fingerprints, weather maps, physical objects and pictures. Temporal patterns are time based like speech waveforms, electrocardiograms, and target signature and timer series. Syntactic pattern recognition is a new approach of pattern recognition which utilizes the concepts of formal language theory. The term syntactic pattern recognition is synonymous to linguistic pattern recognition grammatical pattern recognition and structural pattern recognition. The difference between the mathematical approach and 39

syntactic approach is that the former one explicitly utilizes the structure of pattern in recognition process, whereas syntactic approach deals with patterns on a strictly quantitative basis. 3.5. Formal Language Theory Syntactic pattern recognition follows the theory of formal languages. The origin of formal language theory may be traced in middle 1950 s with the development by Noam Chomsky s mathematical model of a grammar related to his work in natural languages. The concepts helpful to comprehend the formal language theory are defined below: An Alphabet is any finite set of symbols. A word over an alphabet is any string of finite length composed of symbols from the alphabet. For example, valid words of alphabet {0,1} are 0,1,00,01,10,11. A word with no symbols is called empty word and denoted by Λ. A language is any set of words over an alphabet. 40

As every language follows some specific grammar. Similarly the formal language is associated with a grammar which is basically a 4-tuple: G = {V N, V T, P, S} Where, V N is a set of non-terminals (variables); V T is a set of terminals (constants); P is a set of productions or rewriting rules; S is a set start or root symbol. S belongs to the set V N and V N and V T are disjoint sets, whereas V is the union of sets V N and V T. V * denotes the set (free monoid) of words consisting of the empty word Λ whereas V + is a set (free semi group) of sentences of V * - Λ. The language generated by G, is denoted by L(G). It is the set of strings that satisfy two conditions: (i) Each string is composed only of terminals (i.e., each string is a terminal sentence) (ii) Each string can be derived from S by suitable application of production from the set P. The set P of production consists of expressions of the form α β.the symbol indicates replacement of string α by the string β, where α is a string in V + and β is a string in V *. The set of production formulas are a part of a normal algorithm whose concept was introduced 41

by A.A. Markov. The normal algorithm recognizes angles from changes in the direction during contour tracking. This is done by Look Ahead Tracing (LAT) technique. Grammars differ only in their productions. Now, various types of grammar are: Unrestricted Grammar : - It has production formulas of the form α β, where α is a string and β is another string. Context-Sensitive Grammar : - It has production of the form α 1 Aα 2 α 1 βα 2, where α 1 and α 2 are in V *, β is in V + and A is in V N. This grammar allows replacement of the nonterminal A by the string β only when A appears in the context α 1 Aα 2 of string α 1 and α 2. Context free Grammar:- It has production of the form A β, where A is in V N and β is in V +. The name context free arises from the fact that the variable A may be replaced by a string β regardless of the context in which A appears. Regular (or Finite-state) Grammar:- It is one with productions of the form A ab or A a, where A and B are variables in V N and a is a terminal in V T. These grammars are sometimes called type 0, 1, 2 and 3 grammars respectively. The basic concepts underlying syntactic pattern 42

recognition is illustrated by the development of mathematical models of computing machines, called automata. Given an input string, an automaton is capable of recognizing whether the pattern belongs to the language with which the automaton is associated. A finite automaton is defined as the 5-tuple A f = (Q,, δ, q 0, F) Where Qis a finite, nonempty set of states, is a finite input alphabet, δ is a mapping from Q X into the collection of all subsets of Q, q 0 is the starting state, and F is a set of final or accepting states. 3.6. Formulation of the Syntactic Pattern Recognition Problem Suppose that we have two pattern classes W 1 and W 2. Let the patterns of these classes be composed of features from some finite set. We call the features terminals by V T. Here we use the set of constants denoted by V T = {R, DR, D, DL, L, UL, U, UR} and the root symbols would change depending on the object to be recognized. Note that R, D, L and U are elementary symbols and DR, DL, UL and UR are composite symbols of the alphabet V T. Certain primitives are also used instead of terminals in syntactic pattern recognition. The pattern classes W 1 and W 2 are composed of features from some finite set. If there exists a grammar 43

with the property that the language it generates consists of sentences or words (patterns), which belong exclusively to one of the pattern classes say W 1, this grammar can be used for pattern classification so that a pattern belongs to W 1 if it is a word in L(G 1 ). 3.7. Syntactic Pattern Description The object to be recognized in an image in this case, is a twodimensional pattern. The string grammars of this pattern can be obtained by simple juxtaposition of a string, to form new strings. Juxtaposition of two strings means placing the objects together, without losing the identity of the objects. Concatenation can also be done but it involves spatial rearrangement as well as a loss of identity on the part of the individual objects. Juxtaposition of structures takes place only at two points called a head and tail of an arrow defined by these two points. Graph-like patterns can be recognised as two-dimensional patterns which can then be reduced to an equivalent string representation. The other useful technique for describing two-dimensional relationship is based on tree structures. A tree is a finite set T of one or more nodes such that: There is an especially designated node called the root of the tree 44

The remaining nodes (excluding the root) are partitioned into m disjoint sets T 1,T 2 T m, m 0, where each of these sets is in turn a tree. These trees are called sub trees of the root. A node of degree zero is called a leaf, while a node of higher degree is called a branch node. The tree representation of a pattern is called a pattern tree. Figure 3.1 shows the sample tree diagram for the given pattern. Figure 3.1: Tree representation of patterns Syntax-directed grammar is a mechanics for determining whether or not a pattern can be generated by a particular grammar. Once the grammars are known the basic problem is the development of a procedure for determining whether or not a given pattern represents a valid formula, a word or a sentence. As outlined earlier the procedure used in formal language theory to accomplish this is called parsing. Basically we consider two types of parsing techniques: (i) Top Down and (ii) Bottom Up. The top or root of the (inverted) tree is the start symbol S and through repeated application of the productions of the grammar one can attempt to arrive at the given terminal sentence. The 45

bottom up approach on the other hand starts with the given sentence and attempts to arrive at the symbol S by applying the production. In either case if the parsing fails then the given pattern represents an incorrect sentence and is therefore rejected. The parsing process can be further improved by employing the rules of syntax of the grammar [11]. Syntax is defined as the juxtaposition and concatenation of object. A rule of syntax states some permissible (or prohibited) relation between objects. A syntax-directed parser employs the syntax of the grammar in the parsing process. The syntactic pattern description of various types of objects as the top down process and the regeneration of objects from the sentence obtained as the bottom-up approach are briefly described in the next chapter. 46