A General Class of Noncontext Free Grammars Generating Context Free Languages

Similar documents
Language properties and Grammar of Parallel and Series Parallel Languages

Erkki Mäkinen State change languages as homomorphic images of Szilard languages

COMPUTATIONAL COMPLEXITY OF LEFT-ASSOCIATIVE GRAMMAR

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

A Version Space Approach to Learning Context-free Grammars

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

arxiv: v1 [math.at] 10 Jan 2016

Proof Theory for Syntacticians

A R "! I,,, !~ii ii! A ow ' r.-ii ' i ' JA' V5, 9. MiN, ;

Parsing of part-of-speech tagged Assamese Texts

Parallel Evaluation in Stratal OT * Adam Baker University of Arizona

Grammars & Parsing, Part 1:

Guide to the Uniform mark scale (UMS) Uniform marks in A-level and GCSE exams

Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice

LOS ANGELES CITY COLLEGE (LACC) ALTERNATE MEDIA PRODUCTION POLICY EQUAL ACCESS TO INSTRUCTIONAL AND COLLEGE WIDE INFORMATION

Enumeration of Context-Free Languages and Related Structures

PH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.)

PRODUCT PLATFORM DESIGN: A GRAPH GRAMMAR APPROACH

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Abstractions and the Brain

A GENERIC SPLIT PROCESS MODEL FOR ASSET MANAGEMENT DECISION-MAKING

Self Study Report Computer Science

Context Free Grammars. Many slides from Michael Collins

"f TOPIC =T COMP COMP... OBJ

On the Polynomial Degree of Minterm-Cyclic Functions

THEORY OF PLANNED BEHAVIOR MODEL IN ELECTRONIC LEARNING: A PILOT STUDY

The ADDIE Model. Michael Molenda Indiana University DRAFT

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Page 1 of 11. Curriculum Map: Grade 4 Math Course: Math 4 Sub-topic: General. Grade(s): None specified

CS 598 Natural Language Processing

Using the CU*BASE Member Survey

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Classifying combinations: Do students distinguish between different types of combination problems?

Morphotactics as Tier-Based Strictly Local Dependencies

CONCEPT MAPS AS A DEVICE FOR LEARNING DATABASE CONCEPTS

SARDNET: A Self-Organizing Feature Map for Sequences

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Diagnostic Test. Middle School Mathematics

systems have been developed that are well-suited to phenomena in but is properly contained in the indexed languages. We give a

Software Maintenance

Lecture Notes on Mathematical Olympiad Courses

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

South Carolina English Language Arts

HDR Presentation of Thesis Procedures pro-030 Version: 2.01

Developing a TT-MCTAG for German with an RCG-based Parser

Conducting the Reference Interview:

Word Segmentation of Off-line Handwritten Documents

Lecture 1: Machine Learning Basics

Artificial Neural Networks written examination

Backwards Numbers: A Study of Place Value. Catherine Perez

Marketing Management

An Online Handwriting Recognition System For Turkish

Best website to write my essay >>>CLICK HERE<<<

Discriminative Learning of Beam-Search Heuristics for Planning

Ab Calculus Clue Problem Set Answers

4-3 Basic Skills and Concepts

Timeline. Recommendations

UK flood management scheme

The Journal of Mathematical Behavior

University of Groningen. Systemen, planning, netwerken Bosman, Aart

What the National Curriculum requires in reading at Y5 and Y6

Some Principles of Automated Natural Language Information Extraction

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

GCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)

Experiences Using Defect Checklists in Software Engineering Education

Rule Learning With Negation: Issues Regarding Effectiveness

TabletClass Math Geometry Course Guidebook

Toward Probabilistic Natural Logic for Syllogistic Reasoning

The Strong Minimalist Thesis and Bounded Optimality

Probability and Game Theory Course Syllabus

Evolution of Collective Commitment during Teamwork

Rubric for Scoring English 1 Unit 1, Rhetorical Analysis

DISTRICT ASSESSMENT, EVALUATION & REPORTING GUIDELINES AND PROCEDURES

ISFA2008U_120 A SCHEDULING REINFORCEMENT LEARNING ALGORITHM

Using Proportions to Solve Percentage Problems I

KIS MYP Humanities Research Journal

Specification of the Verity Learning Companion and Self-Assessment Tool

Catchy Title for Machine

Are You Ready? Simplify Fractions

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Chapter 4 Grading and Academic Standards

A Grammar for Battle Management Language

Deploying Agile Practices in Organizations: A Case Study

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University

MANAGERIAL LEADERSHIP

Visual CP Representation of Knowledge

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Primary National Curriculum Alignment for Wales

PIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries

Outreach Connect User Manual

GRADUATE PROGRAM Department of Materials Science and Engineering, Drexel University Graduate Advisor: Prof. Caroline Schauer, Ph.D.

Sample Problems for MATH 5001, University of Georgia

Natural Language Processing. George Konidaris

Mining Student Evolution Using Associative Classification and Clustering

Mixed Up Multiplication Grid

Generating Test Cases From Use Cases

Lecture 10: Reinforcement Learning

Software Security: Integrating Secure Software Engineering in Graduate Computer Science Curriculum

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

Transcription:

INFORMATION AND CONTROL 43, 187-194 (1979) A General Class of Noncontext Free Grammars Generating Context Free Languages SARWAN K. AGGARWAL Boeing Wichita Company, Wichita, Kansas 67210 AND JAMES A. HEINEN Department of Electrical Engineering, Marquette University, Milwaukee, Wisconsin 53233 The concept of SVMT-bounded grammars is introduced. It is shown that SVMT-bounded grammars generate only context free languages, and that all context free grammars can be converted to equivalent SVMT-bounded grammars. It is also shown that the property of SVMT-boundedness can sometimes be used to conclude that a given language is context free while certain previous results cannot be used for this purpose. l. INTRODUCTION It is, of course, true that context free grammars generate context free languages. It is also well known that some noncontext free grammars also generate languages that happen to be context free. These grammars are therefore no more "powerful" in their ability to generate languages than are context free grammars. Exactly which noneontext free grammars generate only context free languages is not known. Several results have been obtained, however, that identify certain classes of grammars generating only context free languages. These classes are determined by restrictions on the form of rules allowed in the grammar. (Other results have also been obtained dealing with the manner in which the rules are applied, but these are not considered here.) A new type of grammar (SVMT-bounded grammar) is introduced in this paper. It is shown that SVMT-bounded grammars can generate only context free languages. More importantly, it is demonstrated that certain grammars satisfying the property of SVMT-boundedness do not satisfy previously obtained conditions which guarantee the generation of context free languages. This, therefore, results in an enlargement of the known class of grammars generating only context free languages. 187 0019-9958/79/110187-08 $02.00/0 Copyright 1979 by Academic Press, Inc. All rights of reproduction in any form reserved.

188 AGGARWAL AND HEINEN 2. NOTATIONAL PRELIMINARIES AND DEFINITIONS This section summarizes the notation to be used in the remainder of this paper. For a more thorough description of the concepts involved, the reader is referred to Aho and Ullman (1972) and Salomaa (1973). If V is a "vocabulary" (finite nonempty set of symbols), then V* denotes the set of all finite strings over V, including e, the "empty string" (concatenation identity), and V + denotes the set of all nonempty finite strings over V, i.e., V + = V*-{e}. If c~ ~ V*, then [ c~ [ denotes the length of the string c~. A "(phrase structure) grammar" is denoted by G ~ (V, 27, P, S), where V is the vocabulary, Z (a nonempty subset of V) is the set of "terminals," P (a nonempty finite set of rules for string generation of the form xoalxl_/12 "".dnx n --~ Xofilxafi~ "'" finx~, where n > 0 and where xi ~ Z*, 0 <~ i <~ n, and A i ~ V -- 27 and fii ~ V*, 1 <~ i ~ n) is the set of "productions," and S ~ V -- Z is the "start symbol." The elements of V -- Z are called "nonterminals" or "variables." A string 7~3 "derives" the string 7fi3 (denoted by 7o~3 ~ 7fi3) if 7, 3 e V* and the rule c~--~/3 is in P. The transitive reflexive closure of ~ is denoted by *~. The set of all "sentences" (strings of terminals) that can be derived from S is called the "language" L(G) generated by the grammar G, i.e., L(G) = {x ] S *~ x and x ~ Z*}. Two grammars are "equivalent" if they generate the same language. A grammar is "context free" if every production is of the form ~ --~/3 where e V -- Z and fie V +. A language L is "context free" if there exists a context free grammar G such that L = L(G). In some of the results that follow it will be necessary to consider strict partial orderings on a vocabulary. A relation < on V (a subset of V V) is a "strict partial ordering on V" if (i) X < Y and Y < Z imply X < Z for all X, Y, Z e V and (ii) X 42 X for all X e V. Similarly, < is a "strict partial ordering on V U Z*" if (i) c~ </3 and/3 < y imply ~ < 7 for all c~,/3, 7 e V U 2J* and (ii) ~ 42 a for all ~ ~ V k3 Z*. The main result of the paper requires the concept of a generalized sequential machine. A generalized sequential machine (gsm) is a system M = (Q, Z, A, 8, A, qo), where Q is the set of states (finite), 27 is the input alphabet (a finite set), A is the output alphabet (a finite set), and qo e Q is the initial state. The function ~:Q 27--~Q is the transition function and A: Q x Z-+A* is the output function. These may be extended to Q Z* as follows: (i) for all q E Q, 3(q, e) = q and 1(q, e) = e, (ii) for all q e Q, x a Z*, and a e 27, and 3(q, xa) =3(~(q, x), a) A(q, xa) = A(q, x) ~(3(q, x), a).

GRAMMARS WITH CONTEXT FREE LANGUAGES 189 The associated gsm mapping M is defined as M(x) = A(qo, x) for all x ~ Z*, and, for a languagel, M(L) = {M(x) I x ~L}. It can be shown that gsm mappings preserve the context free nature of a language. Finally, a regular set is a subset of 2" which is either ~ (the empty set), {e}, {a} for some a c Z, or which can be obtained from these sets by a finite number of unions, concatenations, and/or operations under *. It is well known that intersection with a regular set also maintains the context free nature of a language. The following notation, similar to that used by Baker (1974), will be convenient in the proof of the main result. For a grammar G = (V, Z, P, S), L G = max{l c~ ] ] a ~ Z* and there exist 7, fi ~ V* such that a7 --> fi ~ P}, R e = max{[ ~ I [ ~ E Z* and there exist 7, ]9 ~ V* such that 7a ~ 1~ ~ P}, M a =max{{0}u{l~[ [ ~ a X* and there exist y, 8, fiav*,a, BaV--Z such that 7Ao~B3 ~ fi ~ P}}, N(G)= E (ia]--l). cz~3~p It is noted that the grammar G is context free if and only if N(G) = O. 3. GENERATIVE POWER OF CERTAIN GRAMMARS A problem that has been studied considerably in recent years is that of determining the circumstances under which the language L(G) generated by a grammar G will be context free even though the grammar G may not itself be context free. Several results consisting of sufficient conditions on the form of the rules of G to guarantee that L(G) is context free have been developed by previous investigators (Hibbard, 1966; Ginsburg and Greibach, 1966; Book, 1972; Baker, 1974). That is to say, conditions have been given under which the "addition of context" does not increase the generative power of a grammar over that obtainable without it. Two of these results are presented below. THEOREM 1 (Hibbard, 1966). If G = (V, Z, P, S) is a grammar, < is a strict partial ordering on V, each rule in P is of the form AiM 2... An -+ X1X2 "'" Xm where A i ~ V- Z, 1 ~ i ~ n, and Xj ~ V, 1 ~j ~ m, and there exists a k, 1 ~ k ~ m, such that ~1 i ~ X k for all i, 1 ~ i ~ n, then L(G) is context free. The concept of terminal bounded grammars will be needed for the next theorem. DEFINITION 1. The grammar G = (V, Z, P, S) is "terminal bounded" if

190 AGGARWAL AND HEINEN each rule in P is of the form xoalxla 2... Anx~ --~yoblylb2 "" B,~3,,~, where xi, yj e Z*, O <~ i ~ n, O <~ j <~ m, Ak, Br ~ V -- Z, 1 ~ k <~ n, 1 <~ r ~ m, and either n = 1 or there exists ap, 0 ~<p ~ m, such that I xi I < ]Y~ [ for all i, l <~ i <~ n--1. THEOREM 2 (Baker, 1974). If G = (V, Z, P, S) is a terminal bounded grammar, then L(G) is context free. Baker (1974) has shown that the various results of Ginsburg and Greibach (1966) and Book (1972) are simply special (restricted) cases of Theorem 2, and, consequently they are not repeated here. In addition, although it employs a somewhat different approach, Theorem 1 can also be obtained as a corollary from Theorem 2. 4. SVMT-BOuNDED GRAMMARS A new type of grammar (SVMT-bounded) is introduced in this section. It is based on the concept of SVMT-strings; DEFINITION 2. If G ~- (V, Z, P, S) is a grammar, the string ~ ~ V* is an "SVMT-string" (single variable or multiple terminal string) if ~ ~ V u Z*. The following definition introduces a particular type of strict partial ordering defined on the elements of V k) Z* (all SVMT-strings): DEFINITION 3. Suppose that G = (g, Z, P, S) is a grammar and < is a strict partial ordering on V k9 Z*. Then < is an "SVMT-ordering on G" if: (i) e<~forall~vwz +, (ii) A <xforallaev--zandx~z+, (iii) x <y for all x, y~z*suchthatlx[ <]yl, and (iv) whenever A < B and B ~ V -- Z, then A c (g -- 27) U {e}. It is noted that there can be SVMT-strings e~ and 13 which satisfy neither e~ <]3 nor fi < ~. It can be shown that if G = (V,Z,P,S) is a grammar and < is a strict partial ordering on V then the addition of all the relationships implied by conditions (i)-(iii) of Definition 3 produces a new strict partial ordering < on V u Z*. This can be easily proved by a straightforward (but lengthy) verification of the conditions of the definition of a strict partial ordering. For the details of the proof the reader is referred to Aggarwal (1976). It will be useful for the theorem that follows to introduce the concept of the SVMT-normal form of a string.

GRAMMARS WITH CONTEXT FREE LANGUAGES 191 DEFINITION 4. If G = (If, 27, P, S) is a grammar and ~ ~ V +, then the "SVMT-normal form" of c is ~ = ~1~2 "'" ~, where each ~i, 1 ~< i ~< n is a nonempty SVMT-string, and if n >~ 2 then for all i, 1 ~< i ~< n -- 1, at least one of the two strings ~i and cq+ 1 is a single variable (i.e., an element of V -- 27). If ~ = e, then its "SVMT-normal form" is ~ ~ % = e. Thus a string is put in SVMT-normal form by writing it in terms of its "maximal" SVMT-string components. The following example will illustrate this idea: EXAMPLE I. Let V = {A, B, S, a, b, c} and Z = {a, b, c}. If c~ -- ABabAa bca, then its SVMT-normal form is c~ = c 1c~2%cq% %, where c~ 1 = A, o~ 2 = B, % =ab,%=a,~5 =abc, and%=a- Clearly, the SVMT-normal form representation of any given string is unique. The concepts of SVMT-bounded rules and SVMT-bounded grammars will now be defined. DEFINITION 5. If G = (V,.Y,, P, S) is a grammar and < is an SVMTordering on G, then the rule ~1~2 "-" c~ ~ filfi2 "'" fi~ (written in SVMT-normal form) is "SVMT-bounded with respect to <" if there exists a k, 1 ~< h ~< m, such that ~,: < fie for all i, 1 ~< i ~< n. DEFINITION 6. If G = (V, Z, P, S) is a grammar and < is an SVMTordering on G, then G is "SVMT-bounded" if each rule in P (with the possible exception of S--~ e) is SVMT-bounded with respect to <. In the next section the generative power of SVMT-bounded grammars will be determined. 5. GENERATIVE POWER OF SVMT-BOUNDED GRAMMARS The foilowing theorem is the main result of this paper: THEOREM 3. If G = (V, 2:, P, S) is an SVMT-bounded grammar, then L(G) is context free. Proof. The proof closely parallels Baker's (1974) proof of Theorem 2 stated above. Many of the details are therefore omitted. The first step is to construct a new grammar G 1 = (V1, Z1, P1, S) from G. Two cases must be considered. Case A. La >.7FIa or Ra > Ma It will be assumed that R a >/L a. (The development for L a > R a is similar.) G 1 is constructed by choosing any rule ~x --~ fix in P such that x ~ Z'* and i x ] = Ra, and replacing this rule by the rule ~--~fix$ in P1-This results in a grammar G1 = (Vu{$}, 27u{$}, P1, S) with

192 AGGAR'WAL AND HEINEN (i) N(G~) < N(G), and (ii) L(G) = MI(L(G1) n R1), where R 1 ~--- (gk.) {$x})* is a regular set and M s is a gsm. Case B. Mc >~ Le and MG ~> RG In this case the construction consists of choosing a rule in P of the form xoalxla 2.'. Amxm "-~ XoO~lXlO~ 2 "'" ~,~x... where xi ~ X*, 0 ~ i ~ m, A~ ~ V -- Z, a~ E V*, 1 ~ j ~< m, and such that [ x1~ [ = Me for some k. This rule is then rewritten as ~4~xkAk+lfi --~ 7y3, wherec~,fi, 7, ~cv*,yez*,and ]y[ > Ix k[. Finally, this rule in P is replaced by the following two rules in P1 : aa~ ~ 7Y$, xkak+lfi -~ xk fi. In this case the result is a grammar G 1 = (V w {$, }, 27 u {$, },/)1, S) with (i) N(G1) < N(G), and (ii) L(G) ~-- M~(L(G~) n R~), where R~ = (V w {$xkcy})* is a regular set and M s is a gsm. The next step is to repeat the constructions in Cases A and B as often as necessary to produce a series of grammars, gsm's, and regular sets satisfying L(G) -= MI(L(Ga) n R~) L(G~) = M2(L(a2) c~ R2) L(G~_I) = Mn(L(Gn) n R,,) for which N(G,~) < N(G,_I) <." < N(Ga) < N(G). For the grammar G~, L% = R%-= ]FIG, ~ = O, but there is as yet no guarantee that N(G~)= O. It will now be shown that L(G,~) is indeed, however, a context free language. Since La~ = Ran ~ Ma~ = O, all rules in Pn are of the type A1A 2 "'" A,, -+ ~la2 "'" a k, where Aie V-- 2, 1 ~ i ~ m, and c9, 1 ~j ~ k, is an SVMT-string. The constructions described above do not disturb the SVMT-bounded nature of the rules in P and hence G n is an SVMT-bounded grammar. This grammar G~ is exactly in the form described in Theorem I above and, as such, L(G~) is context free. Now, since L(G) can be obtained from L(G~) by a finite number of operations by gsm's and intersections with regular sets, both of which preserve the context free nature of a language, L(G) is also context free. Q.E.D.

GRAMMARS WITH CONTEXT FREE LANGUAGES 193 It is clear that one may view Theorem 3 as a generalization of the concepts in Theorem 1 in light of Theorem 2. On the other hand, Theorem 1 can be considered as a special case of Theorem 3 in which all rules have left-hand sides consisting entirely of variables. Theorem 2 can also be considered as a special case of Theorem 3 (using only conditions (ii) and (iii) of Definition 3), since terminal bounded grammars are also SVMT-bounded. The following example illustrates the significance of Theorem 3 in that it demonstrates that there are situations in which Theorem 3 applies while none of the previous results do. Thus the class of SVMT-bounded grammars enlarges the known class of grammars generating only context free languages. (For greater details of the example the reader is referred to Aggarwal (1976).) EXAMPLE 2. Consider the grammar G : (17, Z, P, S), where V = is, B, C, D, a, b, c}, Z = {a, b, c}, P = S---~ abc CB ~ BD ab ~ ab 5. bb--+ bb 1 bd --~ bc 7.6" cc-+ cc I" The language generated by G is L(G) = {abc, aabbcc} which is finite, and, hence, obviously, context free. Theorem 1 does not apply since Rule 4 (as is also the case for Rules 5-7) has a terminal on the left-hand side. Theorem 2 does not apply since Rule 3 is not terminal bounded. On the other hand, one can define the following strict partial ordering on the vocabulary V: S, B, C, D < a, b, c, B,C<D. By adding to this all the additional relationships implied by conditions (i)-(iii) of Definition 3, the strict partial ordering on /7 is "extended" to become a strict partial ordering on V ~d Z*. This can easily be shown to be an SVMTordering on G. Thus all rules in P are SVMT-bounded and hence Theorem 3 can be applied to establish that L(G) is context free. The following theorem demonstrates a certain "fundamental result" regarding the relationship of SVMT-bounded grammars to context free grammars. THEOREM 4. Any context free grammar G = (17, Z, P, S) can be converted to an equivalent SVMT-bounded grammar G'. Proof. Any context free grammar G =- (17, Z, P, S) can be converted to an equivalent grammar G' = (V', ~', P', S) in Greibach normal form (Hopcroft

194 AGGARWAL AND HEINEN and Ullman, 1969), i.e., a grammar in which each rule is of the form A --~ as, where A~ V'--2, a~, and ~(V'--Z)*. (Note that V_C_C V'.)Each side of these rules is in SVMT-normal form. Now choose any strict partial ordering on the vocabulary V' of G'. As indicated earlier this can be "extended" to become a strict partial ordering on V' u Z*, and hence, an SVMT-ordering on G'. Now, from Definition 3 condition (ii), for each rule in P', A < a. Thus, by Definition 6, G' is SVMT-bounded. Q.E.D. RECEIVED: July 21, 1978; REVISED: June 29, 1979 REFERENCES AGGARWAL, S. (1976), Classified grammars, Doctoral dissertation, Marquette University, Milwaukee, Wisconsin. AHO, A., AND ULLMAN, J, (1972), "The Theory of Parsing, Translation, and Compiling," Vol. I: "Parsing," Prentice-Hall, Englewood Cliffs, N. J. BAKER, B. (1974), Non-context-free grammars generating context-free languages, Inform. Contr. 24, 231-246. BOOK, R. (1972), Terminal context in context-sensitive grammars, SIAM f. Comput. 1, 20-30. GINSBURG, S., AND GREIBACH, S. (1966), Mappings which preserve context-sensitive languages, Inform. Contr. 9, 563-582. HIBBARD, T. (1966), Scan limited automata and context limited grammars, Doctoral dissertation, University of California, Los Angeles, California. HOPCROFT, J., AND ULLMAN, J. (1969), "Formal Languages and their Relation to Automata," Addison-Wesley, Reading, Mass. SALOMAA, A. (1973), "Formal Languages," Academic Press, New York.