The Entropy of Recursive Markov Processes BENNY BRODDA

Similar documents
Natural language processing implementation on Romanian ChatBot

E-LEARNING USABILITY: A LEARNER-ADAPTED APPROACH BASED ON THE EVALUATION OF LEANER S PREFERENCES. Valentina Terzieva, Yuri Pavlov, Rumen Andreev

Fuzzy Reference Gain-Scheduling Approach as Intelligent Agents: FRGS Agent

arxiv: v1 [cs.dl] 22 Dec 2016

'Norwegian University of Science and Technology, Department of Computer and Information Science

Consortium: North Carolina Community Colleges

CONSTITUENT VOICE TECHNICAL NOTE 1 INTRODUCING Version 1.1, September 2014

Application for Admission

Management Science Letters

part2 Participatory Processes

HANDBOOK. Career Center Handbook. Tools & Tips for Career Search Success CALIFORNIA STATE UNIVERSITY, SACR AMENTO

Statewide Framework Document for:

Lecture 10: Reinforcement Learning

A Minimalist Approach to Code-Switching. In the field of linguistics, the topic of bilingualism is a broad one. There are many

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Grade 5 + DIGITAL. EL Strategies. DOK 1-4 RTI Tiers 1-3. Flexible Supplemental K-8 ELA & Math Online & Print

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

School of Innovative Technologies and Engineering

The presence of interpretable but ungrammatical sentences corresponds to mismatches between interpretive and productive parsing.

CS 598 Natural Language Processing

University of Groningen. Systemen, planning, netwerken Bosman, Aart

Remainder Rules. 3. Ask students: How many carnations can you order and what size bunches do you make to take five carnations home?

Stochastic Calculus for Finance I (46-944) Spring 2008 Syllabus

Exemplar 6 th Grade Math Unit: Prime Factorization, Greatest Common Factor, and Least Common Multiple

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

2014 Gold Award Winner SpecialParent

Let s think about how to multiply and divide fractions by fractions!

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Algebra 1 Summer Packet

arxiv: v1 [math.at] 10 Jan 2016

also inside Continuing Education Alumni Authors College Events

(I couldn t find a Smartie Book) NEW Grade 5/6 Mathematics: (Number, Statistics and Probability) Title Smartie Mathematics

Given a directed graph G =(N A), where N is a set of m nodes and A. destination node, implying a direction for ow to follow. Arcs have limitations

Theory of Probability

Grade 6: Correlated to AGS Basic Math Skills

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Backwards Numbers: A Study of Place Value. Catherine Perez

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Radius STEM Readiness TM

Natural Language Processing. George Konidaris

Language Independent Passage Retrieval for Question Answering

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

Probability and Game Theory Course Syllabus

Cal s Dinner Card Deals

Guidelines and additional provisions for the PhD Programmes at VID Specialized University

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

Working with Rich Mathematical Tasks

Learning Disability Functional Capacity Evaluation. Dear Doctor,

Objective: Add decimals using place value strategies, and relate those strategies to a written method.

PTK 90-DAY CRASH COURSE CALENDAR

RANKING AND UNRANKING LEFT SZILARD LANGUAGES. Erkki Mäkinen DEPARTMENT OF COMPUTER SCIENCE UNIVERSITY OF TAMPERE REPORT A ER E P S I M S

Empiricism as Unifying Theme in the Standards for Mathematical Practice. Glenn Stevens Department of Mathematics Boston University

Ch VI- SENTENCE PATTERNS.

Extending Place Value with Whole Numbers to 1,000,000

Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice

The Evolution of Random Phenomena

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

South Carolina English Language Arts

Pre-Algebra A. Syllabus. Course Overview. Course Goals. General Skills. Credit Value

Proof Theory for Syntacticians

Cognitive Modeling. Tower of Hanoi: Description. Tower of Hanoi: The Task. Lecture 5: Models of Problem Solving. Frank Keller.

On March 15, 2016, Governor Rick Snyder. Continuing Medical Education Becomes Mandatory in Michigan. in this issue... 3 Great Lakes Veterinary

Reducing Abstraction When Learning Graph Theory

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

1.11 I Know What Do You Know?

GCSE. Mathematics A. Mark Scheme for January General Certificate of Secondary Education Unit A503/01: Mathematics C (Foundation Tier)

Math 96: Intermediate Algebra in Context

VISION, MISSION, VALUES, AND GOALS

Chapter 4 - Fractions

Creating a Test in Eduphoria! Aware

PM tutor. Estimate Activity Durations Part 2. Presented by Dipo Tepede, PMP, SSBB, MBA. Empowering Excellence. Powered by POeT Solvers Limited

Co-op Placement Packet

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE

Probabilistic Latent Semantic Analysis

SOUTHERN MAINE COMMUNITY COLLEGE South Portland, Maine 04106

Summer Assignment AP Literature and Composition Mrs. Schwartz

Lecture 1: Machine Learning Basics

Instructor: Matthew Wickes Kilgore Office: ES 310

Artificial Neural Networks written examination

BIOL Nutrition and Diet Therapy Blinn College-Bryan Campus Course Syllabus Spring 2011

INSTRUCTOR'S GUIDE PRONUNCIATION - Levels 1 & REVIEW LESSON I

Transfer Learning Action Models by Measuring the Similarity of Different Domains

DERMATOLOGY. Sponsored by the NYU Post-Graduate Medical School. 129 Years of Continuing Medical Education

On the Formation of Phoneme Categories in DNN Acoustic Models

Syntactic systematicity in sentence processing with a recurrent self-organizing network

Graduate Program in Education

Focus of the Unit: Much of this unit focuses on extending previous skills of multiplication and division to multi-digit whole numbers.

Books Effective Literacy Y5-8 Learning Through Talk Y4-8 Switch onto Spelling Spelling Under Scrutiny

Introduction to Simulation

ENGBG1 ENGBL1 Campus Linguistics. Meeting 2. Chapter 7 (Morphology) and chapter 9 (Syntax) Pia Sundqvist

ELPAC. Practice Test. Kindergarten. English Language Proficiency Assessments for California

Graduation Initiative 2025 Goals San Jose State

Mathematics. Mathematics

Smiley Face Feedback Form

Week 4: Action Planning and Personal Growth

SSIS SEL Edition Overview Fall 2017

6 Financial Aid Information

Criterion Met? Primary Supporting Y N Reading Street Comprehensive. Publisher Citations

Introduction to Causal Inference. Problem Set 1. Required Problems

EEAS 101 BASIC WIRING AND CIRCUIT DESIGN. Electrical Principles and Practices Text 3 nd Edition, Glen Mazur & Peter Zurlis

Transcription:

Research Group for Quatitative Liguistic s KVAL PM 339 Jue 191 1967 Fack Stockholm 40 SWEDEN The Etropy of Recursive Markov Processes By BENNY BRODDA The work reported i this paper has bee sposored by Humaistiska forskigsr~det, Tekiska forskigsr~det ad Riksbakes Jubileumsfod, Stockholm, Swede. '.

THE~ENTROPY OF RECURSIVE MARKOV PROCESSES By BENNY BRODDA KVAL, Fack, Stockholm 40, Swede Summary The aim of this commuicatio is to obtai a explicit formula for calculatig the etropy of a source which behaves i accordace with the rules of a arbitrary Phrase Structure Grammar, i which relative probabilities are attached to the rules i the grammar. With this aim i mid we itroduce a alte~rative defiitio of the cocept of a PSG as a set of self-embedded (re- Cursive) Fiite State Grammars; whe the probabilities are take ito accout i such a grammar we call it a Recursive Markov Process. 1. I the first sectio we give a more detailed defiitio of what kid of Markov Processes we are goig to geeralize later o (i sec. 3), ad we also outlie the cocept of etropy i a ordiary Markov source. More details "of iformatio may be foupd~ e.g., i Khichis "Mathematical Foudatios of Iformatio Theory", N.Y. ~ 1957~ or "Iformatio Theory" by R. Ash, N. Y., 1965. A Markov Grammar tie s : is defied as a Markov Source with the followig proper- Assume that there are + 1 states, say S O, S1,..., S, i the source. S O is defied as the iitial state ad S is defied as the fial state ad the other states are called itermediate states. We shall, of course, also have a trasi- tio matrix, M = (Pij), cotaiig the, trasitio probabilities of the source. a) A trasitio from state S i to state S k is always accompaied by a produc- tio of a (o-zero) letter aik from a give fiite alphabet. Trasitio to differet states from oe give state alway s produce differet letters. b) From the" iitial state, S0~ direct or idirect trasitios should be possible to ay other state i the source. From o state is a trasitio to S O allowed. c) From ay state, direct or idirect trasitios to the fial state S should be possible. From S o trasitio is allowed to ay other state (S is a "absorbig state"). The work reported i this paper has bee sposored by Humaistiska forskigsr~det, Tekiska forskigsr~det ad Riksbakes Jubileumsfod, Stockholm, Swederi.

A (grammatical) sete'ce should ow be defied as the (left-to-right) cocateatio of the letters produced by the source, whe passig from the iitial state to the fial state. The legth of a setece is defied as the umber of letters i the setece. To simplify matters without droppig much of geerality we also require that d) The greatest commo divisor for all the possible legths of seteces is = l (i.e., the source becomes a aperiodic source, if it is short-circuited by idetifyig the fial ad iitial states). ~- With the properties a - d above, the source obtaied by idetifyig the fial ad iitial states is a idecomposable, ergodic Markov process (cf. Feller, "Probability Theory ad Its Applicatios", ch. 15, N. Y. s 1950). I the trasitio matrix M for a Markov grammar of our type all elemets i the first colum are zero, ad i the last row all elemets are zero except the last oe which is = 1. For a give Markov grammar we defie the ucertaity or etropy, Hi, for each state S i, i = 0, 1...,, as: Hi=~l Pij l g Pij; i= 1, Z.... j=o We also defie the etropy, H or H(M), for the grammar as = 1 (1). = x.h. 1 1 i= 0 where x = (x0, x z,..., X_l) is defied as the statioary distributio-~ the source obtaied whe S O ad S are idetified; thus x is defied as the (uique) solutio to the set of simultaeous equatios (z) xm 1 = x x0 + X l + ''" + X-1 = 1 where M 1 is formed by shiftig the last ad first colums ad the omittig the last row ad colum. The mea setece legth. ~, of the set of grammatical seteces ca ow be easily calculated as

(3) = 1/x 0 2. Embedded Grammars (cf. Feller, op. tit.) We ow assume that we have two Markov grammars, M ad M1, with states S O, S 1..., S, ad T o, T I,..., T m, respectively, where S O ads, T O ad T m are the correspodig iitial ad fial states. Now cosider two states S i ad S k i the grammar M; assume that the correspodig trasitio probability is = Pik" We ow trasform the grammar, M1, ito a ew oe, M], by embeddig the grammar M 2 i M 1 betwee the states S i ad Sk, a operatio which is performed by idetifyig the states T O ad T with the m states S i ad S k respectively. Or, to be more precise, assume that i the grammar M 1 the trasitios to the states Tj, j~l, has the probabilities q0j" The, i the grammar M', trasitios to a state T. from the state S. will 3 1 take place with the probability =.Pikq0 j. A retur to the state S k i the "mai" grammar from a itermediate state Tj i M 1 takes place with the probability qjm" With the coditios above fulfilled, we propose that the etropy for the. com- posed grammar be calculated accordig to the formula: (4) H(M') = H(M) + xipik " ~I " H(M ) 1 + xipik (~1-1) where H(M) is the etropy of the grammar M whe there is a ordiary co- ectio (with probability Pik) betwee the states S i ad Sk, ad where x. is 1 the iheret probability of beig i the state S. uder the same coditios. 1 ~1 is the mea setece legth of the seteces produced by the grammar M 1 aloe. (It is quite atural that this umber appears as a weight i the formula, sice if oe is producig a setece accordig to the grammar M ad arrives at the state S i ad from there "dives" ito the grammar M1, the ~1 is the expected waitig time for emergig agai i the mai grammar M.) The factor xipik may be iterpreted as the combied probability of ever arrivig at.s i ad there choosig the path over to M 1 (you may, of course, choose quite aother path from Si).

also say that Aik grammar/.) stads as a abbreviatio for a arbitrary setece of that We associate each grammar M! with the grammar M., j = 0, 12..., N, by 3 3 just cosiderig it as a o-recursive oe, thaf is, we cosider all the sym- bols Aik as termial symbols (eve if they are:'ot). The grammars thus ob- taied are ordiarily Markov grammars accordig to our defiitio, ad the etropies Hj = H(Mj) are easily computed accordig to formula (1), as are the statioary distributios /formula (2)/. The follwoig theorem shows how the etropies H! for the fully recursive grammars M! are coected with the J 3 umbers H.. J Theorem The etropy H! for a set of recursive Markov grammar Mj, j = 0, 1, J ca be calculated accordig to the formula..., N, (6) k j=0, 1...,N. k Here the factors Yjk are depedet oly of the probability matrix of the grammar ad the umbers ~k defied as the mea setece legth of the seteces of the grammar M~, k = 0, 1,... N, ad computable accordig to lemma below. H~ is the etropy for the grammar. The theorem above is a direct applicatio for the grammar of formula (4), sec. 2. The coefficiets Yjk i formula (6) ca, more precisely, be calculated as a sum of terms of the type xipim with the idices (i, m) are where the gram- mar M~ appears i the grammar M3~!" x i ad Pim are the compoets the sta- tioary distributio ad probability matrix for the grammar M.o, J

Assume ow that we have a Markov grammar of our type, but for which each trasitio will take a certai amout of time. A very atural questio is the: "What is the expected time to produce a setece i that laguage?" The aswer is i the followig lemma. Lemma Let M be a_mmarkov grammar with states Si, i= O, S are the iitial ad fial states respectively, 1...,, where S O ad Assume that each trasitio S i -. S k will take Ylk time uits. Deote the expected time for arrival at S give that the grammar is i state S i by ti, i = 0, I,...~ ~ (thus t o is the expected time for producig ase- tece). The times t I will the fulfill the followig set of simultaeously liear equatios : (7) ti = ~ Pik (tik + tk) k Formula (7) is itself a proof of the lemra. With more coveiet otatios we ca write (7) as (E - P) t = Pt where E is the uit matrix, P is the probability matrix (with P = 0) ad Pt is the vector with compoets Pi (t) =~ Pim tim' i = 0, 1...,. m The applicatio of ~he lemma for computig the umbers ~k i formula (6) is ow the followig. The trasitio times of the lemma are, of course, the expected time (or "legths" as we have called it earlier) for passig via a sub-grammar of the grammar uder cosideratio. Thus the umber tik i-~]itself the ukow etitle s ~k" 6

For each of the sub-grammars M~, j : 0, I,..., N, we geta set of liear J equatios of type (7) for determiig the vectors t of 1emma. The first com- poet of this vector, i.e.j the umber t O, is the equal to the expected legth, ~, of the seteces of that g~ammar. (Ufortuately, we have to compute extra the expected time for goig from ay state of the sub-gram- mars to the correspodig fial state.) The total umber of ukows ivolved whe computig the etropy of our grammar (i. e., the etropy H~) is equal to (the total umber of states i all our sub-grammars) plus (the umber of sub-grammars). This is also the umber of equatior~,_for we have + 1 e~uatios from formula (6) ad the ( + 1) sets of equatios of the type (7). We assert that all these simultaeous equatios a~e solvable, if the grammar fulfills the coditios we earlier stated for the grammar, i.e., that from 'each state i ay subgrammar exists at least oe path to the fial state of that grammar.