Exam Speech and Language Processing 1 (216631) 24 January 2006 Introduction This exam Speech and Language Processing 1 consists of 20 multiple choice questions. You may use the book Speech and Language Processing, the slides and your notes. You can earn 100 points for this exam: 5 points per question. The numbered grammar referred to in two of the multiple choice questions can be found in the final section of this document. When the time is up or when you are finished you should hand in the answer form for the multiple choice questions. Tip: First fill in your answers on this question form; check the answers when you have completed all the questions; then fill in your answers on the answer form. Success 1
Multiple choice questions 1. Which of the following regular languages is accepted by the automaton shown here? (q 0 is the start state) (a) a(ba)* a(bba)* (b) {aba, abba} (c) a(bb?a)* (d) ab(b?a)? 2. Consider the following two statements about inflection and derivation. i) In English adding the suffix -s to the end of an infinitive verb (for example, sing sings ) is a form of inflection. ii) In English adding the suffix -ism to an adjective (for example, national nationalism ) is a form of derivation. Which of these statements are true? (a) Both are true (b) None of them is true (c) Only i) is true (d) Only ii) is true 3. Consider the following to statements about morphemes and syllables. i) A morpheme can consist of several syllables. ii) A syllable can consist of several morphemes. Which of these statements are true? 2
(a) Both are true (b) None of them is true (c) Only i) is true (d) Only ii) is true 4. In Dutch, the past tense of a verb ends in de if the verb stem ends in a voiced sound (for example, voedde, fed and oliede, oiled ) and in te if the verb stem ends in an unvoiced sound (for example, zakte, failed and pestte, bullied ). We assume that the basic past tense suffix is de, and that in the step from intermediate to surface level de is changed to te after an unvoiced sound. Below you see the state-transition table for a transducer that can correctly generate the past tenses mentioned above. We use PC-Kimmo notation where 0 is the empty symbol, + is the morpheme boundary symbol, and @ is the other symbol. The symbol CU stands for unvoiced consonants. Final states are indied with a colon (:) and non-final states with a dot after the state number. State numbers start with 1; the fail state has number zero. RULE "DE/TE Replacement" 7 7 CU + # d d e @ CU 0 # t d e @ 1: 2 1 1 0 1 1 1 2: 2 3 1 0 1 1 1 3: 2 0 1 6 4 1 1 4. 2 1 1 0 1 5 1 5. 2 1 0 0 1 1 1 6. 0 0 0 0 0 7 0 7: 0 0 1 0 0 0 0 Assume we make the following changes to the transducer: - We replace the d:d transition from state 3 to state 4 with a d:d transition from state 3 to state 0, and - We replace the CU:CU transition from state 2 to state 2 with a CU:CU transition from state 2 to state 1. What will happen now? 3
(a) The transducer will now accept (and generate) the incorrect past tense form zakde (b) The transducer will now accept (and generate) the incorrect past tense form pestde (c) The transducer will now accept (and generate) both zakde and pestde (d) The transducer will still not accept (nor generate) zakde and pestde 5. A finite state automaton (FSA) accepts a regular language. A finite state transducer (FST) is an extension of a finite state automaton; it defines a translation from sequences of input symbols (a regular language) to sequences of output symbols. Finite state automata as well as finite state transducers can be non-deterministic. A finite state transducer is non-deterministic if the underlying finite state automaton (that we get by ignoring the output symbols on the transitions of the automaton) is non-deterministic. Consider the following two statements. i) For every non-deterministic FSA there is a deterministic FSA that accepts the same regular language. ii) For every non-deterministic FST there is a deterministic FST that defines the same translation. (a) Only i) is true. (b) Only ii) is true. (c) Both i) and ii) are true. (d) Both i) and ii) are false. 6. Some natural language stemming algorithm has the following two properties: 1) the words adhere and adhesion remain distinct after stemming; 2) the words experiment and experience are reduced to the same stem. Which of the following statements is true? 4
(a) 1) is an example of overstemming, 2) is an example of understemming (b) 1) is an example of understemming, 2) is an example of overstemming (c) Both 1) and 2) are examples of overstemming (d) Both 1) and 2) are examples of understemming 7. The field of phonology is about: (a) How speech sounds are actually made, transmitted and received (b) Studying all the sounds that both human and artificial voices are capable of creating (c) Studying subsets of the sounds that constitute language and meaning (d) How sounds can be organized into one system for all languages 8. Which of the following sound classifiions should not be part of this group? (a) Nasal (b) Dental (c) Velar (d) Glottal 9. Which English speech sounds does the following feature bundle refer to? [ + consonant [ - sonorant [ +/- voice [ + back (a) /m/ ( man ) and /n/ ( name ) (b) /k/ ( ) and /g/ ( goal ) (c) /p/ ( pack ) and /b/ ( ball ) (d) /f/ ( foot ) and /v/ ( verb ) 5
10. Which of the following phonetic transcriptions of Dutch words can be regarded as representative for the way the word would be normally pronounced in Dutch? (a) dichtdoen: d I x t d u n (b) lenen: l e: n @ (c) politie: p l i s i (d) herfststraal: h E r f s t s t r a: l 11. Dobby the house-elf, one of the characters in the Harry Potter books and films, has a rather typical way of speaking. For example, Dobby says things like this to Harry Potter: Dobby has to punish himself, sir Dobby has come to warn Harry Potter Harry Potter asks if he can help Dobby... These utterances are different from normal English at which linguistic level? (a) Phonology (b) Morphology (c) Syntax (d) None of the above 12. Consider the grammar below: Rules Lexicon S NP VP Prep with, in VP Verb NP (PP) Noun woods, bike NP (Det) Nom (PP) Det the PP Prep NP Verb saw Nom Noun ProperNoun John, Peter Nom ProperNoun How many parse trees does this grammar produce for the sentence John saw Peter with the bike in the woods? 6
(a) 1 (b) 2 (c) 3 (d) More than 3 13. Consider the sentence She bought a potato and some carrots when she went to the corner store. Which of the following lists of word sequences only contain constituents of this sentence? (a) She bought, a potato and some carrots (b) She, to the corner store (c) the corner store, bought a potato (d) potato, she went to the corner store 14. Which of the following feature structures does Grammar 1 (given at the end of this document) assign to the sentence A student works? subj (a) head S sub NP student [ pers 3 num sg VP works [ NP student [ pers 3 num sg 7
(b) S subj 1 NP student [ pers 3 num sg head VP works sub 1 [ pers 3 num sg (c) S subj 1 NP student [ pers 3 num sg head VP works sub [ subj 1 (d) None of the above 8
15. We want to extend Grammar 1 so that we can parse the sentence two students work but not two student work or two students works. To achieve this, which of the following ical items should we add to the icon? Det two Noun (a) sub 1 1 (b) sub Det two 1 [ Noun 1 [ num 2 (c) sub Det two [ Noun [ num pl (d) sub Det two [ num pl Noun [ pers 3 num pl 9
16. A language has 100 words. Every word w has equal probability of occurring in a sentence. For every word w i, every word w j also has equal probability of occurring after w i. What are the values of the probabilities P (w i w j ), the probability that the bigram w i w j occurs, and P (w j w i ), the probability of word w j if the preceding word is w i? (a) P (w i w j ) = 0.01 and P (w j w i ) = 0.01 (b) P (w i w j ) = 0.0001 and P (w j w i ) = 0.01 (c) P (w i w j ) = 0.01 and P (w j w i ) = 0.0001 (d) P (w i w j ) = 0.0001 and P (w j w i ) = 0.0001 17. Language XL is modelled as a random sequence of letters with the following probabilities of occurrence: a b c d e f 1/16 1/4 1/16 1/4 1/4 1/8 What is the per letter entropy of this language model? (a) 2.0 (b) 2.375 (c) 3.0 (d) Neither 2.0, 2.375 or 3.0 18. Good-Turing estimators use this equation to calculate the probability of seeing word X, having seen a corpus: with: P (X corpus) = r N r = (r + 1) E(N r+1) E(N r ) where r is the number of times you ve seen word X, N r is the number of different words that were seen exactly r times, and the E() means 10
you re trying to estimate what N r would normally be, for an infinite corpus of an infinite language. N is the total number of counts, and r is the adjusted number of observations: which is how many times you should have seen that word (which is often a fraction). A Very Simple Form of Good-Turing Estimation takes as function E() the identity function: E(n) = n. A corpus has 30000 words. The word unusualness occurs once. There are 10000 words that occur exactly once. There are 3000 words that occur exactly twice in the corpus. What is the estimated probability of the word unusualness if we use the Very Simple Good-Turing Estimation method? (a) 1/30000 (b) 1/10000 (c) 2/100000 (d) other value 19. Consider the following two statements. i) The sum of the re-estimated Simple Good-Turing probabilities of all the words in the corpus is exactly one. ii) The Very Simple Good-Turing Estimation method (see previous exercise) has a major drawback: it may assign probability zero to some words, namely if by chance for some value of r there are no word types that occur exactly r times. (a) Only i) is true. (b) Only ii) is true. (c) Both i) and ii) are true. (d) Both i) and ii) are false. 20. Consider the following context-free grammar, with N oun, Det and V erb as Part of Speech symbols. The words John and Mary have only Part of Speech P ropnoun, the words walks and sleeps have only Part of Speech V erb, and the word and has only Part of Speech Conj. 11
S S Conj S S NP VP NP Det Nom NP PropNoun VP V NP VP V We use Earley s Recognizer (see J&M Figure 10.16, page 381) to check whether the sentence Mary walks and John sleeps is correct according to this grammar. Constructing Chart[0 we start with the initial item [γ S; 0 and we add as many different items to the chart as possible. Then we construct Chart[1 also adding as many different items as possible. And so on. What is the number of items Chart[1 will eventually have according to this algorithm? (a) 5 (b) 6 (c) 7 (d) 8 12
Grammar 1 Rules: S VP NP NP VP <S subj> = <NP> <S head> = <VP> <VP sub subj> = <NP> Verb <VP > = <Verb > <VP > = <Verb > <VP sub> = <Verb sub> Det Noun <NP > = <Noun > <NP > = <Det > <Det sub> = <Noun> Lexicon: Noun student [ pers 3 num sg Noun students [ pers 3 num pl sub Verb work [ subj [ NP [ num pl 13
sub Verb works subj NP [ pers 3 num sg sub Det a 1 [ Noun 1 [ num sg sub 1 Det the [ Noun 1 14