Ling 601/510 9/3/2013. Lecture 1: First thoughts on Optimality Theory. Part 1: Rumblings of discontent in the belly of rule-based phonology.

Lecture 1: First thoughts on Optimality Theory Part 1: Rumblings of discontent in the belly of rule-based phonology. A. Insufficient restriction. SPE (Sound Pattern of English, Chomsky & Halle 1968, the Bible of early Generative phonology) claims that a language can have any collection of rules, so long as rules are of the form A -> B/C D (where A-D are natural classes). Example, C -> +cont /V V (intervocalic spirantization, an extremely common species of lenition pattern), but C -> -cont /V V (intervocalic occlusivization) is unattested. We know that the latter rule is bad for phonetic reasons, but these are not expressed in the theory. Failure to formally capture a notion of phonetic naturalness (i.e. rules serve the phonetic functional goals of making sounds easier to pronounce (ease of articulation) and/or less confusable with other sounds (ease of perception). SPE, p. 400: "The entire discussion of phonology in this book suffers from a fundamental theoretical inadequacy.... The problem is that our approach to features, to rules, and to evaluation has been overly formal. Suppose, for example, that we were systematically to interchange features or to replace [αf] with [-αf] (where α = + or -, and F is a feature) throughout our description of English structure. There is nothing in our account of linguistic theory to indicate that the result would be the description of a system that violates certain principles governing human languages. To the extent that this is true, we have failed to formulate the principles of linguistic theory, of universal grammar, in a satisfactory manner." (SPE's partial solution: context-free "marking conventions") B. The need for morpheme structure constraints in addition to rules (Stanley 1967). E.g. Turkic (e.g. Yakut) vowel harmony systems, within roots, vowels agree in backness and rounding w/ preceding vowel. Same condition actively enforced by means of a rule when suffixes are added. Within the root, there is no evidence of an active rule changing the underlying values. So 2 difft. devices are required to capture the same generalization. C. The conspiracy" phenomenon: a collection of formally unrelated rules may "conspire" to create or avoid certain outputs. Kisseberth (1970):Yawelmani: Ø -> V / C CC V -> Ø / VC CV Ø -> V / C C# V -> Ø / V+C #] Verb C -> Ø / CC + C -> Ø / C + C Nowhere is the generalization captured, *complex syllable margins, the principle that underlies this collection of rules. Subtract one of the rules and you have a formally simpler, but functionally incoherent rule system. D. Loanword phonology: how do speakers "learn" rules to deal with kinds of words they've never heard before? Ex., for speakers who say Rachmaninoff as Rockmaninoff, how could they have learned a rule x -> k, since there are hardly ever exposed to [x] (indeed, it's the speakers who are less exposed to [x] who are more likely to do this substitution. Intuitively, what's going on is that [x] is forbidden in certain dialects of English, and speakers who do x -> k

are substituting the closest legal sound in the language; but the rule-based framework doesn't give us a good way of expressing this. Part 2: Introducing Classic OT I. A Thumbnail Sketch of OT: Rule-based phonology works like a factory assembly line. OT is more like ordering a product to meet certain specifications. A. The basic formal device is constraints (i.e. static well-formedness statements) rather than rules (i.e. operations). B. The constraints are violable. C. Cross-linguistic variation lies not in the constraint set, but in their ranking. D. Input-output mappings are determined by two functions, GEN and H-EVAL. 1. GEN takes the input and maps it to an (infinite) set of candidate outputs. 2. H-EVAL takes the candidate set and maps it to the output (the "winner"), i.e. the candidate with the best score, vis-à-vis the constraint hierarchy. 3. Proof that a particular input-output mapping obtains under some ranking takes the form of a tableau: Input: /kæt/ ASPIRATE DON'T DELETE *CODA kæt *! * DON'T CHANGE [spread glottis] k h æt * * k h æ *! DON'T INSERT k h ætnɪp **! ** dɔg *!** * *** O Canada, our home and native land... etc. II. Introducing faithfulness and factorial typology A. Faithfulness constraints militate against change between input and output. MAX: No deletion of segments. DEP: No insertion of segments. B. CV syllable theory as an illustration.

ONSET: Syllables have an onset *CODA: Syllables have no coda (Syllables have nucleus, by definition) CV as the unmarked syllable type falls out from these two constraints; more generally, the markedness scale, CV > {CVC, V} > VC. {*CODA, ONSET, MAX}» DEP = epenthesis to achieve CV syllables {ONSET, DEP, *CODA}» MAX = deletion to achieve CV syllables {*CODA, MAX}» DEP» ONSET = epenthesis to achieve CV, V syllables {ONSET, MAX}» DEP» *CODA = epenthesis to achieve CV, CVC syllables {*CODA, DEP}» MAX» ONSET = deletion to achieve CV, V syllables {ONSET, DEP}» MAX» *CODA = deletion to achieve CV, CVC syllables {DEP, MAX}» {*CODA, ONSET} = no epenthesis or deletion rules, codas and onsetless syllables tolerated. Key insight: faithfulness is a violable constraint, ranked vis-a-vis other constraints. Rule application corresponds to the ranking schema: well-formedness constraint» faithfulness; failure of rule application corresponds to the opposite ranking. Part 3: Further OT notions 1. What is markedness? a. In phonological theory: a marked feature value or structure is one whose existence typologically implies existence of some other ( unmarked ) value or structure, e.g. voiceless sonorants imply voiced, CVC syllables imply CV (typological definition) b. Unmarked values and structures are also higher frequency, typologically and language-internally (e.g. in a representative corpus of speech, we expect more voiceless obstruents than voiced, more ). c. Where does markedness come from? Still somewhat controversial, but a hypothesis gaining ground is that it has to do with what's easy to pronounce and easy to perceive (functional definition). If so, we expect markedness statements to be contextsensitive: What s marked word-initially might be unmarked intervocalically, etc. d. In other linguistic domains, markedness has varying definitions, and different sources, though the core idea in all domains is that marked structures are rarer, unmarked are more ordinary. 2. Markedness vs. faithfulness, the basic OT story a. In order for anything like a phonological pattern to happen, markedness >> faith. b. Context-sensitive markedness. i. Dutch coda devoicing: *Coda-voice >> Ident(voice) ii. English (coda devoicing does not apply): otherway around, result, both + and - voi segments ok in coda. iii. Moral: same constraints, difft ranking, difft resulting pattern. But note, OT claims that *Coda-voice is still a constraint in English, even if it's not

triggering coda devoicing ATB. Therefore, under particular circumstances where higher-ranked constraints are not relevant, we might (indeed, we shall) see the effects of this constraint. c. More complex interaction, context-sensitive markedness (e.g. *V oral N), context-free markedness (*V nas ) and faith (Ident(nas)). Faith >> CFM, CSM contrast in nasal/oral vowels, in all contexts CFM >> Faith, CSM no contrast, oral in all contexts (absence of nasal vowels from inventory) CSM >> CFM >> Faith allophonic variation CSM >> Faith >> CFM contextual neutralization 3. Richness of the base. a. No need to limit UR to phonemes. No need for any restrictions on UR. It's the job of the constraint system to determine what sounds can occur on the surface, and in what contexts they can occur. b. Moral: when accounting for distributional patterns (e.g. allophony or phonotactics), you can t invoke systematic gaps in input. E.g. if no coda clusters, this has to be attributed to a constraint ruling out coda clusters; it s not sufficient to assume that the inputs happen not to include forms with coda clusters. c. Another moral: no phonemic level of representation. All allophonic detail is there in the input. 4. Richness of the base vs. lexicon optimization a. Does this mean that the UR for Ponapean [papa] is /pasdfsdfxztrtypanmppjlk/ or /pp/? No! b. Lexicon Optimization: assume that choice of input for a known output works just like computation of output from input. i. Output = papa ii. Markedness constraints are irrelevant to selection of input, because they are constraints on surface representation only; and the surface representation in all candidates is the same: [papa]. iii. Therefore the only relevant constraints are faithfulness, and faithfulness always favours identity between input and output iv. Therefore, optimal input is identical to output (provided no surface allomorphy). [papa] < /papa/. c. Why this is not circular i. The (phonotactic) pattern seen in /papa/ is not attributable to the input. The cause of the pattern lies in the constraint system. ii. Richness of the Base is a what-if scenario, allowing us to imagine how the constraint system would actively enforce patterns, even if the inputs were completely random. iii. But once we have a constraint system ensuring proper surface patterns, even under what-if scenario of RotB, then Lexicon Optimization says actual inputs will conform to these patterns as well. iv. The constraint system is not superfluous, just because it s not actively being enforced.

d. Analogy: A herd of cattle are penned in by a high fence. The cattle never show any sign of trying to break out. Does this mean the fence is unnecessary? No, the cattle stay in because the fence keeps them there. 5. UR s in alternations a. RotB does not mean that assumptions about UR s never matter (at least in Classic OT). b. In analysis of alternations, it s not sufficient to characterize surface distributional patterns, you have to account for varying realizations of particular morphemes, and so, as in Generative analysis, assumptions about the UR of these morphemes may be crucial, particularly if the alternation is neutralizing. c. Example: Dutch [bɛt]~[bɛdən] (beds) vs. [bɛt]~[bɛtən] (dab-1pl.). Alternation because voicing changes in sg./pl. of bed; neutralizing bkz the voicing distinction which shows up in the suffixed forms is lost in the isolation forms. If we carry over the Generative assumption that morphologically complex forms are computed on-line by the grammar, by concatenating morphemes (alternatives later in the course), the only sensible way to account for such a neutralizing alternation is to assume that bed is underlying /bɛd/ whereas dab is underlyingly /bɛt/; the unpredictable voicing in the suffixed forms must be attributed to the UR, and enforced on surface by faithfulness constraints. d. In this case Lex. Opt. does not guarantee [bɛd] < /bɛd/, because there are 2 difft. outputs for this input morpheme: [bɛt] and [bɛd] (in bɛdən). Lex Opt doesn t tell us which one the UR should be based on. 6. The morpheme structure constraint problem revisited. What is the underlying rep. of [stæmp] in English, given that English does not allow word-internal heterorganic nasal stop clusters (cf. /ɪn+pɑsəbl/ -> [ɪmpɑsəbl]). How is this question answered in Generative framework? In OT?