Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization

Size: px

Start display at page:

Download "Novel Word Embedding and Translation-based Language Modeling for Extractive Speech Summarization"

Barry Whitehead
6 years ago
Views:

1 Novel Word Embedding and Translaion-based Language Modeling for Exracive Speech Summarizaion Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen +, Hsin-Min Wang, Hsin-Hsi Chen # Academia Sinica, Taian # Naional Taian Universiy, Taian + Naional Taian Normal Universiy, Taian {kychen, ourney, hm}@iis.sinica.edu., + berlin@csie.nnu.edu., # hhchen@csie.nu.edu. ABSTRACT Word embedding mehods revolve around learning coninuous disribued vecor represenaions of ords h neural neorks, hich can capure semanic and/or synacic cues, and in urn be used o induce similariy measures among ords, senences and documens in conex. Celebraed mehods can be caegorized as predicion-based and coun-based mehods according o he raining obecives and model archiecures. Their pros and cons have been exensively analyzed and evaluaed in recen sudies, bu here is relaively less ork coninuing he line of research o develop an enhanced learning mehod ha brings ogeher he advanages of he o model families. In addiion, he inerpreaion of he learned ord represenaions sill remains someha opaque. Moivaed by he observaions and considering he pressing need, his paper presens a novel mehod for learning he ord represenaions, hich no only inheris he advanages of classic ord embedding mehods bu also offers a clearer and more rigorous inerpreaion of he learned ord represenaions. Buil upon he proposed ord embedding mehod, e furher formulae a ranslaion-based language modeling frameork for he exracive speech summarizaion ask. A series of empirical evaluaions demonsrae he effeciveness of he proposed ord represenaion learning and language modeling echniques in exracive speech summarizaion. Keyords Word embedding, represenaion, inerpreaion, language model, speech summarizaion 1. INTRODUCTION Ong o he populariy of various Inerne applicaions, rapidly grong mulimedia conen, such as music video, broadcas nes programs and lecure recordings, has been coninuously filling our everyday life [1, 2]. Obviously, speech is one of he mos imporan sources of informaion abou mulimedia. By virue of speech summarizaion, one can efficienly brose mulimedia conen by digesing he summarized audio/video snippes and associaed ranscrips. Exracive speech summarizaion manages o selec a se of salien senences from a spoken documen according o a arge summarizaion raio and subsequenly concaenae hem ogeher o form a summary [3]. The de specrum of summarizaion mehods developed so far may be roughly divided ino hree caegories: 1) mehods simply based on he senence posiion or srucure informaion [4], 2) mehods based on unsupervised senence ranking [5], and 3) mehods based on supervised senence classificaion [5]. Ineresed readers may refer o [5, 7, 8] for comprehensive revies and ne insighs ino he maor mehods ha have been developed and applied h good success o a de variey of ex and speech summarizaion. Orhogonal o he exising commonly-used mehods, e explore in his paper he use of various ord embedding mehods [9-11] in exracive speech summarizaion, hich have recenly demonsraed excellen performance in many naural language processing (NLP) relaed asks, such as machine ranslaion [12], senimen analysis [13] and senence compleion [14]. The cenral idea of hese mehods is o learn coninuous, disribued vecor represenaions of ords using neural neorks, hich can probe laen semanic and/or synacic cues, and in urn be employed o induce similariy measures among ords, senences and documens. According o he variey of he raining obecives and model archiecures, he classic mehods can be roughly classified ino he predicion-based and coun-based mehods [15]. Recen sudies in he lieraure have evaluaed hese mehods in several NLP-relaed asks and analyzed heir srenghs and deficiencies [11, 16]. Hoever, here are only a fe sudies in he lieraure ha coninue he line of research o crysalize an enhanced ord embedding mehod ha brings ogeher he meris of hese o maor families. In addiion, he inerpreaion of he learned value of each dimension in a learned ord represenaion is a bi opaque. To saisfy he pressing need and complemen he defec, e propose a novel modeling mehod, hich no only inheris he advanages from he classic ord embedding mehods bu also offers a clearer and more rigorous inerpreaion. Beyond he effors o improve he represenaion of ords, e also presen a novel and efficien ranslaion-based language modeling frameork on op of he proposed ord embedding mehod for exracive speech summarizaion. Unlike he common hread of leveraging ord embedding mehods in speech/ex summarizaion asks, hich is o represen a documen/senence by averaging he corresponding ord embeddings over all ords in he documen/senence and esimae he cosine similariy measure of any given documensenence pair, he proposed frameork can auhenically capure he finer-grained (i.e., ord-o-ord) semanic relaionship o be effecively used in exracive speech summarizaion. In a nushell, he maor conribuions of he paper are ofold: A novel ord represenaion learning echnique, hich no only inheris he advanages from he classic ord embedding mehods bu also offers a clearer and more rigorous inerpreaion of ord represenaions, is proposed. A ranslaion-based language modeling frameork on op of he proposed ord embedding mehod, hich can also be inegraed h classic ord embedding mehods, is inroduced o he exracive speech summarizaion ask.

2 2. CLASSIC WORD EMBEDDING METHODS Perhaps one of he mos ell-knon seminal sudies on developing ord embedding mehods as presened by Bengio e al. [9]. I esimaed a saisical n-gram language model, formalized as a feedforard neural neork, for predicing fuure ords in conex hile inducing ord embeddings as a by-produc. Such an aemp has already moivaed many follo-up exensions o develop effecive mehods for probing laen semanic and synacic regulariies manifesed in he represenaions of ords. Represenaive mehods can be caegorized as predicion-based and coun-based mehods. The skip-gram model (SG) [10] and he global vecor model (Gloe) [11] are ell-sudied examples of he o caegories, respecively. Raher han seeking o learn a saisical language model, he SG model is inended o obain a dense vecor represenaion of each ord direcly. The srucure of SG is similar o a feed-forard neural neork, h he excepion ha he non-linear hidden layer in he former is removed. The model hus can be rained on a large corpus efficienly, geing around he heavily compuaional burden incurred by he non-linear hidden layer, hile sill reaining good performance. Formally, given a ord sequence, 1, 2,, T, he obecive funcion of SG is o maximize he log-probabiliy, T c = 1 k = c k 0 + k log P ( ), (1) here c is he ndo size of he conexual ords for he cenral ord, and he condiional probabiliy is compued by exp( v ) + k + k v P( ) =, (2) exp( v v ) i=1 here vv +kk and vv denoe he represenaions of he ords a posiions +k and, respecively; denoes he i-h ord in he vocabulary; and is he vocabulary size. The Gloe model suggess ha an appropriae saring poin for ord represenaion learning should be associaed h he raios of co-occurrence probabiliies raher han he predicion probabiliies. More precisely, Gloe makes use of a eighed leas squares regression, h he aim of learning ord represenaions ha can characerize he co-occurrence saisics beeen each pair of ords: f ( X i= 1 = 1 2 )( v v + b + b log X ), (3) here and are any o disinc ords in he vocabulary; XX i denoes he number of imes ords and co-occur in a predefined sliding conex ndo; f( ) is a monoonic smoohing funcion used o modulae he impac of each pair of ords involved in model raining; and vv i and bb i denoe he ord represenaion and he bias erm of ord, respecively. Ineresed readers may refer o [15, 16] for a more horough and eneraining discussion. 3. METHODOLOGY 3.1 The Proposed Word Embedding Mehod Alhough he predicion-based mehods have shon heir remarkable performance in several NLP-relaed asks, hey do no sufficienly uilize he saisics of he enire corpus since he models are usually rained on local conex ndos in a separae manner [11]. By conras, he coun-based mehods leverage he holisic saisics of he corpus efficienly. Hoever, a fe sudies have indicaed heir relaively poor performance in some asks [16]. Among all he exising mehods (boh he predicion-based and coun-based mehods), he inerpreaion of he learned value of each dimension in he represenaion is no inuiively clear. Moivaed by hese observaions, a novel modeling approach, hich naurally brings ogeher he advanages of he o maor model families and resuls in inerpreable ord represenaions, is proposed. We begin h he definiion of erminologies and noaions. As mos classic embedding mehods, e inroduce o ses of ord represenaions: one is he se of desired ord represenaions, denoed by M; he oher is he se of separae conex ord represenaions, denoed by W. W and M are HH marices, here he -h columns of marices W and M, denoed by WW R HH and MM R HH, correspond o he -h ord in he vocabulary. H is a pre-defined dimension of he ord embedding. To make he learned represenaion inerpreable, e assume ha each ord embedding is a mulinomial represenaion. Furhermore, o make he compuaion more efficien, e assume ha each ro vecor of marix W follos a mulinomial disribuion as ell. To inheri he advanages from he predicion-based mehods, he raining obecive is o obain an appropriae ord represenaion by considering he predicive abiliy of a given ord occurring a an arbirary posiion of he raining corpus (denoed ) o predic is surrounding conex ords: c P( k = c k 0 + k ) = W M c + k k = c 0 = 1 k W M The denominaor can be omied because i alays equals o 1. In order o characerize he hole corpus saisics ell, e rain he model parameers in a bach mode insead of using a sequenial learning sraegy. Therefore, he obecive funcion becomes. (4) n(, )log( W M ), (5) i= 1 = 1 i here n(,) denoes he number of imes ords and cooccur in a pre-defined sliding conex ndo. Obviously, such a model no only bears a close resemblance o he predicion-based mehods (e.g., SG) bu also capializes on he saisics gahered from he enire corpus like he coun-based mehods (e.g., Gloe), in a probabilisic frameork. The componen disribuions (i.e., W and M) can be esimaed using he expecaion-maximizaion (EM) algorihm. Advanced algorihms, such as he riple ump EM algorihm [17], can be leveraged o accelerae he raining process. Since he raining obecive of he proposed mehod is similar o ha of he SG model, and i resuls a se of disribuional ord represenaions, e hus erm he proposed model he disribuional skip-gram model (DSG). The inerpreive abiliy of DSG ll be discussed in deail laer in Secion Translaion-based Language Modeling for Summarizaion Language modeling (LM) has proven is broad uiliy in many NLP-relaed asks. In he conex of using LM for exracive speech summarizaion, each senence S of a spoken documen D o be summarized can be formulaed as a probabilisic generaive model for generaing he documen, and senences are seleced based on he corresponding generaive probabiliy P(D S): he higher he probabiliy, he more represenaive S is likely o be for D [18]. The maor challenge facing he LM-based frameork is ho o accuraely esimae he model parameers for each senence. The simples ay is o esimae a unigram language model (ULM) on he basis of he frequency of each disinc ord occurring in he senence S, h he maximum likelihood crierion:

3 n(, S) P ( S) =, (6) S here n(,s) is he number of imes ha ord occurs in senence S, and S is he lengh of he senence. There is a general consensus ha merely maching erms in a candidae senence and he documen o be summarized may no alays selec summary senences ha can capure he imporan semanic inen of he documen. Thus, in order o more precisely assess he represenaiveness of a senence o he documen, e sugges inferring he probabiliy ha he documen ould be generaed as a ranslaion of he senence. Tha is, he generaing probabiliy is calculaed based on a ranslaion model of he form P( ), hich is he probabiliy ha a senence ord is semanically ranslaed o a documen ord : n(, D) P ( D S) = ( ) ( ). D P P S (7) S i Accordingly, he ranslaion-based language modeling approach allos us o score a senence by compuing he degree of mach beeen a senence ord and he semanically relaed ords in he documen. If P( ) only allos a ord o be ranslaed ino iself, Eq. (7) ll be reduced o he ULM approach (cf. Eq. (6)). Hoever, P( ) ould in general allo us o ranslae ino he semanically relaed ords h non-zero probabiliies, hereby achieving semanic maching beeen he documen and is componen senences. Based on he proposed DSG mehod, he ranslaion probabiliy P( ) can be naurally compued by: P i ) = W M. (8) ( Consequenly, he senences offering he highes generaed probabiliy (cf. Eqs (7) and (8)) and dissimilar o hose already seleced senences (for an already seleced senence S, compuing P(S S ) using Eq. (7)) ll be seleced and sequenced o form he final summary according o a desired summarizaion raio. The proposed ranslaion-based language modeling mehod is denoed by TBLM hereafer. 4. EXPERIMENTS 4.1 Daase & Seup We conduc a series of experimens on a Mandarin Benchmark broadcas ne corpus [19]. The MATBN daase is publicly available and has been dely used o evaluae several NLP-relaed asks, including speech recogniion [20], informaion rerieval [21] and summarizaion [18]. As such, e follo he experimenal seing used by some previous sudies for speech summarizaion. The average ord error rae of he auomaic ranscrips of hese broadcas nes documens is abou 38%. The reference summaries ere generaed by ranking he senences in he manual ranscrip of a broadcas nes documen by imporance hou assigning a score o each senence. Each documen has hree reference summaries annoaed by hree subecs. For he assessmen of summarizaion performance, e adoped he dely-used ROUGE merics (in F-scores) [22]. The summarizaion raio as se o 10%. In addiion, a corpus of 14,000 ex nes documens, compiled during he same period as he broadcas nes documens, as used o esimae he parameers of he models compared in his paper. 4.2 Experimenal Resuls A common hread of leveraging ord embedding mehods in a summarizaion ask is o represen a documen/senence by Table 1. Summarizaion resuls achieved by various ordembedding mehods in conuncion h he cosine similariy measure. Cosine SG Gloe DSG SM LSA MMR Table 2. Summarizaion resuls achieved by various ordembedding mehods in conuncion h he ranslaionbased language modeling mehod. TBLM SG Gloe DSG ULM Table 3. Summarizaion resuls achieved by a fe ellsudied or/and sae-of-he-ar unsupervised mehods. MRW LexRank SM ILP DSG(TBLM) averaging he corresponding ord embeddings over all ords in he documen/senence. Afer ha, he cosine similariy measure, as a sraighforard choice, can be readily applied o deermine he degree of relevance beeen any pair of represenaions [29, 30]. In he firs place, e ry o invesigae he effeciveness of o saeof-he-ar ord embedding mehods (i.e., SG and Gloe) and he proposed mehod (i.e., DSG), in conuncion h he cosine similariy measure for speech summarizaion. The experimenal resuls are shon in Table 1, here denoes he resuls obained based on he manual ranscrips of spoken documens, and denoes he resuls using he speech recogniion ranscrips of spoken documens ha may conain speech recogniion errors. Several observaions can be made from hese resuls. Firs, he o classic ord embedding mehods, hough based on disparae model srucures and learning sraegies, achieve resuls compeiive o each oher in boh he and cases. Second, he proposed DSG mehod, hich naurally brings ogeher he advanages of he o maor model families (i.e., predicion-based and coun-based) in he lieraure, ouperforms SG and Gloe (represenaives of he o model families, respecively) by a significan margin in boh he and cases. Third, since he relevance degree beeen a documen-senence pair is compued by he cosine similariy measure, vecor space-based mehods, such as SM [23], LSA [23] and MMR [24], can be reaed as he principled baseline sysems. Albei ha he classic ord embedding mehods (i.e., Gloe and SG) ouperform he convenional SM model, hey achieve almos he same level of performance as LSA and MMR, hich are considered o enhanced versions of SM. I should be noed ha

M apple Daily Life 0.55 0.25 Tech. Company Figure 1. A running example for inerpreing he ord embeddings learned by DSG.

I is orh menioning ha TBLM is ell suiable for pairing h he DSG mehod since he ranslaion probabiliy can be easily obained by ligheigh calculaion (cf. Eq. (8)).

4 M apple Daily Life Tech. Company Figure 1. A running example for inerpreing he ord embeddings learned by DSG. he proposed DSG mehod no only ouperforms SM, bu also is superior o LSA and MMR in boh he and cases. Nex, e evaluae he proposed TBLM mehod. I is orh menioning ha TBLM is ell suiable for pairing h he DSG mehod since he ranslaion probabiliy can be easily obained by ligheigh calculaion (cf. Eq. (8)). I can also be inegraed h he classic ord embedding mehods (e.g. SG and Gloe), bu h a heavier compuaional burden (cf. Eq. (2) for example). The experimenal resuls are summarized in Table 2. Three observaions can be made from he resuls. Firs, since he proposed DSG mehod inheris he advanages from predicion- and coun-based mehods, i ouperforms boh SG and Gloe, hen all he hree models are paired h he proposed TBLM mehod. Second, hen inegraed h TBLM, all he hree ord embedding mehods ouperform he baseline ULM mehod [18, 31] (cf. Secion 3.2) by a remarkable margin in boh he and cases. Third, comparing he resuls in Tables 1 and 2, i is eviden ha TBLM is deemed a preferable vehicle o make use of poerful ord embedding mehods in speech summarizaion. In he las se of experimens, e assess he performance levels of several ell-praciced unsupervised summarizaion mehods, including he graph-based mehods (i.e., MRW [25] and LexRank [26]), he submodulariy (SM) mehod [27], and he ineger linear programming (ILP) mehod [28]. The corresponding resuls are illusraed in Table 3. The performance rends of hese sae-of-hear mehods in our sudy are quie in line h hose observaions made by oher previous sudies in differen exracive summarizaion asks. A noiceable observaion is ha speech recogniion errors may lead o inaccurae similariy measures beeen a pair of senences or documen-senence. Probably due o his reason, he o graph-based mehods (i.e., MRW and LexRank) canno perform on par h he vecor-space mehods (i.e., SM, LSA, and MMR) (cf. Table 1) in he case, bu he siuaion is reversed in he case. Moreover, he SM and ILP achieve he bes resuls in he case, bu only offer mediocre performance among all mehods in he case. To sum up, he proposed DSG mehod, hich inheris he advanages from boh he predicionand coun-based mehods, indeed ouperforms classic ord embedding mehods hen paired h differen summarizaion sraegies (i.e., he cosine similariy measure and TBLM). The proposed TBLM mehod furher enhances he DSG mehod since i can auhenically capure a finer-grained (i.e., ord-o-ord) semanic relaionship o be effecively used in exracive speech summarizaion. 4.3 Furher Analysis SG, Gloe and he proposed DSG mehod can be analyzed from several criical perspecives. Firs, SG and DSG aim a maximizing he collecion likelihood in raining, hile Gloe concenraes on discovering useful informaion from he co-occurrence saisics beeen each pair of ords. I is orhy o noe ha Gloe has a close relaion h he classic eighed marix facorizaion approach, hile he maor difference is ha he former concenraes on rendering he ord-by-ord co-occurrence marix hile he laer decomposes he ord-by-documen marix [23, 32, 33]. Second, since he parameers (i.e., ord represenaions) of SG are rained sequenially (i.e., he so-called on-line learning sraegy), he sequenial order of he raining corpus may make he resuling models unsable. On he conrary, Gloe and DSG accumulae he saisics over he enire raining corpus in he firs place; he model parameers are hen updaed based on such censuses a once (i.e., he so-called bach-mode learning sraegy). Finally, he ord vecors learned by DSG are disribuional represenaions, hile SG and Gloe express each ord by a disribued represenaion. The classic embedding mehods usually oupu o ses of ord represenaions, bu here is no paricular use for he conex ord represenaions. Since DSG assumes ha each ro of he conex ord marix W follos a mulinomial disribuion (i.e., a mulinomial disribuion over ords), e can naurally inerpre he semanic meaning of each dimension of a ord embedding by referring o he ords h higher probabiliies in he corresponding ro vecor in W. Since each ord embedding in he desired ord represenaions M is also a mulinomial disribuion, e can inerpre a learned ord embedding by firs idenifying he dimensions h higher probabiliies and hen idenifying he conex ords h higher probabiliies in he corresponding ro vecors of W. Figure 1 shos a running example for ord apple learned by DSG on he LDC Gigaord corpus. The ord cloud can be ploed h respec o he probabiliies of individual conex ords for a seleced dimension. I is obvious ha apple is no only a kind of imporan maer in our daily life, bu also is a famous echnology company. The example shos ha he ord embeddings learned by DSG can be inerpreed in a reasonable and sysemaical manner. 5. CONCLUSIONS A novel disribuional ord embedding mehod and a ranslaionbased language modeling mehod have been proposed and inroduced for exracive speech summarizaion in his paper. Empirical resuls have demonsraed heir respecive and oin effeciveness and efficiency over several sae-of-he-ar summarizaion mehods. In he fuure, e plan o furher exend and apply he proposed frameork o a der range of summarizaion and NLP-relaed asks. We ll also concenrae on inegraing a variey of prior knoledge for learning he ord represenaions.

5 6. REFERENCES [1] Mari Osendorf Speech echnology and informaion access. IEEE Signal Processing Magazine, 25(3): [2] Sadaoki Furui, Li Deng, Mark Gales, Hermann Ney, and Keiichi Tokuda Fundamenal echnologies in modern speech recogniion. IEEE Signal Processing Magazine, 29(6): [3] Inderee Mani and Mark T. Maybury (Eds.) Advances in auomaic ex summarizaion. Cambridge, MA: MIT Press. [4] P. B. Baxendale Machine-made index for echnical lieraure-an experimen. IBM Journal, 2(4): [5] Yang Liu and Dilek Hakkani-Tur Speech summarizaion. Chaper 13 in Spoken Language Undersanding: Sysems for Exracing Semanic Informaion from Speech, G. Tur and R. D. Mori (Eds), Ne York: Wiley. [6] Ziqiang Cao, Furu Wei, Suian Li, Wenie Li, Ming Zhou, and Houfeng Wang Learning Summary Prior Represenaion for Exracive Summarizaion. In Proc. ACL, pages [7] Ani Nenkova and Kahleen McKeon Auomaic summarizaion. Foundaions and Trends in Informaion Rerieval, 5(2 3): [8] Gerald Penn and Xiaodan Zhu A criical reassessmen of evaluaion baselines for speech summarizaion. In Proc. ACL, pages [9] Yoshua Bengio, Reean Ducharme, Pascal incen, and Chrisian Jauvin A neural probabilisic language model. Journal of Machine Learning Research, 3: [10] Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean Efficien esimaion of ord represenaions in vecor space. In Proc. ICLR, pages [11] Jeffrey Penningon, Richard Socher, and Chrisopher D. Manning Gloe: Global vecor for ord represenaion. In Proc. EMNLP, pages [12] Will Y. Zou, Richard Socher, Daniel Cer, Chrisopher D. Manning Bilingual ord embeddings for phrase-based machine ranslaion. In Proc. ACL, pages [13] Duyu Tang, Furu Wei, Nan Yang, Ming Zhou, Ting Liu, and Bing Qin Learning senimen-specific ord embedding for er senimen classificaion. In Proc. ACL, pages [14] Ronan Collober and Jason Weson A unified archiecure for naural language processing: deep neural neorks h muliask learning. In Proc. ICLR, pages [15] Marco Baroni, Georgiana Dinu, and German Kruszeski Don coun, predic! A sysemaic comparison of conexcouning vs. conex-predicing semanic vecors. In Proc. ACL, pages [16] Omer Levy, Yoav Goldberg, and Ido Dagan Improving disribuional similariy h lessons learned from ord embeddings. Transacions of he Associaion for Compuaional Linguisics, 3: [17] Han-Shen Huang, Bou-Ho Yang and Chun-Nan Hsu Triple ump acceleraion for he EM algorihm. In Proc. ICDM, pages 1 4. [18] Shih-Hung Liu, Kuan-Yu Chen, Berlin Chen, Hsin-Min Wang, Hsu-Chun Yen, and Wen-Lian Hsu Combining relevance language modeling and clariy measure for exracive speech summarizaion. IEEE/ACM Transacions on Audio, Speech, and Language Processing, 23(6): [19] Hsin-Min Wang, Berlin Chen, Jen-Wei Kuo, and Shih-Sian Cheng MATBN: A Mandarin Chinese broadcas nes corpus. Inernaional Journal of Compuaional Linguisics and Chinese Language Processing, 10(2): [20] Jen-Tzung Chien Hierarchical Piman-Yor-Dirichle language model. IEEE/ACM Transacions on Audio, Speech and Language Processing, 23(8): [21] Chien-Lin Huang and Chung-Hsien Wu Spoken Documen Rerieval Using Muli-Level Knoledge and Semanic erificaion. IEEE Transacions on Audio, Speech, and Language Processing, 15(8): [22] Chin-Ye Lin ROUGE: Recall-oriened undersudy for gising evaluaion. hp://haydn.isi.edu/rouge/. [23] Yihong Gong and Xin Liu Generic ex summarizaion using relevance measure and laen semanic analysis. In Proc. SIGIR, pages [24] Jaime Carbonell and Jade Goldsein The use of MMR, diversiy-based reranking for reordering documens and producing summaries. In Proc. SIGIR, pages [25] Xiaoun Wan and Jianu Yang Muli-documen summarizaion using cluser-based link analysis. In Proc. SIGIR, pages [26] Gunes Erkan and Dragomir R. Radev LexRank: Graphbased lexical cenraliy as salience in ex summarizaion. Journal of Arificial Inelligen Research, 22(1): [27] Hui Lin and Jeff Bilmes Muli-documen summarizaion via budgeed maximizaion of submodular funcions. In Proc. NAACL HLT, pages [28] Korbinian Riedhammer, Benoi Favre, and Dilek Hakkani-Tur Long sory shor - Global unsupervised models for keyphrase based meeing summarizaion. Speech Communicaion, 52(10): [29] Mikael Kageback, Olof Mogren, Nina Tahmasebi, and Devda Dubhashi Exracive summarizaion using coninuous vecor space models. In Proc. CSC, pages [30] Kuan-Yu Chen, Shih-Hung Liu, Hsin-Min Wang, Berlin Chen, and Hsin-Hsi Chen Leveraging ord embeddings for spoken documen summarizaion. In Proc. INTERSPEECH, pages [31] Kuan-Yu Chen, Shih-Hung Liu, Berlin Chen, Hsin-Min Wang, Ea-Ee Jan, Wen-Lian Hsu, and Hsin-Hsi Chen Exracive broadcas nes summarizaion leveraging recurren neural neork language modeling echniques. IEEE/ACM Transacions on Audio, Speech, and Language Processing, 23(8): [32] Kuan-Yu Chen, Hsin-Min Wang, Berlin Chen, and Hsin-Hsi Chen Weighed marix facorizaion for spoken documen rerieval. In Proc. ICASSP, pages [33] Omer Levy and Yoav Goldberg Neural ord embedding as implici marix facorizaion. In Proc. NIPS, pages 1 9.

Neural Network Model of the Backpropagation Algorithm

Neural Network Model of the Backpropagation Algorithm Neural Nework Model of he Backpropagaion Algorihm Rudolf Jakša Deparmen of Cyberneics and Arificial Inelligence Technical Universiy of Košice Lená 9, 4 Košice Slovakia jaksa@neuron.uke.sk Miroslav Karák