Analysis of Error Count Distributions for Improving the Postprocessing Performance of OCCR

Size: px
Start display at page:

Download "Analysis of Error Count Distributions for Improving the Postprocessing Performance of OCCR"

Transcription

1 Analysis of Error Count Distributions for Improving the Postprocessing Performance of OCCR Yue-Shi Lee and Hsin-Hsi Chen Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan, R.O.C. {leeys, Submitted on 23 July, 1996, Revised on 17 November, 1996 and Accepted on 29 November, 1996 Abstract Contextual language processing plays an important role for the postprocessing of OCR. Its effects are demonstrated by many proposed systems. In general, it performs well. However, its performance is not so good as expect when the test data contain more unseen context, e.g., proper nouns such as personal names and organizational names. This paper addresses the importance of analyzing the error count distributions before applying the language models. According to the analysis, more than 50% of errors can be reduced and more than 90% of time can be saved on the average based on the Markov character bigram model. Keywords: Contextual Language Processing, Error Count Distributions, Image Processing, Markov Model, OCCR, Unseen Context 1 Introduction To improve the interface with computers, the development of input devices such as optical character recognition (OCR) device and speech recognition (SR) device is expected. The OCR device is a good choice while the printed documents are provided. However, optical Chinese character recognition (OCCR) is an extremely challenging task due to the different multifonts, complex shapes and the very large vocabularies 1. Because the misrecognitions of image processing are hard to be avoided, contextual postprocessing (language processing) of the recognition is indispensable for both reducing the recognition errors caused by the preprocessing (image processing) of the recognition and saving the time in human proofreading. Contextual language processing for the postprocessing of OCR is not new. Shyu, et al. [1] adopt a word-lattice-based Markov character bigram model suggested by [2] to the OCCR system. Chou and Chang [3] use a Markov word unigram model and a confusion matrix to decide the most plausible characters. Chang and Chen [4] combine a noisy channel model and a language model to implement the postprocessing of the OCCR. Araki, et al. [5-6] propose a selective error-correction method to detect and correct erroneous characters in Japanese text input through an OCR. Shinghal [7], Sinha and Prasada [8] also propose approaches for English. The purpose of the contextual language processing is to find the most plausible candidate for each image character with the maximum likelihood probability. The above approaches claim that it plays an important role and has much effect in the postprocessing of the OCR systems. In general, it performs well. However, its performance is not so good as expect when the test data contain more unseen context, e.g., proper nouns such as personal names and organizational names. Besides, some frequently used characters are always selected by the language models, but they may be wrong in some cases. Therefore, if we can predict which image characters have been recognized correctly by the image processing module, the above problems can be alleviated. That is, it is important to make an analysis before applying the language models. This paper is organized as follows. Section 2 presents our OCCR system. Section 3 introduces the language models used in this paper. Section 4 describes the analysis methods and demonstrates the proposed methods. Section 5 is the concluding remarks. 2 System Description The proposed system shown in Fig. 1 consists of two major modules: (1) Image Processing Module (Preprocessing) and (2) Language Processing Module (Postprocessing). The image processing module contains three submodules: (1) Image Segmentation, (2) Feature Extraction, and (3) Feature Matching. An optical scanner scans the printed document and converts it into an image document. The image segmentation submodule segments the entire image into blocks and then classifies each block into a text, graphic or picture block. The text blocks are further segmented into individual character blocks subsequently. Each character block stands for an image character. Fig. 2(a) shows a simplified example of image document. After the image segmentation submodule is applied, the segmented image document is shown in Fig. 2(b). Once the image characters have been segmented properly, the feature extraction submodule extracts the features from each image character. In the feature matching stage, the extracted features of each individual image character are matched to a feature database to recognize the character. The top ten candidates, which form a candidate set, for each image character are generated for the subsequent language processing. Fig. 3 shows the candidates for each image character of Fig. 2(b). 1 There are about 13,000 characters in Chinese.

2 Image Document Image Processing Module (Preprocessing) Image Segmentation Image base Feature Extraction Feature Extraction Dictionary Feature Matching Feature base Language Processing Module (Postprocessing) Analysis of Error Count Distributions Markov Character Bigram Model Character Bigram Table Character Unigram Table Text Document Fig. 1. Block Diagram of the Proposed System Character Blocks (Image ) (a) (b) Fig. 2. An Example of Image Segmentation Â Ã Ä Å Æ Ç È É ! " # $ % & ' ( ) * , / Fig. 3. The Top Ten Candidates for Each Image Character of Fig. 2(b) In Fig. 3, a number follows each candidate. The number indicates the error count between the current image character and the image character stored in the image database according to the features. The lower the error count is, the more the similarity between two image characters is. Thus, the first candidate of each image character is the most plausible candidate based on the image processing module. The error count can be used to calculate the probability of each candidate given the image character. Given the k-th image character i k, the matching score of the j-th candidate c j (SCORE j ) is defined as follows [9-10]. ÂÃÄÅ = ÅÇÇÈÇÉÂÈÉÈÉÂ Æ ÅÇÇÈÇÉÂÈÉÈÉ + Æ Â Based on the definition of SCORE j, the probability of the j-th candidate c j given the image character i k is calculated as follows. ÅÆÇÈÉ Ã Â Âà à ÄÄ = ÇÈ ÅÆÇÈÉ Å ÅÆÇ During the language processing stage, the analysis of error count distributions and the Markov character bigram model are adopted simultaneously to deal with the problems of recognition errors caused by the image processing module and yield the final text document. This paper focuses on the postprocessing of the recognition especially for the analysis of error count distributions.

3 3 Language Models in OCCR The problem of OCCR can be defined as how to convert a sequence of image characters I into the corresponding sequence of characters  correctly based on the language models. In this paper, a statistical Markov character bigram model is adopted to improve the recognition rate. Let I=<i 1, i 2, i 3,..., i n > be an image character string and C=<c 1, c 2, c 3,..., c n > be one of the possible character strings. Here, c i denotes one of the characters in the i-th candidate set. The conversion can be formulated as follows.  Âà ÄÅ Â Âà ÃÄÅÆÇÈ Â Äà ÃÄÇ... (1) The former probability, i.e., P IP (C I), is produced by the image processing module and the latter probability, i.e., P LP (C) is calculated by the language processing module. If the contextual information, i.e., P LP (C), is ignored, the above formula becomes as follows.  Âà ÄÅ The P IP (C I) is defined as follows. ÂÃÄÅÆ =   à à Ã= Â Ã Ä Å Æ Ä Å Â Âà ÃÄÅÆÇ (2) The definition of P IP (c j i j ), i.e., the probability of candidate c j given the image character i j, is described in Section 2. By using Formula 2, the first candidate, which has the lowest error count, is always selected as the result. If more than one candidate has the same error count, the most frequently used character is selected as the result through dictionary lookup (see Fig. 1). Similarly, if the P IP (C I) is ignored in, the formula becomes as follows.  Âà ÄÅ Â Âà ÃÄÅ (3) In this paper, the P LP (C) is simplified as a Markov character bigram model shown below. Ä + à à +  ÂÃÄ Â ÂÃÄ ÅÆ ÂÃÄ ÇÄ Å Â Â ÂÃ Ã Ä Å Ä Æ Â Â = à =  In this formula, c 0 and c n+1 mark the beginning and the ending of the character string, respectively. According to the above formulas, the preliminary results are shown in Table 1. Table 1. The Preliminary Results Correct Rate for Correct Rate for Formula 2 Correct Rate for % 98.48% 88.75% % 96.57% 88.56% % 97.00% 90.19% % 97.28% 89.47% % 96.51% 87.89% % 94.76% 89.16% Total 97.84% 96.83% 88.97% In these experiments, a Chinese unsegmented newspaper corpus is adopted as the source of the training data to train the Markov character bigram probabilities. It includes approximately 360,000 sentences (about 4,000,000 characters). The test data (6 articles) are scanned from the Liberty Times. It includes 237 sentences (2457 characters). In Table 1, it is clear that using the contextual information () only to select the most plausible candidate does not gain the advantages in these experiments. This is because the image processing module has the excellent performance and the test data (news) contains many proper nouns such as personal names and organizational names which are difficult to be solved by the language models. Besides, some frequently used characters are always selected by, but they may be wrong in some cases. Because combines P IP (C I) and P LP (C), these effects are alleviated. However, they still have some influences. The subsequent analysis will demonstrate this point. The preliminary results for Formula 2 are discussed in detail and the statistic information is shown in Table 2. Table 2. The Statistic Information of the Preliminary Results for Formula 2 Correctly Image Wrongly Image Correct within the Top Ten Candidates Total In the above table, 1 has 5 image characters wrongly recognized by using Formula 2. That is, the first candidate is not the correct result in these five image characters. But 4 of 5 can be found within the top ten candidates. From Table 2, 84.62% (( )/78) wrongly recognized image characters can be recovered to the correct ones by using the characters within the top ten candidates. This is a good phenomenon while the contextual information can be successfully applied to the wrongly recognized positions. Tables 3 and 4 show the detail statistic information of the preliminary results for Formulas 1 and 3, respectively. Table 3. The Detail Statistic Information of the Preliminary Results for Correct Wrong CC CW WC WW Net Gain Total

4 Table 4. The Detail Statistic Information of the Preliminary Results for Correct Wrong CC CW WC WW Net Gain Total In these two tables, Correct (Wrong) denotes the number of correctly (wrongly) recognized image characters 2. Columns 4, 5, 6 and 7 indicate the performance changes from image processing module (preprocessing) to language processing module (postprocessing). They can be classified into four types: Correctto-Correct (CC), Correct-to-Wrong (CW), Wrong-to-Correct (WC) and Wrong-to-Wrong (WW). In the CW type, an image character which is correctly recognized by the image processing module is changed to a wrong one by the language processing module. In the WC type, a wrongly recognized character is recovered to the correct one by the language processing module. In the CC type, no characters are changed. In the WW type, a wrongly recognized character is not changed or is changed to another wrong one. The performance of the language processing module can be evaluated as the net gain shown as follows. Net Gain = WC - CW In Table 4, the Net Gains are all negative. It reveals the language processing module cannot be effectively applied to the OCCR application when the P IP (C I) is ignored. But the Net Gains of 1 and 2 in Table 3 are also negative even the P IP (C I) is incorporated with the P LP (C). In Table 3 (4), 32 (247) image characters which are correctly recognized by the image processing module are changed to the wrong ones by the language processing module. However, 57 (54) image characters which are wrongly recognized by the image processing module are recovered to the correct ones by the language processing module. Because Table 2 shows that 66 ( ) wrongly recognized characters may be recovered by the language processing module, the language processing module performs well in these wrongly recognized positions. That is, if we can predict that which position has correctly recognized by the image processing module, the first candidate is selected as the result. The other candidates (from the second candidate to the tenth candidate) can be removed from the candidate set and will not be tried by the language processing module. Under this way, the Net Gain can be turned to positive value and the effects of language processing module can be shown. In the next section, we will describe how to predict if a position is correctly or wrongly recognized by the image processing module. 2 Correct = CC + WC Wrong = CW + WW 4 Analysis of Error Count Distributions To decide which image character has been recognized by the image processing module correctly, the only information that we can use is the error count of each candidate. In this paper, an image character is assumed to be correctly recognized by the image processing module based on the following two hypotheses. (1) The error count of the first candidate in the candidate set must be less than a threshold value A. (2) The difference of the error count between the first candidate and the second candidate in the candidate set must be greater than a threshold value B. Table 5 shows the error count distribution for the first hypothesis. Table 5. The Error Count Distribution for the First Hypothesis The Range of the Error Count for the First Candidate Correctly Image Character Wrongly Image Character 0 ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ In Table 5, Column 1 indicates the range of the error count for the first candidate. Column 2 (3) indicates the number of the image characters which are correctly (wrongly) recognized by the image processing module given the condition in Column 1. For example, there are 407 correctly recognized image characters and 6 wrongly recognized image characters when the error counts of their first candidates are between 2500 and In Table 5, we can find that an image character is correctly recognized by the image processing module if the error count of its first candidate is less than 2500, i.e., threshold value A. That is, total 234 ( ) positions can be correctly detected. The first candidate can be selected as the correct result and the other candidates are not tried by the language processing module. However, this hypothesis only obtains little improvements (9.52%) because total number of positions (image characters) is Table 6 shows the error count distribution for the second hypothesis. In Table 6, Column 1 indicates the range of the difference of the error count between the first candidate and the second candidate in the candidate set. Column 2 (3) indicates the number of correctly (wrongly) recognized image characters under the condition in Column 1. For example, there are only 136 correctly recognized image characters and 63 wrongly recognized image characters when the difference of the error count between the first candidate and the second candidate is less than 200, i.e., threshold value B. -4-

5 Table 6. The Error Count Distribution for the Second Hypothesis The Range of the Difference of the Error Count Correctly Image Wrongly Image 0 ~ ~ ~ ~ ~ That is, if we assume that the first candidate is correct when the difference of the error count is greater than 200, total 2258 ( ) positions are identified. Of these, 2243 are correct and 15 (78-63) are wrong. That is, these 15 wrongly recognized characters are identified as correct based on this hypothesis. However, only 199 (136+63) positions have to be considered further. It is clear that this analysis is useful because most of the characters are identified correctly in advance. Tables 7 and 8 show the experimental results for Formulas 1 and 3 based on two hypotheses. Table 7. The Experimental Results for Based on Two Hypotheses Correct Wrong Total Table 8. The Experimental Results for Based on Two Hypotheses Correct Wrong Total In Tables 7 and 8, A and B denote the threshold values for Hypothesis I and Hypothesis II, respectively. The performance in these two experiments is increased very much. For example, the original Markov character bigram model based on (3) has 53 (271) errors. After the analysis, the recognition errors reduce to 30 (53) under the threshold values, i.e., and. That is, 43.40% (80.44%) of errors on the average are reduced by the analysis. The threshold values A and B in hypotheses I and II highly depend on the quality of the printed documents. It does not depend on the type and domain of the context. Another 7 printed documents are also scanned for testing. The experimental results are shown in Tables 9 and 10. Table 9. The Experimental Results before the Analysis Correctly Wrongly Correctly Wrongly Total Table 10. The Experimental Results after the Analysis Correctly Wrongly Correctly Wrongly Total The threshold values, A and B, are set to 2500 and 300, respectively. It is clear that the experimental results are similar to the previous ones. Without the analysis, the correct rates for Formulas 1 and 3 are 96.81% and 86.49%, respectively. By using the analysis, the correct rates for Formulas 1 and 3 are 98.40% and 97.54%, respectively. That is, 50.00% and 81.82% of errors on the average are reduced by the analysis for Formulas 1 and 3, respectively. Besides, the processing speed is also saved after applying the analysis. Without the analysis, the processing speed is 1.67 characters per second. By using the analysis, the -5-

6 processing speed becomes characters per second under PC- 486/DX That is, the analysis saves 92.26% of time on the average. 5 Concluding Remarks A standard approach to reduce the recognition errors caused by the preprocessing, i.e., image processing, is to use the corpusbased language models in the postprocessing, i.e., language processing. This paper proposes the analysis of error count distributions to alleviate the problems caused by the contextual language processing. The experimental results show the analysis can reduce more than 50% of errors and save more than 90% of time on the average based on the Markov character bigram model. Besides, this simple but effective analysis can also be applied to other natural language applications such as speech recognition [2] and handwriting recognition [9,10,11]. References [1] K.H. Shyu, et al., "An OCR Based Translation System between Simplified and Complex Chinese," Computer Processing of Chinese and Oriental Languages, Vol. 9, No. 1, pp , [2] L.S. Lee, et al., "Golden Mandarin (II) - An Improved Single-Chip Real-Time Mandarin Dictation Machine for Chinese Language with Very Large Vocabulary," Proceedings of ICASSP, pp , [3] B.H. Chou and J.S. Chang, "The Language Models in Optical Chinese Character Recognition," Proceedings of ROCLING V, pp , [4] J.S. Chang and S.D. Chen, The Postprocessing of Optical Character Recognition Based on Statistical Noisy Channel and Language Model, Proceedings of PACLIC, pp , [5] T. Araki, S. Ikehara, et al., An Evaluation of a Method to Detect and Correct Erroneous in Japanese Input through an OCR Using Markov Models, Proceedings of Applied Natural Language Processing, pp , [6] T. Araki, S. Ikehara, et al., An Evaluation to Detect and Correct Erroneous Wrongly Substituted, Deleted and Inserted in Japanese and English Sentences Using Markov Models, Proceedings of COLING, pp , [7] R. Shinghal, "A Hybrid Algorithm for Contextual Text Recognition," Pattern Recognition, Vol. 16, No. 2, pp , [8] R.M.K. Sinha and B. Prasada, "Visual Text Recognition Through Contextual Processing," Pattern Recognition, Vol. 21, No. 5, pp , [9] H.J. Lee, C.H. Tung and C.H. Chang Chien, "A Markov Model in Handwritten Chinese Text Recognition," Proceedings of ICDAR, pp , [10] C.H. Tung and H.J. Lee, "Increasing Character Recognition Accuracy by Detection and Correction of Erroneously Identified," Pattern Recognition, Vol. 27, No. 9, pp , [11] C.H. Chang, " Word Class Discovery for Postprocessing Chinese Handwriting Recognition," Proceedings of COLING, pp ,

Noisy Channel Models for Corrupted Chinese Text Restoration and GB-to-Big5 Conversion

Noisy Channel Models for Corrupted Chinese Text Restoration and GB-to-Big5 Conversion Computational Linguistics and Chinese Language Processing vol. 3, no. 2, August 1998, pp. 79-92 79 Computational Linguistics Society of R.O.C. Noisy Channel Models for Corrupted Chinese Text Restoration

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 6, Ver. IV (Nov Dec. 2015), PP 01-07 www.iosrjournals.org Longest Common Subsequence: A Method for

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

Disambiguation of Thai Personal Name from Online News Articles

Disambiguation of Thai Personal Name from Online News Articles Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The Internet as a Normative Corpus: Grammar Checking with a Search Engine The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a

More information

Investigation on Mandarin Broadcast News Speech Recognition

Investigation on Mandarin Broadcast News Speech Recognition Investigation on Mandarin Broadcast News Speech Recognition Mei-Yuh Hwang 1, Xin Lei 1, Wen Wang 2, Takahiro Shinozaki 1 1 Univ. of Washington, Dept. of Electrical Engineering, Seattle, WA 98195 USA 2

More information

SIE: Speech Enabled Interface for E-Learning

SIE: Speech Enabled Interface for E-Learning SIE: Speech Enabled Interface for E-Learning Shikha M.Tech Student Lovely Professional University, Phagwara, Punjab INDIA ABSTRACT In today s world, e-learning is very important and popular. E- learning

More information

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System QuickStroke: An Incremental On-line Chinese Handwriting Recognition System Nada P. Matić John C. Platt Λ Tony Wang y Synaptics, Inc. 2381 Bering Drive San Jose, CA 95131, USA Abstract This paper presents

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1 Notes on The Sciences of the Artificial Adapted from a shorter document written for course 17-652 (Deciding What to Design) 1 Ali Almossawi December 29, 2005 1 Introduction The Sciences of the Artificial

More information

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Twitter Sentiment Classification on Sanders Data using Hybrid Approach IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 4, Ver. I (July Aug. 2015), PP 118-123 www.iosrjournals.org Twitter Sentiment Classification on Sanders

More information

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar Chung-Chi Huang Mei-Hua Chen Shih-Ting Huang Jason S. Chang Institute of Information Systems and Applications, National Tsing Hua University,

More information

Universiteit Leiden ICT in Business

Universiteit Leiden ICT in Business Universiteit Leiden ICT in Business Ranking of Multi-Word Terms Name: Ricardo R.M. Blikman Student-no: s1184164 Internal report number: 2012-11 Date: 07/03/2013 1st supervisor: Prof. Dr. J.N. Kok 2nd supervisor:

More information

Speech Emotion Recognition Using Support Vector Machine

Speech Emotion Recognition Using Support Vector Machine Speech Emotion Recognition Using Support Vector Machine Yixiong Pan, Peipei Shen and Liping Shen Department of Computer Technology Shanghai JiaoTong University, Shanghai, China panyixiong@sjtu.edu.cn,

More information

Distributed Weather Net: Wireless Sensor Network Supported Inquiry-Based Learning

Distributed Weather Net: Wireless Sensor Network Supported Inquiry-Based Learning Distributed Weather Net: Wireless Sensor Network Supported Inquiry-Based Learning Ben Chang, Department of E-Learning Design and Management, National Chiayi University, 85 Wenlong, Mingsuin, Chiayi County

More information

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

Chinese Language Parsing with Maximum-Entropy-Inspired Parser Chinese Language Parsing with Maximum-Entropy-Inspired Parser Heng Lian Brown University Abstract The Chinese language has many special characteristics that make parsing difficult. The performance of state-of-the-art

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

K 1 2 K 1 2. Iron Mountain Public Schools Standards (modified METS) Checklist by Grade Level Page 1 of 11

K 1 2 K 1 2. Iron Mountain Public Schools Standards (modified METS) Checklist by Grade Level Page 1 of 11 Iron Mountain Public Schools Standards (modified METS) - K-8 Checklist by Grade Levels Grades K through 2 Technology Standards and Expectations (by the end of Grade 2) 1. Basic Operations and Concepts.

More information

An Online Handwriting Recognition System For Turkish

An Online Handwriting Recognition System For Turkish An Online Handwriting Recognition System For Turkish Esra Vural, Hakan Erdogan, Kemal Oflazer, Berrin Yanikoglu Sabanci University, Tuzla, Istanbul, Turkey 34956 ABSTRACT Despite recent developments in

More information

Human Emotion Recognition From Speech

Human Emotion Recognition From Speech RESEARCH ARTICLE OPEN ACCESS Human Emotion Recognition From Speech Miss. Aparna P. Wanare*, Prof. Shankar N. Dandare *(Department of Electronics & Telecommunication Engineering, Sant Gadge Baba Amravati

More information

Large vocabulary off-line handwriting recognition: A survey

Large vocabulary off-line handwriting recognition: A survey Pattern Anal Applic (2003) 6: 97 121 DOI 10.1007/s10044-002-0169-3 ORIGINAL ARTICLE A. L. Koerich, R. Sabourin, C. Y. Suen Large vocabulary off-line handwriting recognition: A survey Received: 24/09/01

More information

Problems of the Arabic OCR: New Attitudes

Problems of the Arabic OCR: New Attitudes Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

WHEN THERE IS A mismatch between the acoustic

WHEN THERE IS A mismatch between the acoustic 808 IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, VOL. 14, NO. 3, MAY 2006 Optimization of Temporal Filters for Constructing Robust Features in Speech Recognition Jeih-Weih Hung, Member,

More information

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach

Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach IOP Conference Series: Materials Science and Engineering PAPER OPEN ACCESS Historical maintenance relevant information roadmap for a self-learning maintenance prediction procedural approach To cite this

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability Shih-Bin Chen Dept. of Information and Computer Engineering, Chung-Yuan Christian University Chung-Li, Taiwan

More information

How to Judge the Quality of an Objective Classroom Test

How to Judge the Quality of an Objective Classroom Test How to Judge the Quality of an Objective Classroom Test Technical Bulletin #6 Evaluation and Examination Service The University of Iowa (319) 335-0356 HOW TO JUDGE THE QUALITY OF AN OBJECTIVE CLASSROOM

More information

Mandarin Lexical Tone Recognition: The Gating Paradigm

Mandarin Lexical Tone Recognition: The Gating Paradigm Kansas Working Papers in Linguistics, Vol. 0 (008), p. 8 Abstract Mandarin Lexical Tone Recognition: The Gating Paradigm Yuwen Lai and Jie Zhang University of Kansas Research on spoken word recognition

More information

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration

Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration INTERSPEECH 2013 Semi-Supervised GMM and DNN Acoustic Model Training with Multi-system Combination and Confidence Re-calibration Yan Huang, Dong Yu, Yifan Gong, and Chaojun Liu Microsoft Corporation, One

More information

Effectiveness of Electronic Dictionary in College Students English Learning

Effectiveness of Electronic Dictionary in College Students English Learning 2016 International Conference on Mechanical, Control, Electric, Mechatronics, Information and Computer (MCEMIC 2016) ISBN: 978-1-60595-352-6 Effectiveness of Electronic Dictionary in College Students English

More information

Reducing Features to Improve Bug Prediction

Reducing Features to Improve Bug Prediction Reducing Features to Improve Bug Prediction Shivkumar Shivaji, E. James Whitehead, Jr., Ram Akella University of California Santa Cruz {shiv,ejw,ram}@soe.ucsc.edu Sunghun Kim Hong Kong University of Science

More information

Learning Methods in Multilingual Speech Recognition

Learning Methods in Multilingual Speech Recognition Learning Methods in Multilingual Speech Recognition Hui Lin Department of Electrical Engineering University of Washington Seattle, WA 98125 linhui@u.washington.edu Li Deng, Jasha Droppo, Dong Yu, and Alex

More information

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS

BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Daffodil International University Institutional Repository DIU Journal of Science and Technology Volume 8, Issue 1, January 2013 2013-01 BANGLA TO ENGLISH TEXT CONVERSION USING OPENNLP TOOLS Uddin, Sk.

More information

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition

Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Unvoiced Landmark Detection for Segment-based Mandarin Continuous Speech Recognition Hua Zhang, Yun Tang, Wenju Liu and Bo Xu National Laboratory of Pattern Recognition Institute of Automation, Chinese

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

Learning Microsoft Office Excel

Learning Microsoft Office Excel A Correlation and Narrative Brief of Learning Microsoft Office Excel 2010 2012 To the Tennessee for Tennessee for TEXTBOOK NARRATIVE FOR THE STATE OF TENNESEE Student Edition with CD-ROM (ISBN: 9780135112106)

More information

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science

More information

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl

More information

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl

More information

ScienceDirect. Malayalam question answering system

ScienceDirect. Malayalam question answering system Available online at www.sciencedirect.com ScienceDirect Procedia Technology 24 (2016 ) 1388 1392 International Conference on Emerging Trends in Engineering, Science and Technology (ICETEST - 2015) Malayalam

More information

Mining Association Rules in Student s Assessment Data

Mining Association Rules in Student s Assessment Data www.ijcsi.org 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Biome I Can Statements

Biome I Can Statements Biome I Can Statements I can recognize the meanings of abbreviations. I can use dictionaries, thesauruses, glossaries, textual features (footnotes, sidebars, etc.) and technology to define and pronounce

More information

Using computational modeling in language acquisition research

Using computational modeling in language acquisition research Chapter 8 Using computational modeling in language acquisition research Lisa Pearl 1. Introduction Language acquisition research is often concerned with questions of what, when, and how what children know,

More information

Search right and thou shalt find... Using Web Queries for Learner Error Detection

Search right and thou shalt find... Using Web Queries for Learner Error Detection Search right and thou shalt find... Using Web Queries for Learner Error Detection Michael Gamon Claudia Leacock Microsoft Research Butler Hill Group One Microsoft Way P.O. Box 935 Redmond, WA 981052, USA

More information

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology Michael L. Connell University of Houston - Downtown Sergei Abramovich State University of New York at Potsdam Introduction

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities

Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities Multilingual Document Clustering: an Heuristic Approach Based on Cognate Named Entities Soto Montalvo GAVAB Group URJC Raquel Martínez NLP&IR Group UNED Arantza Casillas Dpt. EE UPV-EHU Víctor Fresno GAVAB

More information

Memory-based grammatical error correction

Memory-based grammatical error correction Memory-based grammatical error correction Antal van den Bosch Peter Berck Radboud University Nijmegen Tilburg University P.O. Box 9103 P.O. Box 90153 NL-6500 HD Nijmegen, The Netherlands NL-5000 LE Tilburg,

More information

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Transfer Learning Action Models by Measuring the Similarity of Different Domains Transfer Learning Action Models by Measuring the Similarity of Different Domains Hankui Zhuo 1, Qiang Yang 2, and Lei Li 1 1 Software Research Institute, Sun Yat-sen University, Guangzhou, China. zhuohank@gmail.com,lnslilei@mail.sysu.edu.cn

More information

Case study Norway case 1

Case study Norway case 1 Case study Norway case 1 School : B (primary school) Theme: Science microorganisms Dates of lessons: March 26-27 th 2015 Age of students: 10-11 (grade 5) Data sources: Pre- and post-interview with 1 teacher

More information

Calibration of Confidence Measures in Speech Recognition

Calibration of Confidence Measures in Speech Recognition Submitted to IEEE Trans on Audio, Speech, and Language, July 2010 1 Calibration of Confidence Measures in Speech Recognition Dong Yu, Senior Member, IEEE, Jinyu Li, Member, IEEE, Li Deng, Fellow, IEEE

More information

Reading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5-

Reading Grammar Section and Lesson Writing Chapter and Lesson Identify a purpose for reading W1-LO; W2- LO; W3- LO; W4- LO; W5- New York Grade 7 Core Performance Indicators Grades 7 8: common to all four ELA standards Throughout grades 7 and 8, students demonstrate the following core performance indicators in the key ideas of reading,

More information

Constructing Parallel Corpus from Movie Subtitles

Constructing Parallel Corpus from Movie Subtitles Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing

More information

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE EE-589 Introduction to Neural Assistant Prof. Dr. Turgay IBRIKCI Room # 305 (322) 338 6868 / 139 Wensdays 9:00-12:00 Course Outline The course is divided in two parts: theory and practice. 1. Theory covers

More information

Full text of O L O W Science As Inquiry conference. Science as Inquiry

Full text of O L O W Science As Inquiry conference. Science as Inquiry Page 1 of 5 Full text of O L O W Science As Inquiry conference Reception Meeting Room Resources Oceanside Unifying Concepts and Processes Science As Inquiry Physical Science Life Science Earth & Space

More information

Read&Write Gold is a software application and can be downloaded in Macintosh or PC version directly from https://download.uky.edu

Read&Write Gold is a software application and can be downloaded in Macintosh or PC version directly from https://download.uky.edu UK 101 - READ&WRITE GOLD LESSON PLAN I. Goal: Students will be able to describe features of Read&Write Gold that will benefit themselves and/or their peers. II. Materials: There are two options for demonstrating

More information

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments Vijayshri Ramkrishna Ingale PG Student, Department of Computer Engineering JSPM s Imperial College of Engineering &

More information

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s))

PAGE(S) WHERE TAUGHT If sub mission ins not a book, cite appropriate location(s)) Ohio Academic Content Standards Grade Level Indicators (Grade 11) A. ACQUISITION OF VOCABULARY Students acquire vocabulary through exposure to language-rich situations, such as reading books and other

More information

The Strong Minimalist Thesis and Bounded Optimality

The Strong Minimalist Thesis and Bounded Optimality The Strong Minimalist Thesis and Bounded Optimality DRAFT-IN-PROGRESS; SEND COMMENTS TO RICKL@UMICH.EDU Richard L. Lewis Department of Psychology University of Michigan 27 March 2010 1 Purpose of this

More information

A Quantitative Method for Machine Translation Evaluation

A Quantitative Method for Machine Translation Evaluation A Quantitative Method for Machine Translation Evaluation Jesús Tomás Escola Politècnica Superior de Gandia Universitat Politècnica de València jtomas@upv.es Josep Àngel Mas Departament d Idiomes Universitat

More information

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University

The Effect of Extensive Reading on Developing the Grammatical. Accuracy of the EFL Freshmen at Al Al-Bayt University The Effect of Extensive Reading on Developing the Grammatical Accuracy of the EFL Freshmen at Al Al-Bayt University Kifah Rakan Alqadi Al Al-Bayt University Faculty of Arts Department of English Language

More information

Circuit Simulators: A Revolutionary E-Learning Platform

Circuit Simulators: A Revolutionary E-Learning Platform Circuit Simulators: A Revolutionary E-Learning Platform Mahi Itagi Padre Conceicao College of Engineering, Verna, Goa, India. itagimahi@gmail.com Akhil Deshpande Gogte Institute of Technology, Udyambag,

More information

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling Notebook for PAN at CLEF 2013 Andrés Alfonso Caurcel Díaz 1 and José María Gómez Hidalgo 2 1 Universidad

More information

Disciplinary Literacy in Science

Disciplinary Literacy in Science Disciplinary Literacy in Science 18 th UCF Literacy Symposium 4/1/2016 Vicky Zygouris-Coe, Ph.D. UCF, CEDHP vzygouri@ucf.edu April 1, 2016 Objectives Examine the benefits of disciplinary literacy for science

More information

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano

LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES. Judith Gaspers and Philipp Cimiano LEARNING A SEMANTIC PARSER FROM SPOKEN UTTERANCES Judith Gaspers and Philipp Cimiano Semantic Computing Group, CITEC, Bielefeld University {jgaspers cimiano}@cit-ec.uni-bielefeld.de ABSTRACT Semantic parsers

More information

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation

Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Role of Pausing in Text-to-Speech Synthesis for Simultaneous Interpretation Vivek Kumar Rangarajan Sridhar, John Chen, Srinivas Bangalore, Alistair Conkie AT&T abs - Research 180 Park Avenue, Florham Park,

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE

LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE LEXICAL COHESION ANALYSIS OF THE ARTICLE WHAT IS A GOOD RESEARCH PROJECT? BY BRIAN PALTRIDGE A JOURNAL ARTICLE Submitted in partial fulfillment of the requirements for the degree of Sarjana Sastra (S.S.)

More information

Timeline. Recommendations

Timeline. Recommendations Introduction Advanced Placement Course Credit Alignment Recommendations In 2007, the State of Ohio Legislature passed legislation mandating the Board of Regents to recommend and the Chancellor to adopt

More information

Abstractions and the Brain

Abstractions and the Brain Abstractions and the Brain Brian D. Josephson Department of Physics, University of Cambridge Cavendish Lab. Madingley Road Cambridge, UK. CB3 OHE bdj10@cam.ac.uk http://www.tcm.phy.cam.ac.uk/~bdj10 ABSTRACT

More information

Interpreting ACER Test Results

Interpreting ACER Test Results Interpreting ACER Test Results This document briefly explains the different reports provided by the online ACER Progressive Achievement Tests (PAT). More detailed information can be found in the relevant

More information

Math 96: Intermediate Algebra in Context

Math 96: Intermediate Algebra in Context : Intermediate Algebra in Context Syllabus Spring Quarter 2016 Daily, 9:20 10:30am Instructor: Lauri Lindberg Office Hours@ tutoring: Tutoring Center (CAS-504) 8 9am & 1 2pm daily STEM (Math) Center (RAI-338)

More information

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics (L615) Markus Dickinson Department of Linguistics, Indiana University Spring 2013 The web provides new opportunities for gathering data Viable source of disposable corpora, built ad hoc for specific purposes

More information

Dublin City Schools Broadcast Video I Graded Course of Study GRADES 9-12

Dublin City Schools Broadcast Video I Graded Course of Study GRADES 9-12 Philosophy The Broadcast and Video Production Satellite Program in the Dublin City School District is dedicated to developing students media production skills in an atmosphere that includes stateof-the-art

More information

5. UPPER INTERMEDIATE

5. UPPER INTERMEDIATE Triolearn General Programmes adapt the standards and the Qualifications of Common European Framework of Reference (CEFR) and Cambridge ESOL. It is designed to be compatible to the local and the regional

More information

Subject: Opening the American West. What are you teaching? Explorations of Lewis and Clark

Subject: Opening the American West. What are you teaching? Explorations of Lewis and Clark Theme 2: My World & Others (Geography) Grade 5: Lewis and Clark: Opening the American West by Ellen Rodger (U.S. Geography) This 4MAT lesson incorporates activities in the Daily Lesson Guide (DLG) that

More information

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH

STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH STUDIES WITH FABRICATED SWITCHBOARD DATA: EXPLORING SOURCES OF MODEL-DATA MISMATCH Don McAllaster, Larry Gillick, Francesco Scattone, Mike Newman Dragon Systems, Inc. 320 Nevada Street Newton, MA 02160

More information

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Dr. Amardeep Kaur Professor, Babe Ke College of Education, Mudki, Ferozepur, Punjab Abstract The present

More information

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education GCSE Mathematics B (Linear) Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education Mark Scheme for November 2014 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge

More information

Performance Analysis of Optimized Content Extraction for Cyrillic Mongolian Learning Text Materials in the Database

Performance Analysis of Optimized Content Extraction for Cyrillic Mongolian Learning Text Materials in the Database Journal of Computer and Communications, 2016, 4, 79-89 Published Online August 2016 in SciRes. http://www.scirp.org/journal/jcc http://dx.doi.org/10.4236/jcc.2016.410009 Performance Analysis of Optimized

More information

Robot manipulations and development of spatial imagery

Robot manipulations and development of spatial imagery Robot manipulations and development of spatial imagery Author: Igor M. Verner, Technion Israel Institute of Technology, Haifa, 32000, ISRAEL ttrigor@tx.technion.ac.il Abstract This paper considers spatial

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription

Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Analysis of Speech Recognition Models for Real Time Captioning and Post Lecture Transcription Wilny Wilson.P M.Tech Computer Science Student Thejus Engineering College Thrissur, India. Sindhu.S Computer

More information

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS 1 CALIFORNIA CONTENT STANDARDS: Chapter 1 ALGEBRA AND WHOLE NUMBERS Algebra and Functions 1.4 Students use algebraic

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Analysis: Evaluation: Knowledge: Comprehension: Synthesis: Application:

Analysis: Evaluation: Knowledge: Comprehension: Synthesis: Application: In 1956, Benjamin Bloom headed a group of educational psychologists who developed a classification of levels of intellectual behavior important in learning. Bloom found that over 95 % of the test questions

More information

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011 Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011 Cristian-Alexandru Drăgușanu, Marina Cufliuc, Adrian Iftene UAIC: Faculty of Computer Science, Alexandru Ioan Cuza University,

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information