An Integrated System for Polytonic Greek OCR

Size: px
Start display at page:

Download "An Integrated System for Polytonic Greek OCR"

Transcription

1 An Integrated System for Polytonic Greek OCR I. Generating the Data Bruce Robertson, Dept. of Classics, Mount Allison University, New Brunswick, Canada Digital Classicist Seminar, Institute of Classical Studies, London, UK, July 19, 2013

2 I. Generating the Data A. Reason

3 Why Ancient Greek OCR? 1. Rapid digitization of Greek texts not yet in digital libraries

4 Why Ancient Greek OCR? 1. Rapid digitization of Greek texts not yet in digital libraries 2. Study of textual variants and app. crit.

5 Why Ancient Greek OCR? 1. Rapid digitization of Greek texts not yet in digital libraries 2. Study of textual variants and app. crit. 3. Text reuse analysis

6 Why Ancient Greek OCR? 1. Rapid digitization of Greek texts not yet in digital libraries 2. Study of textual variants and app. crit. 3. Text reuse analysis 4. General-purpose OCR search, like Google Books

7 Use Manual Editing? Automatic Spellchecking? Digitization Textual Variants Text Reuse OCR Search or

8 Use Manual Editing? Automatic Spellchecking? Digitization Textual Variants Text Reuse OCR Search or

9 I. Generating the Data B. Challenge

10 Example Text: John 1:1 Ἐν ἀρχῇ ἦν ὁ Λόγος, καὶ ὁ Λόγος ἦν πρὸς τὸν Θεόν, καὶ Θεὸς ἦν ὁ Λόγος.

11 Acute, Grave and Circumflex Accents Ἐν ἀρχῇ ἦν ὁ Λόγος, καὶ ὁ Λόγος ἦν πρὸς τὸν Θεόν, καὶ Θεὸς ἦν ὁ Λόγος.

12 Smooth and Rough Breathing Marks Ἐν ἀρχῇ ἦν ὁ Λόγος, καὶ ὁ Λόγος ἦν πρὸς τὸν Θεόν, καὶ Θεὸς ἦν ὁ Λόγος.

13 Iota Subscript Ἐν ἀρχῇ ἦν ὁ Λόγος, καὶ ὁ Λόγος ἦν πρὸς τὸν Θεόν, καὶ Θεὸς ἦν ὁ Λόγος.

14 Diversity of Greek Fonts in 19th C. archive.org Texts

15

16 Recognizing Lines

17 Recognizing Lines

18 Accurate Binarization

19 Binarization Important to Results

20 I. Generating the Data C. Resources

21 Contextless 'Greekness' Index Devised by Dr. Boschetti Based on dictionary and likely sequences of letters, etc. Named 'B-score' in these slides

22 Archive.org Provides: Thousands of volumes rendered in highresolution (400 ppi +) colour images OCR results from ABBYY Finereader Excellent Latin-script recognition Poor Greek results Top-quality line-segmentation

23 Open-source OCR Engines Gamera Current focus of my team Tesseract Nick White has worked extensively on this to generate good results OCRopus Dr. Boschetti recently has been able to use Tesseract training sets for this engine

24 Interchange Format: HOCR <p> <span class="ocr_line" title="bbox "> <span class="ocr_word" title="bbox " xml:lang="grc">νυ ν</span> <span class="ocr_word" title="bbox " xml:lang="grc">δ'</span> <span class="ocr_word" title="bbox " xml:lang="grc">ω ρθωσας</span> </p>

25 I. Generating the Data D. Method

26 Rigaudon Greek OCR Process HOCR Output Replace Latin-script output words with ones in same position from Archive's ABBYY output Per-volume spellcheck table Weighted Edit Table for Classifier 410,000 word dictionary from open Perseus Greek texts HOCR Latin / Greek Combining Does ABBYY OCR file contain Latinscript output? Replace spellchecked words Weighted Levenshtein Automatic OCR Spellchecker (x14 cores) Automatic OCR Spellcheck Reduction to unique Greek strings... Replace non-dictionary words with dict words from other binarization pages HOCR "Blending" Select highest-ranking binarization page Score table for binarization thresholds Boschetti scoring HOCR Results at a range of binarization thresholds Parallel Process x35 Cores Classifiers for Teubner Sans, Teubner Serif, Oxford (Loeb, etc.) Greek OCR For Gamera JP2 Input Library Page Segmentation Thru HOCR Input Gamera Image Recognition HOCR Output ABBYY to HOCR Conversion ABBYY OCR Information file Images from Archive.org OCR

27 Raw HOCR Production Using Gamera Plugin for Gamera OCR allows it to import high-quality line-segmentation information, compensating for Gamera's poor results in this critical function Plugin to output HOCR Wrapper function generates a range of output pages based on binarization threshold (typically per page)

28 HOCR 'Blending' This step aims to gather word-by-word the 'best' results from the range of results pages for each image Selects the highest-scoring result page overall Where a Greek word in this page is not in the dictionary and another page has a dictionary word in the exact same physical location, it replaces with dictionary word

29 Automatic Spellcheck All pages in volume are reduced to a set of unique, decomposed Greek strings These are compared to dictionary using Levenshtein distances A 'weighting table', suitable for a given font, indicates which edits are preferable or allowed Result is 'light' correction, esp. of diacritics

30 Automatic Spellcheck Weighting Table ['replace', ['replace', ['replace', ['replace', ['replace', ['replace', ['replace', ['replace', ['replace', ['replace', ur'ϲϲ', ur'σςσ', 1],#for lunate fonts ur'c', ur'σς', 1],#for lunate fonts ur't', ur'ττ', 1], ur'τr', ur'τ', 1], ur'uu', ur'υ', 1], ur'y', ur'υ', 1], ur'e', ur'ε', 1], ur'e', ur'ε', 2], ur'z', ur'ζ', 1], ur'k', ur'κκ', 1],

31 Optionally injecting Greek into Original Latin HOCR Don't want to try to get excellent Greek and Latin results, esp. when ABBYY and others do better job with Latin In the case that archive.org provides Latin OCR: If Rigaudon's output word is Greek, replace archive. org's ABBYY output word with Rigaudon's

32 Reporting

33 I. Generating the Data F. Results

34 Results Περι δε δικαιοσυ νης και α δικι ας... τε τυγχα νουσιν ου σαι πρα ξεις, και ποι α... κο τος α λλ' ου τη ς κομιζου σης φυ σεως κτη μα... τπ ου ν, ω ς ε ν ε ργον η μι ν ε πιβα λλει μο νον...

35 Results πιπλαττομε νω[ν η μη κ[αλ]ω ς α λλως προσ[τιθεμε - κρα τος τ' ι σο ψυχον ε κ γυναικω ν καρδιο δηκτον ε μοι κρατυ νεις.

36 I. Generating the Data G. Future

37 Multiple OCR Engines Replace non-dictionary words with dict words from other binarization pages HOCR "Blending" Select highest-ranking binarization page Score table for binarization thresholds Boschetti scoring HOCR Results at a range of binarization thresholds OCR OCRopus Gamera Line Segmentation Tesseract Take ABBYY data out of the process With 'cleaning' Tesseract's line-segmentation is often as good Use Nick White's general-purpose polytonic classifier and ones specifically designed for a font

38 Resources Output: Code: Further Topics HPC Computing with Grid Engine Python Flask Web Microframework Making Book Images

39 An Integrated System for Generating and Correcting Polytonic Greek OCR Bruce Robertson and Federico Boschetti Part II The Proof-reading Process Federico Boschetti ILC-CNR of Pisa Digital Classicist Seminars London, 19 July 2013 Federico Boschetti Generating and Correcting Polytonic Greek OCR 1/ 20

40 Introduction Information Aggregation Proof-reader Web Application False positives Manual corrections on OCR output may be performed by Experts Classicists devoted to proof-reading for a long-term project Data Entry Firms Professional proof-readers not skilled in the target language(s) Crowd Sourcing Students that are learning the target language(s) Random Volunteers People with heterogeneous education and skills Federico Boschetti Generating and Correcting Polytonic Greek OCR 1/ 20

41 Introduction Information Aggregation Proof-reader Web Application False positives For this reason proof-reading tools focused on ancient languages should be centralized easy to use based on image / text comparison line by line optimized to catch attention on possible errors, distinguished by category efficiently providing the most probable correction Federico Boschetti Generating and Correcting Polytonic Greek OCR 2/ 20

42 Overview Information Aggregation Proof-reader Web Application False positives Enriched hocr files Alignment with other editions False negatives 1 Information Aggregation Enriched hocr files Alignment with other editions False negatives 2 Proof-reader Web Application 3 False positives Federico Boschetti Generating and Correcting Polytonic Greek OCR 3/ 20

43 Enriched hocr files Information Aggregation Proof-reader Web Application False positives Enriched hocr files Alignment with other editions False negatives OCR output formatted in hocr microformat The hocr output produced by Rigaudon is postprocessed, in order to add information managed by the Proof-reading Web Application Multiple sources Dictionaries with and without diacritics Multiple editions of the same work (if available) Syllabic repertory Federico Boschetti Generating and Correcting Polytonic Greek OCR 3/ 20

44 Dictionaries Information Aggregation Proof-reader Web Application False positives Enriched hocr files Alignment with other editions False negatives In order to identify possible errors and provide good suggestions to correct them, the OCR output is spell-checked and the potential errors are processed step by step The spell-checker is based on dictionaries generated from Perseus text collection. An upper-case dictionary is used to evaluate if a character sequence is a word with a wrong accent or breathing mark Federico Boschetti Generating and Correcting Polytonic Greek OCR 4/ 20

45 Information Aggregation Proof-reader Web Application False positives Alignment with other editions Enriched hocr files Alignment with other editions False negatives When another edition of the same work is available, the two editions are aligned word by word applying the Needleman-Wunsch algorithm ὁ Γαδαρεὺς ἐν ταῖς Χάρισιν ἐπιγραφομέναις ἔφη τὸν Ομηρον Σύρον ὄντα τὸ ὁ Γαδαρεὺς ἐν τ αχς Σάρισιν ἐπιγραφομέναις ἔφη τὸν Ομeρον Σύρον ὄντα τὸ Federico Boschetti Generating and Correcting Polytonic Greek OCR 5/ 20

46 Information Aggregation Proof-reader Web Application False positives Enriched hocr files Alignment with other editions False negatives False negatives and the risk of digital contaminatio An example Rigaudon on the Anecdota Graeca edited by Cramer recognizes the word χόρος, which is rejected by the current spellchecker The spell-checker suggests χορός as a correction Also the alignment with Koster s edition of the Prolegomena de comoedia suggests χορός But the page image contains χόρος, a late form attested from Athenaeus to the Byzantine period Federico Boschetti Generating and Correcting Polytonic Greek OCR 6/ 20

47 Syllabication Information Aggregation Proof-reader Web Application False positives Enriched hocr files Alignment with other editions False negatives In order to prevent false negatives due to (rare) variants ignored by the dictionaries, the system distinguishes between random character sequences and well-formed syllabic sequences Each potential error is divided in syllables and each syllable is evaluated according to its position For example, χό-ρος is a well-formed syllabic sequence: χό- is a valid Greek initial syllable and -ρος is a valid final Greek syllable Federico Boschetti Generating and Correcting Polytonic Greek OCR 7/ 20

48 Overview Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections 1 Information Aggregation 2 Proof-reader Web Application The web interface Cues Self-corrections 3 False positives Federico Boschetti Generating and Correcting Polytonic Greek OCR 8/ 20

49 Centralization Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections The proof-reader is a web application inspired by the Mozilla hocr Editor interface but employs the WikiSource collaborative philosophy Texts are stored in a central XML native database Federico Boschetti Generating and Correcting Polytonic Greek OCR 8/ 20

50 Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections The Control Panel Federico Boschetti Generating and Correcting Polytonic Greek OCR 9/ 20

51 Image / Text Pairs Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections Federico Boschetti Generating and Correcting Polytonic Greek OCR 10/ 20

52 Cues Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections Wrong accents and breathing marks Attention is focused on diacritics Self-corrections Special care is necessary to avoid the risk of contaminatio Errors Random errors Federico Boschetti Generating and Correcting Polytonic Greek OCR 11/ 20

53 Example Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections Federico Boschetti Generating and Correcting Polytonic Greek OCR 12/ 20

54 Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections Self-corrections and suggestions generated by alignment In a self-correction, the reading has been substituted by the aligned word of another edition. Self corrections need three conditions: character sequence is refused by the spell-checker edit distance between the character sequence and the aligned edition is very close the character sequence is not a well-formed syllabic sequence Federico Boschetti Generating and Correcting Polytonic Greek OCR 13/ 20

55 Example Information Aggregation Proof-reader Web Application False positives The web interface Cues Self-corrections Federico Boschetti Generating and Correcting Polytonic Greek OCR 14/ 20

56 Information Aggregation Proof-reader Web Application False positives Dynamic Dictionaries The web interface Cues Self-corrections Dictionaries used by the spell-checker are dynamically rebuilt when a milestone in proof-reading is reached Enlarging the dictionaries, rare variants are acquired and used to spell-check the next works Federico Boschetti Generating and Correcting Polytonic Greek OCR 15/ 20

57 Overview Information Aggregation Proof-reader Web Application False positives 1 Information Aggregation 2 Proof-reader Web Application 3 False positives Federico Boschetti Generating and Correcting Polytonic Greek OCR 16/ 20

58 Information Aggregation Proof-reader Web Application False positives False positives are deceitful By definition, false positives pass the spell-checking Specially if they are graphically similar to the correct word, such as δ and ὁ in Greek or m and ni in Latin, they are difficult to be seen, in particular by proof-readers not skilled in the target language(s) Federico Boschetti Generating and Correcting Polytonic Greek OCR 16/ 20

59 Example Information Aggregation Proof-reader Web Application False positives Federico Boschetti Generating and Correcting Polytonic Greek OCR 17/ 20

60 Example Information Aggregation Proof-reader Web Application False positives Federico Boschetti Generating and Correcting Polytonic Greek OCR 17/ 20

61 Information Aggregation Proof-reader Web Application False positives Semantic Distance Semantic distance is calculated along the nodes of WordNet s hierarchy, i.e. along the chain of hyponyms / hypernyms, in order to reach co-hyponyms Different translations of the same concepts (e.g. vis in Latin and efficacia in Italian or efficacy in English) have semantic distance equal to zero Semantically unrelated words (e.g. vinum in Latin and efficacia in Italian) have a large semantic distance Federico Boschetti Generating and Correcting Polytonic Greek OCR 18/ 20

62 Information Aggregation Proof-reader Web Application False positives AncientWordNet Synsets of AncientGreekWordNet and LatinWordNet have been extracted from bilingual dictionaries They are aligned to modern languages such as English, Italian, etc. Federico Boschetti Generating and Correcting Polytonic Greek OCR 19/ 20

63 Conclusion Information Aggregation Proof-reader Web Application False positives The proof-reading Web Application puts together the main features of individual and collaborative proof-reading tools currently available The entire work-flow is circular: Training OCR - Performing OCR - Spell-checking OCR - Correcting OCR - Enlarging dictionaries - Retraining OCR Federico Boschetti Generating and Correcting Polytonic Greek OCR 20/ 20

64 Information Aggregation Proof-reader Web Application False positives Thank you for your attention Federico Boschetti Generating and Correcting Polytonic Greek OCR 20/ 20

65 References Information Aggregation Proof-reader Web Application False positives S. Feng, R. Manmatha: A Hierarchical, HMM-based Automatic Evaluation of OCR Accuracy for a Digital Library of Books. JCDL 2006, (2006) W.B. Lund, E.K. Ringger: Improving Optical Character Recognition through Efficient Multiple System Alignment, JCDL (2009) M. Reynaert: Non-interactive OCR Post-correction for Giga-Scale Digitization Projects. A. Gelbukh (ed.): CICLing 2008, LNCS 4919, (2008) M. Reynaert: All, and only, the Errors: more Complete and Consistent Spelling and OCR-Error Correction Evaluation. 6th International Conference on Language Resources and Evaluation 2008, (2008) C. Ringlstetter, K. Schulz, S. Mihov, K. Louka: The same is not the same - postcorrection of alphabet confusion errors in mixed-alphabet OCR recognition. 8th International Conference on Document Analysis and Recognition, 1, (2005) M. Spencer, C. Howe: Collating texts using progressive multiple alignment. Computer and the Humanities, 37, 1, (2003) G. Stewart, G. Crane, A. Babeu: A New Generation of Textual Corpora. JCDL 2007, (2007) L. Zhuang, X. Zhu: An OCR Post-processing Approach Based on Multi-knowledge. 9th International Conference on Knowledge-Based Intelligent Information and Engineering Systems, (2005) Federico Boschetti Generating and Correcting Polytonic Greek OCR 20/ 20

CX 105/205/305 Greek Language 2017/18

CX 105/205/305 Greek Language 2017/18 The University of Warwick Department of Classics and Ancient History CX 105/205/305 Greek Language 2017/18 Module Convenor: Clive Letchford, Room H.2.39 C.A.Letchford@warwick.ac.uk detail from Codex Sinaiticus,

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email: stloyare@tau.ac.il,

More information

Detecting English-French Cognates Using Orthographic Edit Distance

Detecting English-French Cognates Using Orthographic Edit Distance Detecting English-French Cognates Using Orthographic Edit Distance Qiongkai Xu 1,2, Albert Chen 1, Chang i 1 1 The Australian National University, College of Engineering and Computer Science 2 National

More information

AQUA: An Ontology-Driven Question Answering System

AQUA: An Ontology-Driven Question Answering System AQUA: An Ontology-Driven Question Answering System Maria Vargas-Vera, Enrico Motta and John Domingue Knowledge Media Institute (KMI) The Open University, Walton Hall, Milton Keynes, MK7 6AA, United Kingdom.

More information

Problems of the Arabic OCR: New Attitudes

Problems of the Arabic OCR: New Attitudes Problems of the Arabic OCR: New Attitudes Prof. O.Redkin, Dr. O.Bernikova Department of Asian and African Studies, St. Petersburg State University, St Petersburg, Russia Abstract - This paper reviews existing

More information

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays IOSR Journal of Computer Engineering (IOSR-JCE) e-issn: 2278-0661,p-ISSN: 2278-8727, Volume 17, Issue 6, Ver. IV (Nov Dec. 2015), PP 01-07 www.iosrjournals.org Longest Common Subsequence: A Method for

More information

The taming of the data:

The taming of the data: The taming of the data: Using text mining in building a corpus for diachronic analysis Stefania Degaetano-Ortlieb, Hannah Kermes, Ashraf Khamis, Jörg Knappen, Noam Ordan and Elke Teich Background Big data

More information

Linking Task: Identifying authors and book titles in verbose queries

Linking Task: Identifying authors and book titles in verbose queries Linking Task: Identifying authors and book titles in verbose queries Anaïs Ollagnier, Sébastien Fournier, and Patrice Bellot Aix-Marseille University, CNRS, ENSAM, University of Toulon, LSIS UMR 7296,

More information

Arabic Orthography vs. Arabic OCR

Arabic Orthography vs. Arabic OCR Arabic Orthography vs. Arabic OCR Rich Heritage Challenging A Much Needed Technology Mohamed Attia Having consistently been spoken since more than 2000 years and on, Arabic is doubtlessly the oldest among

More information

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading

Program Matrix - Reading English 6-12 (DOE Code 398) University of Florida. Reading Program Requirements Competency 1: Foundations of Instruction 60 In-service Hours Teachers will develop substantive understanding of six components of reading as a process: comprehension, oral language,

More information

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS Václav Kocian, Eva Volná, Michal Janošek, Martin Kotyrba University of Ostrava Department of Informatics and Computers Dvořákova 7,

More information

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF) Hans Christian 1 ; Mikhael Pramodana Agus 2 ; Derwin Suhartono 3 1,2,3 Computer Science Department,

More information

Applications of memory-based natural language processing

Applications of memory-based natural language processing Applications of memory-based natural language processing Antal van den Bosch and Roser Morante ILK Research Group Tilburg University Prague, June 24, 2007 Current ILK members Principal investigator: Antal

More information

arxiv: v1 [cs.cl] 2 Apr 2017

arxiv: v1 [cs.cl] 2 Apr 2017 Word-Alignment-Based Segment-Level Machine Translation Evaluation using Word Embeddings Junki Matsuo and Mamoru Komachi Graduate School of System Design, Tokyo Metropolitan University, Japan matsuo-junki@ed.tmu.ac.jp,

More information

Greek Teachers Attitudes toward the Inclusion of Students with Special Educational Needs

Greek Teachers Attitudes toward the Inclusion of Students with Special Educational Needs American Journal of Educational Research, 2014, Vol. 2, No. 4, 208-218 Available online at http://pubs.sciepub.com/education/2/4/6 Science and Education Publishing DOI:10.12691/education-2-4-6 Greek Teachers

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words,

have to be modeled) or isolated words. Output of the system is a grapheme-tophoneme conversion system which takes as its input the spelling of words, A Language-Independent, Data-Oriented Architecture for Grapheme-to-Phoneme Conversion Walter Daelemans and Antal van den Bosch Proceedings ESCA-IEEE speech synthesis conference, New York, September 1994

More information

Finding Translations in Scanned Book Collections

Finding Translations in Scanned Book Collections Finding Translations in Scanned Book Collections Ismet Zeki Yalniz Dept. of Computer Science University of Massachusetts Amherst, MA, 01003 zeki@cs.umass.edu R. Manmatha Dept. of Computer Science University

More information

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4 University of Waterloo School of Accountancy AFM 102: Introductory Management Accounting Fall Term 2004: Section 4 Instructor: Alan Webb Office: HH 289A / BFG 2120 B (after October 1) Phone: 888-4567 ext.

More information

A Neural Network GUI Tested on Text-To-Phoneme Mapping

A Neural Network GUI Tested on Text-To-Phoneme Mapping A Neural Network GUI Tested on Text-To-Phoneme Mapping MAARTEN TROMPPER Universiteit Utrecht m.f.a.trompper@students.uu.nl Abstract Text-to-phoneme (T2P) mapping is a necessary step in any speech synthesis

More information

Learning Methods for Fuzzy Systems

Learning Methods for Fuzzy Systems Learning Methods for Fuzzy Systems Rudolf Kruse and Andreas Nürnberger Department of Computer Science, University of Magdeburg Universitätsplatz, D-396 Magdeburg, Germany Phone : +49.39.67.876, Fax : +49.39.67.8

More information

Cross Language Information Retrieval

Cross Language Information Retrieval Cross Language Information Retrieval RAFFAELLA BERNARDI UNIVERSITÀ DEGLI STUDI DI TRENTO P.ZZA VENEZIA, ROOM: 2.05, E-MAIL: BERNARDI@DISI.UNITN.IT Contents 1 Acknowledgment.............................................

More information

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form

Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form Orthographic Form 1 Improved Effects of Word-Retrieval Treatments Subsequent to Addition of the Orthographic Form The development and testing of word-retrieval treatments for aphasia has generally focused

More information

ECE-492 SENIOR ADVANCED DESIGN PROJECT

ECE-492 SENIOR ADVANCED DESIGN PROJECT ECE-492 SENIOR ADVANCED DESIGN PROJECT Meeting #3 1 ECE-492 Meeting#3 Q1: Who is not on a team? Q2: Which students/teams still did not select a topic? 2 ENGINEERING DESIGN You have studied a great deal

More information

Disambiguation of Thai Personal Name from Online News Articles

Disambiguation of Thai Personal Name from Online News Articles Disambiguation of Thai Personal Name from Online News Articles Phaisarn Sutheebanjard Graduate School of Information Technology Siam University Bangkok, Thailand mr.phaisarn@gmail.com Abstract Since online

More information

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS

METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS METHODS FOR EXTRACTING AND CLASSIFYING PAIRS OF COGNATES AND FALSE FRIENDS Ruslan Mitkov (R.Mitkov@wlv.ac.uk) University of Wolverhampton ViktorPekar (v.pekar@wlv.ac.uk) University of Wolverhampton Dimitar

More information

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy

Informatics 2A: Language Complexity and the. Inf2A: Chomsky Hierarchy Informatics 2A: Language Complexity and the Chomsky Hierarchy September 28, 2010 Starter 1 Is there a finite state machine that recognises all those strings s from the alphabet {a, b} where the difference

More information

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT

WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT WE GAVE A LAWYER BASIC MATH SKILLS, AND YOU WON T BELIEVE WHAT HAPPENED NEXT PRACTICAL APPLICATIONS OF RANDOM SAMPLING IN ediscovery By Matthew Verga, J.D. INTRODUCTION Anyone who spends ample time working

More information

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl

More information

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland daviddonnellan@eircom.net Claus Pahl

More information

Seminar - Organic Computing

Seminar - Organic Computing Seminar - Organic Computing Self-Organisation of OC-Systems Markus Franke 25.01.2006 Typeset by FoilTEX Timetable 1. Overview 2. Characteristics of SO-Systems 3. Concern with Nature 4. Design-Concepts

More information

Australian Journal of Basic and Applied Sciences

Australian Journal of Basic and Applied Sciences AENSI Journals Australian Journal of Basic and Applied Sciences ISSN:1991-8178 Journal home page: www.ajbasweb.com Feature Selection Technique Using Principal Component Analysis For Improving Fuzzy C-Mean

More information

On document relevance and lexical cohesion between query terms

On document relevance and lexical cohesion between query terms Information Processing and Management 42 (2006) 1230 1247 www.elsevier.com/locate/infoproman On document relevance and lexical cohesion between query terms Olga Vechtomova a, *, Murat Karamuftuoglu b,

More information

Leveraging Sentiment to Compute Word Similarity

Leveraging Sentiment to Compute Word Similarity Leveraging Sentiment to Compute Word Similarity Balamurali A.R., Subhabrata Mukherjee, Akshat Malu and Pushpak Bhattacharyya Dept. of Computer Science and Engineering, IIT Bombay 6th International Global

More information

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

The Internet as a Normative Corpus: Grammar Checking with a Search Engine The Internet as a Normative Corpus: Grammar Checking with a Search Engine Jonas Sjöbergh KTH Nada SE-100 44 Stockholm, Sweden jsh@nada.kth.se Abstract In this paper some methods using the Internet as a

More information

Corrective Feedback and Persistent Learning for Information Extraction

Corrective Feedback and Persistent Learning for Information Extraction Corrective Feedback and Persistent Learning for Information Extraction Aron Culotta a, Trausti Kristjansson b, Andrew McCallum a, Paul Viola c a Dept. of Computer Science, University of Massachusetts,

More information

CX 101/201/301 Latin Language and Literature 2015/16

CX 101/201/301 Latin Language and Literature 2015/16 The University of Warwick Department of Classics and Ancient History CX 101/201/301 Latin Language and Literature 2015/16 Module tutor: Clive Letchford Humanities Building 2.21 c.a.letchford@warwick.ac.uk

More information

Florida Reading Endorsement Alignment Matrix Competency 1

Florida Reading Endorsement Alignment Matrix Competency 1 Florida Reading Endorsement Alignment Matrix Competency 1 Reading Endorsement Guiding Principle: Teachers will understand and teach reading as an ongoing strategic process resulting in students comprehending

More information

Unit 7 Data analysis and design

Unit 7 Data analysis and design 2016 Suite Cambridge TECHNICALS LEVEL 3 IT Unit 7 Data analysis and design A/507/5007 Guided learning hours: 60 Version 2 - revised May 2016 *changes indicated by black vertical line ocr.org.uk/it LEVEL

More information

RIVERS AND LAKES. MATERIA: GEOGRAFIA AUTORI Stefania Poggio Angela Renzi CONSULENZA: Cristina Fontana I.C. COMO-LORA-LIPOMO

RIVERS AND LAKES. MATERIA: GEOGRAFIA AUTORI Stefania Poggio Angela Renzi CONSULENZA: Cristina Fontana I.C. COMO-LORA-LIPOMO MATERIA: GEOGRAFIA AUTORI Stefania Poggio Angela Renzi CONSULENZA: Cristina Fontana I.C. COMO-LORA-LIPOMO RIVERS AND LAKES Destinatari: Lower Secondary School CLASSE: PRIMA (1 st year) Learning Unit Title

More information

Speech Recognition at ICSI: Broadcast News and beyond

Speech Recognition at ICSI: Broadcast News and beyond Speech Recognition at ICSI: Broadcast News and beyond Dan Ellis International Computer Science Institute, Berkeley CA Outline 1 2 3 The DARPA Broadcast News task Aspects of ICSI

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information

Constructing Parallel Corpus from Movie Subtitles

Constructing Parallel Corpus from Movie Subtitles Constructing Parallel Corpus from Movie Subtitles Han Xiao 1 and Xiaojie Wang 2 1 School of Information Engineering, Beijing University of Post and Telecommunications artex.xh@gmail.com 2 CISTR, Beijing

More information

Artificial Neural Networks written examination

Artificial Neural Networks written examination 1 (8) Institutionen för informationsteknologi Olle Gällmo Universitetsadjunkt Adress: Lägerhyddsvägen 2 Box 337 751 05 Uppsala Artificial Neural Networks written examination Monday, May 15, 2006 9 00-14

More information

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education

GCSE Mathematics B (Linear) Mark Scheme for November Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education GCSE Mathematics B (Linear) Component J567/04: Mathematics Paper 4 (Higher) General Certificate of Secondary Education Mark Scheme for November 2014 Oxford Cambridge and RSA Examinations OCR (Oxford Cambridge

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari}@cedar.buffalo.edu Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches

NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches NCU IISR English-Korean and English-Chinese Named Entity Transliteration Using Different Grapheme Segmentation Approaches Yu-Chun Wang Chun-Kai Wu Richard Tzong-Han Tsai Department of Computer Science

More information

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm

Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm Syntax Parsing 1. Grammars and parsing 2. Top-down and bottom-up parsing 3. Chart parsers 4. Bottom-up chart parsing 5. The Earley Algorithm syntax: from the Greek syntaxis, meaning setting out together

More information

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition

Objectives. Chapter 2: The Representation of Knowledge. Expert Systems: Principles and Programming, Fourth Edition Chapter 2: The Representation of Knowledge Expert Systems: Principles and Programming, Fourth Edition Objectives Introduce the study of logic Learn the difference between formal logic and informal logic

More information

Dublin City Schools Mathematics Graded Course of Study GRADE 4

Dublin City Schools Mathematics Graded Course of Study GRADE 4 I. Content Standard: Number, Number Sense and Operations Standard Students demonstrate number sense, including an understanding of number systems and reasonable estimates using paper and pencil, technology-supported

More information

Modeling function word errors in DNN-HMM based LVCSR systems

Modeling function word errors in DNN-HMM based LVCSR systems Modeling function word errors in DNN-HMM based LVCSR systems Melvin Jose Johnson Premkumar, Ankur Bapna and Sree Avinash Parchuri Department of Computer Science Department of Electrical Engineering Stanford

More information

Large vocabulary off-line handwriting recognition: A survey

Large vocabulary off-line handwriting recognition: A survey Pattern Anal Applic (2003) 6: 97 121 DOI 10.1007/s10044-002-0169-3 ORIGINAL ARTICLE A. L. Koerich, R. Sabourin, C. Y. Suen Large vocabulary off-line handwriting recognition: A survey Received: 24/09/01

More information

Postprint.

Postprint. http://www.diva-portal.org Postprint This is the accepted version of a paper presented at CLEF 2013 Conference and Labs of the Evaluation Forum Information Access Evaluation meets Multilinguality, Multimodality,

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany kroha@informatik.tu-chemnitz.de Ricardo Baeza-Yates Center

More information

AUTHORING E-LEARNING CONTENT TRENDS AND SOLUTIONS

AUTHORING E-LEARNING CONTENT TRENDS AND SOLUTIONS AUTHORING E-LEARNING CONTENT TRENDS AND SOLUTIONS Danail Dochev 1, Radoslav Pavlov 2 1 Institute of Information Technologies Bulgarian Academy of Sciences Bulgaria, Sofia 1113, Acad. Bonchev str., Bl.

More information

Stages of Literacy Ros Lugg

Stages of Literacy Ros Lugg Beginning readers in the USA Stages of Literacy Ros Lugg Looked at predictors of reading success or failure Pre-readers readers aged 3-53 5 yrs Looked at variety of abilities IQ Speech and language abilities

More information

CS Machine Learning

CS Machine Learning CS 478 - Machine Learning Projects Data Representation Basic testing and evaluation schemes CS 478 Data and Testing 1 Programming Issues l Program in any platform you want l Realize that you will be doing

More information

The Role of String Similarity Metrics in Ontology Alignment

The Role of String Similarity Metrics in Ontology Alignment The Role of String Similarity Metrics in Ontology Alignment Michelle Cheatham and Pascal Hitzler August 9, 2013 1 Introduction Tim Berners-Lee originally envisioned a much different world wide web than

More information

Research computing Results

Research computing Results About Online Surveys Support Contact Us Online Surveys Develop, launch and analyse Web-based surveys My Surveys Create Survey My Details Account Details Account Users You are here: Research computing Results

More information

Truth Inference in Crowdsourcing: Is the Problem Solved?

Truth Inference in Crowdsourcing: Is the Problem Solved? Truth Inference in Crowdsourcing: Is the Problem Solved? Yudian Zheng, Guoliang Li #, Yuanbing Li #, Caihua Shan, Reynold Cheng # Department of Computer Science, Tsinghua University Department of Computer

More information

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence. NLP Lab Session Week 8 October 15, 2014 Noun Phrase Chunking and WordNet in NLTK Getting Started In this lab session, we will work together through a series of small examples using the IDLE window and

More information

Rule Learning With Negation: Issues Regarding Effectiveness

Rule Learning With Negation: Issues Regarding Effectiveness Rule Learning With Negation: Issues Regarding Effectiveness S. Chua, F. Coenen, G. Malcolm University of Liverpool Department of Computer Science, Ashton Building, Ashton Street, L69 3BX Liverpool, United

More information

Understanding and Supporting Dyslexia Godstone Village School. January 2017

Understanding and Supporting Dyslexia Godstone Village School. January 2017 Understanding and Supporting Dyslexia Godstone Village School January 2017 By then end of the session I will: Have a greater understanding of Dyslexia and the ways in which children can be affected by

More information

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape Koshi Odagiri 1, and Yoichi Muraoka 1 1 Graduate School of Fundamental/Computer Science and Engineering, Waseda University,

More information

Lecturing Module

Lecturing Module Lecturing: What, why and when www.facultydevelopment.ca Lecturing Module What is lecturing? Lecturing is the most common and established method of teaching at universities around the world. The traditional

More information

ANGLAIS LANGUE SECONDE

ANGLAIS LANGUE SECONDE ANGLAIS LANGUE SECONDE ANG-5055-6 DEFINITION OF THE DOMAIN SEPTEMBRE 1995 ANGLAIS LANGUE SECONDE ANG-5055-6 DEFINITION OF THE DOMAIN SEPTEMBER 1995 Direction de la formation générale des adultes Service

More information

A NOTE ON UNDETECTED TYPING ERRORS

A NOTE ON UNDETECTED TYPING ERRORS SPkClAl SECT/ON A NOTE ON UNDETECTED TYPING ERRORS Although human proofreading is still necessary, small, topic-specific word lists in spelling programs will minimize the occurrence of undetected typing

More information

Unit purpose and aim. Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50

Unit purpose and aim. Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50 Unit Title: Game design concepts Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50 Unit purpose and aim This unit helps learners to familiarise themselves with the more advanced aspects

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT Rajendra G. Singh Margaret Bernard Ross Gardler rajsingh@tstt.net.tt mbernard@fsa.uwi.tt rgardler@saafe.org Department of Mathematics

More information

TeacherPlus Gradebook HTML5 Guide LEARN OUR SOFTWARE STEP BY STEP

TeacherPlus Gradebook HTML5 Guide LEARN OUR SOFTWARE STEP BY STEP TeacherPlus Gradebook HTML5 Guide LEARN OUR SOFTWARE STEP BY STEP Copyright 2017 Rediker Software. All rights reserved. Information in this document is subject to change without notice. The software described

More information

Cambridge NATIONALS. Creative imedia Level 1/2. UNIT R081 - Pre-Production Skills DELIVERY GUIDE

Cambridge NATIONALS. Creative imedia Level 1/2. UNIT R081 - Pre-Production Skills DELIVERY GUIDE Cambridge NATIONALS Creative imedia Level 1/2 UNIT R081 - Pre-Production Skills VERSION 1 APRIL 2013 INDEX Introduction Page 3 Unit R081 - Pre-Production Skills Page 4 Learning Outcome 1 - Understand the

More information

Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question

Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question 1 Instructional Approach(s): The teacher should introduce the essential question and the standard that aligns to the essential question 2 Instructional Approach(s): The teacher should conduct the Concept

More information

1 Use complex features of a word processing application to a given brief. 2 Create a complex document. 3 Collaborate on a complex document.

1 Use complex features of a word processing application to a given brief. 2 Create a complex document. 3 Collaborate on a complex document. National Unit specification General information Unit code: HA6M 46 Superclass: CD Publication date: May 2016 Source: Scottish Qualifications Authority Version: 02 Unit purpose This Unit is designed to

More information

The MEANING Multilingual Central Repository

The MEANING Multilingual Central Repository The MEANING Multilingual Central Repository J. Atserias, L. Villarejo, G. Rigau, E. Agirre, J. Carroll, B. Magnini, P. Vossen January 27, 2004 http://www.lsi.upc.es/ nlp/meaning Jordi Atserias TALP Index

More information

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Data Integration through Clustering and Finding Statistical Relations - Validation of Approach Marek Jaszuk, Teresa Mroczek, and Barbara Fryc University of Information Technology and Management, ul. Sucharskiego

More information

Negation in Ancient Greek: a Typological Approach*

Negation in Ancient Greek: a Typological Approach* DOI: 10.5817/GLB2016-2-14 Negation in Ancient Greek: a Typological Approach* Dagmar Muchnová (Charles University in Prague) Abstract This article deals with the application of the concept of negative concord

More information

UK flood management scheme

UK flood management scheme Cockermouth is an ancient market town in Cumbria in North-West England. The name of the town originates because of its location on the confluence of the River Cocker as it joins the River Derwent. At the

More information

Cross-Lingual Text Categorization

Cross-Lingual Text Categorization Cross-Lingual Text Categorization Nuria Bel 1, Cornelis H.A. Koster 2, and Marta Villegas 1 1 Grup d Investigació en Lingüística Computacional Universitat de Barcelona, 028 - Barcelona, Spain. {nuria,tona}@gilc.ub.es

More information

Vocabulary Usage and Intelligibility in Learner Language

Vocabulary Usage and Intelligibility in Learner Language Vocabulary Usage and Intelligibility in Learner Language Emi Izumi, 1 Kiyotaka Uchimoto 1 and Hitoshi Isahara 1 1. Introduction In verbal communication, the primary purpose of which is to convey and understand

More information

The Good Judgment Project: A large scale test of different methods of combining expert predictions

The Good Judgment Project: A large scale test of different methods of combining expert predictions The Good Judgment Project: A large scale test of different methods of combining expert predictions Lyle Ungar, Barb Mellors, Jon Baron, Phil Tetlock, Jaime Ramos, Sam Swift The University of Pennsylvania

More information

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming Data Mining VI 205 Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming C. Romero, S. Ventura, C. Hervás & P. González Universidad de Córdoba, Campus Universitario de

More information

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN

*Net Perceptions, Inc West 78th Street Suite 300 Minneapolis, MN From: AAAI Technical Report WS-98-08. Compilation copyright 1998, AAAI (www.aaai.org). All rights reserved. Recommender Systems: A GroupLens Perspective Joseph A. Konstan *t, John Riedl *t, AI Borchers,

More information

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes

Stacks Teacher notes. Activity description. Suitability. Time. AMP resources. Equipment. Key mathematical language. Key processes Stacks Teacher notes Activity description (Interactive not shown on this sheet.) Pupils start by exploring the patterns generated by moving counters between two stacks according to a fixed rule, doubling

More information

South Carolina English Language Arts

South Carolina English Language Arts South Carolina English Language Arts A S O F J U N E 2 0, 2 0 1 0, T H I S S TAT E H A D A D O P T E D T H E CO M M O N CO R E S TAT E S TA N DA R D S. DOCUMENTS REVIEWED South Carolina Academic Content

More information

Android App Development for Beginners

Android App Development for Beginners Description Android App Development for Beginners DEVELOP ANDROID APPLICATIONS Learning basics skills and all you need to know to make successful Android Apps. This course is designed for students who

More information

BODY LANGUAGE ANIMATION SYNTHESIS FROM PROSODY AN HONORS THESIS SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE OF STANFORD UNIVERSITY

BODY LANGUAGE ANIMATION SYNTHESIS FROM PROSODY AN HONORS THESIS SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE OF STANFORD UNIVERSITY BODY LANGUAGE ANIMATION SYNTHESIS FROM PROSODY AN HONORS THESIS SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE OF STANFORD UNIVERSITY Sergey Levine Principal Adviser: Vladlen Koltun Secondary Adviser:

More information

Moderator: Gary Weckman Ohio University USA

Moderator: Gary Weckman Ohio University USA Moderator: Gary Weckman Ohio University USA Robustness in Real-time Complex Systems What is complexity? Interactions? Defy understanding? What is robustness? Predictable performance? Ability to absorb

More information

A Domain Ontology Development Environment Using a MRD and Text Corpus

A Domain Ontology Development Environment Using a MRD and Text Corpus A Domain Ontology Development Environment Using a MRD and Text Corpus Naomi Nakaya 1 and Masaki Kurematsu 2 and Takahira Yamaguchi 1 1 Faculty of Information, Shizuoka University 3-5-1 Johoku Hamamatsu

More information

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY

MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY MULTILINGUAL INFORMATION ACCESS IN DIGITAL LIBRARY Chen, Hsin-Hsi Department of Computer Science and Information Engineering National Taiwan University Taipei, Taiwan E-mail: hh_chen@csie.ntu.edu.tw Abstract

More information

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments

Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Specification and Evaluation of Machine Translation Toy Systems - Criteria for laboratory assignments Cristina Vertan, Walther v. Hahn University of Hamburg, Natural Language Systems Division Hamburg,

More information

French Dictionary: 1000 French Words Illustrated By Evelyn Goldsmith

French Dictionary: 1000 French Words Illustrated By Evelyn Goldsmith French Dictionary: 1000 French Words Illustrated By Evelyn Goldsmith If searching for the ebook French Dictionary: 1000 French Words Illustrated by Evelyn Goldsmith in pdf format, then you've come to correct

More information

(Sub)Gradient Descent

(Sub)Gradient Descent (Sub)Gradient Descent CMSC 422 MARINE CARPUAT marine@cs.umd.edu Figures credit: Piyush Rai Logistics Midterm is on Thursday 3/24 during class time closed book/internet/etc, one page of notes. will include

More information

ROSETTA STONE PRODUCT OVERVIEW

ROSETTA STONE PRODUCT OVERVIEW ROSETTA STONE PRODUCT OVERVIEW Method Rosetta Stone teaches languages using a fully-interactive immersion process that requires the student to indicate comprehension of the new language and provides immediate

More information

Gaming in Second Life via Scratch4SL: Engaging high school students in programming courses

Gaming in Second Life via Scratch4SL: Engaging high school students in programming courses University of the Aegean From the SelectedWorks of Nikolaos Pellas Winter April 4, 2015 Gaming in Second Life via Scratch4SL: Engaging high school students in programming courses Nikolaos Pellas, University

More information

Evolution of Symbolisation in Chimpanzees and Neural Nets

Evolution of Symbolisation in Chimpanzees and Neural Nets Evolution of Symbolisation in Chimpanzees and Neural Nets Angelo Cangelosi Centre for Neural and Adaptive Systems University of Plymouth (UK) a.cangelosi@plymouth.ac.uk Introduction Animal communication

More information

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report

Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report Linking the Common European Framework of Reference and the Michigan English Language Assessment Battery Technical Report Contact Information All correspondence and mailings should be addressed to: CaMLA

More information

Using dialogue context to improve parsing performance in dialogue systems

Using dialogue context to improve parsing performance in dialogue systems Using dialogue context to improve parsing performance in dialogue systems Ivan Meza-Ruiz and Oliver Lemon School of Informatics, Edinburgh University 2 Buccleuch Place, Edinburgh I.V.Meza-Ruiz@sms.ed.ac.uk,

More information

Beyond the Blend: Optimizing the Use of your Learning Technologies. Bryan Chapman, Chapman Alliance

Beyond the Blend: Optimizing the Use of your Learning Technologies. Bryan Chapman, Chapman Alliance 901 Beyond the Blend: Optimizing the Use of your Learning Technologies Bryan Chapman, Chapman Alliance Power Blend Beyond the Blend: Optimizing the Use of Your Learning Infrastructure Facilitator: Bryan

More information

The Enterprise Knowledge Portal: The Concept

The Enterprise Knowledge Portal: The Concept The Enterprise Knowledge Portal: The Concept Executive Information Systems, Inc. www.dkms.com eisai@home.com (703) 461-8823 (o) 1 A Beginning Where is the life we have lost in living! Where is the wisdom

More information