Major objectives of the ELRC

Similar documents
SOCRATES PROGRAMME GUIDELINES FOR APPLICANTS

Introduction Research Teaching Cooperation Faculties. University of Oulu

ehealth Governance Initiative: Joint Action JA-EHGov & Thematic Network SEHGovIA DELIVERABLE Version: 2.4 Date:

National Academies STEM Workforce Summit

Overall student visa trends June 2017

May To print or download your own copies of this document visit Name Date Eurovision Numeracy Assignment

PROGRESS TOWARDS THE LISBON OBJECTIVES IN EDUCATION AND TRAINING

Department of Education and Skills. Memorandum

EQE Candidate Support Project (CSP) Frequently Asked Questions - National Offices

The European Higher Education Area in 2012:

Impact of Educational Reforms to International Cooperation CASE: Finland

Challenges for Higher Education in Europe: Socio-economic and Political Transformations

SECTION 2 APPENDICES 2A, 2B & 2C. Bachelor of Dental Surgery

Twenty years of TIMSS in England. NFER Education Briefings. What is TIMSS?

The development of national qualifications frameworks in Europe

Welcome to. ECML/PKDD 2004 Community meeting

PIRLS. International Achievement in the Processes of Reading Comprehension Results from PIRLS 2001 in 35 Countries

TIMSS Highlights from the Primary Grades

National Pre Analysis Report. Republic of MACEDONIA. Goce Delcev University Stip

The development of ECVET in Europe

Summary and policy recommendations

06-07 th September 2012, Constanta Romania th Sept 2012

Science and Technology Indicators. R&D statistics

DEVELOPMENT AID AT A GLANCE

CALL FOR PARTICIPANTS

The development of ECVET in Europe

UNIVERSITY AUTONOMY IN EUROPE II

LIFELONG LEARNING PROGRAMME ERASMUS Academic Network

North American Studies (MA)

WELCOME WEBBASED E-LEARNING FOR SME AND CRAFTSMEN OF MODERN EUROPE

HIGHLIGHTS OF FINDINGS FROM MAJOR INTERNATIONAL STUDY ON PEDAGOGY AND ICT USE IN SCHOOLS

The recognition, evaluation and accreditation of European Postgraduate Programmes.

Teaching Practices and Social Capital

Universities as Laboratories for Societal Multilingualism: Insights from Implementation

PROJECT PERIODIC REPORT

California Digital Libraries Discussion Group. Trends in digital libraries and scholarly communication among European Academic Research Libraries

D.10.7 Dissemination Conference - Conference Minutes

DISCUSSION PAPER. In 2006 the population of Iceland was 308 thousand people and 62% live in the capital area.

The Survey of Adult Skills (PIAAC) provides a picture of adults proficiency in three key information-processing skills:

Navitas UK Holdings Ltd Embedded College Review for Educational Oversight by the Quality Assurance Agency for Higher Education

AUTHORING E-LEARNING CONTENT TRENDS AND SOLUTIONS

InTraServ. Dissemination Plan INFORMATION SOCIETY TECHNOLOGIES (IST) PROGRAMME. Intelligent Training Service for Management Training in SMEs

OCW Global Conference 2009 MONTERREY, MEXICO BY GARY W. MATKIN DEAN, CONTINUING EDUCATION LARRY COOPERMAN DIRECTOR, UC IRVINE OCW

GALICIAN TEACHERS PERCEPTIONS ON THE USABILITY AND USEFULNESS OF THE ODS PORTAL

Inspiring Science Education European Union Project

EXECUTIVE SUMMARY. TIMSS 1999 International Science Report

Students with Disabilities, Learning Difficulties and Disadvantages STATISTICS AND INDICATORS

Financiación de las instituciones europeas de educación superior. Funding of European higher education institutions. Resumen

ROLE DESCRIPTION. Name of Employee. Team Leader ICT Projects Date appointed to this position 2017 Date under review Name of reviewer

Higher Education Review (Embedded Colleges) of Navitas UK Holdings Ltd. Hertfordshire International College

IAB INTERNATIONAL AUTHORISATION BOARD Doc. IAB-WGA

Tailoring i EW-MFA (Economy-Wide Material Flow Accounting/Analysis) information and indicators

The University of British Columbia Board of Governors

WMO Global Campus: Frequently Asked Questions and Answers, July 2015 V1. WMO Global Campus: Frequently Asked Questions and Answers

MODERNISATION OF HIGHER EDUCATION PROGRAMMES IN THE FRAMEWORK OF BOLOGNA: ECTS AND THE TUNING APPROACH

JAMK UNIVERSITY OF APPLIED SCIENCES

DICE - Final Report. Project Information Project Acronym DICE Project Title

State Parental Involvement Plan

The Characteristics of Programs of Information

On the Open Access Strategy of the Max Planck Society

TERTIARY EDUCATION BOOM IN EU COUNTRIES: KEY TO ENHANCING COMPETITIVENESS OR A WASTE OF RESOURCES?

The CESAR Project: Enabling LRT for 70M+ Speakers

A TRAINING COURSE FUNDED UNDER THE TCP BUDGET OF THE YOUTH IN ACTION PROGRAMME FROM 2009 TO 2013 THE POWER OF 6 TESTIMONIES OF STRONG OUTCOMES

Master in International Economics and Public Policy. Christoph Wirp MIEPP Program Manager

CEF, oral assessment and autonomous learning in daily college practice

Europeana Creative. Bringing Cultural Heritage Institutions and Creative Industries Europeana Day, April 11, 2014 Zagreb

BLASKI, POLAND Introduction. Italian partner presentation

General rules and guidelines for the PhD programme at the University of Copenhagen Adopted 3 November 2014

Advances in Aviation Management Education

International House VANCOUVER / WHISTLER WORK EXPERIENCE

Improving education in the Gulf

Open Discovery Space: Unique Resources just a click away! Andy Galloway

Ministry of Education, Republic of Palau Executive Summary

COMMISSION OF THE EUROPEAN COMMUNITIES

SME Academia cooperation in research projects in Research for the Benefit of SMEs within FP7 Capacities programme

2001 MPhil in Information Science Teaching, from Department of Primary Education, University of Crete.

Unit purpose and aim. Level: 3 Sub-level: Unit 315 Credit value: 6 Guided learning hours: 50

Next-Generation Technical Services (NGTS) Archivists Toolkit Recommendations

The Rise of Populism. December 8-10, 2017

License to Deliver FAQs: Everything DiSC Workplace Certification

Master s Degree Programme in East Asian Studies

Lifelong Learning Programme. Implementation of the European Agenda for Adult Learning

NA/2006/17 Annexe-1 Lifelong Learning Programme for Community Action in the Field of Lifelong Learning (Lifelong Learning Programme LLP)

EUROPEAN STUDY & CAREER FAIR

Hungary. Iván Rónai Ministry of Cultural Heritage

THE WEB 2.0 AS A PLATFORM FOR THE ACQUISITION OF SKILLS, IMPROVE ACADEMIC PERFORMANCE AND DESIGNER CAREER PROMOTION IN THE UNIVERSITY

Introduction to Moodle

Report survey post-doctoral researchers at NTNU

SEDRIN School Education for Roma Integration LLP GR-COMENIUS-CMP

OVERVIEW Getty Center Richard Meier Robert Irwin J. Paul Getty Museum Getty Research Institute Getty Conservation Institute Getty Foundation

2 ND BASIC IRRS TRAINING COURSE

international PROJECTS MOSCOW

An Example of an E-learning Solution for an International Curriculum in Manufacturing Strategy

Analysis of European Medical Schools Teaching Programs

Equitable Access Support Network. Connecting the Dots A Toolkit for Designing and Leading Equity Labs

Exam Centre Contingency and Adverse Effects Policy

CREATING SHARABLE LEARNING OBJECTS FROM EXISTING DIGITAL COURSE CONTENT

Rethinking Library and Information Studies in Spain: Crossing the boundaries

Planning a Webcast. Steps You Need to Master When

Evaluation Report Output 01: Best practices analysis and exhibition

Transcription:

Major objectives of the ELRC The European Language Resource Coordination Consortium (ELRC) A service contract for the EC Khalid CHOUKRI ELRA/ELDA choukri@elda.org On behalf of the CEF ELRC team ELDA, ILSP, TILDE, DFKI, TAUS KC /1

Key messages on the specifics of the ELRC work.. Now that Josef did set the scene What are specifically our major goals Instruments and approaches to implement them Timeline Preparing the ground (<M6) Data collections (<M24) Why we can not achieve them without You If we do so, how can we service the EU and our community (R&D, Industry, MS) How can we sustain the process and the efforts KC /2

Our major goals Golden Goal LR identification and Collection Secure 200+ Language resources suitable for use within Assisted Translation (AT//MT), and sustain the process Instruments. Awareness and Info/data sharing Establish and staff a helpdesk Connect with national bodies and prepare the ground Organize one seminar per country (30+), improve readiness and ability to contribute data Establish a pipeline to collect data Work out jointly for a win-win deal for all parties KC /3

Preparing the ground Helpdesk and Support Set up and run a technical-legal helpdesk that will help with all queries regarding language resource identification, preparation, processing and sharing. Technical aspects, such as formatting, encoding, metadata usage, metadata conversion to LOD, packaging, uploading, maintenance; Basic data processing, such as data cleaning, alignment, annotation schemas, data validation, processing evaluation, Legal aspects towards data sharing, comprising from licensing models and IPR clearing to data anonymization and confidentiality Administrative issues. KC /4

Preparing the ground the ELRC Website Key information about the CEF and ELRS in all CEF languages Overview, materials, findings, registration, calendar of the ELRC events - Workshops and Conferences Latest news and activities related to the ELRC Support for social media Facebook, Twitter and LinkedIn Feedback gathering facility Discussion forum Connect to the Helpdesk and FAQ for ELRC related issues Language resource repository facility process, upload, access Document repository KC /7

Preparing the Ground Country-specific training workshops Objective: Identification and evangelization of decision makers about - Multilinguality, - Useful role and support of MT, and - Requirements of current MT technologies in terms of data How: - Organize circa 30 training Workshops in EU Member and CEF-Affiliated States; - Organization in cooperation with local partners to seek a multiplier effect (e.g. national anchor points but also DGT local branches) - Provide a high level of localisation and adaptation of material; local speakers, Targeted audience - Mostly Decision makers in national «publications» offices and DSI-Like Mangers - Producers & right-owners of (Public Sector) and similar MT valuable resources KC /8

Preparing the Ground Country-specific training workshops Planned/Expected outcomes: Identification and reaching local stakeholders Opening the doors Boosting data awareness inc. PSI directive, logistic/legal aspects Emphasize the benefit of data providers (better AT/MT for My language) Identification of (usable) Data collections + (right)holders; Local support of a concerted pan- European Action orchestrated by ELRC on MT/AT; AT.DSI as a web-service /APIs. Identification of digital services that can adopt (early adopters) MT/AT technologies on the local scenes KC /9

(tentative) Organization of the schedule of workshops The workshops will proceed in parallel: Initial schedule of the workshops in a rehearsal/initial phase (3-4 Workshops) These workshops will serve as pilots for the second and broader phase. Analysis of the outcome and re-tuning of the Workshop material & approach Running the second round of the workshops (26+) Geographic areas and respective responsibilities Area 1 would be under the responsibility of Tilde (Latvia, Lithuania, Estonia, Finland, Sweden, Norway, Denmark, Iceland), Area 2 under the responsibility of ELDA (France, Spain, Portugal, Italy, Malta, Belgium, the Netherlands, Luxembourg, Germany, United Kingdom, Ireland) Area 3 under the responsibility of ILSP (Greece, Cyprus, Bulgaria, Romania, Croatia, Slovenia, Austria, Czech Republic, Slovakia, Hungary, Poland). With the logistic support of DFKI and TAUS KC /10

The Success of Workshops Run all workshops with the next 6 months! The attractiveness of the workshops - (how many attendees, how many Key players, LR right holders) Good feedback from the surveys How efficient are the outcomes in terms of LR sets identified and secured Agreement in principle on LRs donation Adoption/Deployment of AT/MT within the national offices Report at the second ELRC Conference (likely @LREC 2016) KC /13

Collect LR Data Sets Identify and collect data sets, based on preparatory action, stakeholder and data leads in previous task; Target 200+ new data sets; Alignment with EC on required data properties: domain, type and quality; Set-up LR collection team (both from the consortium and identified contacts), Collect Data:» Identify and prioritize the sources of the data (online versus offline)» Use traditional approaches to obtain data (partners expertise)» Use new approaches i.e. for online sources will be crawled (tools from Panacea, QTlaunchPad, TaaS, etc.)» Best Crawlers that allow to identify parallel data and comparable data in specific domains of knowledge will be used. Perform basic data-cleaning, pre-processing, formatting, conversion and alignment where required; (automatic) quality review and maintenance; KC /15

Collect LR Data Sets Add some (basic) documentation and the necessary meta-data capture; Ensure clearing legal issues (IPR, Licensing, etc. where necessary) Set up and run a rigorous data quality control system using automated tools (based on ELRA quality Control methodologies) and manual sampling spotchecks Set up shared reporting tools Recoding and tracking progress against target; Set up and operating Storage and Distribution mechanisms (Meta-share) Strong Partnership with the EU Open Data Portal (a sharing channel) Regular and Final delivery of data to EC. KC /16

Data Quality and Validation Data Quality vs meta-data quality, documentation, Identify Sources of data (quality tag!) some are reliable (official administration vs blogs from public officers) Automatically assess the confidence in genre, domain, language register, (Automatically) Identify parallel/comparable data levels Define and Assess quality of data & meta-data Assess the openness of the associated licenses; Etc. KC /17

Data special issues Data is not in digital format, not a known format! ( ) shapes and forms We have translated texts but the sources are lost No one knows what /who are right holders Data is text, lists, mono bilingual.but what can we donate!! We only have PDF, OCRed data, Wseb-pages in HTML, Word documents, PDF documents, Excel sheets May be even better translated texts, translation memories, etc. have you ever had text translated by an external company? Who are they? Who owns what? What about personal information in the data? ELRC will remove this and many other things Unexpected issues PSI is a directive. Enforcement but not our way of operating Other Legal issues, Ethics, etc. KC /18

Data special issues Data is not in digital format, not a known format! ( ) shapes and forms We have translated texts but the sources are lost No one knows what /who are right holders Data is text, lists, mono bilingual.but what can we donate!! We only have PDF, OCRed data, Wseb-pages in HTML, Word documents, PDF documents, Excel sheets May be even better translated texts, translation memories, etc. have you ever had text translated by an external company? Who are they? Who owns what? What about personal information in the data? ELRC will remove this and many other things Unexpected issues PSI is a directive. Enforcement but not our way of operating Other Legal issues, Ethics, etc. KC /19

Success of Data Collection task How Many new resources have been identified /Quarter? How Many new resources have been secured for use within AT.DSI? How many are made widely available? The MT@EC deployment at local administration, API/Web services for us KC /20

Concluding message: failure is not an option Joint effort to support EU and MS but also our languages and hence our innovative players KC /21

Timeline of the action. Two Major phases Setting the ground (<M6) Identify Contacts & Connections Helpdesk Workshops LR Collection phase (--M24) Information dissemination Web, conferences, Social networks Urgent actions Tune the messages for our contacts & stackholders. Seminars before summer Rehearsal. With your input involvement KC /22

The official tasks list Task 1: Secretariat of the Language Resource Coordination (DFKI) Task 2: Technical Helpdesk for Language Resource provision (ELDA) Task 3: Language Resource Board (DFKI) Task 4: Website (Tilde) Task 5: Conferences (DFKI) Task 6: Targeted country-specific training workshops (ELDA) Task 7: Language Resource data sets (ELDA) Task 8: Advisory and consultancy services (DFKI) KC /23

Psi.. Amended Directive The main changes in the amended Directive are to: require public sector bodies (PSBs) to allow the re-use of existing and generally accessible information they create, collect or hold. The effect of this was to make re-use mandatory in most cases. extend its scope to cover PSI held by public sector museums, libraries (including university libraries) and archives in making their information available for re-use. introduce the general principle that charges for re-use should normally be set at marginal cost, with exceptions in certain circumstances. introduce a redress mechanism for complaints by re-users operated by an impartial review body with the power to make binding decisions KC /24