Supervised Methods for Automatic Acronym. Acronym Expansion in Medical Text. Mahesh Joshi

Similar documents
Semi-supervised methods of text processing, and an application to medical concept extraction. Yacine Jernite Text-as-Data series September 17.

Switchboard Language Model Improvement with Conversational Data from Gigaword

A Case Study: News Classification Based on Term Frequency

Probabilistic Latent Semantic Analysis

Linking Task: Identifying authors and book titles in verbose queries

Word Sense Disambiguation

A Bayesian Learning Approach to Concept-Based Document Classification

Automated Non-Alphanumeric Symbol Resolution in Clinical Texts

Reducing Features to Improve Bug Prediction

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

SINGLE DOCUMENT AUTOMATIC TEXT SUMMARIZATION USING TERM FREQUENCY-INVERSE DOCUMENT FREQUENCY (TF-IDF)

Twitter Sentiment Classification on Sanders Data using Hybrid Approach

Netpix: A Method of Feature Selection Leading. to Accurate Sentiment-Based Classification Models

AQUA: An Ontology-Driven Question Answering System

THE ROLE OF DECISION TREES IN NATURAL LANGUAGE PROCESSING

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

Text-mining the Estonian National Electronic Health Record

Beyond the Pipeline: Discrete Optimization in NLP

Python Machine Learning

Learning From the Past with Experiment Databases

Iterative Cross-Training: An Algorithm for Learning from Unlabeled Web Pages

Chunk Parsing for Base Noun Phrases using Regular Expressions. Let s first let the variable s0 be the sentence tree of the first sentence.

Speech Recognition at ICSI: Broadcast News and beyond

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Heuristic Sample Selection to Minimize Reference Standard Training Set for a Part-Of-Speech Tagger

2/15/13. POS Tagging Problem. Part-of-Speech Tagging. Example English Part-of-Speech Tagsets. More Details of the Problem. Typical Problem Cases

Ensemble Technique Utilization for Indonesian Dependency Parser

A Comparison of Two Text Representations for Sentiment Analysis

CS 446: Machine Learning

Vocabulary Usage and Intelligibility in Learner Language

Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Experts Retrieval with Multiword-Enhanced Author Topic Model

Robust Sense-Based Sentiment Classification

Semi-Supervised Face Detection

Cross Language Information Retrieval

On-Line Data Analytics

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Outline. Web as Corpus. Using Web Data for Linguistic Purposes. Ines Rehbein. NCLT, Dublin City University. nclt

Rule Learning With Negation: Issues Regarding Effectiveness

EdIt: A Broad-Coverage Grammar Checker Using Pattern Grammar

Methods for the Qualitative Evaluation of Lexical Association Measures

The taming of the data:

Experiments with SMS Translation and Stochastic Gradient Descent in Spanish Text Author Profiling

Search right and thou shalt find... Using Web Queries for Learner Error Detection

Using dialogue context to improve parsing performance in dialogue systems

Chinese Language Parsing with Maximum-Entropy-Inspired Parser

University of Alberta. Large-Scale Semi-Supervised Learning for Natural Language Processing. Shane Bergsma

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Postprint.

Content-based Image Retrieval Using Image Regions as Query Examples

Bootstrapping and Evaluating Named Entity Recognition in the Biomedical Domain

Using Web Searches on Important Words to Create Background Sets for LSI Classification

Distant Supervised Relation Extraction with Wikipedia and Freebase

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Target Language Preposition Selection an Experiment with Transformation-Based Learning and Aligned Bilingual Data

The stages of event extraction

Lecture 1: Machine Learning Basics

Integrating E-learning Environments with Computational Intelligence Assessment Agents

Cross-Lingual Text Categorization

Multi-Lingual Text Leveling

BYLINE [Heng Ji, Computer Science Department, New York University,

Applications of memory-based natural language processing

Assessing System Agreement and Instance Difficulty in the Lexical Sample Tasks of SENSEVAL-2

The Internet as a Normative Corpus: Grammar Checking with a Search Engine

Parsing of part-of-speech tagged Assamese Texts

The One Minute Preceptor: 5 Microskills for One-On-One Teaching

Leveraging Sentiment to Compute Word Similarity

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Multilingual Sentiment and Subjectivity Analysis

PRAAT ON THE WEB AN UPGRADE OF PRAAT FOR SEMI-AUTOMATIC SPEECH ANNOTATION

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Disambiguation of Thai Personal Name from Online News Articles

Extracting Verb Expressions Implying Negative Opinions

The Role of the Head in the Interpretation of English Deverbal Compounds

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Learning Computational Grammars

Research computing Results

Class Responsibility Assignment (CRA) for Use Case Specification to Sequence Diagrams (UC2SD)

11/29/2010. Statistical Parsing. Statistical Parsing. Simple PCFG for ATIS English. Syntactic Disambiguation

BMC Medical Informatics and Decision Making 2012, 12:33

Web as Corpus. Corpus Linguistics. Web as Corpus 1 / 1. Corpus Linguistics. Web as Corpus. web.pl 3 / 1. Sketch Engine. Corpus Linguistics

Optimizing to Arbitrary NLP Metrics using Ensemble Selection

Measuring the relative compositionality of verb-noun (V-N) collocations by integrating features

Teachers: Use this checklist periodically to keep track of the progress indicators that your learners have displayed.

Rule Learning with Negation: Issues Regarding Effectiveness

The MEANING Multilingual Central Repository

Training and evaluation of POS taggers on the French MULTITAG corpus

The Smart/Empire TIPSTER IR System

Compositional Semantics

THE VERB ARGUMENT BROWSER

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Modeling Attachment Decisions with a Probabilistic Parser: The Case of Head Final Structures

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Prediction of Maximal Projection for Semantic Role Labeling

Dialog Act Classification Using N-Gram Algorithms

An Interactive Intelligent Language Tutor Over The Internet

Developing a TT-MCTAG for German with an RCG-based Parser

TextGraphs: Graph-based algorithms for Natural Language Processing

Exploiting Phrasal Lexica and Additional Morpho-syntactic Language Resources for Statistical Machine Translation with Scarce Training Data

Feature-oriented vs. Needs-oriented Product Access for Non-Expert Online Shoppers

Transcription:

Supervised Methods for Automatic Acronym in Medical Text Mahesh Joshi Department of Computer Science, University of Minnesota Duluth Summer 25 Intern, Division of Biomedical Informatics, Mayo Clinic 25 th August 25 Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 2

Terminology Abbreviation : a shortened form of a written word or phrase used in place of the whole e.g. AcG for accelerator globulin Acronym 2 : a word formed from the initial letter or letters of each of the successive parts or major parts of a compound term e.g. CC for common cold Every acronym is an abbreviation, not vice-versa,2: Definitions from the Merriam Webster Online Dictionary (http://www.m-w.com/) August 25, 25 3 Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 4 2

The Problem Acronyms and Abbreviations are widely used in clinical notes Their widespread use for various terms gives rise to ambiguity among them e.g. AC can mean: Antitussive with Codeine a cough medicine and/or a pain reliever Acromioclavicular relating to the joint formed between the acromion and the clavicle Acid Controller a drug used to treat peptic ulcers and gastritis and esophageal reflux any of the 3 different senses we have encountered August 25, 25 5 Information Retrieval Ambiguity among acronyms can be a significant problem in medical information retrieval (IR) In IR, augmenting search query with acronyms of search terms can enhance performance Consider the following numbers obtained from 7,56,336 notes representing 993,72 patients e.g. ACA ACA only 5,483 notes (2,543 patients) adeno carcinoma or adenocarcinoma only 299,74 notes (66,57 patients) ACA and ( adeno carcinoma or adenocarcinoma ),29 notes (88 patients) August 25, 25 6 3

Information Retrieval e.g. DJD DJD only 75,956 notes (6,43 patients) degenerative joint disease only 225,859 notes (78,428 patients) DJD and degenerative joint disease 9,349 notes (2,856 patients) Augmenting the search with acronyms add ~2% (5483/ 29974) and ~77% (75956 / 225859) more documents to original search results for ACA and DJD, increasing the sensitivity or recall for the search. August 25, 25 7 The Problem Ambiguity of acronyms can degrade this performance by bringing down the specificity or precision of the search. ACA for example has 7 possible senses and the extra 5483 notes could contain the term ACA with any of those senses. Methods for automatic acronym expansion can therefore be employed for intelligent indexing of documents containing acronyms. August 25, 25 8 4

A Solution Treat automatic acronym expansion similar to word sense disambiguation (WSD) Use the surrounding context of the acronym to decide the correct sense, just like a human would The Robitussin AC doesn't affect his cough much - antitussive with codeine History of left supraspinatus tear and DJD of the left AC joint acromioclavicular Pepcid AC two every day acid controller August 25, 25 9 Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 5

Supervised Learning Methods The state of the art and a very popular approach to WSD, yielding high accuracy on this task Initially require a set of manually classified or sense tagged examples known as the training data Using some learning algorithm and features from the training data, these methods generate a classifier The classifier can be used to classify future instances of test data August 25, 25 What do the algorithms learn Robitussin cough supraspinatus joint Pepcid Sense AC A AC 2 B AC 3 C AC 4 A AC 5 B August 25, 25 2 6

What do the algorithms learn Robitussin cough supraspinatus joint Pepcid Sense AC A AC 2 B AC 3 C AC 4 A AC 5 B August 25, 25 3 What do the algorithms learn Robitussin cough supraspinatus joint Pepcid Sense AC A AC 2 B AC 3 C AC 4 A AC 5 B August 25, 25 4 7

What do the algorithms learn Robitussin cough supraspinatus joint Pepcid Sense AC A AC 2 B AC 3 C AC 4 A AC 5 B August 25, 25 5 Choice of algorithms Support Vector Machines Introduced by Vapnik (995) Discriminative method based on Perceptron learning The naïve Bayes classifier Based on the Bayes rule for conditional probabilities Simplifying assumption of conditionally independent features Decision trees Divide and conquer strategy, forming a tree of questions with yes no answers, based on the available features Crucial features near the root, selected using information gain measures August 25, 25 6 8

Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 7 Related Work Liu et al. (JAMIA 24) Fully supervised approaches using naïve Bayes classifier, decision lists, and their adaptation of decision list classifier Pakhomov (ACL 22), Pakhomov et al. (AMIA 25) Unsupervised training data generation from Mayo clinical notes, MEDLINE collection and WWW + supervised disambiguation of abbreviations August 25, 25 8 9

Related Work Mohammad and Pedersen (CoNLL 24) Employ unigram, bigram and syntactic features Pedersen (NAACL 2) Uses ensembles of multiple naïve Bayes classifiers trained on unigrams in various window sizes August 25, 25 9 Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 2

Training Data Generation The biggest hurdle in supervised approaches lack of sufficient hand labeled training data In our case, the focus was on analyzing machine learning algorithms with respect to several types of features Still, selecting the right kind of data for the annotation process done by the medical data retrieval experts is crucial August 25, 25 2 Important Considerations Choosing acronyms Practical importance Frequency Sense distribution Sense Inventory a list of possible expansions for the selected acronyms UMLS listed expansions in LRABR table Mayo Clinic approved expansions Diagnosis codes from master-sheet data Master-sheet entries are diagnostic statements about patients, and each master sheet entry is manually assigned an 8 digit diagnosis code from the Hospital Adaptation of the ICDA code (HICDA) August 25, 25 22

Acronym Finding Initially identified a set of 25 acronyms using UMLS sense inventory as reference These had a highly skewed distribution in Mayo data Used the Mayo master-sheet data (22,75,83 diagnosis statements), with the following criteria to select an acronym: Has two or more diagnosis codes associated with it in mastersheet, a diagnosis code is considered unique only if it differs in the first five digits out of eight from others Has a relatively balanced distribution of the number of different diagnosis codes associated with it Considered practically important by medical data retrieval experts Identified 7 acronyms which are being annotated August 25, 25 23 Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 24 2

Feature Engineering Different types of features used for WSD: Bag of Words in context Parts of Speech of words in context Syntactic relationships (noun phrase, verb phrase, subjectobject) Collocations in context Symbolic knowledge from an ontology such as UMLS or WordNet Discourse level features such as section identifiers in clinical notes, e.g. CC (Chief Complaint), HPI (History of Present Illness) August 25, 25 25 Our features Unigrams in a flexible window of to around the acronym Two word collocations, i.e. bigrams in a flexible window of to Parts of Speech of two words to the left and right of the acronym Clinical note features: Service Code represents the department where the patient was treated (Cardiology, Rheumatology etc.) Gender Code Section Id August 25, 25 26 3

Why medical features might help APC: Atrial Premature Contraction (Cardiology), Argon Plasma Coagulation (Gastroenterology) Service Code might help AP: Angina Pectoris is more commonly diagnosed among male population Gender Code might help August 25, 25 27 Feature Identification Tools Annotated XML file generation from clinical notes: UIMA (Unstructured Information Management Architecture), http://www.research.ibm.com/uima/ Tokenization, Part of Speech Tagging: ANNIE system (A Nearly-New Information Extraction system) in GATE (General Architecture for Text Engineering), http://gate.ac.uk Unigram and bigram features identification using frequency cutoff and log likelihood measure: NSP (Ngram Statistics Package), http://ngram.sourceforge.net/ Machine Learning Algorithms Implementation: WEKA (Waikato Environment for Knowledge Analysis), http://www.cs.waikato.ac.nz/ml/weka/ August 25, 25 28 4

Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 29 Results Unigrams + Bigrams Majority C 5. Maximum Entropy Naïve Bayes SVM C 4.5 AC 3.4 94.6 96.7 96.34 95.9 95.47 ACA 87.4 93. 97. 97.97 97.78 95.75 APC 42.3 9.7 95.9 92.82 93.9 89.89 CF 76.3 95.8 94.2 96.9 97.8 95.63 HA 92.3 94.7 95.8 97.45 96.27 94.7 LA 88.5 92.6 94.6 97.3 96.93 94.6 NSR 99. 98.8 99. 99.26 99. 99. PE 48.3 9.8 93.3 9.56 9.9 9.94 August 25, 25 3 5

Results U + B + POS Majority C 5. Maximum Entropy Naïve Bayes SVM C 4.5 AC 3.4 94.6 96.7 95.26 96.34 96.2 95.9 94.4 95.47 ACA 87.4 93. 97. 97.97 97.97 97.97 97.78 95. 95.75 APC 42.3 9.7 95.9 93.9 92.82 93.9 93.9 9.43 89.89 CF 76.3 95.8 94.2 97.4 96.9 97.32 97.8 95.2 95.63 HA 92.3 94.7 95.8 97.84 97.45 96.7 96.27 94.7 94.7 LA 88.5 92.6 94.6 97.3 97.3 97.75 96.93 95.9 94.6 NSR 99. 98.8 99. 98.27 99.26 99. 99. 99. 99. PE 48.3 9.8 93.3 92.29 9.56 92.87 9.9 9.52 9.94 August 25, 25 3 Results U + B + CF Majority C 5. Maximum Entropy Naïve Bayes SVM C 4.5 AC 3.4 94.6 96.7 95.47 96.34 95.9 95.9 94.4 95.47 ACA 87.4 93. 97. 98.5 97.97 98.5 97.78 94.9 95.75 APC 42.3 9.7 95.9 93.9 92.82 93.35 93.9 9.43 89.89 CF 76.3 95.8 94.2 97.46 96.9 96.76 97.8 94.93 95.63 HA 92.3 94.7 95.8 97.84 97.45 97.45 96.27 94.89 94.7 LA 88.5 92.6 94.6 95.9 97.3 95.7 96.93 94.6 94.6 NSR 99. 98.8 99. 96.5 99.26 99. 99. 99. 99. PE 48.3 9.8 93.3 92.29 9.56 93.45 9.9 9.52 9.94 August 25, 25 32 6

Results U + B + POS + CF Majority C 5. Maximum Entropy Naïve Bayes SVM C 4.5 AC 3.4 94.6 96.7 95.26 96.34 96.34 95.9 94.4 95.47 ACA 87.4 93. 97. 97.97 97.97 98.5 97.78 94.9 95.75 APC 42.3 9.7 95.9 93.35 92.82 93.62 93.9 9.43 89.89 CF 76.3 95.8 94.2 97.32 96.9 97.46 97.8 94.93 95.63 HA 92.3 94.7 95.8 97.84 97.45 97.64 96.27 94.89 94.7 LA 88.5 92.6 94.6 97.3 97.3 97.54 96.93 95.9 94.6 NSR 99. 98.8 99. 97.53 99.26 99.26 99. 99. 99. PE 48.3 9.8 93.3 93.6 9.56 93.45 9.9 9.52 9.94 August 25, 25 33 Fixed vs Flexible Window Fixed vs Flexible Window -- unigrams Avg. Improvement:.56 +-.6 95 Average Accuracy 9 85 8 75 7 65 unigrams-fixed unigrams-flexible 6 August 25, 25 2 3 4 5 6 7 8 9 Window Size 34 7

Fixed vs Flexible Window Fixed Vs Flexible Window -- bigrams Avg. Improvement: 3.35 +-.46 95 Average Accuracy 9 85 8 75 7 65 bigrams-fixed bigrams-flexible 6 August 25, 25 2 3 4 5 6 7 8 9 Window Size 35 Fixed vs Flexible Window Fixed vs Flexible Window -- unigrams+bigrams Avg. Improvement: 2.37 +-.83 95 Average Accuracy 9 85 8 75 7 65 unigrams+bigrams-fixed unigrams+bigrams-flexible 6 August 25, 25 2 3 4 5 6 7 8 9 Window Size 36 8

Learning Curve AC - Unigrams 95 9 Accuracy 85 8 75 7 65 NB SVM C45 6 2 3 4 5 6 7 8 9 Window Size August 25, 25 37 Learning Curve AC - Bigrams 95 9 Accuracy 85 8 75 7 65 NB SVM C45 6 2 3 4 5 6 7 8 9 Window Size August 25, 25 38 9

Learning Curve AC - Unigrams + Bigrams 95 9 Accuracy 85 8 75 7 65 NB SVM C45 6 2 3 4 5 6 7 8 9 Window Size August 25, 25 39 Learning Curve ACA - Unigrams 95 9 Accuracy 85 8 75 7 65 NB SVM C45 6 2 3 4 5 6 7 8 9 Window Size August 25, 25 4 2

Learning Curve ACA - Bigrams 95 9 Accuracy 85 8 75 7 65 NB SVM C45 6 2 3 4 5 6 7 8 9 Window Size August 25, 25 4 Learning Curve ACA - Unigrams + Bigrams 95 9 Accuracy 85 8 75 7 65 NB SVM C45 6 2 3 4 5 6 7 8 9 Window Size August 25, 25 42 2

Feature Performance Feature Performance 95 Average Accuracy 9 85 8 75 7 65 unigrams bigrams unigrams+bigrams 6 2 3 4 5 6 7 8 9 Window Size August 25, 25 43 Additional Features Additional Features 95 Average Accuracy 9 85 8 75 7 65 window2 window9 window 6 CF POS POS+CF Type of Feature August 25, 25 44 22

Overall Classifier Performance Accuracy (%) Training Time (s) Testing Time (s) Naïve Bayes 9.57 ± 5.97.66 ±.4 7.62 ± 4.85 SVM 93.26 ± 4.85.48 ±.94.5 ±. C 4.5 9.33 ± 6.92 8.4 ± 6.2.2 ±. August 25, 25 45 Findings Window size beyond 3 significant unigrams / bigrams does not seem to improve performance substantially SVMs were able to make better use of complimentary features Overall, two significant unigrams and bigrams on each side, POS and clinical features performed well for all classifiers August 25, 25 46 23

Outcomes Development of an annotation infrastructure that we can pursue further for other acronyms / ambiguous terms Framework for experimentation and testing of various supervised algorithms for WSD Uncovering the extent of the problem with acronym data generation from medical records The developed classifier models can be plugged into a UIMA-Weka interface August 25, 25 47 Overview Background The Problem Supervised Learning Methods Related Work Methods Training Data Generation Feature Engineering Results Summary August 25, 25 48 24

Summary Acronym disambiguation is an important aspect in automatic text analysis Manually labeled training data generation for supervised methods is a complex task Semi-supervised methods are attractive from this perspective Conventional WSD features perform quite well with acronym disambiguation, as expected Domain specific features like service code, gender code and section id improve results to some extent August 25, 25 49 Acknowledgements Dr. Seguei Pakhomov for his continual support and advice and for giving me the right level of independence in choosing the direction of work. Dr. Ted Pedersen and Dr. Richard Maclin from University of Minnesota, Duluth for their encouragement to pursue this internship and invaluable guidance in research. Medical data retrieval experts Barbara Abbot, Debra Albrecht and Pauline Funk, without whom this study would not have been possible at all! Patrick Duffy for his technical advice in various matters. Dr. Guergana Savova and James Buntrock for their feedback and questions that raised interesting issues. Dr. Christopher G. Chute August 25, 25 5 25

References Commission on Professional and Hospital Activities Hospital Adaptation of ICDA. 2nd ed. Vol.. 973, Ann Arbor, MI: Commission on Professional and Hospital Activities Liu H., Teller V. and Friedman C. A Multi-aspect Comparison Study of Supervised Word Sense Disambiguation. Journal of the American Medical Informatics Association (24) Mohammad S. and Pedersen T. Combining Lexical and Syntactic Features for Supervised Word Sense Disambiguation. In Proceedings of the Conference on Computational Natural Language Learning (CoNLL-24) Pakhomov S. Semi-Supervised Maximum Entropy Based Approach to Acronym and Abbreviation Normalization in Medical Texts. In Proceedings of the 4 th Annual Meeting of the Association for Computational Linguistics (ACL 22) August 25, 25 5 References Pakhomov S., Pedersen T. and Chute C. G. Abbreviation and Acronym Disambiguation in Clinical Discourse. To appear in the proceedings of the 25 Annual Symposium of the American Medical Informatics Association (AMIA 25) Pedersen T. A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation. In Proceedings of the First Meeting of the North American Chapter of the Association for Computational Linguistics (NAACL-) Vapnik V.: The Nature of Statistical Learning Theory. Springer. (995) August 25, 25 52 26

Software General Architecture for Text Engineering (GATE): http://gate.ac.uk/. Cunningham H., Maynard D., Bontcheva K., Tablan V. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. Proceedings of the 4th Anniversary Meeting of the Association for Computational Linguistics (ACL 22) Ngram Statistics Package (NSP): http://ngram.sourceforge.net/. Banerjee S. and Pedersen T.: The Design, Implementation and Use of the Ngram Statistics Package. Proceedings of the Fourth International Conference on Intelligent Text Processing and Computational Linguistics (23) Unstructured Information Management Architecture (UIMA): http://www.research.ibm.com/uima/. Ferrucci D. and Lally A. UIMA: an architectural approach to unstructured information processing in the corporate research environment, Natural Language Engineering (24) Waikato Environment for Knowledge Analysis (WEKA): http://www.cs.waikato.ac.nz/ml/weka/. Witten I. and Frank E.: Data Mining: Practical machine learning tools with Java implementation. Morgan-Kaufmann (2) August 25, 25 53 27