Proceedings of the 8th WSEAS International Conference on Applied Computer and Applied Computational Science. Boolean Conversion

Similar documents
Learning Methods for Fuzzy Systems

Ph.D in Advance Machine Learning (computer science) PhD submitted, degree to be awarded on convocation, sept B.Tech in Computer science and

Evolutive Neural Net Fuzzy Filtering: Basic Description

Rule Learning With Negation: Issues Regarding Effectiveness

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

Rule Learning with Negation: Issues Regarding Effectiveness

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

Curriculum Vitae of Chiang-Ju Chien

Knowledge-Based - Systems

Applying Fuzzy Rule-Based System on FMEA to Assess the Risks on Project-Based Software Engineering Education

Learning From the Past with Experiment Databases

Human Emotion Recognition From Speech

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Australian Journal of Basic and Applied Sciences

Data Fusion Models in WSNs: Comparison and Analysis

Softprop: Softmax Neural Network Backpropagation Learning

Reducing Features to Improve Bug Prediction

Speech Emotion Recognition Using Support Vector Machine

Kamaldeep Kaur University School of Information Technology GGS Indraprastha University Delhi

Longest Common Subsequence: A Method for Automatic Evaluation of Handwritten Essays

CS Machine Learning

FSL-BM: Fuzzy Supervised Learning with Binary Meta-Feature for Classification

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

AUTOMATED TROUBLESHOOTING OF MOBILE NETWORKS USING BAYESIAN NETWORKS

SARDNET: A Self-Organizing Feature Map for Sequences

Learning Methods in Multilingual Speech Recognition

Class-Discriminative Weighted Distortion Measure for VQ-Based Speaker Identification

Mining Association Rules in Student s Assessment Data

Visual CP Representation of Knowledge

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

An OO Framework for building Intelligence and Learning properties in Software Agents

Probabilistic Latent Semantic Analysis

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

A student diagnosing and evaluation system for laboratory-based academic exercises

Practical Integrated Learning for Machine Element Design

Word Segmentation of Off-line Handwritten Documents

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

Classification Using ANN: A Review

A Reinforcement Learning Variant for Control Scheduling

International Series in Operations Research & Management Science

Knowledge Transfer in Deep Convolutional Neural Nets

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

A Case Study: News Classification Based on Term Frequency

A SURVEY OF FUZZY COGNITIVE MAP LEARNING METHODS

Reinforcement Learning by Comparing Immediate Reward

CLASSIFICATION OF TEXT DOCUMENTS USING INTEGER REPRESENTATION AND REGRESSION: AN INTEGRATED APPROACH

Using Genetic Algorithms and Decision Trees for a posteriori Analysis and Evaluation of Tutoring Practices based on Student Failure Models

Lecture Notes on Mathematical Olympiad Courses

Developing True/False Test Sheet Generating System with Diagnosing Basic Cognitive Ability

Conversational Framework for Web Search and Recommendations

A Case-Based Approach To Imitation Learning in Robotic Agents

Rule discovery in Web-based educational systems using Grammar-Based Genetic Programming

Handling Concept Drifts Using Dynamic Selection of Classifiers

A Study of Metacognitive Awareness of Non-English Majors in L2 Listening

Artificial Neural Networks written examination

A study of speaker adaptation for DNN-based speech synthesis

Transfer Learning Action Models by Measuring the Similarity of Different Domains

Product Feature-based Ratings foropinionsummarization of E-Commerce Feedback Comments

Community-oriented Course Authoring to Support Topic-based Student Modeling

Quantitative Evaluation of an Intuitive Teaching Method for Industrial Robot Using a Force / Moment Direction Sensor

Effectiveness of Electronic Dictionary in College Students English Learning

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

Multi-Lingual Text Leveling

Department of Computer Science GCU Prospectus

Axiom 2013 Team Description Paper

Math Placement at Paci c Lutheran University

Automating the E-learning Personalization

Chapter 4 - Fractions

Assignment 1: Predicting Amazon Review Ratings

Dinesh K. Sharma, Ph.D. Department of Management School of Business and Economics Fayetteville State University

Use of Online Information Resources for Knowledge Organisation in Library and Information Centres: A Case Study of CUSAT

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

An Introduction to Simio for Beginners

Constructive Induction-based Learning Agents: An Architecture and Preliminary Experiments

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

Integrating E-learning Environments with Computational Intelligence Assessment Agents

ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY DOWNLOAD EBOOK : ADVANCED MACHINE LEARNING WITH PYTHON BY JOHN HEARTY PDF

Linking Task: Identifying authors and book titles in verbose queries

Procedia - Social and Behavioral Sciences 237 ( 2017 )

Python Machine Learning

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

PREDICTING SPEECH RECOGNITION CONFIDENCE USING DEEP LEARNING WITH WORD IDENTITY AND SCORE FEATURES

Identification of Opinion Leaders Using Text Mining Technique in Virtual Community

Detecting Wikipedia Vandalism using Machine Learning Notebook for PAN at CLEF 2011

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

IMGD Technical Game Development I: Iterative Development Techniques. by Robert W. Lindeman

An Effective Framework for Fast Expert Mining in Collaboration Networks: A Group-Oriented and Cost-Based Method

The Good Judgment Project: A large scale test of different methods of combining expert predictions

A NEW ALGORITHM FOR GENERATION OF DECISION TREES

PROCEEDINGS OF SPIE. Double degree master program: Optical Design

Agent-Based Software Engineering

Machine Learning from Garden Path Sentences: The Application of Computational Linguistics

DIDACTIC MODEL BRIDGING A CONCEPT WITH PHENOMENA

Team Formation for Generalized Tasks in Expertise Social Networks

Calibration of Confidence Measures in Speech Recognition

Matching Similarity for Keyword-Based Clustering

Transcription:

Boolean Conversion Fengming M. Chang Department of Information Science and Applications Asia University Wufeng, Taichung County, Taiwan paperss@gmail.com Abstract: - The Boolean Conversion (BC) is a novel method that is proposed in this paper solve the problem of large number of attributes in machine learning. Large number of data attributes is easy cause a system freezes or shuts down, especially in the neuro-fuzzy based learning. The purpose of BC is reduce data dimensions by a binary or Boolean conversion process. All the attributes are reserved but combined in few numbers of new attributes instead of that some attributes are removed. Three data sets, nbuses, ACLP, and MONK, are offered in this study test and compare the learning accuracies and learning. The results indicate that the proposed BC can keep about the same level of but increase the learning efficiency. Key-Words: - Boolean conversion, Large attribute, Machine learning, Neuro-fuzzy, fuzzificain Introduction Recent have shown widely applications in the fields of Artificial intelligence (AI) for data classification or prediction [-4]. Many methods were proposed. For most of the examples in previous studies, the number of input attributes is not large. They probably only provide a theoretic model for researchers. However, real data in some theoretic studies and some practical applications have plenty of input attributes. It causes some problems. First, some systems will easily shut down because the calculations of the machine learning are o large. Second, some learning programs have their limits. The learning methods that mostly need reduce input attribute numbers are Artificial Neural Network (ANN), Fuzzy Neural Network (FNN, neuro-fuzzy), and, the later is improved based on FNN. FNN and are more difficult perform than ANN. FNN deals with the network learning using fuzzy membership functions. Because the de calculations in FNN are complex and difficult, most of the fuzzy membership functions are setup as triangular, generalized bell, trapezoidal, and so on so that they are easy calculate. Anomalous shapes fuzzy membership is not recommended because it is almost impossible defuzzify using programs beforehand even though the defuzzy calculation is still not efficient. When the input attribute amounts are larger than 6, the FNN program could not perform normally. Most of the, the computer went on hold without any response, it froze. In this article, nbuses data set has 9 input attributes. These data are used as fail for learning when using FNN or methods because they have o many attributes. On the other hand, some machine learning programs have upper limit of network nodes, and reduction of the input attribute amounts is also necessary. Literatures Review Data attribute reduction is an important way improve the efficiency of AI learning. Early related work was done by Shen and Chouchoulas, who proposed a Rough Set Attribute Reduction (RSAR) method remove redundant input attributes for discrete s from complex systems. However, RSAR still lacks efficiency although it can reduct attributes [5]. The other study is that Beynon introduced an approximate reducts concept and proposed a Variable Precision Rough Sets (VPRS) model find out the smallest set of attributes [6]. Later, Hsu et al. applied VPRS model for mobile phone test procedure [7]. Inbarani et al. also applied VPRS for feature selection of web usage mining [8]. In addition, Ang and Quek did not reduce data attriubute but reduce fuzzy rules by combined rough set and neuro-fuzzy learning [9]. The Proposed Method First, a data set of nbuses [] that consists of 9 input and output attributes is used explain the proposed BC method first. The nbuses can not be performed well in FNN and mega- methods. Values of its attributes are integers. The s of the first attribute are {,,, 4}, s of the second, the 8 th, and output attributes are {,, }, and s of the other attributes are {, }. The process of the BC method is simple. Each ISSN: 79-57 7 ISBN: 978-96-474-75-8

decimal number can be transferred in a Boolean number one on one mapping. For example, in the first instance of our nbuses data, 9 attributes are combined in new attributes. We combine the first the third attributes be the first new attribute. On the left of the Fig., decimal numbers, 4,, and, are transferred in Boolean numbers accordingly. Considering the maximum of each attribute, Boolean number for decimal number should be. So that the number of bit in Boolean number for each attribute is fixed. Next, the Boolean numbers are physically combined be a unique Boolean number as shown in the middle part of Fig.. Each original Boolean number occupies its own digital position in the combined Boolean number format without mixing with other numbers. After that, for the convenience of calculation in the real world, this combined Boolean number is transferred a decimal number. In the above process, the input s are combined in a unique decimal 77 and the input attributes are reduced. The reason for not combining the decimal number directly, such as combine 4,, be 4 is because that 4 is bigger than 77, the result of BC. Smaller number is easier for calculation. decimal numbers 4 Transferred Boolean numbers Combined 6 Boolean numbers one Transferred a decimal numbers 77 Fig.. The process of the Boolean Conversion. With Fig. as an example, there are inputs were converted in a single new input. First, the original inputs {4,, } are converted in a Boolean digit number that is {,, }. Second, these Boolean digit numbers are physically combined in one Boolean digit number:. The corresponding decimal number is 77. It can be expressed by the binary system as: {,, } = * + * + * It could be a Boolean weight expressed by Boolean system as: B = [ ] or expressed by decimal system: 4 { 4,,} 77 = 4 * + * + * and the binary weight vecr is B = [ 4 power of is determined by the data maximum domain. For example, if the range of the second attribute is from 6, the maximum domain is 6, then 6( decimal) = ( binary) = ] + ( decimal) ( decimal) means the number of bits the second attribute needs is, bits ( ) =. Each attribute may not need the same number of bit in other cases. The power of the second attribute is the bit number plus the power of the third attribute or the tal bit number from the third the fifth inputs. It is presented as: power( i) = bits( i + ) + power( i + ) = i ( I ), power ( I) =, where I is the tal number of the input i. I m= i+ bits( m) 4 Results and Comparisons In this study, three data sets are offered check learning accuracies and of the results of both non-applying and applying BC. These data sets are nbuses, ACLP, and MONK. 4. nbuses data The nbuese data that has been mentioned in section is used by the proposed method in this subsection. Table shows one record of the data. There are 9 input and one output attributes in the data. The st the rd attributes are converted in a new input attribute, the 4 th the 6 th attributes are combined in the nd new input by BC, and the 7 th the 9 th are combined in the rd new one. Therefore there are only three new input attributes. As shown in Table, the new input record is {77, 8, 9}. After all attributes are converted using BC, the data are tested and compared using BN, C4.5, SVM, ANN, FNN, and methods with -folds cross-validation testing. Each fold are used as testing data in turn and the remaining tal of 9 folds data are used as training data. The results are presented in ISSN: 79-57 7 ISBN: 978-96-474-75-8

Table. Without using BC, FNN and fail perform. After applying BC, it can easily perform machine learning using FNN and methods. Most of the prediction accuracies after using BC are even a little higher than without using it in this case, and learning decreases. Fig. compares the prediction accuracies under different learning methods. Fig.. illustrates the learning. For nbuses data, even after using BC, for FNN and mega- is still large. For other learning methods, learning reduced. Table. An explanation of converting 9 attributes new decimal attributes. # # # #4 #5 #6 #7 #8 #9 Original decimal Converse Boolean Combine three Boolean Converse three decimal 4 77 8 9 Table. The comparison of nbuses data. Non Accuracy 9.4 % 9.4% 86.84% 85.5% Fail Fail -BC Time(sec).5.6.56.56 perform perform Accuracy 94.74% 9.4% 84.% 9.% 95% 95% BC Time(sec).4..5.5.5 learning without using BC learning after using BC Fig.. The learning comparison before and after using BC method by six methods for nbuses data. 4. ACLP data There are in tal 4 instance in ACLP data with 6 input and output attributes. The st, the 5 th, and the 6 th attributes have s of {,, }, and the other attributes have s of {, }. For neuro-fuzzy learning, 6 input attributes cause learning process very slowly. Fortunately, the s of each attribute are not large. We can compare the results of FNN and mega- methods. The results are presented in Table, Fig. 4, and Fig. 5. Accuracies after using BC are a little lower than before, but is saving. Before applying BC in FNN and mega-, for learning is very large. After using BC, is saved largely. Table. The comparison of ACLP data. Non Accuracy 86.4 % 87.86% 9.7% 84.9% 84.7% 84.7% -BC Time(sec).7.4 65 65 Accuracy 8.7% 89.9% 8.4% 8.4% 8.8% 8.% BC Time(sec).. 5 5.%.% 8.% 6.% 4.%.%.% without using BC after using BC Fig.. The comparison before and after using BC method by six methods for nbuses data. 8.% 6.% 4.%.%.% without using BC after using BC Fig. 4. The comparison before and after using BC method by six methods for ACLP data. ISSN: 79-57 7 ISBN: 978-96-474-75-8

7 6 5 4 learning without using BC learning after using BC Fig. 5. The learning comparison before and after using BC method by six methods for ACLP data. 7 6 5 4 learning without using BC learning after using BC Fig. 7. The learning comparison before and after using BC method by six methods for Monk data. 4. Monk data Monk data were created by Sebastian Thrun (see UCI Machine Learning Reposiry []) which has 4 instances, 6 inputs and output attributes. Because the number of attributes is not large in this case, we can compare the learning accuracies of FNN and mega-fuzzificatioin with and without using BC again. Table 4 shows the results. In this case, FNN and mega- can be performed but waste large before using BC. All the accuracies after using BC are a little lower than before. The learning accuracies are also compared in Fig. 6 and learning is compared in Fig. 7. Still, leaning before using BC for FNN and mega- is very large, but becomes very small after applying BC. Table 4. The comparison of Monk data. Accuracy 9.6 % % 8.56% % % % Non-BC Time(sec)...9 7 7 Accuracy 89.% 96.6% 76.9% 98.87% 97% 98% BC Time(sec)..6.% 8.% 6.% 4.%.%.% without using BC after using BC Fig. 6. The comparison before and after using BC method by six methods for Monk data. 5 Conclusions In this study, a novel BC method is proposed deal with the problem of that data with a large number of attributes may cause a system freezes or shuts down. BC reduce attribute number by combining some of the attributes in smaller number of new attributes instead of that removing some attributes from data. After attributes are combined and reduced, learning accuracies and learning are compared by BN, C4.5, SVM, ANN, neuro-fuzzy, and learning methods. In this study, three data sets, nbuses, ACLP, and MONK, are offered test and compare the learning results. Some of their learning accuracies after using BC are a little lower than before, some have a little higher accuracies. In general, the learning after applying BC is not worse. In addition, leaning is shortened after BC is used. Facing the problem of fail perform in neuro-fuzzy, the proposed BC method indeed solves the problem of data have large attributes in learning in brief. Acknowledgement Thanks are due the support in part by the National Science Council of Taiwan under Grant No. NSC 96-46-H-468-6-MY. References: [] Y.Y. Yao, Granular computing: basic issues and possible solutions, Proceedings of the 5 th Joint Conference on Information Sciences, 999, pp. 86 89. [] L. Polkowski and A. Skowron, Towards adaptive calculus of granules, Proceedings of 998 IEEE International Conference on Fuzzy Systems, pp. 6. [] T.Y. Lin, Granular computing on binary relations I: data mining and neighborhood systems, II: Rough set representations and belief functions, in ISSN: 79-57 7 ISBN: 978-96-474-75-8

L. Polkowski and A. Skowron eds., Rough sets in knowledge discovery. Heidelberg, Physica-Verlag, 998, pp. 7 4. [4] Y.Y. Yao, Granular computing using neighborhood systems, in R. Roy, T. Furuhashi, and P.K. Chawdhry (eds.) Advances in Soft Computing: Engineering Design and Manufacturing, Springer-Verlag, London, 999, pp. 59 55. [5] T.Y. Lin, Data mining: granular computing approach, Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining, 999, pp. 4. [6] A. Skowron and J. Stepaniuk, Information granules: wards foundations of granular computing, International Journal of Intelligent Systems, Vol. 6, 57 85,. [7] Y.Y. Yao, Information granulation and rough set approximation, International Journal of Intelligent Systems, Vol. 6, 87 4,. [8] J.-S. R. Jang, ANFIS: Adaptive-Network-based Fuzzy Inference Systems, IEEE Transactions on System, Man, and Cybernetics, vol., no., pp. 665-685, 99. [9] D. C. Li, C. Wu, and F. M. Chang, Using data-fuzzifying technology in small data set learning improve FMS scheduling, International Journal of Advanced Manufacturing Technology, vol. 7, no. -4, pp. -8, 5. []F. M. Chang, and C. C. Chan, Improve Neuro-Fuzzy Learning by Attribute Reduction, The 7 th Annual Meeting of the North American Fuzzy Information Processing Society, The Rockefeller University, NY, USA, May 8-, 8. []B. Predki, R. Slowinski, J. Stefanowski, R. Susmaga, and Sz. Wilk, ROSE - Software Implementation of the Rough Set Theory, In: L. Polkowski, A. Skowron, eds, Rough Sets and Current Trends in Computing, Lecture Notes in Artificial Intelligence, vol. 44, pp. 65-68,. 998. []B. Predki and Sz.Wilk, Rough Set Based Data Exploration Using ROSE System, In: Z. W. Ras, A. Skowron, eds, Foundations of Intelligent Systems, Lecture Notes in Artificial Intelligence, vol. 69, pp.7-8, 999. []A. Øhrn and J. Komorowski, ROSETTA: a rough set olkit for analysis of data, Proc. Third International Joint Conference on Information Sciences, Vol., pp. 4-47, Durham, NC, March 997. [4]Z. Pawlak, Rough Sets: Theoretical Aspects of Reasoning about Data, Kluwer, 99. [5]S. Qiang, and C. Alexios, A modular approach generating fuzzy rules with reduced attributes for the moniring of complex systems, Engineering Applications of Artificial Intelligence, vol., No., pp.6-78,. [6]M. Beynon, Reducts within the variable precision rough set model: A further investigation, European Journal of Operational Research, vol. 4, pp.59-65,. [7]J. H. Hsu, T. L. Chiang, and H. C. Wang, VPRS model for mobile phone test procedure, Journal of the Chinese Institute of Industrial Engineers, vol., no. 4, pp.45-55, 6. [8]H. H. Inbarani, K. Thangavel, and A. Pethalakshmi, Rough set based Feature Selection for Web Usage Mining, International Conference on Computational Intelligence and Muldia Applications, pp.-8, 7. [9]K. K. Ang, and C. Quek, Sck Trading Using RSPOP: A Novel Rough Set-Based Neuro-Fuzzy Approach, IEEE Transactions on Neural Network, vol. 7, no. 5, pp.-5, 6. []Laborary of Intelligent Decision Support Systems, Poznan University of Technology, http://www-idss.cs.put.poznan.pl/site/rose.html []UCI Machine Learning Reposiry, http://mlearn.ics.uci.edu/mlreposiry.html ISSN: 79-57 74 ISBN: 978-96-474-75-8