Available online at ScienceDirect. Agriculture and Agricultural Science Procedia 3 ( 2015 ) 14 19

Similar documents
Python Machine Learning

A Neural Network GUI Tested on Text-To-Phoneme Mapping

Lecture 1: Machine Learning Basics

INPE São José dos Campos

Artificial Neural Networks written examination

Learning Methods for Fuzzy Systems

Evolutive Neural Net Fuzzy Filtering: Basic Description

Test Effort Estimation Using Neural Network

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Softprop: Softmax Neural Network Backpropagation Learning

Analysis of Hybrid Soft and Hard Computing Techniques for Forex Monitoring Systems

*** * * * COUNCIL * * CONSEIL OFEUROPE * * * DE L'EUROPE. Proceedings of the 9th Symposium on Legal Data Processing in Europe

Kamaldeep Kaur University School of Information Technology GGS Indraprastha University Delhi

Procedia - Social and Behavioral Sciences 237 ( 2017 )

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Knowledge Transfer in Deep Convolutional Neural Nets

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Learning to Schedule Straight-Line Code

An empirical study of learning speed in backpropagation

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

An Introduction to Simio for Beginners

Human Emotion Recognition From Speech

QuickStroke: An Incremental On-line Chinese Handwriting Recognition System

Predicting Student Attrition in MOOCs using Sentiment Analysis and Neural Networks

Evolution of Symbolisation in Chimpanzees and Neural Nets

On-the-Fly Customization of Automated Essay Scoring

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Probability and Statistics Curriculum Pacing Guide

Course Outline. Course Grading. Where to go for help. Academic Integrity. EE-589 Introduction to Neural Networks NN 1 EE

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

Deep search. Enhancing a search bar using machine learning. Ilgün Ilgün & Cedric Reichenbach

WHEN THERE IS A mismatch between the acoustic

Framewise Phoneme Classification with Bidirectional LSTM and Other Neural Network Architectures

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

School of Innovative Technologies and Engineering

Abstractions and the Brain

FUZZY EXPERT. Dr. Kasim M. Al-Aubidy. Philadelphia University. Computer Eng. Dept February 2002 University of Damascus-Syria

System Implementation for SemEval-2017 Task 4 Subtask A Based on Interpolated Deep Neural Networks

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

Time series prediction

Calibration of Confidence Measures in Speech Recognition

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Testing A Moving Target: How Do We Test Machine Learning Systems? Peter Varhol Technology Strategy Research, USA

Using focal point learning to improve human machine tacit coordination

CS Machine Learning

Quality Framework for Assessment of Multimedia Learning Materials Version 1.0

Attributed Social Network Embedding

ScienceDirect. A Framework for Clustering Cardiac Patient s Records Using Unsupervised Learning Techniques

arxiv: v1 [cs.lg] 15 Jun 2015

Data Fusion Through Statistical Matching

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Detailed course syllabus

Statewide Framework Document for:

Analyzing the Usage of IT in SMEs

(Sub)Gradient Descent

Malicious User Suppression for Cooperative Spectrum Sensing in Cognitive Radio Networks using Dixon s Outlier Detection Method

ScienceDirect. Noorminshah A Iahad a *, Marva Mirabolghasemi a, Noorfa Haszlinna Mustaffa a, Muhammad Shafie Abd. Latif a, Yahya Buntat b

ScienceDirect. Malayalam question answering system

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Evidence for Reliability, Validity and Learning Effectiveness

Truth Inference in Crowdsourcing: Is the Problem Solved?

Practical Integrated Learning for Machine Element Design

LEGO training. An educational program for vocational professions

GDP Falls as MBA Rises?

Taxonomy of the cognitive domain: An example of architectural education program

Autoregressive product of multi-frame predictions can improve the accuracy of hybrid models

Henry Tirri* Petri Myllymgki

Reducing Features to Improve Bug Prediction

On the Combined Behavior of Autonomous Resource Management Agents

CHAPTER 4: REIMBURSEMENT STRATEGIES 24

A study of speaker adaptation for DNN-based speech synthesis

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Speech Emotion Recognition Using Support Vector Machine

Lecture 10: Reinforcement Learning

Procedia - Social and Behavioral Sciences 197 ( 2015 )

Procedia - Social and Behavioral Sciences 98 ( 2014 ) International Conference on Current Trends in ELT

A Comparison of Annealing Techniques for Academic Course Scheduling

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA 2013

Learning Structural Correspondences Across Different Linguistic Domains with Synchronous Neural Language Models

An Introduction to Simulation Optimization

Artificial Neural Networks

Designing a Rubric to Assess the Modelling Phase of Student Design Projects in Upper Year Engineering Courses

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

Analysis of Enzyme Kinetic Data

Robust Speech Recognition using DNN-HMM Acoustic Model Combining Noise-aware training with Spectral Subtraction

Reduce the Failure Rate of the Screwing Process with Six Sigma Approach

STA 225: Introductory Statistics (CT)

On the Formation of Phoneme Categories in DNN Acoustic Models

Procedia - Social and Behavioral Sciences 141 ( 2014 ) WCLTA Using Corpus Linguistics in the Development of Writing

Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation

A Pipelined Approach for Iterative Software Process Model

The dilemma of Saussurean communication

Classification Using ANN: A Review

Model Ensemble for Click Prediction in Bing Search Ads

Challenges in Deep Reinforcement Learning. Sergey Levine UC Berkeley

Deep Neural Network Language Models

Procedia - Social and Behavioral Sciences 197 ( 2015 )

An Empirical and Computational Test of Linguistic Relativity

Transcription:

Available online at www.sciencedirect.com ScienceDirect Agriculture and Agricultural Science Procedia 3 ( 2015 ) 14 19 The 2014 International Conference on Agro-industry (ICoA) : Competitive and sustainable Agroindustry for Human Welfare Prediction of Hot Glue Content for Sealing Toothpaste Carton Ravipim Chaveesuk a, * and Teeranut Ngoenvivatkul a Department of Agro-Industrial Technology, Faculty of Agro-Industry, Kasetsart University, Ngamwongwan Road, Bangkok, 10900, Thailand Abstract This research compared 2 types of model (regression model and artificial neural network) for prediction of glue content for sealing toothpaste carton from 4 sealing process factors, i.e., production line, diameter of toothpaste tube, pressure in glue nozzle during applying glue onto a toothpaste carton and glue temperature in a glue tank. Models under study included 3 regression models, i.e., multiple regression, polynomial regression and stepwise regression, and backpropagation neural network (BPN). The results indicated that the BPN model possessed higher prediction accuracy and generalization capability and lower bias. The best BPN model had a structure of 4-10-1 with the mean absolute error (MAE) of validating data set of 0.04 gram. In addition, the BPN model identified that the most influential sealing process factors affecting the prediction of glue content were pressure in glue nozzle and glue temperature in the glue tank. The packing department should concentrate on monitoring the value of both factors to control the consistency of glue usage. 2015 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license 2015 The Authors. Published by Elsevier B.V. (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of Jurusan Teknologi Industri Pertanian, Fakultas Teknologi Pertanian, Universitas Gadjah Peer-review under responsibility of Jurusan Teknologi Industri Pertanian, Fakultas Teknologi Pertanian, Universitas Mada. Gadjah Mada Keywords: Regression; backpropagation neural network, toothpaste carton, glue content prediction 1. Introduction Toothpaste s manufacturers always concern about increasing their operation s efficiency along the supply chains due to a highly competitive market. Packaging and packages are known to be one of the key factors that affect the * Corresponding author. Tel.: +66-2562-5000 ext. 5363; Fax: 66-2562-5092. E-mail address: ravipimc@gmail.com 2210-7843 2015 Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). Peer-review under responsibility of Jurusan Teknologi Industri Pertanian, Fakultas Teknologi Pertanian, Universitas Gadjah Mada doi:10.1016/j.aaspro.2015.01.005

Ravipim Chaveesuk and Teeranut Ngoenvivatkul / Agriculture and Agricultural Science Procedia 3 ( 2015 ) 14 19 15 efficiency in the chain. Their functions are to contain, protect or preserve, communicate information, provide convenience in use, handling, transportation, storage, and distribution and promote the product. Toothpaste packaging system includes laminated tube, carton, leaflet inside the carton, fifth panels for promotion, bundle shrink film, bundle barcode, and shipping case. These packaging materials are assembled using an automatic machine. Since there are various sizes of toothpaste that require specific machine types and assembling speeds, these packaging materials must be designed to fit the capacity and limitation of the machine in each production line to smooth the flow of the production line. A critical activity contributing to flow s smoothness and considered as a tamper-evident is sealing the toothpaste carton with hot melt glue. Typically, size of the carton, machine in the production line, hot glue temperature and pressure in the glue nozzle during application of the hot glue onto a carton lid are known to influence the glue content on the lid and an effectiveness of the sealing process. However, the toothpaste manufacturer under study determines the glue content required and develops a glue requirement plan based on the size of the toothpaste only. As a result, the manufacturer faces the problem of underestimate the glue content and incurs high cost for urgent orders. These urgent orders were approximately 0.4 tons with costs of 13,000 USD monthly. This research examines the use of two predictive models to estimate the glue content from the sealing process factors for this manufacturer in order to reduce the costs of urgent orders. The predictive models of interest are regression model and backpropagation neural network model. 2. Predictive models 2.1. Regression model Regression is widely used in modeling the input-output relationship. A general regression model for m input factors, (x 1, x 2,, x m ) = x, can be expressed as: Yi 1 p k j 1k 1 ( ) k ij i (1) Where Y i = response in the i th trial, Z k ( X ij) = power function in first order, second order or higher order and interaction terms, = regression coefficient, and = error term from the i th trial, and = error term from the i th trial. Regression models are very straightforward to implement, however, they require restrictive assumptions on the error terms such as normal random errors, constant error variance, and the absence of multicollinearity. In addition, their performance depends on the appropriateness of the functional forms (Madu, 1996). 2.2. Backpropagation neural network model Backpropagation is one of artificial neural network (ANN) paradigms. ANN develops a mapping from the input variables to the output variables through an iterative learning process. The model consists of a large number of simple and interconnected adaptive processing elements called neurons. Associated with each connection is a weight that represents the information being used to solve the problem. These weights are iteratively adjusted by a learning process to optimal values that produce best fit of the predicted outputs over the entire learning data set. An ANN is generally organized into a sequence of layers: the input, hidden, and output layers. The input and output layers contain neurons that correspond to input and output variables, respectively. Data flow between layers across weighted connection. Each neuron in the hidden or the output layer sums its input signals from the previous layer weighted by the connection weights, and applies an activation function to determine its output signal. A multi-layer ANN with nonlinear transfer functions such as sigmoid and hyperbolic tangent can theoretically model any relationship to an arbitrary accuracy and is thus called a universal approximator (Hornik et al., 1989; Funayashi, 1989). Backpropagation network (BPN) is a feedforward multi-layer neural network trained by gradient descent method (Rumelhart et al.,1986). The training algorithm is based on minimization of total squared error of output

16 Ravipim Chaveesuk and Teeranut Ngoenvivatkul / Agriculture and Agricultural Science Procedia 3 ( 2015 ) 14 19 computed by the network. The training algorithm involves three stages: the feed forward of input training set, the calculation and backpropagation of error, and the adjustment of the weights. The model requires no prior assumption of functional forms and is also robust to deviations from traditional statistical assumptions. Limitations in the BPN is the difficulty in selecting its architectures and training parameters as well as is prone to overparameterization, producing a good fit on the model construction data set but poor generalization to others. 3. Methodology 3.1. Data collection and preparation Four factors affecting the sealing process were studied: the production line (1-10 lines), diameter of toothpaste tube (22, 25, 28, 35, 38 mm), hot glue temperature (170, 173, 175 o C) and pressure in the glue nozzle (1.8, 2.0, 2.5,3.0, 3.2 bar) during application of hot glue onto a carton lid. Based on a specific condition of each production line, there were 32 conditions under study. Fifty cartons were collected from each condition, making up 1,600 cartons. Each empty carton was weighed and went through the packing and lid-sealing process. The packed carton was reweighted to compute the glue content (gram) from the difference between weights before and after packing and sealing. All data (1,600 points) were arranged into an input-output mapping with sealing process factors as input variables and glue content as an output variable. Each condition (50 data points) were divided into 3 data sets: training set for 30 data pints, testing set for 10 data points and validating set for 10 data points. The training set was used to build the model while the testing set was used to identify the proper model structures and parameters. The validating set was used to evaluate the generalization of the model. 3.2. Model building and validation 3.2.1. Regression model Three types of regression models were constructed from the training data set (960 data points) using MINITAB version 16. These models included multiple regression, polynomial regression and stepwise regression. Statistical assumption underlying all regression models were tested: normal distribution of errors, outliers, constant error variance, and no multicollinearity (Kutner et al., 2008). Each model was used to predict the glue content for the testing data set in order to select a proper functional form and parameters based on the mean absolute error (MAE) computed as follows n Y i Y i i 1 MAE n where Y i denotes the actual response value of data point i, Y i denotes the predicted response value of data point i, and n denotes the number of data points over which the error is calculated. Then the constructed models were validated based on MAE of the validating set (320 data points). 3.2.2. Backpropagation network (BPN) model (2) The BPN models were constructed using sealing process factors as input variables and the corresponding glue content as an output variable from training set through NeuralWork Explorer software. All variables were normalized to be consistent with the range of the activation function i.e. between -1 and +1 for hyperbolic tangent function. Architectures and learning parameters are the key factors for the ANN performance. One hidden layer which was proven to be sufficient for modelling continuous functions (Basheer, 2000; Hecht-Nielsen, 1990) was employed in this research. Several hidden neurons (5-30), learning rate (0.01-0.5), momentum (0-0.9) and sets of

Ravipim Chaveesuk and Teeranut Ngoenvivatkul / Agriculture and Agricultural Science Procedia 3 ( 2015 ) 14 19 17 initial random weights were explored. To avoid overtraining, the learning phase was stopped every 1,000 iteration, and the model was evaluated for its prediction accuracy using the testing set. Learning was stopped when the MAE of the testing set continued to increase. The proper architecture and learning parameters were selected based on the MAE of this testing set. Then the constructed models was validated based on MAE of the validating set (320 data points). 3.3. Model comparison 3.3.1. Prediction accuracy and generalization capability Both selected regression models and BPN models were compared for its prediction accuracy based on MAE. A superior model should possess good prediction accuracy for both training and validating data sets. In other words, its generalization capability should be retained. 3.3.2. Model bias Bias is an asymmetric distribution of the estimation error. The superior model should exhibit as less bias as possible. The model bias can be observed by computing a bias factor (B f ) (Ross, 1996) as follows; B f n Yˆ i log Y i 1 i n 10 (3) If a bias factor is equal to 1, the model is unbiased. A bias factor greater than 1 indicates that the model overestimates the data while a value less than 1 indicates that it underestimates the data. 3.4. Identification of important sealing factors Once the model is built and validated, it could be used to predict the glue content as well as to identify the sealing process factors affecting the glue content required in sealing each carton. Chaveesuk and Smith (2006) have shown that polynomial regression and backpropagation network could identify the significant factors affecting the capital investment measures. In case of a polynomial regression model, inference can be made from the magnitude of the standardized regression coefficients. A large coefficient indicates an important effect of that variable. For an ANN model, altering the input variables by a certain percentage and calculating how much the output changes provides the basis for observing the important effects of the input variable. The larger the percentage changes, the greater the effect of that input variable. 4. Results and discussions First order stepwise regression with interaction model possesses highest prediction accuracy among all regression models investigated. The BPN model that exhibits highest prediction accuracy has a 4-10-1 structure (4 input neurons-10 hidden neurons-1 output neuron) and was trained at the learning rate of 0.1 and momentum of 0.9 for 39,000 iterations. Table 1 compares both regression and BPN model accuracy in terms of MAE and bias in terms of bias factor. It is observed that the best BPN model is superior to the best regression model in terms of prediction accuracy and generalization capability. In addition, the plots between the actual glue weight used and the predicted value for BPN and regression models in the validating data set confirm this observation with the r 2 of 0.78 and 0.61, respectively (Fig 1). This might be attributable to the universal approximator property of BPN. Both models however slightly overestimate the glue content since their biases are a little higher than 1.

18 Ravipim Chaveesuk and Teeranut Ngoenvivatkul / Agriculture and Agricultural Science Procedia 3 ( 2015 ) 14 19 Table 1. Models prediction accuracy and bias. Model MAE (gram) Bias factor Training set Validating set Training set Validating set First order stepwise regression 0.06 0.06 1.03 1.02 4-10-1 BPN 0.03 0.04 1.03 1.05 Fig. 1. The actual glue weight used and the predicted value in the validating data set (a) BPN; (b) Regression. When the more accurate BPN model is used in prediction the glue content required and in glue requirement planning, the company can reduce an overestimate in glue order from 0.4 tons/month to 0.016 tons/month and also reduce the monthly cost of urgent order from 12,900 USD to 520 USD. Identification of important input factors are further insights gained from the accurate models. Since BPN model outperforms regression model in terms of prediction accuracy and generalization capability, it is then used to identify the important sealing process factors. Pressure in the glue nozzle and hot glue temperature are the most and second most influential sealing factors identified by BPN model. These factors must be monitored so that corrective action can be undertaken in a timely manner if there is a small change in any of both factors. 5. Conclusions Best preditive model for glue content required to seal the toothpaste carton lid is 4-10-1 backpropagation neural network with the mean absolute error of 0.04 gram in validating data set. This model is slightly bias upwards. If the model is used in glue requirement planing, the firm under study can save 12,380 USD on an urgent order per month. The most important sealing factors pintpointed by this model are the pressure in the glue nozzle and hot glue temperature. References Basheer, I., 2000. Selection of Methodology for Modeling Hysteresis of Soil Using Neural Networks, J. Comput.-aided Civil Infrastruct. Eng. 5(6), 445-463. Chaveesuk, R., Smith, A.E., 2003. Economic Valuation of Capital Projects Using Neural Network Metamodels. The Engineering Economist 48 (1), 1-30. Funahashi, K., 1989. On the Approximate Realization of Continuous Mappings by Neural Networks. Neural Networks 2, 183-192. Hecht-Nielsen, R., 1990. Neurocomputing. Addison-Wesley, MA.

Ravipim Chaveesuk and Teeranut Ngoenvivatkul / Agriculture and Agricultural Science Procedia 3 ( 2015 ) 14 19 19 Hornik, K., Stinchcombe, M., White, H., 1989. Multilayer Feedforward Networks are Universal Approximators. Neural Networks 2, 359-366. Kutner, M.H., Nachtsheim, C.J., Neter, J., 2008. Applied Linear Statistical Models. 4 th ed. McGraw-HILL, Singapore. Madu, C.N., 1990. Simulation in Manufacturing: A Regression Metamodel Approach. Computers & Industrial Engineering, 18, 381-389. Ross, T., 1996. Indices for Performance Evaluation of Predictive Models in Food Microbiology. Journal Application Bacterial 81, 501-508. Rumelhart, D.E., Hinton, G. E., Williams, R. J., 1986. Learning Internal Representations by Error Propagation, in Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Vol. 1: Foundations, Rumelhart, D. E. and McClelland, J. L. (Eds.), MIT Press, MA, pp. 318-362.