Sequential Decision Making Predictions under the Influence of Observational Learning

Sequenial Decision Making Predicions under he Influence of Observaional Learning Shing H. DOONG Deparmen of Informaion Managemen, ShuTe Universiy YanChao Disric, KaoHsiung Ciy, Taiwan 8445 STRCT Today s corporae managers face challenges in informaion echnology (IT adopion wih grea sakes. The waiing period for an IT invesmen o be realized, if any, could be long; hus word-of-mouh informaion propagaion may no help hem o make wise decisions. Though informaion used by early adopers o make heir decisions may no be available o he public, lae adopers can ofen observe he decisions made by he early adopers and infer hidden informaion o supplemen heir own privae informaion. Observaional learning heory applies when a person uses observed behavior from ohers o infer somehing abou he usefulness of he observed behavior. Walden and rowne proposed a simulaion procedure o model he influence of observaional learning in sequenial decision makings. Previously, we proposed a dynamic ayesian nework (DN o model sequenial decision makings under he influence of observaional learning. In he presen sudy, we show how o infer a DN model from simulaed daa. Hidden Markov model and arificial neural neworks were used o infer he DN model. Their performance will be discussed. Keywords: Sequenial Decision Making, Observaional Learning, Dynamic ayesian Nework, Hidden Markov Model, rificial Neural Neworks. INTRODUCTION Today s corporae managers face challenges in informaion echnology (IT adopion wih grea sakes. IT ogeher wih elecommunicaion has been considered he main driver for he economic growh of many counries in he new economy era since 000s. To many companies, IT has become an indispensable par of heir core compeence wih several characerisics. Firs, IT is becoming so powerful and complex ha a fair assessmen of is meris is difficul. Second, capial invesmens in IT are subsanial, ye reurns on invesmens ofen ake ime o maerialize. rooks has shown ha sofware and oher echnological componens are complex arifacs ever buil by human beings []. In some cases, impacs of new echnologies may ake years o be realized []. Owing o hese reasons, corporae managers need differen kinds of ools and pracices o help hem make wise decisions in IT adopions. When people make decisions wih limied or asymmeric informaion, hey use differen pracices o correc his informaion deficiency. Observaional learning occurs when one person observes he behavior of anoher person and infers somehing abou he usefulness of he behavior based on ha observaion [3]. Research shows ha, due o informaion asymmery, people use wha hey observe from ohers o updae heir own privae informaion or belief abou a decision making [4]. Observaional learning ofen leads o an ineresing phenomenon called informaional cascades [5]. n informaional cascade occurs if an individual s acion does no depend on his privae informaion signal [5]. Walden and rowne [3] developed a heoreical exension of he observaional learning model in [5], where a binary privae signal is generaed for each decision maker who chooses o adop or rejec an acion. In [3], a coninuous privae signal is issued o each individual who also chooses o adop or rejec an acion. Changing he privae informaion signal from binary o coninuous has produced many ineresing resuls. For example, unlike he easy informaional cascading in he case of binary signals, Walden and rowne showed ha here are always lae decision reversals in a sequence of decision makings. Tha is, informaional cascades do no occur in he case of coninuous signals. simulaion procedure was used o invesigae he exended observaional learning heory [3]. In a laer sudy, we showed ha he Walden and rowne (W model can also be invesigaed from he perspecive of a dynamic ayesian nework (DN [6]. In he presen sudy, we consider he problem of inferring he DN from simulaed daa. Hidden Markov model (HMM and arificial neural neworks (NN are used o infer he DN model. This paper is organized as follows. We briefly discuss he W model and our DN perspecive firs. Then, HMM and NN are inroduced o learn he DN, given simulaed daa. Experimenal resuls will be presened nex, followed by discussions and conclusions.. MTERILS ND METHODS Observaional learning wih coninuous signals Walden and rowne used a coninuous signal o denoe he privae informaion received by an individual [3]. sequence of individuals will make a decision of choosing echnology (e.g., adop he cloud compuing IT or echnology (e.g., rejec he cloud compuing IT. ssume ha echnologies and emi signals from he normal disribuions N(, σ and N( µ, σ respecively, and µ > µ. n individual chooses if he following condiion is saisfied: s µ β s µ Here s is a privae signal received by he individual, p s µ and p s µ are he probabiliy disribuion ( ( µ (

funcions (pdfs of N (, σ and N(, σ, respecively. µ Plugging in pdfs o solve for s, we obain he following decision rule: s β s < β µ ( Dynamic ayesian Nework In order o model he W model as a DN, we need o use wo sequences of random variables o describe he dynamic involved in a sequenial decision model [6]. The variable X represens a decision hreshold β, hus P X = β =, and he Y variable represens he oucome of a decision. Thus, assuming ha signals are drawn from echnology, hen PY X is given by: ln( β σ µ + µ β = (3 ( µ µ Using signal deecion heory [7], Walden and rowne se he decision hreshold β as follows: P a β = P b β = β β s µ ds, s µ ds (7 Pr (µ β = k (4 Pr (µ Here k is common o all individuals and involves he relaive benefi of o. For he firs individual, we can assume echnologies and are equally good, hus Pr (µ = Pr (µ. For he remaining decision makers, hese erms are poserior probabiliies afer observing previous decisions: P µ P µ + + = P µ = P µ D,... D D,... D In he above equaion, D denoes he decision made by he -h individual. Using he chain rule of condiional probabiliy, Walden and rowne deduce he following rule for updaing decision hresholds: (5 Pr (D µ, β+ = β, Pr (D µ, (6 where D = a or b when he -h individual chooses echnology or, and consiss of decisions made by all - previous individuals. Decisions in ogeher deermine he signal breakpoin r β for he -h individual via he hreshold β and Eq. ( (3. Consequenly, we have he following ideniies when is he rue echnology emiing privae signals. Pr (a µ, = Pr (b µ, = p β β p ( s µ ds ( s µ ds Similar ideniies can be derived when is he rue echnology emiing privae signals. If he -h individual chooses echnology (i.e., D = a, β will be scaled down o form β + because Pr (a µ, < Pr (a µ,. Thus, he break-poin β for signal s is moved lefwards. There is more space for he nex individual o choose echnology. On he oher hand, if his individual chooses, hen β is scaled up, β moves righwards and here is less space for he nex individual o choose echnology. In order o describe he fac ha β scales up or down depending on he decision of he -h individual, a causal link from Y o X + mus be esablished as follows: X = β X + = β + Y Y + Figure. DN for observaional learning The X + variable depends on X and Y by rules in Table. This random variable has a discree disribuion and is value depends on he previous hidden sae (X and he previous decision (Y. The W model is now convered o a DN in Figure wih relevan pdfs given in Eq. (7 and Table. Table. Condiional probabiliy PX + X, Y Pr (X + X, Y ( β, a ( β, b Pr (a µ, β 0 Pr (a µ, Pr (b µ, β 0 Pr (b µ, The DN perspecive has a few advanages over he original W model. Firs, he signal receiving and decision making sep has been simplified o a binomial sampling sep. Second, he dynamic updaing of decision hresholds is replaced by clear rules in Table. The DN in Figure is also easy o inerpre. I ells causal relaionships among all relevan variables and shows how he sysem evolves as ime moves ahead. Hidden Markov Models hidden Markov model (HMM is represened by a 5-uple (S, V,,, P where S = {s, s,, s N } consiss of N saes ha are no direcly observable, V = {v, v,.., v M } denoes M observable oucomes emied by a sae, = ((a ij = Ps j s i represens he ransiion probabiliies beween saes, = ((b jk = Pv k s j represens he emiing probabiliies of oucomes by saes, and P = ((p i represens he iniial probabiliies of saes [8]. Given an HMM wih all needed componens, a sequence of oucomes can be generaed by ( choosing an iniial sae according o he iniial probabiliy vecor P; ( emiing an oucome from his sae by using he emiing probabiliy marix

V; (3 ransiing o a nex sae by following he ransiion marix ; and (4 repeaing seps ( and (3. This is a daa deducion procedure commonly used in simulaion sudies. On he oher hand, given sequences of oucomes (daa, an HMM may be learned from he daa and used for fuure predicions. This is a paerns inducion procedure used in mos daa mining algorihms. rificial Neural Neworks rificial neural nework (NN has been successfully applied o solve many funcion approximaion problems in engineering and social sciences. n NN simulaes he neural sysem of a brain o learn paerns from examples and uses he learned knowledge o make predicions for new daa [9]. basic daa processing uni in a neural ne is called a neuron which is conneced o oher neurons via synapses. The srucure of an NN refers o he number of neurons and he way hey are disribued and conneced. To simplify he compuaion, neurons are scaered ino layers and informaion is ransferred from layer o layer. The inpu layer represens he independen variables in a funcion approximaion problem. The oupu layer corresponds o he dependen variable(s. Layers beween he inpu and oupu layers are called hidden layers. n NN wih hidden layers is also called a mulilayer percepron (MLP. Wihou a hidden layer, a simple percepron has limied learning capabiliy [0]. I has been shown ha an MLP can approximae arbirarily well any coninuous decision region provided ha here are enough layers and neurons []. Learning an NN from daa is o find opimal synapic weighs o fi raining daa wih known inpuoupu pairs. enion mus be paid o he nework srucure so ha we do no overfi he model wih daa. rained NN can be used o predic oupu values for new inpu values. 3. SIMULTION STUDY To examine informaional cascades of sequenial decision makings under he influence of observaional learning, boh [3] and [6] presened a simulaion sudy. ssume ha wo alernaive echnologies and are o be seleced by a sequence of individuals. Suppose is he beer echnology, hus all privae signals will be emied by is pdf, which is assumed be a normal disribuion N( assume ha µ, σ. We µ =, µ = 0, and k =, σ = in he previous secion of observaional learning wih coninuous signals. simulaion run of decision makings consiss of 00 sequenial decisions as explained previously. For he W approach, his includes ( drawing a signal from he pdf of echnology ; ( making a decision based on he signal, Eq. ( and Eq. (3; (3 updaing he new hreshold according o he decision made and Eq. (6; and (4 coninuing he process unil he 00 h decision is made. On he oher hand, for he DN approach, his includes ( drawing a sample from he uniform disribuion on (0, o decide echnology or according o Eq. (7; ( updaing condiional probabiliy PX + X, Y in Table ; and (3 coninuing he process unil he 00 h decision is made. ecause he simulaion is based on probabilisic samplings, one run of simulaion can differ from anoher run of simulaion subsanially. Thus, a oal of 000 runs of simulaion are conduced o smooh ou flucuaions beween runs. he end, he average correc decision rae for each decision posiion (from o 00 is repored. The average correc decision rae a a posiion is he number of correc decisions (i.e., choosing a ha posiion ou of oal runs divided by 000. Figure shows ha boh approaches yield very similar cures of average correc decision rae. oh approaches have an average correc rae curve ha sars low a around.70 and increases o around.95 a he laer sage. The correlaion value beween hese wo sequences of average correc decision raes is.994 and he mean absolue error (ME is.03. Oher simulaion ypes including random updaing of decision hresholds and cases of eriary decisions can be considered wih he DN approach. verage correc rae 5 5 5 0.65 0 0 0 30 40 50 60 70 80 90 00 Posiion number W DN Figure. Comparison of W and DN approaches. 4. LERING PTTERNS OF SEQUENTIL DECISIONS The las secion presens a simulaion sudy based on W and DN models. We now consider he reverse process of discovering models from daa. Since real sequenial decisions are hard, if no impossible, o obain in business, we use simulaed daa from he DN approach o learn paerns of sequenial decisions under he influence of observaional learning. Training samples The DN approach is used o generae raining samples for learning paerns of sequenial decisions. In oal, 000 observaion sequences are oupued from he simulaions. Each observaion sequence consiss of 00 sequenial decisions of (for choosing or 0 (for choosing. Using HMM as a learning ool HMM is a special DN when i is spread ou in seps. In order o use HMM, we need o decide he number of hidden saes and he number of observable oucomes. Since here are only wo possible decisions ( or 0, we choose hidden saes and emied oucomes. The 000 observaion sequences of raining samples are fed ino a aum-welch (also called a forwardbackward learning algorihm o learn parameers of an HMM [8]. These parameers include he iniial probabiliy for each sae, oucomes emiing probabiliies and saes ransiion probabiliies. The rained HMM is used o generae 000 sequences of simulaed oucomes. The simulaion is obvious and sraighforward, given he full parameers of an HMM. Each sequence consiss of 00 sequenial decisions. The average correc decision rae is compued as before and compared wih ha from he DN approach.

Using NN as a learning ool In order o use NN as a learning ool, we need o se up an inpu-oupu correspondence, i.e., inpu variables and oupu variables. Using he DN perspecive (Figure as a guideline, we can se up a correspondence as β + = f ( β, D Since β deermines he disribuion of D according o Eq. (7, we will use he probabiliy of choosing as he surrogae variable. Le denoe he probabiliy of choosing a he -h posiion. p p can be deermined from Eq. (7. Then, we will approximae he following funcion wih NN. p + = f ( p, D (8 To prepare raining samples for he NN learning, we use he average correc decision rae from he DN simulaions o denoe p, i.e. p = (number of decision a posiion /000. The D variable is exraced from he 000 observaion sequences of he DN simulaions. Insead of using he full se of 000 observaion sequences o rain a single NN model, we rain 0 NN models wih smaller daa ses and average oupus from hese 0 NN models o make predicions. More specifically, we randomly choose 00 observaion sequences from he DN simulaion o rain an NN model. This procedure is repeaed 0 imes o ge 0 NN models, which are bagged o ge he final predicor. The idea is similar o a bagging predicor []. Since our model in Eq. (8 has only wo inpus and one oupu, we do no need o use a complicaed nework srucure. One or wo hidden layers will suffice for our daa se. Though our daa se may be poenially large, e.g., 00 observaion sequences wih 00 sequenial decisions will produce 9900 inpu-oupu pairs, of which many are simply duplicaes. fer using a rialand-error approach wih es daa, we decided o use a wo hidden layer srucure he firs hidden layer has 4 neurons and he second hidden layer has neurons. Our final MLP has, 4,, neurons in he respecive layer. The Sigmod funcion was chosen o be he acivaion funcion. fer he bagging aggregaor is rained, i is used o predic he probabiliy in a simulaion of sequenial decisions. The firs p decision is simulaed by using he average correc rae a posiion from he DN simulaion. random sample is drawn from he uniform disribuion on (0, and compared wih his average correc rae o choose echnology or. fer he decision is made, i is plugged ino Eq. (8 wih he learned bagging NN predicor o predic he nex probabiliy of choosing. This process coninues unil he 00 h decision is made. gain, 000 runs of simulaion are conduced o calculae he average correc decision rae from he learned NN model. 5. EXPERIMENTL RESULTS In his secion, we presen he experimenal resuls from differen simulaion scenarios. The sandard case In he sandard case, we assume k=. Thus, he relaive benefi of choosing or is equal o one. The previous simulaion sudy has shown ha he average correc decision rae increases from around.70 a posiion o around.95 a posiion 00. Figure 3 shows he average correc decision rae curves from DN, HMM and NN. The DN simulaion was used o generae raining samples for he oher wo o learn. oh HMM and NN learn heir model from he raining samples, and use he learned model o simulae sequenial decisions. The average correc decision rae curve repors he simulaion resuls using he rained model. verage correc rae 0.6 0.5 0.4 0.3 0. 0. 0 DN HMM NN 0 0 30 40 50 60 70 80 90 00 Posiion number Figure 3. Comparison wih DN, HMM and NN (k= The ME beween DN and HMM is.03 and he same measure for DN and NN is.008. On he oher hand, he correlaion beween DN sequence and HMM sequence is.969, and he same measure for DN and NN is.98. Thus, NN has learned a beer predicion model for his sandard case. Technology has a higher relaive benefi In his case, we assume k = 0, hus echnology has a higher relaive benefi han echnology. This gives individuals less incenives o choose echnology. verage correc rae 0.6 0.5 0.4 0.3 0. 0. 0 DN HMM NN 0 0 30 40 50 60 70 80 90 00 Posiion number Figure 4. Comparison wih DN, HMM and NN (k=0 Figure 4 shows ha he rae of choosing echnology is subsanially smaller han ha in he sandard case. This is reasonable; because of a higher relaive benefi for choosing over, an individual mus have received a very srong signal in

order o make a decision of choosing. The probabiliy of choosing echnology is small a he beginning. When more individuals selec echnology, laer individuals increase heir belief in echnology hrough observaional learning. The simulaions show ha he average correc decision rae increases from less han.0 a posiion o around.60 a posiion 00. The ME beween DN and HMM and NN is.08 and.03 respecively. The correlaion beween DN sequence and HMM sequence is.995, while he same measure for DN and NN is.99. Therefore, HMM is a beer predicion model in his case. Technology has a higher relaive benefi and only parial sequences are used In his case, we assume ha echnology has a higher relaive benefi (k = 5, and only parial sequences from he DN simulaions are used o rain HMM and NN models. We assume ha only he firs 50 decisions in an observaion sequence are used o rain predicion models. verage correc rae 0.6 0.5 0.4 0.3 0. 0. 0 DN HMM NN 0 0 30 40 50 60 70 80 90 00 Posiion number Figure 5. Comparison of models (k=5, only 50 decisions used Since he relaive benefi of o is no as big as he one in he previous case, we expec individuals o have more incenives o choose echnology. Figure 5 verifies his wih an iniial average correc decision rae of around.5 o he las rae of around.80 a posiion 00. Since we only use he firs 50 decisions o rain HMM and NN, heir performance for he second half of decision sequences is more ineresing. Figure 5 shows ha he HMM model performs beer han he NN model for his par of decision sequences. Overall, he HMM model also produces a smaller ME (.06 vs..030 and a higher correlaion (.993 vs..970 han he NN model. 6. DISCUSSIONS The experimenal resuls show ha HMM has a beer capabiliy han NN in learning paerns of sequenial decisions. When k is big, he average correc decision rae curve resuling from he NN model is much jagged han ha from he HMM model. This is ineresing if we consider he fac ha he HMM model has no causal links beween D and β +. The DN model in Figure is our basis o consruc he NN model in Eq. (8. Tha is, he curren probabiliy of choosing and he curren decision oucome should ogeher decide he nex probabiliy of choosing. On he oher hand, an HMM has causal links beween hidden saes only. Using ransiion probabiliies, he nex sae is sampled based on he curren sae only. Oucomes from he curren sae have no effec on he sampling of he nex sae in HMM. This seems o conradic he causal model explained by he DN perspecive of observaional learning. The jaggedness of he NN average correc decision rae curve may come from an over-fied neural nework. ecause we have a small nework srucure wih a bundle of daa, hough many of hem are duplicaes, we may over-fi he nework o produce a sensiive predicor. The bagging procedure does no seem o overcome his difficuly. Oher predicion algorihms such as suppor vecor regressions wih known capabiliies in overfiing conrol may be considered in he fuure. 7. CONCLUSIONS Today s corporae managers face challenges in IT adopion wih grea sakes. Corporae IT has become so powerful and complex ha a fair assessmen of is meris is difficul. Capial invesmens in IT are subsanial, ye reurns on invesmens ofen ake ime o maerialize. Convenional word-of-mouh informaion propagaion procedures may work for consumer IT decisions, bu no for corporae IT decisions. Though i is usually difficul o obain he privae informaion ha oher companies use o make heir IT adopion decisions, i is possible o observe wha he oher companies have decided in heir IT adopion. Observaional learning heory applies when a person uses observed behavior from ohers o infer somehing abou he usefulness of he observed behavior. Corporae managers may pracice observaional learning o help hem make beer IT adopion decisions. Observaional learning is known o creae informaional cascades, a phenomenon when an individual s acion does no depend on his privae informaion signal. When informaional cascades occur, belief inferred from observaional learning has overshadowed he privae informaion signal ha an individual uses o make his decision. Walden and rowne [3] proposed a simulaion model o show ha informaional cascades do no occur when he privae informaion signal is coninuous. We presened a DN perspecive of he W model in [6]. The DN approach demonsraed similar simulaion resuls as he W approach.. This sudy is focused on learning he DN model resuling from observaional leaning impaced sequenial decisions. Two machine learning ools are used o learn he DN. The firs one, hidden Markov model, is iself a special case of DN. The second one, arificial neural nework, is a popular learning algorihm in arificial inelligence. The HMM learning approach does no consider impacs of he curren decision (D on he sampling of he nex sae. I also uses a limied number of hidden saes o represen coninuous informaion signals. On he oher hand, he NN learning approach uses he DN perspecive o model a funcional form for approximaion. Is coninuous oupu variable mees he ype of privae informaion signals in Walden and rowne [3]. The experimenal resuls show ha HMM has a beer learning capabiliy han NN in our sudy. In he fuure, we plan o run

more ess wih differen learning algorihms and diverse raining samples. Learning paerns of sequenial decisions consiues he reverse process of simulaion sudies as presened in [3, 6]. Togeher, simulaion sudies and paerns learning can help us beer undersand how observaional learning impacs sequenial decisions. cknowledgemens: This research has been suppored in par by a gran from he Naional Science Council of Taiwan under he conrac number NSC99-40-H-366-006-MY. 8. REFERENCES [] E. rooks, The Myhical Man-monh: Essays on Sofware Engineering, Reading, M: ddison-wesley, 975. [] E. rynjolfsson, and L. Hi, L., Paradox los? Firm-level Evidence on he Reurns o Informaion Sysems, Managemen Science, Vol. 4, No., 996, pp. 54-558. [3] E.. Walden, and G.J. rowne, Sequenial dopion Theory: a Theory for Undersanding Herding ehavior in Early dopion of Novel Technologies, Journal of he ssociaion for Informaion Sysems, Vol. 0, No., 00, pp. 3-6. [4]. andura, Social learning heory, New York: General Learning Press, 977. [5] S. ikhchandani, D. Hirshleifer, and I. Welch, Theory of Fads, Fashion, Cusom, and Culural Change as Informaional Cascades, The Journal of Poliical Economy, Vol. 00, No. 5, 99, pp. 99-06. [6] S. Doong, and S. Ho, Consruc a Sequenial Decision Model: a Dynamic ayesian Nework Perspecive, Proceedings of he 44 h nnual Hawaii Inernaional Conference on Sysem Sciences (HICSS, 0. [7] D. Green, and J. Swes, Signal Deecion Theory and Psychophysics, New York: Wiley, 966. [8] L.R. Rabiner, Tuorial on Hidden Markov Models and Seleced pplicaions in Speech Recogniion, Proceedings of he IEEE, Vol. 77, No., 989, pp. 57-86. [9] I.H. Wien, and E. Frank, Daa Mining, Pracical Machine Learning Tools and Techniques, San Francisco: Morgan Kauffman Publishers, 005. [0] M.L. Minsky, and S.. Paper, Perceprons, Cambridge, M: MIT Press, 969. []. Gallan, and H. Whie, On Learning Derivaives of an Unknown Mapping wih Mulilayer Feed-forward Neworks, Neural Neworks, Vol. 5, No., 99, pp. 9-38. [] L. reiman, agging Predicors, Machine Learning, Vol. 4, No., 996, pp. 3-40.