Effectiveness of Implicit Rating Data on Characterizing Users in Complex Information Systems

Effectiveess of Implicit Ratig Data o Characterizig Users i Complex Iformatio Systems Seoho im, Uma Murthy, apil Ahuja, Sadi Vasile, ad Edward A. Fox Departmet of Computer Sciece Virgiia Tech Blacksburg, Virgiia 406 USA {shk, umurthy, kahuja, sadi, fox}@vt.edu Abstract. Most user focused data miig techiques ivolve purchase patter aalysis, targeted at strictly-formatted database-like trasactio records. Most persoalizatio systems employ explicitly provided user prefereces rather tha implicit ratig data obtaied automatically by collectig users iteractios. I this paper, we show that i complex iformatio systems such as digital libraries, implicit ratig data ca help to characterize users research ad learig iterests, ad ca be used to cluster users ito meaigful groups. Thus, i our persoalized recommeder system based o collaborative filterig, we employ a user trackig system ad a user modelig techique to capture ad store users implicit ratigs. Also, we describe the effects (o commuity fidig) of usig four differet types of implicit ratig data. Itroductio As two-way World Wide Web services such as blogs, wikis, olie jourals, olie forums, etc. became popular, more people were able to express themselves ad play more active roles i olie societies [, ]. This tred chaged WWW users from passive aoymous observers to visible idividuals with persoalities. Such users, i icreasig umbers, are patros of digital libraries (DLs), e.g., researchers ad distace learers. Studyig users of DLs is providig opportuities for research o collaborative filterig, persoalizatio, user modelig, ad recommeder systems. Most such studies cosider users ratigs o the iformatio they select, as well as users prefereces e.g., o research areas, majors, learig topics, or publicatios which are etered explicitly [3, 4, 5]. However, obtaiig explicit ratig data is difficult. Further, termiology associated with the broad topical coverage of most DLs poses serious challeges regardig the idetificatio of users research ad learig iterests. Eve people with the same research iterests express those iterests with differet terms, while the same terms sometimes represet differet research fields. For these reasos, we eed other evidece to help distiguish users research iterests, without depedig o their writte commets. Thus, Nichols [6] ad the GroupLes team [7] showed the great potetial of implicit ratig data whe it is combied with existig systems to form a hybrid system. Further, we utilized users implicit ratig data for collaborative filterig i DLs [8]. However, the effectiveess

of implicit ratig methods still remais uprove. Cosequetly, we explore user trackig ad implicit ratigs i Sectio ad the propose hypotheses about the use of such data i Sectio 3. Sectio 4 describes our iitial experimets ad their results, while Sectio 5 cocludes the discussio ad outlies future work. User Trackig ad Implicit Ratigs Goçalves et al. proposed a XML based log stadard for DLs [9] which helped pave the way for this study. However, origially it emphasized the iteroperability ad reusability of loggig, based o a miimal DL metamodel. A more detailed characterizatio ad aalysis of user societies ad scearios is eeded. Accordigly, we developed a user trackig system [8, 0] to collect histories of users iteractios. The user trackig system, which collects ad seds all phrases ad seteces clicked ad typed by a user to a server, is embedded i the iterface, so that the user is ot aware of its existece. Istead of usig HTTP web logs like most DLs, we employed a user modelig techique to record iteractios, i XML. Each user model also cotais demographic iformatio, persoal iterests, ad similarities with research iterest groups i DLs. The most distictive feature of the model is that it cotais a user s implicit ratigs alog with explicit ratigs ad statistics, as eeded for collaborative filterig ad recommedatio. This ot oly icreases the completeess, iteroperability, ad reusability of user model data, but also decreases the complexity of the process. User iterests are etered implicitly by usig a documet clusterig algorithm, LINGO [, ], ad a user trackig iterface. Also, sedig a query, selectig or skippig a achor, ad expadig a ode all are cosidered implicit ratigs. Figure : A DGG for "User Activity" attribute

A implicit ratig is captured from a user activity. Figure is our Domai Geeralizatio Graph (DGG) for the user activity attribute i our model; DGGs are more commoly used i coectio with data miig targeted o sales or trasactio data [3] to represet the comprehesive relatios betwee attributes. Each ode i the graph represets a partitio of the values that ca be used to describe the attributes. The discrete attribute frequecy is idepedet of other attributes of user activity. Edges betwee adjacet odes describe the geeralizatio relatios betwee the odes. Each user activity has a directio, where: ratig meas the user gives some feedback to the system; perceivig meas the user does t give feedback to the system; ad regardig itetio, implicit meas the user gives feedback implicitly while explicit meas feedback is give explicitly. Thus, sedig a query ad readig a title are ot ratig, sice we do t give ay feedback. However, expadig ad skippig a cluster are ratig by which we idicate whether the cluster is iterestig or ot. For a example of itetio, ote that eterig a query is implicit, because our purpose is ot to characterize ourselves. However, eterig user iformatio or prefereces is explicit. 3 Hypotheses To assess the effectiveess of implicit ratig data for characterizig DL patros accordig to their research iterests, we developed a special iterface for the CITIDEL system [4], part of the NSF-fuded Natioal Sciece Digital Library. Our iterface was based o Carrot [], coupled with our user trackig system, which together support ad record selectio of clusters (i.e., the output of the system) [8]. Hece, we test three hypotheses about proper huma-computer iteractio ad documet clusterig.. H : For ay serious user with their ow research iterests ad topics, show repeated (cosistet) output for the documet collectios referred to by the user.. H : For serious users who share commo research iterests ad topics, show overlapped output for the documet collectios referred to by them. 3. H 3 : For serious users who do t share ay research iterests ad topics, show differet output for the documet collectios referred to by them. 4 Experimets ad Data Aalysis We collected implicit ratigs from studets at both the Ph.D. ad Master s level i Computer Sciece at Virgiia Tech; CITIDEL [4] cotais documets i the computig field. 8 of the studets successfully completed the experimet ad so were selected to be aalyzed for this study. Each subject was asked to perform searches with CITIDEL usig 0 queries i his/her (hereafter, her ) research field

ad allowed to browse the search results to fid iterestig documets. All subjects were required to register ito the system ad so provided explicit prefereces. By sessio ed, each subject had a XML formatted user model i our system. The recommeder, which is a software module i charge of collaborative filterig, maages ad updates models wheever users log-out. Figure is a simplified sample of a user model. The model cosists of four highest level elemets (i additio to a log of queries submitted): ) userifo ad useriterests (ot expaded) are for explicit aswers to a questioaire, ) commuity is for the commuities of the user foud by the recommeder, 3) proposed is for documet topics which are show to the user ad skipped, ad 4) selected is for documet topics which are selected or expaded by the user. Therefore, () is explicit ratig data, () reflects computer iferece, ad (3) ad (4) are implicit ratig data. Each etry has accompayig statistics (e.g., frequecies, probabilities). <?xml versio=".0"?> - <user> <userid>seoho</userid> + <userifo> () + <useriterests> - <commuity> () <member score="0.743">sig00</item> <member score="0.50">sig004</item> <member score="0.83">sig003</item> </commuity> - <query> <item freq="3">educatioal Library</item> <item freq="">user modelig</item> <item freq="">log System</item> </query> - <proposed> (3) <item freq="3">curriculum i Computer</item> <item freq="3">distace learig</item> <item freq="">computer Commuicatio</item> <item freq="">computer ad Computer Educatio</item> <item freq="">computer Security</item> <item freq="">computer Itegrated Maufacturig</item> <item freq="">computer ad Public</item> <item freq="">computer Axiety</item> <item freq="">data Parallel</item> <item freq="">ieee Computer Society</item> </proposed> - <selected> (4) <item freq="3">curriculum i Computer</item> <item freq="">distace learig</item> <item freq="">computer ad Computer Educatio</item> <item freq="">computer ad Public</item> <item freq="">ieee Computer Society</item> </selected> </user> Figure : A example of user model geerated by the recommeder We employed hypothesis testig [5] as follows. Because the data collected from the user trackig system is idepedet ad idetically distributed (i.i.d.), we use iferece processes to verify hypotheses ad estimate properties, startig with HT. HT: Hypothesis testig ad cofidece itervals for H.. H 0 (Null hypothesis of H ): Mea (µ) of frequecy of documet topics proposed by the Documet Clusterig Algorithm are NOT cosistet (µ 0 = ) for a user. Hypothesis Testig about H 0 : µ = µ 0 vs. H : µ > µ 0. Coditios: 95% cofidece (test size α = 0.05), sample size < 5, ukow stadard deviatio σ, i.i.d. radom sample from ormal distributio, estimated z-score t-test.

3. Test statistics: sample mea y =.49, sample stadard deviatio s = 0.77 are observed from the experimet. 4. Rejectio Rule is to reject H 0 if y > µ 0 +z α/ σ/ 5. From the experimet, y =.49 > µ 0 +z α/ σ/ =.0934 6. Therefore decisio is to Reject H 0 ad accept H, 95% Cofidece Iterval for µ is.097 µ.56, ad P-value = 0.0039 Although we separated H ad H 3 as differet hypotheses to emphasize the ideas, they ca be uderstood as idetical ad ca be prove ad estimated together by oe hypothesis test, with cofidece itervals as described below. So, we cosider HT: HT: Hypothesis testig ad cofidece itervals for H.. H 0 (Null hypothesis of H ): A user s average ratio of overlapped topics with other persos i her groups over her total topics which have bee referred, µ, is the same as the average ratio of overlapped topics with other persos out of her groups over her total topics which have bee referred, µ. Hypothesis Testig about H 0 : µ = µ vs. H : µ > µ Because a user ca belog to multiple groups, populatio meas µ ad µ are calculated as i the formulas below, respectively, G i, j k = i= j=, j i = G µ k= ( ) O, µ k = i= j=, j = G k G = ( N ) where O i,j is user i s topic ratio overlapped with user j s topics over i s total topics, G is the total umber of user groups i the system, is the total umber of users i group, ad N is the total umber of users i the system. Oe istace of radom variables i this testig, oe user s overlapped topic ratio with other persos i her group ad overlapped topic ratio with other persos out of her group, is illustrated i Figure 3.. Coditios: 95% cofidece (test size α = 0.05), two i.i.d. radom samples from a ormal distributio, for two sample sizes ad, = < 5, stadard deviatios of each sample σ ad σ are ukow two-sample Welch t-test. 3. Test statistics: Welch score w 0 = s s, where y ( y y ) / +, y are sample meas of each sample ad s, s are sample stadard deviatios of each sample. 4. Rejectio Rule is to reject H 0 if the w 0 > t where t refers to the t-cutoff of the t- distributio table, ad df s is the Satterthwaite s degree of freedom approximatio [5] which is calculated by df S = 5. From the experimet, y = 0.03, df s,α ( s + s ) ( s ) ( s ) + N O i, j y = 0.05, df s = 6. ad w 0 = 4.64 > t 6., 0.05 =.745 6. Therefore decisio is to Reject H 0 ad accept H, 95% Cofidece Itervals for µ, µ ad µ - µ are 0.0659 µ 0.40, 0.083 µ 0.047 ad 0.0468 µ - µ 0.63, respectively, ad P-value = 0.0003

Although the cofidece itervals foud i HT ad HT are broad because of a relatively small set of participats, all values i the itervals still prove our hypotheses. Therefore, supported by HT ad HT, we argue statistically that our hypotheses are correct. We coclude that each DL user will egage i cosistet activities i respose to cosistet output of the DLs accordig to their research iterests ad learig topics. Also, DL users who share commo research iterests ad learig topics will share the same output from the DLs as well. Therefore, we coclude that usig implicit ratig data is highly effective i characterizig users, accordig to our experimet. Figure 3: A istace of the radom variables for user a s i-group overlappigs ad out-group overlappigs, i the Hypothesis Testig HT. All overlappig ratios are directed. ab meas the overlappig ratio from user a to user b. Because the ratio is the umber of topics overlapped over the total umber of topics i her user model, ab ba. I this case, the i-group overlappig ratio of user a is the average of ab, ac, ad ad, ad out-group overlappig ratio is the average of ae ad af. Studies o the effect of differet types of data o the performace of user cluster miig have highlighted a basic problem caused by the variety of academic terms, as we metioed i the itroductio sectio. However, we ca explore user cluster miig more objectively, because we ca obtai user groups without depedig o user s subjective aswers to questioaires about their research iterests or prefereces [8]. We coducted a ANOVA test to compare the effectiveess of four differet user ratig data types o the performace of user cluster miig by usig implicit ratig data ad user groups collected from experimets i [8]. Figure 4 shows the result; ANOVA statistics F(3, 64) = 4.86, p-value = 0.004 ad the least sigificat differece (LSD) =.753. Topics mea ou phrases geerated by LINGO. Terms idicate sigle ous cotaied i the origial documets, queries, ad topics. Although we gaied a relatively large LSD because of the small umber of participats, we still foud statistical sigificace i this test. Figure 4 shows that the test usig proposed terms performs sigificatly worse. Except for the test usig the proposed terms, the other three tests that use selected topics, proposed topics, ad selected terms do t show statistically sigificat differeces from each other eve though the test usig proposed documet topics shows slightly higher performace. We believe that this is because usig proposed terms causes too sesitive overlappig both i the i-group testig ad out-group testig (to distiguish proper relatios betwee users). This leads us to coclude that

term-frequecy based approaches to user cluster miig are ot as efficiet as documet-topic based approaches usig user ratig ad documet clusterig. Overlappig Ratio 0. 0.8 0.6 0.4 0. 0. 0.08 0.06 0.04 0.0 0 selected topics proposed topics selected terms proposed terms Implicit Ratig Data Used Betwee I-Group Users (BIU) Betwee Out-Group Users (BOU) Distiguishability : Average of BIU/BOU 6 5 4 3 0 Average of BIU / BOU Figure 4: Effects of implicit data type used, o the average topic overlappig ratios, betwee i-group users ad betwee out-group users. 5 Coclusios ad Future Work We desiged ad implemeted a user modelig ad user trackig system for a DL to capture ad maitai a user s ratigs ad prefereces. We the proved that implicit ratig data i a complex iformatio system is highly related with the user s research iterests, learig topics, ad prefereces through two statistical hypothesis tests. The test results support the claim that implicit ratigs are good iformatio for studies o user aalysis, persoalizatio, collaborative filterig, ad recommedig. Fially, we tested the effect of differet types of ratig data o the performace of user cluster miig ad foud that usig proposed terms performs worst, because of sesitive overlappig ratio of appearace. From this test we coclude that user s activities of selectig somethig o the iterface, ad extractig documet topics of retured documets from searches with a documet clusterig algorithm, represet the user s characteristics. These results are more meaigful i complex iformatio systems like digital libraries because such systems have dyamic cotets ad sparse ratig data, ad thus implicit ratig data is more feasible to collect tha explicit ratig data. Future work will iclude more advaced data miig techiques usig implicit ratig data ad a wider deploymet of a visualizatio frot-ed for CITIDEL [6]. 6 Ackowledgemets We thak the: people ad orgaizatios workig o CITIDEL, studet participats i our experimets, developers of the LINGO algorithm, ad developers of the

CitiViz [6] visualizatio tool. Thaks go to the Natioal Sciece Foudatio for support of grats NSF DUE-0679, DUE-074, IIS-0307867, ad NSF IIS- 035579. Refereces. Ravi umar, Jasmie Novak, Prabhakar Raghava ad Adrew Tomkis: Structure ad Evolutio of Blogspace. I Commuicatios of the ACM, Vol. 47, No., December 004, 35-39. Cass R. Sustei: Democracy ad Filterig. I Commuicatios of the ACM, Vol. 47, No., December 004, 57-59 3. Thomas W. Maloe, eeth R. Grat, Frakly A. Turbak, Stephe A. Brobst, ad Michael D. Cohe: Itelliget iformatio sharig systems. I Commuicatios of the ACM, Vol. 30, No. 5, 987, 390-40 4. David M. Nichols, Duca Pemberto, Salah Dalhoumi, Omar Larouk, Clair Belisle ad Michael B. Twidale: DEBORA: Developig a Iterface to Support Collaboratio i a Digital Library. I Proceedigs of the Fourth Europea Coferece o Research ad Advaced Techology for Digital Libraries (ECDL 00), Lisbo Portugal, September 000, 39-48 5. ai Yu, Ato Schwaighofer, Volker Tresp, Xiaowei Xu ad Has-Peter riegel: Probabilistic Memory-based Collaborative Filterig, IEEE Trasactios o owledge ad Data Egieerig, 004, Vol. 6, No., 56-69 6. David M. Nichols: Implicit Ratig ad Filterig. I Proceedigs of 5th DELOS Workshop o Filterig ad Collaborative Filterig, Budapest Hugary, November 997, 3-36 7. Joseph A. osta, Bradley N. Miller, David Maltz, Joatha L. Herlocker, Lee R. Gordo ad Joh Riedl, GroupLes: Applyig Collaborative Filterig to Useet News. I Commuicatios of the ACM, Vol. 40, No. 3, 997, 77-87 8. Seoho im ad Edward A. Fox: Iterest-based User Groupig Model for Collaborative Filterig i Digital Libraries. I Proceedigs of the 7 th Iteratioal Coferece o Asia Digital Libraries (ICADL 04), Shaghai, Chia, December 004. I Spriger Lecture Notes i Computer Sciece 3334, 533-54 9. Marcos Adré Goçalves, Mig Luo, Rao She, Mir Farooq ad Edward A. Fox: A XML Log Stadard ad Tools for Digital Library Loggig Aalysis. I Proceedigs of Sixth Europea Coferece o Research ad Advaced Techology for Digital Libraries, Rome, Italy, September, 00, 6-8 0. apil Ahuja, Uma Murthy ad Sadi Vasile: I Virgiia Tech report, available at http://collab.dlib.vt.edu/ruwiki/wiki.pl?mmprojectusermod, 004. Carrot Project, A Research Framework for experimetig with automated queryig of various data sources, processig search results ad visualizatio, available at http://www.cs.put.poza/pl/dweiss/carrot/, 005. Staisław Osiński ad Dawid Weiss: Coceptual Clusterig Usig Ligo Algorithm: Evaluatio o Ope Directory Project Data. I Advaces i Soft Computig, Itelliget Iformatio Processig ad Web Miig, Proceedigs of the Iteratioal IIS: IIPWM 04 Coferece, Zakopae Polad, 004, 369-378, available at http://www.cs.put.poza.pl/ dweiss/site/publicatios/dowload/iipwm-osiski-weiss-004-ligoeval.pdf 3. Aaro Ceglar, Joh Roddick ad Paul Calder: Guidig owledge Discovery Through Iteractive Data Miig, Maagig Data Miig Techologies i Orgaizatios: Techiques ad Applicatios, Idea Group Publishig, 003, 45-87 4. CITIDEL: Available at http://www.citidel.org/, 005

5. R. Lyma Ott ad Michael Logecker: A Itroductio to Statistical Methods ad Data Aalysis, Fifth Editio, Wadsworth Group, 00 6. Saverio Perugii, athlee McDevitt, Rya Richardso, Mauel Perez-Quiñoes, Rao She, Nare Ramakrisha, Chris Williams ad Edward A. Fox: Ehacig Usability i CITIDEL: Multimodal, Multiligual, ad Iteractive Visualizatio Iterfaces, i Proceedigs of the Fourth ACM/IEEE Joit Coferece o Digital Libraries (JCDL 04), Tucso Arizoa, Jue 004, 35-34