Applying author co-citation analysis to user interaction analysis: a case study on instant messaging groups

Similar documents
Empirical research on implementation of full English teaching mode in the professional courses of the engineering doctoral students

Multimedia Application Effective Support of Education

The Dynamics of Social Learning in Distance Education

Australian Journal of Basic and Applied Sciences

Python Machine Learning

A Study of Metacognitive Awareness of Non-English Majors in L2 Listening

University of Groningen. Systemen, planning, netwerken Bosman, Aart

STUDIES OF AUTHOR COCITATION ANALYSIS: A BIBLIOMETRIC APPROACH FOR DOMAIN ANALYSIS

Role of Blackboard Platform in Undergraduate Education A case study on physiology learning in nurse major

AQUA: An Ontology-Driven Question Answering System

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

On the Combined Behavior of Autonomous Resource Management Agents

Probability and Statistics Curriculum Pacing Guide

CSC200: Lecture 4. Allan Borodin

The Enterprise Knowledge Portal: The Concept

On-Line Data Analytics

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Software Maintenance

A Case Study: News Classification Based on Term Frequency

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

OPTIMIZATINON OF TRAINING SETS FOR HEBBIAN-LEARNING- BASED CLASSIFIERS

Assignment 1: Predicting Amazon Review Ratings

10.2. Behavior models

Distributed Weather Net: Wireless Sensor Network Supported Inquiry-Based Learning

Situational Virtual Reference: Get Help When You Need It

Procedia - Social and Behavioral Sciences 226 ( 2016 ) 27 34

Preliminary Report Initiative for Investigation of Race Matters and Underrepresented Minority Faculty at MIT Revised Version Submitted July 12, 2007

Notes on The Sciences of the Artificial Adapted from a shorter document written for course (Deciding What to Design) 1

Thought and Suggestions on Teaching Material Management Job in Colleges and Universities Based on Improvement of Innovation Capacity

Classroom Connections Examining the Intersection of the Standards for Mathematical Content and the Standards for Mathematical Practice

Visit us at:

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Motivation to e-learn within organizational settings: What is it and how could it be measured?

Educator s e-portfolio in the Modern University

Hiroyuki Tsunoda Tsurumi University Tsurumi, Tsurumi-ku, Yokohama , Japan

Speech Recognition at ICSI: Broadcast News and beyond

Circuit Simulators: A Revolutionary E-Learning Platform

Team Formation for Generalized Tasks in Expertise Social Networks

Interpreting ACER Test Results

WORK OF LEADERS GROUP REPORT

Multiple Intelligence Theory into College Sports Option Class in the Study To Class, for Example Table Tennis

Statewide Framework Document for:

Grade 2: Using a Number Line to Order and Compare Numbers Place Value Horizontal Content Strand

National Taiwan Normal University - List of Presidents

A Note on Structuring Employability Skills for Accounting Students

AP Statistics Summer Assignment 17-18

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Defragmenting Textual Data by Leveraging the Syntactic Structure of the English Language

The lab is designed to remind you how to work with scientific data (including dealing with uncertainty) and to review experimental design.

Linking Task: Identifying authors and book titles in verbose queries

Build on students informal understanding of sharing and proportionality to develop initial fraction concepts.

Generating Test Cases From Use Cases

Impact of Digital India program on Public Library professionals. Manendra Kumar Singh

The Isett Seta Career Guide 2010

BSM 2801, Sport Marketing Course Syllabus. Course Description. Course Textbook. Course Learning Outcomes. Credits.

GACE Computer Science Assessment Test at a Glance

A Study of Successful Practices in the IB Program Continuum

First Grade Standards

An Effective Framework for Fast Expert Mining in Collaboration Networks: A Group-Oriented and Cost-Based Method

The development and implementation of a coaching model for project-based learning

(Includes a Detailed Analysis of Responses to Overall Satisfaction and Quality of Academic Advising Items) By Steve Chatman

Curriculum Design Project with Virtual Manipulatives. Gwenanne Salkind. George Mason University EDCI 856. Dr. Patricia Moyer-Packenham

Unit 7 Data analysis and design

Ricopili: Postimputation Module. WCPG Education Day Stephan Ripke / Raymond Walters Toronto, October 2015

USER ADAPTATION IN E-LEARNING ENVIRONMENTS

AC : PREPARING THE ENGINEER OF 2020: ANALYSIS OF ALUMNI DATA

MBA 5652, Research Methods Course Syllabus. Course Description. Course Material(s) Course Learning Outcomes. Credits.

Number of students enrolled in the program in Fall, 2011: 20. Faculty member completing template: Molly Dugan (Date: 1/26/2012)

Patterns for Adaptive Web-based Educational Systems

Visual CP Representation of Knowledge

SCOPUS An eye on global research. Ayesha Abed Library

BENCHMARK TREND COMPARISON REPORT:

Analyzing sentiments in tweets for Tesla Model 3 using SAS Enterprise Miner and SAS Sentiment Analysis Studio

Data Integration through Clustering and Finding Statistical Relations - Validation of Approach

ACCOUNTING FOR MANAGERS BU-5190-AU7 Syllabus

Data Fusion Models in WSNs: Comparison and Analysis

Word Segmentation of Off-line Handwritten Documents

Shyness and Technology Use in High School Students. Lynne Henderson, Ph. D., Visiting Scholar, Stanford

Preprint.

Outreach Connect User Manual

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Operational Knowledge Management: a way to manage competence

SASKATCHEWAN MINISTRY OF ADVANCED EDUCATION

Introduction to Moodle

Full text of O L O W Science As Inquiry conference. Science as Inquiry

Chapter 10 APPLYING TOPIC MODELING TO FORENSIC DATA. 1. Introduction. Alta de Waal, Jacobus Venter and Etienne Barnard

Managing Printing Services

Curricular Reviews: Harvard, Yale & Princeton. DUE Meeting

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

The Impact of Honors Programs on Undergraduate Academic Performance, Retention, and Graduation

Student-Centered Learning

Chamilo 2.0: A Second Generation Open Source E-learning and Collaboration Platform

Fragment Analysis and Test Case Generation using F- Measure for Adaptive Random Testing and Partitioned Block based Adaptive Random Testing

NATIONAL SURVEY OF STUDENT ENGAGEMENT (NSSE)

ABET Criteria for Accrediting Computer Science Programs

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Changing User Attitudes to Reduce Spreadsheet Risk

Investment in e- journals, use and research outcomes

Ontologies vs. classification systems

Lip reading: Japanese vowel recognition by tracking temporal changes of lip shape

Transcription:

DOI 10.1007/s11192-014-1314-7 Applying author co-citation analysis to user interaction analysis: a case study on instant messaging groups Rongying Zhao Bikun Chen Received: 20 October 2013 Ó Akadémiai Kiadó, Budapest, Hungary 2014 Abstract Author co-citation analysis (ACA) was an important method for discovering the intellectual structure of a given scientific field. There was sufficient experience that ACA would work with almost any user data that lent itself to co-occurrence. While most of the current researches still relied on the data of scientific literatures. In this study, in order to provide useful information for better enterprise management, the idea and method of ACA was applied to analyze the information interaction intensity and contents of enterprise web users. Firstly, the development of ACA was briefly introduced. Then the sample data and method used in this study were given. Three QQ groups instant messages of a Chinese company were selected as the raw data and the concepts and model of user interaction intensity (UII) were proposed by referring the ACA theory. Social network analysis method, combined with in-deep interview method were used to analyze the information interaction intensity and contents of enterprise users. Operatively, Excel, Ucinet, Pajek, Netdraw and VOSviewer software were combined to analyze them quantitatively and visually. Finally, it concluded that UII model was relatively reasonable and it could nicely measure the information interaction intensity and contents of enterprise web users. Keywords Webometrics Usage metrics Author co-citation analysis Instant messaging Social network analysis Information visualization Enterprise management JEL Classification C92 Mathematics Subject Classification 91C20 R. Zhao B. Chen (&) School of Information Management, Research Center for China Science Evaluation, The Center for the Studies of Information Resources, Wuhan University, Wuhan 430072, China e-mail: chenbikun2011@whu.edu.cn

Introduction Author co-citation analysis (ACA) was firstly introduced by White and Griffith (1981). Different researchers applied it to detect intellectual structure of a given scientific field. For example, White and McCain (1998) used traditional techniques (multidimensional scaling, hierarchical clustering and factor analysis) to display the specialty groupings of 120 highly-cited information scientists. White (2003) used another kind of technique pathfinder networks (PFNETs) to remap the paradigmatic information scientists with White and McCain s raw data from 1998. Jevremov et al. (2007) mapped the personality psychology as a research field. Osareh and McCain (2008) studied the structure of Iranian chemistry research. Then some researchers extended ACA from the traditional citation databases to the Web environment. Leydesdorff and Vaughan (2006) started an exploratory research by selecting 24 authors of information science under web environment with Google Scholar. Qiu and Ma (2009); Ma et al. (2009) conducted studies of information science scholars in China with the Chinese Google Scholar. Obviously, data sources of the researches above are scientific literatures, such as ISI Web of Knowledge, Google Scholar, China National Knowledge Infrastructure (CNKI) and Chinese Social Sciences Citation Index (CSSCI). Besides, ACA was also applied in Webometrics in recent years. Zuccala (2006) compared ACA and web colink analysis (WCA) by taking mathematics as the subject. She stated that although the practice of ACA might be used to inform a WCA, the two techniques did not share many elements in common. The most important departure between them existed at the interpretive stage when ACA maps became meaningful in light of citation theory, and WCA maps required interpretation based on hyperlink theory. Vaughan and You (2010) proposed a new Webometrics concept-web co-word analysis to measure the relatedness of organizations by using the data from Google and Google Blogs. Wang et al. (2011) studied songs/singers co-collection relationship of online music web users by referring the cocitation analysis theory. Previous researches on the analysis and practice of ACA were meaningful and have covered traditional citation databases, Google Scholar, Google search engine, Google Blogs, online music web and so on. There is sufficient experience with co-occurrence techniques to suggest that they will work with almost any data that lends itself to cooccurrence. But as mentioned above, most of the current researches still relied on the data of scientific literatures. In this study, it aimed to apply the idea and method of ACA to analyze the information interaction intensity and contents of enterprise web users and provided useful information for better enterprise management. So, a new kind of web users data-instant messages of a Chinese company was selected as the raw data. Why these instant messages are chosen to be mapped? Instant messaging product is very popular in daily life to facilitate our communications and we have perceived it as an essential part. So, it is meaningful and interesting to study this kind of social media. Social network analysis (SNA) is the classical method to study social media, but its key step is to set a standard to measure the relation between one member and other members. The nature of ACA that characterizes the relationships between one member and other members perfectly satisfies the needs of measuring the relationships among the users of Instant Messaging. That is why ACA (often to map) is chosen as the technique. In China, Tencent QQ is the most popular Instant Messaging product (detailed information about Tencent Inc. can be acquired in this portal: http://www.tencent.com/en-us/ index.shtml). QQ group is one of typical applications launched by Tencent QQ. QQ group allows a group of people with the same interests, same job, same company or same department to instantaneously chat with certain topics. It also provides the users with other

services: group BBS, group albums, shared files, group homepages and so on. Based on the raw data, user interaction intensity (UII) model was proposed to measure the user information interaction intensity by referring to the ACA theory. SNA method, mapping and clustering techniques were applied to detect the user information interaction intensity and analyze the user information interaction contents. Also, additional in-depth interviews were developed to reinforce the results. Data and method Data The sample data were derived from Tencent QQ groups in an enterprise of China. The enterprise focuses on the development and maintenance of computer hardware and software, broadband network, web sites, telephone networks and television networks (detailed information can be get in this portal: http://www.pmcc.com.cn/). It owns about 300 employees and four departments: software department, system integration department, system security department and marketing department. In software department, there are 20 employees, including one manager, one deputy manager, three technical directors and fifteen ordinary staff. In the enterprise, there are a variety of network relationships, which can be classified into formal network and informal network. Formal network refers to the network driven and formed by enterprise task and can be managed by enterprise, which is the specific reflection of the organizational structure. Informal network refers to the network formed spontaneously by the employees, not constrained by enterprise task, which is loose, unorganized, various and difficult to maintain (Xu 2011). In addition, there are also some semi-formal networks in the enterprise, existing between the formal one and the informal one. This study has conducted the in-depth interview survey, finding that there were three main kinds of QQ groups in the company: department group, project team group in certain department, new employees group per year. Therefore, a simple stratified sampling method was applied to select sample data in terms of the three kinds above: software department group (group A), group of a project team in software department (group B), group of 2011 new employees (group C). According to the theories above, group A is formed by formal organization, group B is formed by semi-formal organization and group C is formed by informal organization. In the end, instant messages of group A, B and C were selected from October 1st, 2011 to February 29th, 2012 and provided by several instant messaging group users in the company. Then clean the sample data by deleting the invalid and redundant messages. The final sample data were counted as follows (in order to protect the privacy of the enterprise members, each member was identified by a number). From Table 1, there are certain relationships among the three instant messaging groups: Group A and B own five common members: 14, 15, 16, 17 and 18. Group B and C have three common members: 16, 17 and 18. Group A and C have four common members: 13, 16, 17 and 18. In perspective of user interaction contents, QQ group instant messages are comprised by different kinds of topics, as well as the conversations in our daily life. How do we recognize the topics in QQ group instant messages? As we know, every topic has a time span. So, it is reasonable to cut instant messages into topics in terms of the messages date and time. In this study, if the time interval between one message and the next message is 30 min or more, then cut them off. The segmentation method above stems from the hypothesis below: within half an hour or more, if no member in the instant messaging

Table 1 Basic statistics of three groups QQ group Message number Topic number Member number Member ID A 258 41 18 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 B 2184 131 6 14, 15, 16, 17, 18, 19 C 452 41 21 13, 16, 17, 18, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36 group speaks a word, a topic is over. It is probable that some instant messages are misclassified to the next topic in terms of this segmentation method mainly because someone may speaks a word which belongs to the previous topic more than half an hour later. So, in this study, every segmented topic is checked in the light of the context. In the end, instant messages of group A, B and C are cut into 41, 131 and 41 topics respectively (shown in Table 1). In addition, member 6 has only two pieces of messages and has no contact with other member in instant messaging group. Considering the further study below, this study perceived that member 6 has joined only one topic. Concepts of user interaction intensity White and Griffith (1981) summarized that the mapping of a particular area of science can be done using authors as units of analysis and the co-citations of pairs of authors as the variable that indicates their distances from each other. The analysis assumes that the more two authors are cited together, the closer the relationship between them. Co-citation of authors results when someone cites any work by any author along with any work by any other author in a new document of his own. Based on the descriptions above, the concepts of UII are proposed. Specifically, the concepts of UII rely on the hypothesis below: different members in an instant messaging group participate in a certain topic because they are interested in the topic or they are familiar with each other and willing to exchange their information. So, in this study, a piece of topic cut from QQ group instant messages can be seen as a journal article, any users included by the topic can be perceived as the authors of cited references (shown in Fig. 1). Since ACA uses author co-citation count as a measure of the relatedness of authors research, the concepts of UII proposed in this study can be viewed as another application of the concepts of ACA. However, the most important difference between them exists at the interpretive stage when ACA becomes meaningful in terms of citation theory and UII requires interpretation based on user information behavior theory and social network theory. Standard formula of UII: UII is defined as the relations between one member and any other member and is set as /, W is set as a certain topic, i and j are set as any two members in the instant messaging group. The intensity / between member i and j is the sum of every minimum number of member i and j co-occurring frequency in any topic W (Wang 2011). / ij ¼ X W minðwi; WjÞ: McCain (1990) summarized the steps in ACA: selection of the author set, retrieval of co-cited author counts, compilation of raw co-citation matrix, conversion to correlation matrix, multivariate analysis of correlation matrix and interpretation and validation, which is called traditional model. In term of the summary by McCain (1990), the basic steps of

Cited Reference of a Journal Article Web of Science CR Bookstein A, 1999, SCIENTOMETRICS, V46, P337 HARGENS LL, 1980, SOC STUD SCI, V10, P55 Harirchi G, 2007, SCIENTOMETRICS, V72, P11 Kovac P, 2004, CHEM BIODIVERS, V1, P606 LENOBLE WJ, PHYS ORGANIC CHEM MCCAIN KW, 1990, J AM SOC INFORM SCI, V41, P433 McCain KW, 2006, J INF SCI, V32, P277 MCCAIN KW, 1989, SCIENTOMETRICS, V17, P127 Mehrdad M, 2004, SCIENTOMETRICS, V61, P79 Moed H., 2005, CITATION ANAL RES EV Author Co-citation A Topic The Next Topic Topic N: 2011/11/15, 15:18:57, UserOne 2011/11/15, 15:22:28, User Two 2011/11/15, 15:24:22, User One 2011/11/15, 15:35:01, User Three 2011/11/15, 15:35:39, User Three - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Topic N+1: 2011/11/16, 17:01:52, User Three User Information Interaction QQ Group Instant Messages Fig. 1 Author co-citation and user information interaction the model are comprised by compilation of raw co-citation matrix, conversion to correlation matrix and multivariate analysis of correlation matrix. SNA method was also verified to be applicable to the co-citation research (e.g., Xu and Zhu 2008; Groh and Fuchs 2011). Therefore, in this study, the UII analysis steps included: compilation of raw co-citation matrix, conversion to correlation matrix, SNA and visualization. Processing of user interaction contents In terms of enterprise features, enterprise knowledge can be divided into three levels: individual knowledge, team knowledge and organizational knowledge. Individual knowledge means the knowledge formed by daily work experience and personal learning and existed in the minds of individuals, which can be called personal skills. Team knowledge mainly refers to operating procedures and standards of certain enterprise tasks. Organizational knowledge mainly refers to enterprise culture and regulation. In this study, QQ group instant messages of the enterprise can be seen as a kind of enterprise knowledge and everyone who participated in one topic can be seen as contributing or sharing enterprise knowledge once. Referring to the theories of enterprise knowledge sharing above, this study has manually analyzed contents of the topics one by one and summarized them into six kinds: personal skill, operating procedure, enterprise regulation, activity, leisure and problem-solving (shown in Table 2). Then this study applied SNA to analyze user information interaction contents, the operation steps were as follows: set a standard to measure the relation between one member and the user interaction contents firstly, then construct the matrix based on the standard, in the end input the matrix into Ucinet, Netdraw or Pajek to analyze. Firstly, set the standard of user information interaction contents: every topic can be seen as the interaction contents and everyone who participated in one topic can be seen as contributing or sharing interaction contents once. Then construct two-mode matrix in terms of standard above. In the end, input the matrix to Ucinet to analyze its characteristic

and draw user interaction contents chart in Netdraw. The chart is shown below (shown in Fig. 4). Tools In bibliometric and scientometric research, a lot of attention is paid to the analysis of networks of, for example, documents, keywords, authors or journals. Mapping and clustering techniques are frequently used to study such networks. Waltman et al. (2010) firstly presented their proposal for a unified approach to mapping and clustering. In the bibliometric and scientometric literature, the most commonly used combination of a mapping and a clustering technique is the combination of multidimensional scaling and hierarchical clustering by SPSS software (for early examples, see White and Griffith 1981; Small et al. 1985; McCain 1990; Peters and Van Raan 1993). However, various alternatives to multidimensional scaling and hierarchical clustering have been introduced in the literature, especially in more recent work, and these alternatives are also often used in a combined fashion. A popular alternative to multidimensional scaling is the mapping technique of Kamada and Kawai algorithm (1989) by Pajek software; (e.g. Leydesdorff and Rafols 2009; Noyons and Calero-Medina 2009; Leydesdorff et al. 2014), which is sometimes used together with the pathfinder network technique (e.g. Schvaneveldt et al. 1988; Chen 1999; White 2003; Moya-Anegon et al. 2007). Two other alternatives to multidimensional scaling are the VxOrd mapping technique (e.g., Boyack et al. 2005; Klavans and Boyack 2006) and VOSmapping technique of VOSviewer software (e.g., VanEck and Waltman 2010). Factor analysis, which has been used in a large number of studies (e.g., Moya- Anegon et al. 2007; Zhao and Strotmann 2008; Leydesdorff and Rafols 2009), may be seen as a kind of clustering technique and, consequently, as an alternative to hierarchical clustering. Another alternative to hierarchical clustering is clustering based on the modularity function of Newman and Girvan (2004); (e.g. Wallace et al. 2009; Zhang et al. 2010). As to the mapping and clustering software, Leydesdorff et al. argued that Gephi and VOSviewer offer superior visualization techniques (Leydesdorff et al. 2011), while Gephi and Pajek/Ucinet offer network statistics. However, the comparison made us realize that with little effort we could also make our outputs compatible with Pajek, and via Pajek also for Gephi (which read Pajek files). This offers additional flexibilities such as using algorithms for community detection among a host of other network statistics which are available in Pajek and Gephi, but not in VOSviewer (Leydesdorff et al. 2014). According to the theories and practices above, in this study, Excel, Ucinet, Pajek, Netdraw and VOSviewer software were combined to analyze UII and contents quantitatively and visually. Excel VBA programming was used to construct user interaction matrix in terms of standard formula of UII above. Ucinet was applied to read the matrix and generate.net file and.##h file. Pajek and Netdraw were used to load the.net file and.##h file to draw user interaction figure. VOSviewer was further applied to visualize the UII figure with its own clustering algorithm based on modularity optimization. Ucinet and Pajek were combined to offer network statistics. In-depth interviews A limitation of the SNA method is that it depicts the current networks between members, but does not reveal the causal factors, context or history of the team contributing to the current influence patterns or perceptions of prestige or knowledge flows within the team. The team history and context as well causes of the current relationship patterns can be

Table 2 User interaction contents of QQ group QQ group User interaction contents (participating number is also shown) A Personal skill (10), enterprise regulation (19), operating procedure (10), problem-solving (2) B Personal skill (17), enterprise regulation (7), operating procedure(39), problem-solving(18), activity (2), leisure (48) C Enterprise regulation (9), activity(9), problem-solving(5), leisure(18) investigated by additional in-depth interviews (Behrend and Erwee 2009). Therefore, in this study, the in-deep interviews with team leaders, key managers and some active members in the enterprise were developed to reinforce the results. User interaction intensity analysis In Pajek, UII network was visualized with the spring-based algorithm of Kamada and Kawai (1989). This algorithm reduces the stress in the representation in terms of seeking to minimize the energy content of the spring system. In the UII figure, every node signifies a member, the size of every node means its degree centrality in the network, the position of every node in the network (in the center or in the edge) signifies its importance, and the thickness of the line between two nodes signifies its interaction intensity, the distance between one node and any other node signifies its closeness. In addition, different color signifies different groups (obtained by K-core analysis). A subset of vertices is called a k-core if every vertex from the subset is connected to at least k vertices from the same subset. Cores in Pajek can be computed using Network/Create Partition/k-Core/All. Result is a partition: for every vertex its core number is given (shown in Table 3) (Nooy et al. 2011). In Fig. 2, member 16 and 18 have the highest degree centrality mainly because they are the common member of Group A, B and C and they contact frequently and broadly with other members. Member 19 is special, scattering in the edge of group A and B. In reality, member 19 is not the staff in the enterprise but a technical guide from a professional software company in China. In addition, there are nine groups (clusters) in the network in terms of their color and member 1, 6, 29 and 35 forms a single group (cluster 1, 3, 7 and 9) respectively (shown in Table 3). In the enterprise, member 1 is the software department manager but he belongs to a single group (cluster). Tracing the instant message contents of member 1, it is found that his messages are almost office notices and task programs. It might be the reason for him to be isolated by others. It is advisable for him to speak some other topics with others members in the department to promote user interaction. Member 6 has the lowest degree centrality and no contacts with other members. In reality, member 6 is one of the only two females in group A, the gender imbalance may be the root reason. When interviewed with the manager and the deputy manager, they said the reason lied in that the company was male-predominant and technology-oriented and member 6 was the only female technical staff in the software department with the edge task of system maintenance (there are only two members undertaking this task). As to member 29 and 35, they are introverted with less words than others after I interviewed with member 18, the active man in the network.

Table 3 Clusters of user interaction intensity network Cluster Member Freq % 1 1 2.7778 2 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18 44.4444 3 6 2.7778 4 19, 24, 27 8.3333 5 20, 21, 22, 23, 25, 26, 30,31, 32, 34 25.0000 6 28, 33 5.5556 7 29 2.7778 8 30, 36 5.5556 9 35 2.7778 Also, cluster 2 is the biggest cluster and all of them are the members of group A. When interviewed with the manager and the deputy manager, they said that their department is the busiest one in their enterprise to maintain its normal operation, so they communicate frequently each other in the workdays, online and offline. The members of cluster 5, 6 and 8 belong to group C but they are different clusters. Interviewing with member 18, he said that members in Group C made contact frequently when they had just entered the enterprise. But half a year later, they made contact less than previously. In his mind, the reason for this lay in the fact that new staff were not familiar with their job at first so they sought a sense of belonging in Group C, the group for new staff. But after they had worked for several months, they gradually integrated into a new life and a new circle of friends, so the old one was ignored. Besides, member 19, 24 and 27 belong to cluster 4 because they have the same network features. Interviewed with the member 18 and 31, they said that the three men were relatively introverted, they liked to chat with certain five or six colleagues and did not care about others. Network density and user interaction intensity In the density view, items are indicated by a label. Each point in a map has a color that depends on the density of items at that point. That is, the color of a point in a map depends on the number of items in the neighborhood of the point and on the importance of the neighboring items. By default, VOSviewer uses a red green blue color scheme (see Fig. 3). In this color scheme, red corresponds with the highest item density and blue corresponds with the lowest item density. The density view is particularly useful to get an overview of the general structure of a map and to draw attention to the most important areas in a map (VanEck and Waltman 2010). In Fig. 3, on the whole, there is a clear separation between the areas of group A and C. It signifies that group A and B have a higher density and size than group C, which indicates that group A and B have the higher information interaction intensity than group C. Also, areas of member 1 to 18 (member 6 excluded) and member 22 and 31 turn out to be important. These areas are very dense, which indicates that overall the information interaction intensity among these members are highest. Centrality and user interaction intensity In Ucinet software, centrality is measured by Network centrality closeness, degree, betweenness, Eigenvector and so on, in this study, only closeness, degree and betweenness are discussed (Ucinet 2004).

Fig. 2 User interaction intensity network by Pajek Betweenness centrality of an actor is the extent to which an actor serves as a potential go-between for other pairs of actors in the network by occupying an intermediary position on the shortest paths connecting other actors. Closeness centrality of an actor is the extent to which the most direct paths connecting an actor to each of the actors in a network are short rather than long. Degree centrality is the number of connections that an actor has in a network (Kilduff and Tsai 2003). As a word, they show the importance of the member in the user interaction network. In Table 4, member 18, 16, 31, 32 and 17 enter the centrality of combined network Top 10, which show that they occupy a central position in the combined network. As the important members in the network, they control the most information flow and have the advantages to contact more with other members. So, it is advisable for them to contact more with others members, improving the information interaction atmosphere and environment. Structural holes and user interaction intensity In Ucinet software there are two ways to detect structural holes: Network Centrality Freeman Betweenness Node Betweenness and Network Ego Networks Structural Holes (Ucinet 2004). From Table 5, the results of the two detection methods are consistent: their common members are 18, 16, 13 and 17. Structural Holes shows the situation of an actor as the middleman, controlling the enterprise information flow, which play a vital role in the user interaction. So the four members act as the middleman of the overall user interaction network. In reality, the four members are new staff in 2011 and they are also the common members in group A and B. In short, the reality and the detection results are consistent. Besides, member 14 and 31 are the structural holes of group A and B respectively. In a word, for UII network (shown in Fig. 2), it is easy to distinguish group A, group B and group C. The three groups are linked together by member 18, 16, 13 and 17, which are the information interaction middleman of the overall network. Also, member 14 and 31 are the information interaction middleman of group A and group C respectively. If the six members keep on communicating frequently with others, the atmosphere of information interaction will be better. Besides, member 6 is isolated in the overall network (shown in Fig. 2), who contacts less with others. So, the business managers should pay more attentions the core node, isolated node and structural holes in the overall network and take certain measures to tackle the problems and promote user information interaction.

Fig. 3 Density view by VOSviewer Table 4 Centrality of user interaction network (Top 10) Member ID nbetweenness Member ID ncloseness Member ID NrmDegree 18 22.697 18 43.21 14 6.133 16 13.42 16 41.667 19 5.248 31 12.998 31 39.326 16 2.845 22 8.7 17 38.889 17 2.346 20 7.791 13 38.889 15 2.24 23 4.545 22 38.043 18 0.958 32 4.323 32 37.634 31 0.371 13 4.03 20 37.634 23 0.281 17 3.606 21 37.234 32 0.273 21 2.944 14 35.714 7 0.186 Table 5 Structural holes of user interaction network Method Structural holes Node betweenness 18, 16, 31, 17, 13 Structural holes 18, 16, 13, 14, 17 User interaction contents analysis In Netdraw, user interaction contents network was visualized with the Spring embedding algorithm (Netdraw Layout Graph Theoretic Layout Spring embedding). In the chart below, every circle signifies the user interaction contents, every square signifies a member, the size of every node signifies its degree centrality in the network, the thickness of the line

Fig. 4 User interaction contents of combined network between the square and the circle means the amount of user interaction contents that the members contributed or shared (Borgatti 2004). User interaction contents of combined network In Fig. 4, on the whole, in the combined network enterprise regulation has the highest degree centrality, then the leisure and activity, operating procedure, personal skill and problem-solving are the least, which indicate that almost all the members pay close attention to the enterprise regulation. For enterprise regulation, member 1 is the closest to it. In reality, member 1 is the software department manager and his instant messages are almost office notices and task programs. For leisure and activity, member 14, 15, 16, 17, 18 and 19 contribute or share the most mainly because they are in the smallest QQ group (Group B), generally the smaller a group, the closer they are. When interviewed, member 18 said that members of Group B always went out for gatherings and travel, enjoying happy leisure time outside the daily job. So, they trust each other and have a willingness to communicate with others. For operating procedure, member 14 and 19 contribute or share the most mainly because member 19 is the technical guide from a professional software company in China. Also, when interviewed with the managers, he said that member 14 is the most experienced technical staff in Group A and he is willing to help other members handle the technical problem. Conclusions and discussion In this study, the idea and method of ACA in bibliometric and scientometric research are applied to web user research and the results are consistent to the company reality, which proves that the UII model is relatively reasonable and applicable to the web user research. Also, SNA method, combined with additional in-depth interviews with team leaders or key managers in the organizations, can quantitatively, visually and comprehensively diagnose

the user information interaction status. The managers can put the combined method into practice to detect the status of their enterprise information interaction, finding their problems and promoting information interaction and enterprise development. Although the study focused on enterprise entities, the UII model could potentially be applied to other types of organizations such as universities or governments. Also, there are some limitations of the research and its findings. In this study, the sample data are mainly limited to one section of the enterprise. If the Tencent QQ Group instant messages of four sections were collected, the findings would be more comprehensive. In the further study, more kinds of web users data will be incorporated. Acknowledgments This paper is supported by Major Program of National Social Science Foundation in China (Grant No. 11&ZD152), Program of Social Science Foundation by Ministry of Education in China (Grant No. 13YJA870023) and High-level International Journal Program of Wuhan University (Grant No. 2012GSP062). This is an extended version of a paper presented at the 14th International Society of Scientometrics and Informetrics Conference, Vienna (Austria), 15 19 July 2013. References Behrend, F. D., & Erwee, R. (2009). Mapping knowledge flows in virtual teams with SNA. Journal of Knowledge Management, 13(4), 99 114. Borgatti, S. P. (2004). NetDraw: Graph Visualization Software. Harvard, MA: Analytic Technologies. Boyack, K. W., Klavans, R., & Borner, K. (2005). Mapping the backbone of science. Scientometrics, 64(3), 351 374. Chen, C. (1999). Visualising semantic spaces and author co-citation networks in digital libraries. Information Processing and Management, 35(3), 401 420. Groh, G., & Fuchs, C. (2011). Multi-modal social networks for modeling scientific fields. Scientometrics, 89(2), 569 590. Jevremov, T., Pajic, D., & Sipka, P. (2007). Structure of personality psychology based on cocitation analysis of prominent authors. Psihologija, 40(2), 329 343. Kamada, T., & Kawai, S. (1989). An algorithm for drawing general undirected graphs. Information Processing Letters, 31(1), 7 15. Kilduff, M., & Tsai, W. (2003). Social networks and organizations. London: Sage. Klavans, R., & Boyack, K. W. (2006). Quantitative evaluation of large maps of science. Scientometrics, 68(3), 475 499. Leydesdorff, L., & Rafols, I. (2009). A global map of science based on the ISI subject categories. Journal of the American Society for Information Science and Technology, 60(2), 348 362. Leydesdorff, L., & Vaughan, L. (2006). Co-occurrence matrices and their applications in information science: Extending ACA to the Web environment. Journal of the American Society for Information Science and Technology, 57(12), 1616 1628. Leydesdorff, L., Hammarfelt, B., & Salah, A. A. A. (2011). The structure of the Arts & Humanities Citation Index: A mapping on the basis of aggregated citations among 1,157 journals. Journal of the American Society for Information Science and Technology, 62(1), 2414 2426. Leydesdorff, L., Kushnir, D., & Rafols, I. (2014). Interactive overlay maps for US patent (USPTO) data based on International Patent Classification (IPC). Scientometrics, 98(3), 1583 1599. Ma, R. M., Dai, Q. B., Ni, C. Q., & Li, X. L. (2009). An author co-citation analysis of information science in China with Chinese Google Scholar search engine, 2004 2006. Scientometrics, 81(1), 33 46. McCain, K. W. (1990). Mapping authors in intellectual space: A technical overview. Journal of the American Society for Information Science, 41(6), 433 443. Moya-Anegon, F., Vargas-Quesada, B., Chinchilla-Rodriguez, Z., Corera-Alvarez, E., Munoz-Fernandez, F. J., & Herrero-Solana, V. (2007). Visualizing the marrow of science. Journal of the American Society for Information Science and Technology, 58(14), 2167 2179. Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113. Nooy, W., Mrvar, A., & Batagelj, V. (2011). Exploratory Social Network Analysis with Pajek: Revised and Expanded (2nd ed.). New York: Cambridge University Press.

Noyons, E. C. M., & Calero-Medina, C. (2009). Applying bibliometric mapping in a high level science policy context. Scientometrics, 79(2), 261 275. Osareh, F., & McCain, K. W. (2008). The structure of Iranian chemistry research, 1990 2006: An author cocitation analysis. Journal of the American Society for Information Science and Technology, 59(13), 2146 2155. Peters, H. P. F., & Van Raan, A. F. J. (1993). Co-word-based science maps of chemical engineering. Part II: Representations by combined clustering and multidimensional scaling. Research Policy, 22(1), 47 71. Qiu, J. P., & Ma, R. M. (2009). The application of ACA method in web environment. Library and Information Service, 52(2), 85 87. (in Chinese). Schvaneveldt, R. W., Dearholt, D. W., & Durso, F. T. (1988). Graph theoretic foundations of pathfinder networks. Computers and Mathematics with Applications, 15(4), 337 345. Small, H., Sweeney, E., & Greenlee, E. (1985). Clustering the Science Citation Index using co-citations. II. Mapping science. Scientometrics, 8(5 6), 321 340. Ucinet. (2004). Ucinet for Windows: Software for social network analysis. Harvard, MA: Analytic Technologies. VanEck, N. J., & Waltman, L. (2010). Softwares survey: VOSviewer, a computer program for bibliometric mapping. Scientometrics, 84(2), 523 538. Vaughan, L., & You, J. (2010). Word co-occurrences on Webpages as a measure of the relatedness of organizations: A new Webometrics concept. Journal of Informetrics, 4(4), 483 491. Wallace, M. L., Gingras, Y., & Duhon, R. (2009). A new approach for detecting scientific specialties from raw cocitation networks. Journal of the American Society for Information Science and Technology, 60(2), 240 246. Waltman, L., Van Eck, N. J., & Noyons, E. C. M. (2010). A unified approach to mapping and clustering of bibliometric networks. Journal of Informetrics, 4(4), 629 635. Wang, Z. F. (2011). Use and management of internet communication tools in community constructions take L community in Hangzhou for example. Unpublished master s thesis, Zhejiang Gongshang University, Hangzhou, China. (in Chinese). Wang, X. W., Hu, Z. G., Ding, K., & Liu, Z. Y. (2011). Research on classification of singers in online music websites based on co-citation theory. Journal of the China Society for Scientific and Technical Information, 30(5), 471 478. (in Chinese). White, H. D. (2003). Pathfinder networks and author cocitation analysis: A remapping of paradigmatic information scientists. Journal of the American Society for Information Science and Technology, 54(5), 423 434. White, H. D., & Griffith, B. (1981). Author cocitation: A literature measure of intellectual structures. Journal of the American Society for Information Science, 32(3), 163 171. White, H. D., & McCain, K. (1998). Visualizing a discipline: An author cocitation analysis of information science, 1972 1995. Journal of the American Society for Information Science, 49(4), 327 355. Xu, L. M. (2011). Research on mechanism of enterprise internal knowledge sharing based on social network. Unpublished master s thesis, Wuhan University, Wuhan, China. (in Chinese). Xu, Y. Y., & Zhu, Q. H. (2008). Demonstration study of social network analysis method in citation analysis. Information Studies: Theory & Application, 31(2), 184 188. (in Chinese). Zhang, L., Liu, X., Janssens, F., Liang, L., & Glanzel, W. (2010). Subject clustering analysis based on ISI category classification. Journal of Informetrics, 4(2), 185 193. Zhao, D., & Strotmann, A. (2008). Information science during the first decade of the Web: An enriched author cocitation analysis. Journal of the American Society for Information Science and Technology, 59(6), 916 937. Zuccala, A. (2006). Author cocitation analysis is to intellectual structure as web colink analysis is to? Journal of the American Society for Information Science and Technology, 57(11), 1487 1502.