RESEARCH METHODS OF COMPUTER SCIENCE

RESEARCH METHODS OF COMPUTER SCIENCE EHTIRAM RAZA KHAN Sr. Lecturer Dept. of Computer Science Jamia Hamdard University New Delhi By HUMA ANWAR Assistant Director IPM Ghaziabad Uttar Pradesh UNIVERSITY SCIENCE PRESS (An Imprint of Laxmi Publications Pvt. Ltd.) An ISO 9001:2008 Company BENGALURU CHENNAI COCHIN GUWAHATI HYDERABAD JALANDHAR KOLKATA LUCKNOW MUMBAI RANCHI NEW DELHI BOSTON (USA) ACCRA (GHANA) NAIROBI (KENYA)

RESEARCH METHODS OF COMPUTER SCIENCE Typeset at ISBN Limits of Liability/Disclaimer of Warranty: PUBLISHED IN INDIA BY UNIVERSITY SCIENCE PRESS (An Imprint of Laxmi Publications Pvt. Ltd.) & & & & & & & & & & C /01/ Printed at: Branches

CONTENT ONTENTS Chapters Page No. 1. Objective and Dimensions of Research 1 17 2. Research Problems 18 32 3. Research Methodology 33 47 4. Research Proposal 48 65 Annexure 66 93 Index 94 95 (v)

PREFACE The market is flooded with the books on computer science. But there has been a vacuum of books on research methods in computer science. Everybody wants to build a competitive professional career in their life. The study material plays a significant role in building a career. And it is very difficult to select a study material for a professional career. Therefore, a judicious choice is significant in selecting a book for your prospective career. To fill in the vacuum and to provide you a novice study material, we have written the book "Research Methods of Computer Science". This book, written in a simple and lucid language deals with all aspects of research methods in computer science, viz., objective and dimensions of research, research problems, research methodology and research proposal. No other book combines these theories with adequate examples. The basic concepts of these theories have been illustrated in detail in this book. The key feature of this book that sets it apart from other books is the provision of detailed theory and self evaluation exercises at the end of each chapter. This provides an opportunity to the students to test whether he/she has fully grasped the fundamental concepts. The book fulfils the curriculum needs of undergraduate, postgraduate and research students of computer science in engineering and MCA courses. The judicious choice of the topics also makes it a novel guide for the computer professionals who have been indulged in computer research methodology. Special thanks to all those who have helped in bringing out this book in its present form. Finally suggestions, comments and error reports that have escaped our notice, for the improvement of this book are cordially welcome. Author

Chapter 1 OBJECTIVES AND DIMENSIONS OF RESEARCH Learning Objectives: After going through this chapter, you should appreciate the following: The Objectives of Research The Dimensions of Research Tools of Research Computer Science is the most happening field of Science. Wherever we go, we find computers and its applications. Computer science can be defined as: 1. Computer Science is the study of phenomena related to computers. 2. Computer Science is the study of information structures. 3. Computer Science is the study and management of complexity. 4. Computer Science is the mechanization of abstraction. 5. Computer Science is a field of study that is concerned with theoretical and applied disciplines in the development and use of computers for information storage and processing, mathematics, logic, science, and many other areas. SCIENTIFIC METHODS OF COMPUTER SCIENCE Basically, we find characteristic features of classical scientific methods also in CS. What is specific for CS is that its objects of investigation are artifacts (computer-related phenomena) that change concurrently with the development of theories describing them and simultaneously with the growing practical experience in their usage. 1

2 RESEARCH METHODS OF COMPUTER SCIENCE A computer from the 1940s is not the same as a computer from the 1970s, which in its turn is different from a computer in 2002. Even the task of defining what a computer is in the year 2002 is far from trivial. Computer science can be divided into: Theoretical, Experimental and Simulation CS, which are three methodologically distinct areas. One method is however common for all three of them, and that is modeling. MODELING Modeling is a process that always occurs in science, in a sense that the phenomenon of interest must be simplified, in order to be studied. That is the first step of abstraction. A model has to take into account the relevant features of a phenomenon. It obviously means that we are supposed to know which features are relevant. That is possible because there is always some theoretical ground that we start from when doing science. A simplified model of a phenomenon means that we have a sort of description in some symbolic language, which enables us to predict observable/measurable consequences of given changes in a system. Theory, experiment and simulation are all about (more or less detailed) models of phenomena. THEORETICAL COMPUTER SCIENCE Concerning Theoretical Computer Science, which adheres to the traditions of logic and mathematics, we can conclude that it follows the very classical methodology of building theories as logical systems with stringent definitions of objects (axioms) and operations (rules) for deriving/proving theorems. The key recurring ideas fundamental for computing are: Conceptual and formal models (including data models, algorithms and complexity) Different levels of abstraction Efficiency Data models are used to formulate different mathematical concepts. In CS a data model has two aspects: The values that data objects can assume, and The operations on the data. Here are some typical data models: The tree data model (the abstraction that models hierarchical data structure): The list data models (can be viewed as special case of tree, but with some additional operations like push and pop. Character strings are an important kind of lists)

OBJECTIVES AND DIMENSIONS OF RESEARCH 3 The set data model (the most fundamental data model of mathematics. Every concept in mathematics, from trees to real numbers can be expressed as a special kind of set) The relational data model (the organization of data into collections of twodimensional tables) The graph data model (a generalization of the tree data model: directed, undirected, and labelled) Patterns, automata and regular expressions. A pattern is a set of objects with some recognizable property. The automaton is a graph-based way of specifying patterns. Regular expression is algebra for describing the same kinds of patterns that can be described by automata. Theory creates methodologies, logics and various semantic models to help design programs, to reason about programs, to prove their correctness, and to guide the design of new programming languages. However, CS theories do not compete with each other as to which better explains the fundamental nature of information. Nor are new theories developed to reconcile theory with experimental results that reveal unexplained anomalies or new, unexpected phenomena, as in physics. In computer science there is no history of critical experiments that decide between the validity of various theories, as there are in physical sciences. The basic, underlying mathematical model of digital computing is not seriously challenged by theory or experiments. In computer science, results of theory are judged by the insights they reveal about the mathematical nature of various models of computing and/or by their utility to the practice of computing and their ease of application. Do the models conceptualize and capture the aspects computer scientists are interested in, do they yield insights in design problems, and do they aid reasoning and communication about relevant problems. The design and analysis of algorithms is a central topic in theoretical computer science. Methods are developed for algorithm design, measures are defined for various computational resources, tradeoffs between different resources are explored, and upper and lower-resource bounds are proved for the solutions of various problems. In the design and analysis of algorithms measures of performance are well-defined, and results can be compared quite easily in some of these measures (which may or may not fully reflect their performance on typical problems). Experiment with algorithms are used to test implementations and compare their practical performance on the subsets of problems considered important. EXPERIMENTAL COMPUTER SCIENCE The subject of inquiry in the field of computer science is information rather than energy or matter. However, it makes no difference in the applicability of the traditional scientific method. To understand the nature of information processes, computer scientists must observe phenomena, formulate explanations and theories, and test them. Experiments are used both for theory testing and for exploration. Experiments test theoretical predictions against reality. A scientific community gradually accepts a theory if all

4 RESEARCH METHODS OF COMPUTER SCIENCE known facts within its domain can be deduced from the theory, if it has withstood experimental tests, and if it correctly predicts new phenomena. Repeatability ensures that results can be checked independently and thus raises confidence in the results. Nevertheless, there is always an element of uncertainty in experiments and tests as well: To paraphrase Edsger Dijkstra, an experiment can only show the presence of bugs (flaws) in a theory, not their absence. Scientists are keenly aware of this uncertainty and are therefore ready to disqualify a theory if contradicting evidence shows up. A good example of theory falsification in computer science is the famous Knight and Leveson experiment, which analyzed the failure probabilities of multiversion programs. Conventional theory predicted that the failure probability of a multiversion program was the product of the failure probabilities of the individual versions. However, John Knight and Nancy Leveson observed that real multiversion programs had significantly higher failure probabilities. In fact, the experiment falsified the basic assumption of the conventional theory, namely that faults in different program versions are statistically independent. Experiments are also used in areas to which theory and deductive analysis do not reach. Experiments probe the influence of assumptions, eliminate alternative explanations of phenomena, and unearth new phenomena in need of explanation. In this mode, experiments help with induction: deriving theories from observation. Artificial Neural Networks (ANN) are a good example of the explorative mode of experimentation. After ANN having been discarded on theoretical grounds, experiments have demonstrated properties better than those theoretically predicted. Researchers are now developing better theories of ANN in order to account for these observed properties. Experiments are made in many different fields of CS such as search, automatic theorem proving, planning, NP-complete problems, natural language, vision, games, neural nets/ connectionism, and machine learning. Furthermore, analyzing performance behavior on networked environments in the presence of resource contention from many users is a new and complex field of experimental computer science. In this context it is important to mention Internet. Yet, there are plenty of computer science theories that haven t been tested. For instance, functional programming, object-oriented programming, and formal methods are all thought to improve programmer productivity, program quality, or both. Yet, none of these obviously important claims have ever been tested systematically, even though they are all 30 years old and a lot of effort has gone into developing programming languages and formal techniques. Some fields of Computing such as Human Computer Interaction and parts of Software Engineering have to take into consideration even humans (users, programmers) in their models of the investigated phenomena. It is therefore resulting in a soft empirical approach more characteristic for Humanities and Social Sciences, with methodological tools such as interviews and case studies. COMPUTER SIMULATION In recent years computation, which comprises computer-based modeling and simulation, has become the third research methodology within CS, complementing theory and experiment.

OBJECTIVES AND DIMENSIONS OF RESEARCH 5 Computational Science has emerged, at the intersection of Computer Science, applied mathematics, and science disciplines in both theoretical investigation and experimentation. Computer simulation makes it possible to investigate regimes that are beyond current experimental capabilities and to study phenomena that cannot be replicated in laboratories, such as the evolution of the universe. In the realm of science, computer simulations are guided by theory as well as experimental results, while the computational results often suggest new experiments and theoretical models. In engineering, many more design options can be explored through computer models than by building physical ones, usually at a small fraction of the cost and elapsed time. WHY DO RESEARCH IN COMPUTER SCIENCE? Research in any discipline is a hard task but when it comes to Computers and IT, it becomes even more daunting task. Still there are very few researchers in computer science and this is the reason why PhD in computer science is so important and crucial for a successful career. Research lets you learn a set of work skill that you can t get from classes. It includes Significant writing task Independent/unstructured work task Doing something real(experimental support to prove the concept) Research helps you to become a true expert in respective computer field. Research in computer science is not only helping the acadmia but is helping industries also. WHAT IS RESEARCH IN COMPUTING SCIENCE? The expanding scope of computing science makes it difficult to sustain traditional scientific and engineering models of research. In particular, recent work in formal methods has abandoned the traditional empirical methods. Similarly, research in requirements engineering and human computer interaction has challenged the proponents of formal methods. These tensions stem from the fact that Computing Science is a misnoma. Topics that are currently considered part of the discipline of computing science are technology rather than theory driven. This creates problems if academic departments are to impose scientific criteria during the assessment of PhDs. It is, therefore, important that people ask themselves What is Research in Computing Science before starting on a higher degree. Good research practice suggests that we should begin by defining our terms. The Oxford Concise dictionary defines research as: Research. 1. (a) the systematic investigation into and study of materials, sources, etc., in order to establish facts and reach new conclusions. (b) an endeavor to discover new or collate old facts etc., by the scientific study of a subject or by a course of critical investigation.

6 RESEARCH METHODS OF COMPUTER SCIENCE This definition is useful because it immediately focuses upon the systematic nature of research. In other words, the very meaning of the term implies a research method. These methods or systems essentially provide a model or structure for logical argument. THE DIALECTIC OF RESEARCH The highest level of logical argument can be seen in the structure of debate within a particular field. Each contribution to that debate falls into one of three categories: Thesis This presents the original statement of an idea. However, very few research contributions can claim total originality. Most borrow ideas from previous work, even if that research has been conducted in another discipline. Antithesis This presents an argument to challenge a previous thesis. Typically, this argument may draw upon new sources of evidence and is typically of progress within a field. Synthesis This seeks to form a new argument from existing sources. Typically, a synthesis might resolve the apparent contradiction between a thesis and an antithesis. A good example of this form of dialetic is provided by the debate over prototyping. For example, some authors have argued that prototypes provide a useful means of generating and evaluating new designs early in the development process (thesis), (Fuchs, 1992). Others have presented evidence against this hypothesis by suggesting that clients often choose features of the prototyping environment without considering possible alternatives (antithesis) (Hayes and Jones, 1989). A third group of researchers have, therefore, developed techniques that are intended to reduce bias towards features of prototyping environments (synthesis) (Gravell and Henderson, 1996). Research in a field progresses through the application of methods to prove, refute and reassess arguments in this manner. MODELS OF ARGUMENT A more detailed level of logical argument can be seen in the structures of discourse that are used to support individual works of thesis, antithesis or synthesis. PROOF BY DEMONSTRATION Perhaps the most intuitively persuasive model for research is to build something and then let that artifact stand as an example for a more general class of solutions. There are numerous examples of this approach being taken within the field of computer science. It is possible to

OBJECTIVES AND DIMENSIONS OF RESEARCH 7 argue that the problems of implementing multi-user operating systems were solved more through the implementation and growth of UNIX than through a more measured process of scientific enquiry. However, there are many reasons why this approach is an unsatisfactory model for research. The main objection is that it carries high risks. For example, the artifact may fail long before we learn anything about the conclusion that we are seeking to support. Indeed, it is often the case that this approach ignores the formation of any clear hypothesis or conclusion until after the artefact is built. This may lead the artifact to become more important to the researcher than the ideas that it is intended to establish. The lack of a clear hypothesis need not be the barrier that it might seem. The proof by demonstration approach has much in common with current engineering practice. Iterative refinement can be used to move an implementation gradually towards some desired solution. The evidence elicited during previous failed attempts can be used to better define the goal of the research as the work progresses. The key problem here is that the iterative development of an artefact, in turn, requires a method or structure. Engineers need to carefully plan ways in which the faults found in one iteration can be fed back into subsequent development. This is, typically, done through testing techniques that are based upon other models of scientific argument. This close relationship between engineering and scientific method should not be surprising: engineering n. an application of science to the design, building and use of machines, construction etc. (The Oxford Concise Dictionary). EMPIRICISM The Western empirical tradition can be seen as an attempt to avoid the undirected interpretation of artifacts. It has produced the most dominant research model since the seventeenth century. It can be summarized by the following stages: Hypothesis Generation This explicitly identifies the ideas that are to be tested by the research. Method Identification This explicitly identifies the techniques that will be used in order to establish the hypothesis. This is critical because it must be possible for one s peers to review and criticize the appropriateness of the methods that you have chosen. The ability to repeat an experiment is a key feature of strong empirical research. Result Compilation This presents and compiles the results that have been gathered from following the method. An important concept here is that of statistical significance; whether or not the observed results could be due to chance rather than an observable effect.

8 RESEARCH METHODS OF COMPUTER SCIENCE Conclusion Finally, the conclusions are stated either as supporting the hypothesis or rejecting it. In the case that results do not support a hypothesis, it is important always to remember that this may be due to a weakness in the method. Conversely, successful results might be based upon incorrect assumptions. Hence, it is vital that all details of a method are made available to peer review. This approach has been used to support many different aspects of research within Computing Science. For example, Boehm, Gray and Seewaldt (1984) used it to compare the effectiveness of specification and prototyping techniques for software engineering. Others have used it to compare the efficiency of searching and sorting algorithms. Researchers in Information Retrieval have even developed standard methods which include well known test sets to establish performance gains from new search engines. There are many problems with the standard approach to scientific empiricism when applied to computing science. The principle objection is that many aspects of computing defy the use of probabilistic measures when analyzing the results of empirical tests. For example, many statistical measures rely upon independence between each test of a hypothesis. Such techniques clearly cannot be used when attempting to measure the performance of any system that attempts to optimise its performance over time; this rules out load balancing algorithms etc. Secondly, it can be difficult to impose standard experimental conditions upon the products of computer science. For example, if a program behaves in one way under one set of operating conditions then there is no guarantee that it will behave in the same way under another set of conditions. These conditions might go down to the level of alpha particles hitting memory chips. Thirdly, it can be difficult to generalise the results of tightly controlled empirical experiments. For example, just because a user finds a system easy to use in a lab-based evaluation, there is no guarantee that another user will be able to use that product amidst the distractions of their everyday working environment. Finally, it is difficult to determine when a sufficient number of trials have been conducted to support many hypotheses. For example, any attempt to prove that a program always satisfies some property will be almost certainly doomed to failure using standard experimental techniques, The number of potential execution paths through even simple code makes it impossible to test properties against every possible execution path. MATHEMATICAL PROOF The dissatisfaction with empirical testing techniques has led many in the computing science research community to investigate other means of structuring arguments in support of particular conclusions. In the United Kingdom, much of this work has focused upon argumentation techniques that were originally developed to model human discourse and thought within the field of philosophy. For example, Burrows, Abadi and Needham (1990) adopted this approach to reason about the correctness of network authentication protocols.

Research Methods Of Computer Science 40% OFF Publisher : Laxmi Publications ISBN : 9789383828241 Author : Ehtiram Raza Khan, Huma Anwar Type the URL : http://www.kopykitab.com/product/11793 Get this ebook