Research Methodology Day 2 Bijay Lal Pradhan, Ph.D.
Research Problem A problem is a intellectual stimulus calling for an answer in the form of scientific inquiry. For example what incentives lead to energy conservation? How can inflation be contracted?, Does social class influence voting behavior? are research problem. All the intellectual stimuli cannot be empirically studied. For eg. is blue nicer than green, is impressionism the most advance form of art? Which are concerned with subjective preference, beliefs, values or testes.
The components of research problems There must be an individual or a group which has some difficulty or the problem. There must be some objectives to be attained at all. If one wants nothing one cannot have a problem. There must be alternative means (or course of action) for obtaining the objectives one wishes to attain. This means that there must be at least two means available to a researcher. If he has no choice of mean he cannot have a problem. There must remain some doubt in the mind of researcher with regard to the selection of alternatives. This means that research must answer the question concerning the relative efficiency of the possible alternatives. There must be some environment to which the difficulty pertains.
Research problem Thus, a research problem is one which requires a researcher to find out the best solution for the given problem. Scientific enquiry will be undertaken which created to the solution of the problem. The first step in formulating the research is to make the problem concrete and explicit. A research worker should identify some aspect of the topic which can be formulated into a specific research question which is feasible to investigate with the resources available. There may be several factors which may result in making the problem complicated.
Criteria for selecting a problem The criteria for the selection of a problem may be : interest of the researcher, amenability of the problem for research and feasibility of the problem from the point of view of resources. Goods and Half gives the following criteria: The researcher s interest, intellectual curiosity and drive Practicability The urgency of the problem Anticipating or expected outcomes. Their importance for the field represented and implementation and Resources, training and personal qualifications of the personnel; availability of special equipment, data, methods, time and sponsorship and administrator s cooperation.
Characteristics of Good problem Problem should be allied with the chain of thinking It should be expressed a relation between two or more variables It should be stated clearly and unambiguously The problem should be such as to imply possibilities of empirical testing by collecting data pertaining to the problem General problem is converted into several research questions It does not represent a moral or ethical position.
Importance of formulating research problem The formulation of a research problem is the first important step of the research process. It is like the identification of a destination before undertaking a journey. It is just like foundation of building. The research problem is serves as the foundation of a research study: If it is well formulated, you can expect a good study to follow. You must have clear idea with regard to what it is that you want to find about and not what you think you must find.
Importance of formulating research problem There is wisely saying A question well-stated is a question half answered. Research question keep you from getting lost or off-track when looking for information Once the research problem stated, it determines almost every step that to be followed: study design, sampling strategy, instrument, analysis etc.
Steps in Problem formulation Statement of problem in general way In broad general way keeping in view of practical concern or intellectual interest. Understanding the nature of the problem Surveying the available literature Developing the ideas through discussion Rephrasing the research problem.
Research Question Research question constitute the most important element of any research. They describe the ideas contained in the research objectives. They point out the data they are required to be collected in a study. Formulation of research questions is the real starting point in preparation of a research. The questions have to be related to three aspects: What, Why and how? What question seeks descriptions, why question seek explanation and understanding and how question seek interventions to bring about change.
Research Question In practical field the researcher does not have clear formulated research questions at the time of taking up the research, though he may have some loosely connected ideas about what should be researched.
Theoretical Framework In this step we have to integrate the information logically so that the reason for the problem can be conceptualized. The critical variables are examined and the association among them is identified. A theoretical framework will be developed by keeping all the variables along with their association.
Theoretical Framework Background concept (broad concept) Systemized concept (explicit Definition) Indicators (measures) variable identification Score for cases (Identification of unit of measurement)
Constructs 3. Conceptual : describe concept by others concept 4. Operational : bridge the conceptual and empirical levels. It makes possible to confirm the existence of concepts which have no direct observable characteristics.
Hypothesis Two words Hypo: Under Thesis: reasoned theory Theory which is not fully reasoned Tentative answer of the research question Imaginated idea or guess depending upon previous accumulated knowledge which can be put to test to determine its validity. Generally specify relationship between variables or with specific value
Hypothesis As a researcher you do not know about the phenomenon, situation, but you do have a hunch to form the basis of certain assumption or guesses. You test these by collecting information that will enable you to conclude if your hunch was right. It is tentative preposition Its validity is unknown In most cases, it specifies a relationship between two or more variables.
Source of Hypothesis Culture of the society Culture has great influence upon the thinking process of people. Caste is related to voting behavior among Nepalese Scientific study (Past Research) Personal experience Very often researchers see evidence of some behaviour pattern in their daily lives.
Characteristics of hypothesis Specific Conceptually Clear Related to available Technique Related to Body of Theory Capable of Empirical Test
Research Design Research design the plan and structure of investigation so conceived as to obtain answers to research questions. The plan is the overall scheme or program of the research. It includes an outline of what researcher will do from writing the hypotheses and their operational implications to the final analysis of data..
Research design and the methods you use to collect your data are not the same thing. Data collection and its analysis are parts of research design.
Sampling The basis idea of sampling it that by selecting some of the elements in population, we may draw conclusions about the entire population. A population is the total collection of elements about which we wish to make some inferences.
Random sampling: Every element in the sampling frame has an equal chance of selection. Stratified sampling: The process by which the sample is constrained to include elements from each of the segments is called stratified random sampling. University students can be divided by their class level, school or major, gender and so forth. Cluster sampling: In the cluster sampling population can also be divided into groups of elements with some groups randomly selected for study.
Judgmental sampling occurs when a researcher selects sample members to conform to some criterion. In case of students problems you only wants to ask questions to student union representative.
Data Collection Direct Personal Interview Indirect Oral Interview Questionnaire through enumerators Questionnaire through mail Data from Local Agents (correspondence) Telephone Interview Online survey Observation Bijay Lal Pradhan 44
Data Collection Techniques Method of data collection Primary sources Secondary Sources Observation interviewing Questionnaire Documents Participants Non Participants Structure Non Structure Mailed Questionnai re Collective Questionnai re Govt. Publication Earlier Research Census Personal Records Client Histories Service records
Stages of Data Processing & Analysis EDITING CODING DATA ENTRY ERROR CHECKING AND VERIFICATION DATA ANALYSIS
Data Processing Scrutinizing Data Entry Checking and updating
Coding Coding is the process of giving some symbols ( either) alphabetical or numerals or (both) to answers so that responses can be recorded into a limited number of classes or categories. It is necessary for the efficient analysis of data. Consider the following examples. What is your sex? Male Female Here the male could be coded as 0 for response and 1 for Female. What is your age? -<10 yrs. It can be coded as A. - 10-25 yrs. It can be coded as B. - 25-50 yrs. It can be coded as C. - 50 or above It can be coded as D.
Organize data into: A field: collection of characters (single No., letter, symbol) representing single type of data. A record: collection of related fields. A file: collection of related records.
Figure 1: Data Transferring from Questionnaire Entity 1 Questionnaire 1 Entity 2... Entity n Questionnaire 2... Questionnaire n Each questionnaire contains responses of m number of questions Q1, Q2, Qm Computer Single-answered Question: A question whose expected number of answer is one. Multiple-answered Question: A question whose expected number of answer is more than one
Example of a Questionnaire Q1. How many employees are in your office? Q2. Please mention the number of employees by their level of education School education. College education. University education. Q3. Have you ever heard about XYZ scheme? Yes No Q5 Q4. How did you hear about XYZ scheme? Mention source. Radio Television Papers Q5. How is your business running in the current situation? Very well Well Satisfactory Bad Very Bad Note that Q1, Q3 and Q5 are single-answered questions & Q2 and Q4 are multiple-answered questions. Q1 and Q2 are quantitative variables & Q3, Q4, Q5 are qualitative variables
Defining Variables: Some Suggestions Variables are symbols used to designate characteristics of entities under consideration. They are defined by users so that responses of each entity could be transferred into appropriate columns.. An efficient way of defining variables is as follows. 1. Treat each single answered question as a variable. It is a good idea to name such variables by their corresponding question numbers. 2. For multiple answered question, first determine the possible number of multiple answers or categories. Treat each category as a value of a variable. It is a good idea to name such variables by their corresponding question number with additional alphabets or numbers separating, if necessary, by under score symbol ( _ ).
List of Variables of the Example Questionnaire Consider the previous questionnaire, where there are 9 possible variables. Their names and labels are as follows. Variable names Q1 Q2a Q2b Q2c Q3 Q4a Q4b Q4c Q5 Variable labels # of employees # of employees with school level education # of employees with college level education # of employees with university level education Awareness status of XYZ scheme Source of awareness is radio Source of awareness is TV Source of awareness is paper Business condition
Assigning Codes for Textual Data Codes are assigned for textual responses. The variables Q3, Q4a, Q4b, Q4c, Q5 introduced in the example questionnaire are qualitative in nature. A coding system for these variables are Variable Q3 Q4a Q4b Q4c Q5 Code values 1 = Yes and 0 = No 1 if heard from radio & 0 otherwise 1 if heard from TV & 0 otherwise 1 if heard from paper & 0 otherwise 1 = Very bad, 2 = Bad, 3 = Satisfactory, 4 = Well, 5 = Very well
What is data analysis? Data analysis deals with the problem of deriving relevant information contained in a given data file. It is one of the major topics of the subject statistics. It is carried out in order to fulfill the research objectives with empirical evidences. The derived information generally appear in one or more of the following forms (list is not exhaustive) 1. small tables (frequency tables) 2. graphs or diagrams (histogram, bar graph, pie chart etc.) 3. summary statistics (percentage, mean, standard deviation etc.) 4. models (linear models, factor models etc.) 5. indicators or indices (per capita income/consumption etc.)
Descriptive & Inferential Statistics Statistics Descriptive Inferential Estimation Hypothesis Testing Tabular Graphical Point Interval Parametric Non-Parametric The methods of inferential statistics are applicable when results are obtained from a random. Uncertainty always remains while generalizing results from a sample to a population. The degree of uncertainty is measured in terms of probability in inferential statistics.
Univariate Data Analysis Analysis of data of a single variable at a time is univariate analysis. The suitable univariate data analysis methods by scale of variables are listed below Nominal or Ordinal What type of data? Ratio or Interval 1. Prepare frequency table 2. Compute mode 3. Compute median (ordinal) 4. Draw graphs Bar diagram Pie-chart 5. Chi-square test 1. Prepare frequency table (discrete) 2. Compute mean. Median and mode 3. Compute positional statistics 4. Compute SD, range etc. 5. Draw graphs. Histogram (continuous). Bar diagram (discrete). 6. Z, t, F & 2 tests 7. Transform into categorical.
Bivariate Data Analysis Analysis of data of two variables at a time. The kinds of data analysis are listed below. Nominal Ordinal Scale Ratio Interval Scale 1. Prepare two-way frequency tables 2. Compute row or column percentages 3. Draw charts and diagrams 4. Test hypotheses (chi-square test of independence) 5. Prepare two-way frequency tables 6. Draw Scatter diagram 7. Test hypotheses (chi-square, z, t, F tests) 8. Carry out correlation & regression analysis
t-test for Difference between Means Nepal Rural Credit Survey 1992 is a national level survey of more than seven thousands households. A poverty profile based on the data is prepared and presented below. The absolute difference ( ) between poor and non-poor group is found highly significant, which is seen by testing the null hypothesis H 0 : = 0 against H 1 : 0. Too small p-value is considered as an evidence against H 0. Indicators Poor Non-poor Average farm size in ha 0.771 1.249 Average household size 6.72 5.19 Average children/household 2.19 1.47 Average literates/ household 1.77 2.70
Chi-square Test of Independence A total of 7264 households were categorized by farm category and poverty status. The data are arranged below. We are interested to test the null hypothesis H 0 : there is no association between two variables FC & PS against the alternative hypothesis H 1 : there is association Poverty Status Farm Category Row Total L/M Small Large Poor 1279 1225 180 2684 Non-poor 1516 2243 821 4580 Column total 2795 3468 1001 7264
Contingency Table showing Association The association between FC and PS is more vividly seen from the following conditional percentage distribution table derived from the previous one. Note that percentages are of column totals. Poverty Status Farm Category L/M Small Large Poor 45.8 35.3 18.0 Non-poor 54.2 64.7 82.0 Total 100.0 100.0 100.0 Clearly, the percentage of poor household decreases with the increase in farm size
Time Series and Cross-section Data Analysis Time series data analysis: Analysis is mainly concerned with the estimation of growth rate, change over time and forecast of the future value of a variable. Cross-sectional data analysis: Analysis is mainly concerned with the comparative analysis of socio-economic variables of different population or household groups, correlation and regression analysis will be used.
Example 1 Univariate Analysis of SLC Exam 2004 Data Type of School Frequency distribution of schools Absolute Relative (%) # of appeared students # of passed students Public 3352 72.4 133957 49451 Private 1281 27.6 35596 29898 Total 4633 100.0 169553 79349 Average number of students per school % of passed students Public 40.0 36.9 Private 27.8 84.0 Total 36.6 46.8
Example 2 Univariate Analysis of SLC Exam 2004 Data Frequency distribution of the # of schools appeared in exam across development region (DR) Frequency % EDR 1000 21.6 CDR 1714 37.0 WDR 1100 23.7 MWDR 439 9.5 FWDR 380 8.2 Total 4633 100.0 Frequency distribution of the # of schools appeared in exam across ecological region (ER) Frequency % Terai 1482 32.0 Hill 2783 60.1 Mountain 368 7.9 Total 4633 100.0
Example 3 Bivariate Analysis of SLC Exam 2004 Data Number of Schools participated in Exam by DR & ER Terai Hill Mountain Total EDR 507 390 103 1000 CDR 440 1166 108 1714 WDR 244 846 10 1100 MWDR 164 220 55 439 FWDR 127 161 92 380 Total 1482 2783 368 4633
Example 4 Bivariate Analysis of SLC Exam 2004 Data Number of Students appeared in Exam by DR and ER Terai Hill Mountain Total EDR 27167 15047 2556 44770 CDR 20352 36678 3727 60757 WDR 10426 29507 98 40031 MWDR 5690 6645 937 13272 FWDR 5131 3776 1816 10723 Total 68766 91653 9134 169553
Generalization If a hypothesis is tested then it may be possible for the researcher to arrive at generalization, i.e., to build a theory. As a matter of fact, the real value of research lies in its ability to arrive at certain generalizations. If the researcher had no hypothesis to start with, he might seek to explain his findings on the basis of some theory. It is known as interpretation. The process of interpretation may quite often trigger off new questions which in turn may lead to further researches.