Syllabus for Statistics (B.Sc. CSIT) program

Similar documents
STA 225: Introductory Statistics (CT)

12- A whirlwind tour of statistics

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Probability and Statistics Curriculum Pacing Guide

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

The Effect of Written Corrective Feedback on the Accuracy of English Article Usage in L2 Writing

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

School Size and the Quality of Teaching and Learning

CS/SE 3341 Spring 2012

Lecture 1: Machine Learning Basics

A. What is research? B. Types of research

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

MASTER OF PHILOSOPHY IN STATISTICS

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

Research Design & Analysis Made Easy! Brainstorming Worksheet

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

Analysis of Enzyme Kinetic Data

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

PROJECT MANAGEMENT AND COMMUNICATION SKILLS DEVELOPMENT STUDENTS PERCEPTION ON THEIR LEARNING

APPENDIX A: Process Sigma Table (I)

Linking the Ohio State Assessments to NWEA MAP Growth Tests *

STAT 220 Midterm Exam, Friday, Feb. 24

Certified Six Sigma Professionals International Certification Courses in Six Sigma Green Belt

PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia

The Evolution of Random Phenomena

Running head: METACOGNITIVE STRATEGIES FOR ACADEMIC LISTENING 1. The Relationship between Metacognitive Strategies Awareness

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

TABLE OF CONTENTS TABLE OF CONTENTS COVER PAGE HALAMAN PENGESAHAN PERNYATAAN NASKAH SOAL TUGAS AKHIR ACKNOWLEDGEMENT FOREWORD

Sociology 521: Social Statistics and Quantitative Methods I Spring Wed. 2 5, Kap 305 Computer Lab. Course Website

A Program Evaluation of Connecticut Project Learning Tree Educator Workshops

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

THE EFFECTS OF CREATIVE TEACHING METHOD ON MOTIVATION AND ACADEMIC ACHIEVEMENT OF ELEMENTARY SCHOOL STUDENTS IN ACADEMIC YEAR

The Use of Statistical, Computational and Modelling Tools in Higher Learning Institutions: A Case Study of the University of Dodoma

Management of time resources for learning through individual study in higher education

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

CHAPTER III RESEARCH METHOD

The Implementation of Interactive Multimedia Learning Materials in Teaching Listening Skills

Knowledge management styles and performance: a knowledge space model from both theoretical and empirical perspectives

Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

Lecture 15: Test Procedure in Engineering Design

The Impact of Formative Assessment and Remedial Teaching on EFL Learners Listening Comprehension N A H I D Z A R E I N A S TA R A N YA S A M I

Analyzing the Usage of IT in SMEs

Mathematics subject curriculum

Algebra 2- Semester 2 Review

On the Distribution of Worker Productivity: The Case of Teacher Effectiveness and Student Achievement. Dan Goldhaber Richard Startz * August 2016

Towards Developing a Quantitative Literacy/ Reasoning Assessment Instrument

A Model to Predict 24-Hour Urinary Creatinine Level Using Repeated Measurements

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

Conceptual and Procedural Knowledge of a Mathematics Problem: Their Measurement and Their Causal Interrelations

Evaluation of Teach For America:

IS FINANCIAL LITERACY IMPROVED BY PARTICIPATING IN A STOCK MARKET GAME?

Capturing and Organizing Prior Student Learning with the OCW Backpack

CUNY Academic Works. City University of New York (CUNY) Hélène Deacon Dalhousie University. Rebecca Tucker Dalhousie University

School of Innovative Technologies and Engineering

Detailed course syllabus

Evidence for Reliability, Validity and Learning Effectiveness

Procedia - Social and Behavioral Sciences 237 ( 2017 )

Theory of Probability

On-the-Fly Customization of Automated Essay Scoring

Statewide Framework Document for:

An application of student learner profiling: comparison of students in different degree programs

Technical Manual Supplement

An overview of risk-adjusted charts

TEXT FAMILIARITY, READING TASKS, AND ESP TEST PERFORMANCE: A STUDY ON IRANIAN LEP AND NON-LEP UNIVERSITY STUDENTS

NIH Public Access Author Manuscript J Prim Prev. Author manuscript; available in PMC 2009 December 14.

Introduction to Simulation

Grade 6: Correlated to AGS Basic Math Skills

EGRHS Course Fair. Science & Math AP & IB Courses

GDP Falls as MBA Rises?

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

Peer Influence on Academic Achievement: Mean, Variance, and Network Effects under School Choice

AGS THE GREAT REVIEW GAME FOR PRE-ALGEBRA (CD) CORRELATED TO CALIFORNIA CONTENT STANDARDS

STUDENT SATISFACTION IN PROFESSIONAL EDUCATION IN GWALIOR

Hierarchical Linear Models I: Introduction ICPSR 2015

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Australian Journal of Basic and Applied Sciences

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Lecture 1: Basic Concepts of Machine Learning

Assignment 1: Predicting Amazon Review Ratings

EDPS 859: Statistical Methods A Peer Review of Teaching Project Benchmark Portfolio

COURSE SYNOPSIS COURSE OBJECTIVES. UNIVERSITI SAINS MALAYSIA School of Management

Sheila M. Smith is Assistant Professor, Department of Business Information Technology, College of Business, Ball State University, Muncie, Indiana.

OFFICE SUPPORT SPECIALIST Technical Diploma

PREDISPOSING FACTORS TOWARDS EXAMINATION MALPRACTICE AMONG STUDENTS IN LAGOS UNIVERSITIES: IMPLICATIONS FOR COUNSELLING

4 th year course description

The Effect of Power Point on Reading Comprehension Improvement among High school students: A case Study in the City of Shoush

Evaluation of Teaching the IS-LM Model through a Simulation Program

The Relationship Between Tuition and Enrollment in WELS Lutheran Elementary Schools. Jason T. Gibson. Thesis

Individual Differences & Item Effects: How to test them, & how to test them well

ATW 202. Business Research Methods

THEORY OF PLANNED BEHAVIOR MODEL IN ELECTRONIC LEARNING: A PILOT STUDY

Empowering Students Learning Achievement Through Project-Based Learning As Perceived By Electrical Instructors And Students

Investment in e- journals, use and research outcomes

AUTOMATIC DETECTION OF PROLONGED FRICATIVE PHONEMES WITH THE HIDDEN MARKOV MODELS APPROACH 1. INTRODUCTION

PHD COURSE INTERMEDIATE STATISTICS USING SPSS, 2018

Transcription:

Syllabus for Statistics (B.Sc. CSIT) program Tribhuvan University Institute of Science & Technology(IOST) Level: B.Sc. Full Marks: 60 + 20 + 20 Course Title: Statistics II Pass Marks: 24 + 8 + 8 Course Code: STA 20 Credit Hrs : 3 Nature of the Course: Theory and Practical Course objectives: To impart the theoretical as well as practical knowledge of estimation, testing of hypothesis, application of parametric and non-parametric statistical tests, design of experiments, multiple regression analysis, and basic concept of stochastic process with special focus to data/problems related with computer science and information technology.. Sampling Distribution and Estimation [6] Sampling distribution of mean and proportion; Concept of Central Limit Theorem; Concept of inferential Statistics; point estimation; Properties of a Good estimator: unbiasedness, consistency, efficiency and sufficiency; Methods of estimation: Maximum likelihood estimation, Method of moments; Interval estimation: Confidence interval and confidence coefficient, confidence limits, confidence interval of mean for normal population. Confidence interval for proportion; Determination of sample size, relationship of sample size with desired level of error. 2. Testing of hypothesis [8] Types of statistical hypotheses null and alternative hypothesis, type I and type II errors, level of significance, critical value and critical region, power of the test, concept of p-value and use of p - value in decision making, steps used in testing of hypothesis, one sample tests for mean of normal population (for known and unknown variance), test for single proportion, test for difference between two means and two proportions, paired sample t-test; Linkage between confidence interval and testing of hypothesis; Assumptions for applying independent t-test, paired t-test; Test of equality of two variances 3. Non parametric test [8] Parametric vs. non-parametric test; Needs of applying non-parametric tests; One-sample test: Run test, Binomial test, Kolmogorov Smirnov test; Two independent sample test: Median test, Kolmogorov-Smirnov test, Wilcoxon Mann Whitney test, Chi-square test; Paired-sample test:

Wilcoxon signed rank test; Cochran s Q test; Friedman two way analysis of variance test; Kruskal Wallis test. 4. Multiple correlation and regression [6] Multiple and partial correlation; Introduction of multiple linear regression; Least square estimation of parameters; Properties of least square estimators; Matrix approach to multiple linear regression; Hypothesis testing of multiple regression(upto two independent variables): Test of significance of regression, test of individual regression coefficient; Model adequacy test: Residual analysis, influential observation, multicollinearity; Coefficient of determination, Adjusted R 2, and their interpretations. 5. Design of experiment [0] Basic terminologies of experimental design; Basic principles of experimental designs; Completely Randomized Design (CRD): Statistical analysis of CRD, ANOVA table, Advantages and disadvantages, concept of multiple comparisons; Randomized Block Design (RBD): Statistical analysis of RBD for one observation per experimental unit, ANOVA table, Efficiency of RBD relative to CRD, Estimations of missing value (one observation only), Advantages and disadvantages; Latin Square Design (LSD): Statistical analysis of m x m LSD for one observation per experimental unit, ANOVA table, Estimation of missing value in LSD (one observation only), Efficiency of LSD relative to RBD, Advantage and disadvantages. 6. Stochastic Process [7] Definition and classification; Markov Process: Markov chain, Matrix approach, Steady- State distribution; Counting process: Binomial process, Poisson process; Simulation of stochastic process; Queuing system: Main component of queuing system, Little s law; Bernoulli single server queuing process: system with limited capacity; M/M/ system: Evaluating the system performance. 2

Practical (Computational Statistics): [5] Practical problems to be covered in the Computerized Statistics laboratory: Practical problems S. No. Title of the practical problems (Using any statistical software such as SPSS, STATA etc. whichever convenient). Sampling distribution, random number generation, and computation of sample size No. of practical problems 2 Methods of estimation(including interval estimation) 3 Parametric tests (covering most of the tests) 3 4 Non-parametric test(covering most of the tests) 3 5 Partial correlation 6 Multiple regression 7 Design of Experiments 3 Stochastic process 2 Total number of practical problems 5 Text Books:. Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers, & Keying Ye(202). Probability & Statistics for Engineers & Scientists. 9 th Ed., Printice Hall. 2. Michael Baron (203). Probability and Statistics for Computer Scientists. 2 nd Ed., CRC Press, Taylor & Francis Group, A Chapman & Hall Book. Reference Books:. Douglas C. Montgomery & George C. Runger(2003). Applied Statistics and Probability for Engineers. 3 rd Ed., John Willey and Sons, Inc. 2. Sidney Siegel, & N. John Castellan, Jr. Nonparametric Statistics for the Behavioral Sciences, 2 nd Ed., McGraw Hill International Editions. 3

Tribhuvan University Institute of Science and Technology Model Question Bachelor Level/Second Year/Third Semester/Science Full Marks: 60 Computer Science and Information Technology STA 20 Pass Marks: 24 (Statistics II) Time : 3 Hours Candidates are required to give their answers in their own words as far as practicable. All notations have the usual meanings. Group A Attempt any Two questions (2 0 = 20). Suppose a population of 4 computers with their lifetimes 3, 5, 7 & 9 years. Comment on the population distribution. Assuming that you sample with replacement, select all possible samples of n = 2, and construct sampling distribution of mean and compare the population distribution and sampling distribution of mean. Compare population mean versus mean of all sample means, and population variance versus variance of sample means and comment on them with the support of theoretical consideration if any. 2. A computer manager is keenly interested to know how efficiency of her new computer program depends on the size of incoming data and data structure. Efficiency will be measured by the number of processed requests per hour. Data structure may be measured on how many tables were used to arrange each data set. All the information was put together as follows. Data size(gigabytes) 6 7 7 8 0 0 5 Number of tables 4 20 20 0 0 2 Processed requests 40 55 50 4 7 26 6 Identify which one is dependent variable? Fit the appropriate multiple regression model and provide problem specific interpretations of the fitted regression coefficients. 3. State and explain the mathematical model for randomized complete block design. Explain all the steps to be adopted to carry out the analysis, and finally prepare the ANOVA table. Group B Attempt any Eight questions (8 5 = 40) 4. In order to ensure efficient usage of a server, it is necessary to estimate the mean number of concurrent users. According to records, the average number of concurrent users at 00 randomly selected times is 37.7, with a sample standard deviation of 9.2. At the % level of significance, do these data provide considerable evidence that the mean number of concurrent users is greater than 35? Draw your conclusion based on your result.

5. A sample of 250 items from lot A contains 0 defective items, and a sample of 300 items from lot B is found to contain 8 defective items. At a significance level α = 0.05, is there a significant difference between the quality of the two lots? 6. Modern email servers and anti-spam filters attempt to identify spam emails and direct them to a junk folder. There are various ways to detect spam, and research still continues. In this regard, an information security officer tries to confirm that the chance for an email to be spam depends on whether it contains images or not. The following data were collected on n = 000 random email messages. Image containing status Spam status With images No images Total Spam 60 240 400 No spam 40 460 600 Total 300 700 000 Assess whether being spam and containing images are independent factors at % level of significance. 7. Two computer makers, A and B, compete for a certain market. Their users rank the quality of computers on a 4-point scale as Not satisfied, Satisfied, Good quality, and Excellent quality, will recommend to others. The following counts were observed: Computer maker Not satisfied Satisfied Good quality Excellent quality A 20 40 70 20 B 0 30 40 20 Is there a significant difference in customer satisfaction of the computers produced by A and by B using Mann-Whitney U test at 5% level of significance. 8. Define queuing systems with suitable examples. Also explain the main components of queuing systems in brief. 9. In some town, each day is either sunny or rainy. A sunny day is followed by another sunny day with probability 0.7, whereas a rainy day is followed by a sunny day with probability 0.4. Weather conditions in this problem represent a homogeneous Markov chain with 2 states: state = sunny and state 2 = rainy. Transition probability matrix of sunny and rainy days is given below. P 0.7 0.3 0.4 0.6 Compute the probability of sunny days and rainy days using the steady-state equation for this Markov chain. 2

0. Consider a completely randomized design with 4 treatments with 7 observations in each. For the ANOVA summary table below, fill in all the missing results. Also indicate your statistical decision. Source Degrees of freedom Sum of Squares Mean Sum of Squares F-ratio Treatments? SSA =? 70 F =? Error? SSE = 590? Total? SST =?. Following are the scores obtained by 0 university staffs on the computer proficiency skills before training and after training. It was assumed that the proficiency of computer skills is expected to be increased after training. Staffs Score Before training After training 2 50 55 3 30 40 4 5 30 5 22 30 6 34 36 7 45 45 8 40 4 9 0 30 0 26 40 Test at 5% level of significance whether the training is effective to improve the computer proficiency skills applying appropriate statistical test. Assume that the given score follows normal distribution. 2. Write short notes on the following. a) Concept of Latin Square Design b) Multiple correlation 3