Statistics for Risk Modeling Exam September 2018

Similar documents
Business Analytics and Information Tech COURSE NUMBER: 33:136:494 COURSE TITLE: Data Mining and Business Intelligence

Detailed course syllabus

A Model to Predict 24-Hour Urinary Creatinine Level Using Repeated Measurements

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Python Machine Learning

STA 225: Introductory Statistics (CT)

Lecture 1: Machine Learning Basics

ACTL5103 Stochastic Modelling For Actuaries. Course Outline Semester 2, 2014

Introduction to Ensemble Learning Featuring Successes in the Netflix Prize Competition

Ryerson University Sociology SOC 483: Advanced Research and Statistics

EDCI 699 Statistics: Content, Process, Application COURSE SYLLABUS: SPRING 2016

Analysis of Enzyme Kinetic Data

Assignment 1: Predicting Amazon Review Ratings

Learning From the Past with Experiment Databases

Spring 2014 SYLLABUS Michigan State University STT 430: Probability and Statistics for Engineering

CS Machine Learning

Probability and Statistics Curriculum Pacing Guide

School Size and the Quality of Teaching and Learning

Mathematics subject curriculum

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Multi-Lingual Text Leveling

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

Purdue Data Summit Communication of Big Data Analytics. New SAT Predictive Validity Case Study

University of Waterloo School of Accountancy. AFM 102: Introductory Management Accounting. Fall Term 2004: Section 4

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Evaluation of Teach For America:

A Decision Tree Analysis of the Transfer Student Emma Gunu, MS Research Analyst Robert M Roe, PhD Executive Director of Institutional Research and

International Series in Operations Research & Management Science

12- A whirlwind tour of statistics

Spring 2015 IET4451 Systems Simulation Course Syllabus for Traditional, Hybrid, and Online Classes

Measurement. When Smaller Is Better. Activity:

APPENDIX A: Process Sigma Table (I)

STT 231 Test 1. Fill in the Letter of Your Choice to Each Question in the Scantron. Each question is worth 2 point.

Multivariate k-nearest Neighbor Regression for Time Series data -

PH.D. IN COMPUTER SCIENCE PROGRAM (POST M.S.)

(Sub)Gradient Descent

A Program Evaluation of Connecticut Project Learning Tree Educator Workshops

Algebra 2- Semester 2 Review

SETTING STANDARDS FOR CRITERION- REFERENCED MEASUREMENT

PROFESSIONAL TREATMENT OF TEACHERS AND STUDENT ACADEMIC ACHIEVEMENT. James B. Chapman. Dissertation submitted to the Faculty of the Virginia

Data Structures and Algorithms

Using the Attribute Hierarchy Method to Make Diagnostic Inferences about Examinees Cognitive Skills in Algebra on the SAT

Hierarchical Linear Models I: Introduction ICPSR 2015

Interpreting ACER Test Results

Statewide Framework Document for:

The My Class Activities Instrument as Used in Saturday Enrichment Program Evaluation

Foothill College Summer 2016

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Unsupervised Learning of Word Semantic Embedding using the Deep Structured Semantic Model

CS4491/CS 7265 BIG DATA ANALYTICS INTRODUCTION TO THE COURSE. Mingon Kang, PhD Computer Science, Kennesaw State University

San José State University Department of Marketing and Decision Sciences BUS 90-06/ Business Statistics Spring 2017 January 26 to May 16, 2017

Hierarchical Linear Modeling with Maximum Likelihood, Restricted Maximum Likelihood, and Fully Bayesian Estimation

Syllabus ENGR 190 Introductory Calculus (QR)

PM tutor. Estimate Activity Durations Part 2. Presented by Dipo Tepede, PMP, SSBB, MBA. Empowering Excellence. Powered by POeT Solvers Limited

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

GRAPHIC DESIGN TECHNOLOGY Associate in Applied Science: 91 Credit Hours

Office Hours: Mon & Fri 10:00-12:00. Course Description

Introduction to Simulation

Comparison of network inference packages and methods for multiple networks inference

Technical Manual Supplement

Answer Key Applied Calculus 4

Psychometric Research Brief Office of Shared Accountability

TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy

SCOPUS An eye on global research. Ayesha Abed Library

Title:A Flexible Simulation Platform to Quantify and Manage Emergency Department Crowding

Practical Research. Planning and Design. Paul D. Leedy. Jeanne Ellis Ormrod. Upper Saddle River, New Jersey Columbus, Ohio

Carolina Course Evaluation Item Bank Last Revised Fall 2009

Math 96: Intermediate Algebra in Context

MULTIPLE CHOICE. Choose the one alternative that best completes the statement or answers the question.

PHD COURSE INTERMEDIATE STATISTICS USING SPSS, 2018

BENCHMARK TREND COMPARISON REPORT:

BADM 641 (sec. 7D1) (on-line) Decision Analysis August 16 October 6, 2017 CRN: 83777

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

EGRHS Course Fair. Science & Math AP & IB Courses

Gender and socioeconomic differences in science achievement in Australia: From SISS to TIMSS

Instructor: Matthew Wickes Kilgore Office: ES 310

CHMB16H3 TECHNIQUES IN ANALYTICAL CHEMISTRY

Measuring physical factors in the environment

Sociology 521: Social Statistics and Quantitative Methods I Spring 2013 Mondays 2 5pm Kap 305 Computer Lab. Course Website

Simple Random Sample (SRS) & Voluntary Response Sample: Examples: A Voluntary Response Sample: Examples: Systematic Sample Best Used When

On-the-Fly Customization of Automated Essay Scoring

Mathematics (JUN14MS0401) General Certificate of Education Advanced Level Examination June Unit Statistics TOTAL.

Science Olympiad Competition Model This! Event Guidelines

Julia Smith. Effective Classroom Approaches to.

Evidence-based Practice: A Workshop for Training Adult Basic Education, TANF and One Stop Practitioners and Program Administrators

The Good Judgment Project: A large scale test of different methods of combining expert predictions

Impact of Cluster Validity Measures on Performance of Hybrid Models Based on K-means and Decision Trees

Kansas Adequate Yearly Progress (AYP) Revised Guidance

How the Guppy Got its Spots:

Effective Pre-school and Primary Education 3-11 Project (EPPE 3-11)

Predicting the Performance and Success of Construction Management Graduate Students using GRE Scores

MAT 122 Intermediate Algebra Syllabus Summer 2016

Math 181, Calculus I

ENME 605 Advanced Control Systems, Fall 2015 Department of Mechanical Engineering

Mathematics Program Assessment Plan

Course Content Concepts

THEORY OF PLANNED BEHAVIOR MODEL IN ELECTRONIC LEARNING: A PILOT STUDY

Design and Creation of Games GAME

A Level Business Studies Past Papers

Physics 270: Experimental Physics

Transcription:

Statistics for Risk Modeling Exam September 2018 IMPORTANT NOTICE This version of the syllabus is final, though minor changes may occur. This March 2018 version includes updates to this page and to the learning outcomes. The list of readings has not changed. The Statistics for Risk Modeling exam is a three and one-half hour exam that consists of 35 multiple-choice questions and is administered as a computer-based test. For additional details, please refer to Exam Rules The goal of the syllabus for this examination is to provide an understanding of the basics of several important analytic methods. This exam is a prerequisite for the Predictive Analytics exam, which will go deeper into each of the covered techniques. The Statistics for Risk Modeling Exam assumes knowledge of probability and mathematical statistics as covered in the Exam P and VEE Mathematical Statistics subjects. The following learning objectives are presented with the understanding that candidates are allowed to use specified calculators on the exam. A variety of tables is available below for the candidate and will be provided to the candidate at the examination. These include values for the standard normal distribution, t distribution and chi-square distribution. Candidates will not be allowed to bring copies of the tables into the examination room. Please check the Updates section on this exam's home page for any changes to the exam or syllabus. Each multiple-choice problem includes five answer choices identified by the letters A, B, C, D, and E, only one of which is correct. Candidates must indicate responses to each question on the computer. Candidates will be given three and one-half hours to complete the exam. As part of the computer-based testing process, a few pilot questions will be randomly placed in the exam (both paper and pencil and computer-based forms). These pilot questions are included to judge their effectiveness for future exams, but they will NOT be used in the scoring of this exam. All other questions will be considered in the scoring. All unanswered questions are scored incorrect. Therefore, candidates should answer every question on the exam. There is no set requirement for the distribution of correct answers for the multiple-choice preliminary examinations. It is possible that a particular answer choice could appear many times on an examination or not at all. Candidates are advised to answer each question to the best of their ability, independently from how they have answered other questions on the exam. Since the CBT exam will be offered over a period of a few days, each candidate will receive a test form composed of questions selected from a pool of questions. Statistical scaling methods are used to ensure within reasonable and practical limits that, during the same testing period of a few days, all forms of the test are comparable in content and passing criteria. The methodology that has been adopted is used by many credentialing programs that give multiple forms of an exam. Because this is a new exam, results for the first several administrations will not be instantaneous. Results will be released on the SOA website about 8 weeks after each testing window ends. The ranges of weights shown in the below are intended to apply to the large majority of exams administered. On occasion, the weights of topics on an individual exam may fall outside the published range. Candidates should also recognize that some questions may cover multiple learning objectives. For this exam, ability to solve problems using the R programming language will not be assumed. However, questions may present R output for interpretation.

LEARNING OBJECTIVES 1. Topic: Basics of Statistical Learning (7.5-12.5%) The Candidate will understand key concepts of statistical learning. a) Explain the types of modeling problems and methods, including supervised versus unsupervised learning and regression versus classification. b) Explain the common methods of assessing model accuracy. c) Employ basic methods of exploratory data analysis, including data checking and validation. 2. Topic: Linear Models (40-50%) The Candidate will understand key concepts concerning generalized linear models. a) Describe and explain the components of, in particular, the exponential family of distributions and link functions. b) Estimate parameters using least squares and maximum likelihood. c) Interpret diagnostic tests of model fit and assumption checking, using both graphical and quantitative methods. d) Select an appropriate model, considering: Distributions and link functions Variable transformations and interactions Pearson chi-square statistic t and F tests AIC and BIC Likelihood ratio test e) Interpret model results with emphasis on using the model to answer the underlying business question. f) Calculate and interpret predicted values, confidence, and prediction intervals. g) Understand how approaches may differ compared to using an ordinary least squares model, including lasso, ridge regression, and KNN.

3. Topic: Time Series Models (12.5-17.5%) The Candidate will understand key concepts concerning regression-based time series models. a) Define and explain the concepts and components of stochastic time series processes, including random walks, stationarity, and autocorrelation. b) Describe specific time series models, including, exponential smoothing, autoregressive, and autoregressive conditionally heteroskedastic models. c) Calculate and interpret predicted values and confidence intervals. 4. Topic: Principal Components Analysis (2.5-7.5%) The Candidate will understand key concepts concerning principal components analysis. a) Define principal components. b) Interpret the results of a principal components analysis, considering loading factors and proportion of variance explained. c) Explain uses of principal components.

5. Topic: Decision Trees (10-15%) The Candidate will understand key concepts concerning decision tree models. a) Explain the purpose and uses of decision trees. b) Explain and interpret decision trees, considering regression trees and recursive binary splitting. c) Explain and interpret bagging, boosting, and random forests. d) Explain and interpret classification trees, their construction, Gini index, and entropy. e) Compare decision trees to linear models. f) Interpret the results of a decision tree analysis. 6. Topic: Cluster Analysis (10-15%) The Candidate will understand key concepts concerning cluster analysis. a) Explain the uses of clustering. b) Explain K-means clustering. c) Explain hierarchical clustering. d) Explain methods for deciding the number of clusters. e) Compare hierarchical with K-means clustering. Textbooks Regression Modeling with Actuarial and Financial Applications, Edward W. Frees, 2010, New York: Cambridge. ISBN: 978-0521135962. Chapter 1 Background only Chapter 2 Sections 1-8 Chapter 3 Sections 1-5 Chapter 5 Sections 1-7 Chapter 6 Sections 1-3 Chapter 7 Sections 1-6

Chapter 8 Sections 1-4 Chapter 9 Sections 1-5 Chapter 11 Sections 1-6 Chapter 12 Sections 1-4 Chapter 13 Sections 1-6 An Introduction to Statistical Learning, with Applications in R, James, Witten, Hastie, Tibshirani, 2013, New York: Springer. A PDF version of the text can be downloaded at http://www-bcf.usc.edu/~gareth/isl/. Chapter 2 Sections 1-3 Chapter 3 Sections 1-6 Chapter 5 Sections 1 and 3 (excluding 5.3.4) Chapter 6 Sections 1-7 Chapter 8 Sections 1-3 Chapter 10 Sections 1-6 While exercises are not included in the required readings, candidates are encouraged to work them as part of the learning experience. OTHER RESOURCES: Tables for Exam SRM Sample Questions and Solutions (to come)