An investigation of the relationship between online activity on and academic grades of newly arrived immigrant students

Size: px
Start display at page:

Download "An investigation of the relationship between online activity on and academic grades of newly arrived immigrant students"


1 EXAMENSARBETE INOM TECHNOLOGY, GRUNDNIVÅ, 15 HP STOCKHOLM, SVERIGE 2017 An investigation of the relationship between online activity on and academic grades of newly arrived immigrant students An application of educational data mining AKASH MENON NAHIDA ISLAM KTH SCHOOL OF INFORMATION AND COMMUNICATION TECHNOLOGY


3 Abstract This study attempts to analyze the impact of an online educational resource on academic performances among newly arrived immigrant students in Sweden between the grade six to nine in the Swedish school system. The study focuses on the web based educational resource called made by Komplementskolan AB. The aim of the study was to investigate the relationship between academic performance and using Another purpose was to see what other factors that can impact academic performances. The study made use of the data mining process, Cross Industry Standard for Data Mining (CRISP-DM), to understand and prepare the data and then create a regression model that is evaluated. The regression model tries predict the dependent variable of grade based on the independent variables of activity, gender and years in Swedish schools. The used data set includes the grades in mathematics, physics, chemistry, biology and religion of newly arrived students in Sweden from six municipalities that have access to The data used also includes metrics of the student s activity on The results show negative correlation between grade and gender of the student across all subjects. In this report, the negative correlation means that female students perform better than male students. Furthermore, there was a positive correlation between number of years a student has been in the same school and their academic grade. The study could not conclude a statistically significant relationship between the activity on and the students academic grade. Additional explanatory independent variables are needed to make a predictive model as well as investigating alternative regression models other than multiple linear regression. In the sample, a majority of the students have little or no activity on despite having free access to the resource through the municipality. keywords: Educational data mining (EDM), data mining (DM), Cross Industry Standard for Data Mining (CRISP-DM), Statistical analysis, Multiple linear regression, null-hypothesis, and level of significance.


5 Sammanfattning Denna studie analyserar inverkan som digitala läromedel har på skolbetyg bland nyanlända elever i Sverige mellan årskurs sex och nio i det svenska skolsystemet. Studien fokuserar på den webbaserade pedagogisk resursen, gjord av Komplementskolan AB. Målet med studien var att undersöka relationen mellan skolresultat och användandet av Ett annat syfte var att undersöka vad för andra faktorer som kan påverka skolresultat. Studien använder sig av datautvinningsprocessen, Cross Industry Standard for Datamining (CRISP-DM), för att förstå, förbereda och analysera datan i form av en regressionsmodell som sedan evalueras. Datasamlingen som används innehåller bland annat skolbetyg i ämnena matematik, fysik, kemi, biologi och religion från sex kommuner som har tillgång till Aktivitet hos eleverna från dessa kommuner på hemsidan användes också för studien. Resultaten visar en negativ korrelation mellan betyg och kön hos eleverna i alla ämnena. Den negativa korrelationen betyder i denna rapport att tjejer får bättre betyg i genomsnitt än killar hos urvalet av nyanlända från de sex kommunerna. Dessutom fanns det en positiv korrelation mellan antal år en elev varit i skolan alternativt i svenska skolsystemet och deras betyg. Studien kunde inte säkerställa ett statistisk signifikant resultat mellan aktivitet på och elevernas skolresultat. Ett flertal förklarande oberoende variabler behövs för att kunna skapa en prognastisk modell för skolresultat samt bör en undersökning på alternativa regressions modeller förutom linjär multipel regression göras. I studiens urval av nyanlända elever från kommunerna, har majoriteten inte använt eller knappt använt även om dessa kommuner haft tillgång till denna resurs. Nyckelord: Utbildningsdatautvinning, Datautvinning, CRISP-DM, Statistisk analys, Multipel linjär regression, nollhypotes och signifikansnivå.


7 Statement of Collaboration This bachelor thesis project was done by Akash Menon and Nahida Islam students of the program information and communication technology at KTH. The report was written with close collaboration with both authors writing multiple subsections in each chapter with exception of the Background which was mostly written by Nahida. Reviewing the report was done by both authors. Nahida did most of the data preparation of the data received from the municipalities while Akash handled most of the data preparation of the data from Most of the R scripts written as part of the data preparation, processing, regression and graphic plots generation was done by Akash and can be given upon request. The analysis was done jointly by both authors.


9 Contents 1 Introduction About Research area Hypothesis Purpose Benefits and ethical aspects Delimitations Outline (Disposition) Background Big Data Learning Analytics Data Mining CRISP-DM model Data mining process by Oracle Educational Data Mining R programming language Statistical analysis Multiple Linear regression Important terms for the regression Pearson correlation coefficient Relevance to this project Methods Step one: Business understanding Step two: Data understanding Step three: Data preparation Step four and five: Modeling and evaluation Step six: Knowledge Deployment Result Analysis of mathematics grades and performance Summary of the Results Discussion and Conclusion Data reliability Discussion of Results Discussion of General hypothesis Conclusion Reflection Future work and ideas 51 7 References 53

10 Appendices 57 A Results on physics 57 B Results on chemistry 63 C Results on biology 69 D Results on religion 75

11 Chapter 1 1 Introduction Internet has been a platform that has gained tremendous growth constantly in the last couple of years. Twenty years ago the Internet platform might not have been widely used by the general public. In recent years, internet usage among individuals has reached over 90% in Sweden according to the International Telecommunication Union (ITU) [1]. This has led to the increased accessibility of different ICT solutions. Among the new technologies that are gaining traction due to the spread of the Internet are digital platforms for education. Leaving the old system behind, these digital platforms allow greater flexibility in education by using a mixture of audio and video cues that can be accessed at the pace and location that fits the individual in question. is a digital educational platform that has a goal to make education easier for students and particularly newly immigrated students in Sweden. According to Migrationsinfo, around 16% of Sweden s population were born abroad in 2015 [2]. Education of newly arrived students is of very high importance as they have the potential to constitute a significant part of the future development of Sweden. Usage of Internet on a daily basis produce vast quantity of data that is created by individual users. This data sometimes contains traces of a user such as view history in YouTube. By analyzing these data one can predict which type of person the user is. In the same way there are tremendous numbers of user data produced in online educational material which remains not used. This data can be analyzed to improve the education system and to predict if such educational material can have any positive or negative impact on student s school grades. 1.1 About is a digital educational material that exists in the form of a website. It is owned by the Komplementskolan AB. had been producing online educational material such as videos and exercises for pupils since In the fall 2013 they decided to introduce a project called The Pilot Project (Pilotprojektet). The aim of the project was to translate videos on chemistry, biology, and religion to Arabic. This was appreciated by students and teachers from twelve schools [3]. Because of the positive feedback, another project called The Language Project (Språkprojektet), had been introduced with the help of Vinnova in 2015 [4]. This project is managed in co-operation with researchers and municipalities across Sweden. As part of the language project greater emphasis is put on multilingual aspects by developing a multilingual digital educational platform. This platform will help to welcome the newly immigrated pupils to the Swedish language as well as other subject in schools. is associated with 29 participated municipalities. They have animated videos on six different languages such as Swedish, English, Arabic, Somali, Tigrinya and Dari. The total number of translated films and quizzes are Page 9 of 80

12 Chapter 1 Table 1: Subscription types and associated supported features Features / Subscription Trial Individual School Språkprojektet Number of movies 10 Unlimited Unlimited Unlimited Number of quizzes 10 Unlimited Unlimited Unlimited Teacher accounts No No Unlimited Unlimited Homework functionality No No Full access Full access Multilingual All supported languages Swedish & English Swedish & English All supported languages Research support No No No Full support respectively 512 and 504 [5]. The platform is offered in trial form and three different types of subscriptions. The three different subscriptions [6] are individual licences, school licenses and all-inclusive license which is offered to schools that partake in The Language Project as can be seen in Table 1. The largest number of users consist of students and teachers from schools in municipalities that partake in The Language Project and have full access to all the features that the platform offers. While signing up for the platform the user selects their role as either student or teacher. They fill in their full name, , desired password and if applicable associated school can also be given. contains video lectures and quizzes on different topics covering grade seven to grade nine. When first started to release the video lectures, there used to be a teacher standing at a whiteboard and explaining the topic. To engage a user and keep their attention they animated the lecture videos. The videos are between two to five minutes in length and seldom longer in order to retain the students attention. A lot of videos are dubbed and have subtitles in up to six different languages. A user can choose any of the six languages as they prefer. Users can choose the language of audio and subtitles independently on demand. Some of the topics have a written supportive text explanation under the lecture video. These texts are in Swedish and there is no function for translating them in any another language. Every video has a quiz with questions on the topic. The quiz questions are divided into three levels. Aspects of gamification is used in the quizzes in which the user has three hearts. Each time a user answers incorrectly they lose a heart. Answering incorrectly three times ends the quiz with a message that encourages the user to watch the related video again. Each completed quiz level awards the student with points. Watching a video until completion also awards points. The homework functionality that is presented in Table 1 is offered to the subscription level denoted schools and The Language Project. Only a teacher Page 10 of 80

13 Chapter 1 can utilize this functionality by distributing homeworks to the whole class or to a specific student. A homework consists of videos and/or related quizzes. 1.2 Research area While the educational material is supposed to be beneficial for all students, the multilingual features of it give it potential to be an even more important resource for newly arrived students. Newly arrived students do not always have the sufficient language proficiency in Swedish to be able learn new concepts through it. The schools participating in the The Language Project have provided student grade data of newly arrived students to which is why this paper primarily focuses on newly immigrated students. According to The Swedish national school law (Skollagen) [7], a student should not be called newly arrived if he/she has been in the school for more than four years in Sweden. This report contains one exception. The exception is there were a few students that were received from the municipalities have been in Sweden more the four years as well. They are also included in the study. SIRIS, the National Agency for Education s online information system on results and quality [8], provides pass rates of different grades. In Table 2 the national pass rates of grade 6 in mathematics, physics, chemistry, biology and religion [9] are presented. Table 2: pass rates of grade 6 Percentage (%) with grade A-E Subject Year 2013/2014 Year 2014/2015 Year 2015/2016 Mathematics Physics Chemistry Biology Religion Unfortunately the authors could not find any statistics on the pass rates from further back. In Table 2 it is noticeable that the pass rates in year 2013/2014 were better than the pass rates from 2015/2016. Page 11 of 80

14 Chapter 1 In Table 3 the national pass rates of grade 6 in mathematics, physics, chemistry, biology and religion [10] are presented. Table 3: pass rates of grade 9 Percentage (%) with grade A-E Subject Year 2012/2013 Year 2013/2014 Year 2014/2015 Year 2015/2016 Mathematics ,7 89, Physics ,4 92, Chemistry , Biology , Religion ,3 93, In Table 3 it is noticeable that the pass rates in year 2012/2013 were better than the pass rates in 2015/2016. This patterns shows a decrease in pass rates across subjects among Swedish students where the subset newly arrived immigrant students is included. This pattern could be turned around if an external help such as online educational material can be implemented in the school system. That is why it is important to find out how effective online educational materials are for students. A first step to answer this question is to analyze what correlation there is between activity of a student on online educational material and their achieved academic grade. 1.3 Hypothesis While doing an experiment it is hard to claim that a prediction is always true. That is why it is essential to put one or several hypothesis. The authors have set some general and specific hypotheses before performing the study. One general hypothesis is that the longer an immigrated student has been in a school the better result they have compared to the first year of their school. It is a common fact that the longer you have been in a country the more you learn about the country and as well as the language. This means there could be a correlation between the time period a student have been in Sweden and their academic grade. The second general hypothesis the authors set up is that using an online educational material can help a student to achieve higher academic grade at school. Online educational material is meant to be a quick help for students. For instance a student is doing his/her mathematics homework at home and he/she needs help with fraction he/she could easily search for an online material to get help instead of waiting the whole day and ask his/her teacher about the Page 12 of 80

15 Chapter 1 problem next day. Because it is a quicker way to get help one might think or believe online educational material makes the education easier for the students which in turn helps the students to achieve higher academic grades. Studies shows that female students do have better performance and academic grades at school than male students [11]. So the third general hypothesis is that we expect female students on average have better school results than their male counterparts. The authors have adapted these general hypothesis to five specific hypothesis with the attention to the online educational material and the student s academic grade received from six municipalities. In mathematical language the terms null-hypothesis and alternative hypothesis are often used. Null-hypothesis is a short form of nullifiable hypothesis [12]. This is usually the opposite to that one wants to prove because it is the statement one wants to nullify or reject. The alternative hypothesis is often the statement one wants to accept in place of the rejected null-hypothesis [13]. Failure to reject the null-hypothesis means we can not accept our alternative hypothesis. The five null-hypotheses ( H 0 ) for this report can be described as following: H 0a : There is no relationship between the number of videos watched on and academic grade received from the school. H 0b : There is no relationship between the number of quizzes completed on and academic grade received from the school. H 0c : There is no relationship between the gender of a student and academic grade received from the school. H 0d : There is no relationship between number of years in school and academic grade received from the school. H 0e : There is no relationship between the set of predictors (Videos, Quizzes, Gender and Years in School) and academic grade received from the school In mathematical terms: H 0 : β 1,2,... = 0 which means there is no relationship between the variables. If the probability, also called the p-value, is less than the level of statistical significance α the H 0 can be rejected. α is set to 0.05 for this study. To summarize everything if p <α the H 0 can be rejected. The alternative hypotheses can be stated as following: H 1a : There is a relationship between the number of videos watched on and academic grade received from the school. H 1b : There is a relationship between the number of quizzes completed on and academic grade received from the school. H 1c : There is a relationship between the gender of a student and academic grade received from the school. Page 13 of 80

16 Chapter 1 H 1d : There is a relationship between number of years in school and academic grade received from the school. H 1e : There is a relationship between the set of predictors (Videos, Quizzes, Gender and Years in School) and academic grade received from the school In mathematical terms: H 1 : β 1,2,... 0 which means there is a relationship between the variables. If p <α the H 1 can be accepted. Table 4 describes the expected relationship between the independent variables and the dependent variable from the results according to the theory. Table 4: Summary of the independent variables expected relationship Independent variable Expected relationship Explanation Videos Positive More watched videos on are associated with higher academic grade. Quizzes Positive More completed quizzes on are associated with higher academic grade. Gender (male = 1, female = 0) Negative Female has better grade than male Years in school Positive The longer a student have been in a school the higher grade he/she has. 1.4 Purpose The idea behind the project is to measure the efficiency of the online educational material by testing the null-hypothesis discussed in the previous section (section 1.3). The reason to choose this path to measure the efficiency is to utilize educational data mining. The goal is to present the correlation with the help of regression models. 1.5 Benefits and ethical aspects The main beneficiaries from this project would be the schools in the municipalities with access to and who would get quantifiable results showing the effects from using the platform. While signing up in the website the user need to specify their address, name and school name. does not store national identification numbers or addresses of their users unlike some school IT systems or typical learning management systems. Page 14 of 80

17 Chapter 1 There are risks with the project concerning integrity when processing information containing personal identifiable information. Personal identifiable information includes data such as names and schools. Any results that are published have personally identifiable information redacted or replaced. If for instance data of a group of less than ten individuals from a certain geographic location is presented it is either redacted or aggregated into a larger group such that it is not possible to identify the individuals in real life. 1.6 Delimitations To execute the thesis project successfully some delimitations were set. section describes the established delimitations. This Online educational material: The project has the main focus on the platform called Student Grades: The student grade data is from only six of the municipalities taking part in The Language Project and only includes individual grade data on newly arrived students. data: The data that was used ranges from August 2012 until February Grades: Using grades as an indicator for academic performance is not a fully objective indicator. The grades are set by the subject teachers in schools according to guidelines in the Swedish curricula. Sometimes the grade vary from teacher to teacher for instance according to their understanding of the Swedish curricula. A more objective indicator for academic performance would be national exams results. Very few schools had reported national exams results and thus solely grades could be used as an indicator for academic performances. 1.7 Outline (Disposition) The first part of the report is an introductory part. This part consists of a short description about the company and their website and the problem. The Second part (section 2) of the report describes some theoretical background that are interesting for the project. The Third section (section 3) is about the methods that has been used through the whole project. In the forth section (section 4) there is a presentation of the results that has been produced. The fifth part (section 5) conducts a discussion and finalize a conclusion of the results and discussions. Lastly the sixth part (section 6) describes ideas for future work on similar issues. Page 15 of 80


19 Chapter 2 2 Background The following section describes and presents background and theory related to the thesis. The section ends with a description on which parts are relevant for this study. 2.1 Big Data Everyday while using internet we leave digital traces in a form of a large volume of data in various websites. These data can be both structured and unstructured and are called Big Data [14]. Our activities in digital education create in the same way a vast amount of data (often terabytes or petabytes). Because of the size and volume of Big Data one might need a cluster of computers for computing and processing the data [15]. The purpose of using Big Data in education is to be able to improve student results, their learning experience and reduce the number of dropouts [16]. has also a big number of data generated from their website. The volume is not as big as terabytes but still it is quite big to work with. The generated data consist of activities that every user have performed since August 2012 to February There are three types of activities a user can perform in These are watching videos, doing quizzes or doing homework given by teachers. Every time a user clicks on a video or quiz an entry is created in the database. Homework entries are created when teachers assign the homework to students. One can count these entries by an appropriate programming language to get the frequencies. 2.2 Learning Analytics Learning Analytics (LA) concerns an area dealing with collecting and analyzing data about and produced by students in the different educational contexts [15]. Stakeholders involved in most LA setting are students, teachers, institutions and researchers [17]. A policy brief from United Nations Educational, Scientific and Cultural Organization (UNESCO) [18] describes most common methods and concepts used in LA. These are learning management system (LMS) Analytics Dashboards, Predictive Analytics, Adaptive LA, Social Network Analytics, and Discourse Analytics. LMS Analytic Dashboards concerns the visualization aspect of LA and aims to provide stakeholders with graphs, tables or other form of visualization of students learning data. Predictive Analytics involves analysis of usage patterns and data to identify students that are doing well or are at risk of failing certain learning outcomes. Adaptive LA focuses on creating a dynamic model of a learners understanding of a topic. The aim here is to provide educational material based on this model to the learner. Social Network Analytics looks at the structure of interactions and relationships between people and groups using a certain LMS in order to identify behavioral patterns that can be beneficial for Page 17 of 80

20 Chapter 2 learning. Discourse Analytics is a LA in which written text and conversation is evaluated. It looks at how language can shape learning. Typically, a LMS offer a platform for communication and educational material. is not a full-fledged LMS as it only provides videos and quizzes as educational content. Teachers do not have any authorship over the educational environment. It offers limited communication between teachers and students. can instead be viewed as a educational material rather than platform. 2.3 Data Mining Data mining is the practice of automatically searching large stores of data to discover patterns and trends that go beyond simple analysis. Data mining uses sophisticated mathematical algorithms to segment the data and evaluate the probability of future events., according to Oracle [19]. Data mining makes it easier to address questions that are hard to answer with simple queries. This hardship occurs because of the large size of data that is generated. Data mining is widely used to discover patterns automatically, predict likely outcomes, create actionable information, and focus on huge databases. Data mining applications have been used widely in fighting against terrorism, business trends, health care and insurance, and web and semantic web [20]. Data mining goals can be categorized [21] as: Prediction: focuses on some variables in the data set to predict future values of other variables of interest. This type of data mining produces a model of the system described by the given data. Description: focuses on finding human-readable patters describing the data. This type of data mining produces new information based on the input data. The author Kantardzic of Data Mining Concepts, Models, Methods, and Algorithms [21] claims that the predictive and descriptive spectrum creates different types of data mining methods such as: Classification: This technique classifies data into several predefined classes. This can be helpful while executing predictive data mining. Regression: This is a method that has its core from statistical field. To achieve predictive result the data is mapped into a real-value prediction variable. Clustering: This is used to identify a finite set of clusters to describe the data. This method belongs to descriptive data mining. Summarization: The technique belongs to descriptive data mining that involves methods for finding a description for a data set. Dependency Modeling: Describes the significant dependencies between different variables of the data set. Page 18 of 80

21 Chapter 2 Change and Deviation Detection: describes the significant changes in the data set. There were two data mining process models the authors came across during the pilot study. One is the Cross Industry Standard Process for Data Mining (CRISP-DM) and the other one is suggested by Oracle documentation on Data mining concepts. Both data mining process models are generalized processes which are easy to implement as a base of a data mining project CRISP-DM model The Cross Industry Standard Process for Data Mining is a data mining process model that widely used within the data mining field [22]. The aim of CRISP-DM model is to make easy, reliable, less costly, manageable and faster data mining projects. Figure 1 shows the five different phases of the CRISP-DM model. Business Understanding Data Understanding Data Preparation Data Deployment Modeling Evaluation Figure 1: CRISP-DM process model overview Below the different steps of the Cross Industry Standard Process for data mining process are described in details. [22]. Step one: Business understanding The initial step to begin a project is to understand what is the goal with this business, what needs to be accomplished. It is important to uncover the different objectives and requirements of the project. The requirements include Page 19 of 80

22 Chapter 2 a summary over which data are needed, which attributes from the database are interesting, the different variables that are important to execute the project and other relevant factors. There should occur a discussion on what risks and contingencies that can occur during the project. Lastly, this knowledge needs to be converted into the data mining view. A goal should be set that is desired to be achieved through the data mining process. Step two: Data Understanding The second phase summarize the understanding of the data that were received. First one need to collect the data that are interesting for the project. These data can be collected from a database or handed over as another format. Data exploration, finding subsets, finding patterns can give one a good insight over the data one will work with. After understanding the data, the data preparation phase begins. Step three: Data preparation Data gathering, cleaning and preparing used to be the biggest part of such project. The data often needs to be processed and formatted into a uniform template. Data that is generated from databases needs to be cleaned and formatted according to the relevance of the project as well. One might need or want to extract, add or delete some data as it is required for the research. The preparation phase can be done several times to achieve the desired data set. This step should get full attention because the rest of the project depends on how well you have gathered your data to implement data mining. There are different tools in the market to ease the process. The most common tool is MS Excel which can be problematic when it comes to Big Data. Other prevailing tools for handling data processing and manipulation are IBM s SPSS and the open source programming language R. Step four: Modeling This phase comprises of building a new model or selecting an already existing model. Some studies on different types of data mining models and algorithms is necessary. After selecting a preferable model or algorithm the data might require data transformation to fit into the selected model. In the first stage of model building one might want to work with a small set of data. It is often noticed that the model that was chosen might need more data cleaning and preparation. That is why it is an iterative process which requires returning back to the previous step for further data preparation. Step five: Evaluation The constructed or chosen model is evaluated. The evaluation should answer how well the model satisfies the requirements described in the step one. According to the answer one might be satisfied with the model or change it to achieve the desired result. Page 20 of 80

23 Chapter 2 Step six: Deployment The last phase is to deploy the knowledge that were gained in a presentable way so that it is understandable for the customer. There are different ways to deploy the knowledge and it depends on what the customer requires. It can be done by generating a report or can even be done by implementing the model in their system Data mining process by Oracle The process there was recommended by Oracle [19] consisted of four stages: Problem definition, Data gathering and preparation, Model building and evaluation and Knowledge deployment. Figure 2 shows the four stages of the data mining process. Problem Definition Data Gathering and Preparation Model Building and Evaluation Knowledge Deployment Figure 2: Data mining process suggested by Oracle The stages of the data mining process suggested by Oracle follows the same general pattern as the CRISP-DM model. The CRISP-DM reference model breaks the Problem Definition step into the Business Understanding and Data Understanding steps. Model building and Evaluation is broken down into two separate steps in the CRISP-DM model as can be seen in figure 1 as opposed to the Oracle model in Figure 2. In general, the two processes for data mining are roughly similar. Page 21 of 80

24 Chapter Educational Data Mining Educational Data mining is an emerging field within Data mining. This field got popularity in recent years because of the growth of educational software and internet usage for educational purposes. This produces a big amount of data which is also known as Big Data. These data can be used to reflect student s learning [23]. Educational Data Mining (EDM) is a field that is used for analyzing educational data in order to research educational issues. To resolve such issues this field takes advantages of statistical, machine-learning, artificial intelligence, information technology, database management system and data-mining algorithms [24]. Educational Data mining inherits methods and techniques from Data Mining and can therefore be used in this field as well. EDM objective can be divided into two objectives as mentioned by Jindal and Borah [25]: 1. Academic Objectives: This objective can be oriented by different fields for instance person oriented, department/institutions oriented or domain oriented. 2. Administrative Objectives: This can be administrator oriented. 2.5 R programming language Among the most common tool for data preparation and processing within the context of data science is the R programming language according to a poll conducted in 2016 on the data science blog KDnuggets [26]. R is a open source programming language which has extensive libraries for statistical tools and graphical techniques. Among the relevant R features for this project, are the regression models, the graphical tools for plot generation and the tools for data preparation. Data can be loaded and manipulated from a variety of different sources in R. The two most relevant ones being commaseparated value (CSV) files and MS Excel files in this project. 2.6 Statistical analysis Statistics is a way of analyzing collected data to support statements about a population. A statistical analysis is often based on probability [27]. Statistics have been used widely in the field of science for various types of analytic researches. A statistical survey often consists of four parts; planning, data gathering, processing and presentation [28]. A common statistical method for modelling is Regression analysis. Regression analysis attempts to explain possible relationships between variables through the use of constructed mathematical models [29]. Page 22 of 80

25 Chapter Multiple Linear regression The most common and widely used regression is linear regression which is used to model the assumed linear relationship between a dependent variable y and one or more explanatory variables X i, i = 0, 1, 2, 3. One simple equation of linear regression model would be y = β 0 + β 1 X 1 + β 2 X 2 + β 3 X 3 + ε (1) where β 0 is called the intercept term, β i, i = 0, 1, 2, 3 are unknown parameters and ε is the error value that represents the difference between the observed value and the predicted value from the regression equation [30]. Multiple linear regression is a type of linear regression with one dependent variable y and two or more explanatory variables X. The equation for multiple linear regression would be n y = β 0 + β k X k + ε (2) k=1 A linear regression has five assumptions that must be fulfilled in order to execute a regression on the data set. The assumptions[31] are listed below: 1. Linear relationship: It is important that the independent and dependent variables has a linear relationship. This is can be tested by scatter plots. 2. Multivariate normality: All variables must be multivariate normal. If the data is not normally distributed a non linear transformation can be useful to fix the problem. 3. No or little multicollinearity: All independent variables must be independent from each other. 4. No auto-correlation: The residuals must be independent from each other. 5. Homoscedasticity: The error terms along the regression are equal Important terms for the regression There are some terms that one should be familiar with while executing regression in R-environment. Below comes a list of some terms with definition that are used often: Residuals: gives the unexplained variation in the regression model. It is calculated by observed y value minus the calculated y value based on the regression model [32]. Coefficients: Standard deviations, t-value and probability for the null hypothesis provides estimates for the model s coefficients [32]. Multiple R-squared: Describes how many points fall on the regression line. This is also called the coefficient of determination [32]. Page 23 of 80

26 Chapter 2 Adjusted R-squared: adjusts for the number of terms in a model [32]. F-statistic: yields F significance and p value of the F test which states whether the regression is a statistically significant (p < 0.05 at 95% confidence interval). p-value: probability for a statistical model. T-test: yields a t-value and p-value that is the check for statistical significance of each beta coefficient in the regression equation. Residuals vs Fitted: A scatter plot of residuals on the y axis and fitted value on the x axis. It shows if the residuals have non-linear patterns. There is no non-linear relationships between the variables if the residuals are equally spread around a horizontal line and has no distinct patterns [33]. Figure 3 is simulated from example data set in R called women. This shows a parabola pattern and is not equally spread around the horizontal line. Residuals vs Fitted 15 Residuals Fitted values lm(weight ~ height) Figure 3: Residuals vs Fitted Normal Q-Q: A plot to show if residuals are normally distributed. If it is normally distributed the residuals will follow the straight dashed line [33]. Figure 4 is simulated from the same example data set as previous. This Page 24 of 80

27 Chapter 2 figure shows that the residuals do not follow the straight dashed line so well. Normal Q Q 15 Standardized residuals Theoretical Quantiles lm(weight ~ height) Figure 4: Normal Q-Q plot Scale-location: Known also as Spread-Location plot and use the squareroot of Standardized residuals. This plot provides a picture over if the residuals are spread equally along the predictor ranges. It can be used to check the assumption of homoscedasticity. A horizontal line with randomly spread points is not what to expect. It is good to have a plot where the points are equally spread around the horizontal line [33]. Figure 5 is simulated from the same example data set as previous. The Scale-location plot shows that the points are randomly spread. Page 25 of 80

28 Chapter 2 Scale Location 15 Standardized residuals Fitted values lm(weight ~ height) Figure 5: Scale-location Residuals vs Leverage: Leverage shows how much each data point impacts the regression. If points are far from the centroid they have greater leverage, the fewer points nearby the greater leverage. Leverage presents both the distance from the centroid and the isolation of a point. Cook s distance can also be seen in the plot. This describes how much the regression would change if a a point was deleted. The expected result is to see the smooth line (in red) close to the horizontal dashed line (in gray) and no point should have large Cook s distance than 0.5 [34]. Figure 6 is simulated from the same example data set as previous. TÍn this figure the smooth line (in red) is not close to the dashed line (in gray). The Cook s distance is 1. Page 26 of 80

29 Chapter 2 Residuals vs Leverage 1 15 Standardized residuals Cook's distance Leverage lm(weight ~ height) Figure 6: Residuals vs Leverage Pearson correlation coefficient The Pearson correlation coefficient also known as product moment correlation coefficient is a method for measuring the degree of linear association between two variables. The coefficient is represented by r. It is measured in the range from -1 through 0 to 1 with no units. If the coefficient is positive it indicates that there is a positive correlation. If the coefficient is 0 it means there is no correlation. If the correlation is negative it indicates that there is a negative correlation. The coefficient r can be calculated [35] as following: r xy = n( x i y i ) ( x i )( y i ) ([n x 2 i ( x i ) 2 ][n y 2 i ( y i ) 2 ]) (3) 2.7 Relevance to this project Everything that was discussed in the Background section (section 2) was indirectly relevant for the project. The most relevant aspects are Big Data, Data Mining and the Data Mining processes, Educational Data Mining and Statistical Analysis. The core of the data analysis depends of statistical tools and theory. Page 27 of 80

30 Chapter 2 The CRISP-DM process model was adopted for this study and most data preparation was done in the R programming language and MS Excel. Most of the data analysis and graphs were produced also in R as it has extensive built-in libraries adequate for data and statistical analysis. Page 28 of 80

31 Chapter 3 3 Methods To execute this project the authors followed the Cross Industry Standard Process for data mining process [22] which was presented in the previous section (section 2). The reason behind choosing the CRISP-DM model that it is one of the most common models that has been used in data mining field [36]. Below, there is a description about how the authors used CRISP-DM model to acheive the results. 3.1 Step one: Business understanding In the beginning of the project a discussion about the problem took place. The formulation that was given by the supervisor from to the authors was as following: Investigate to which extent we can use computer traces left by a student in a digital teaching component with preserved personal integrity. We particularly want to study to what extent these traces can be helpful to understand student s learning and understanding. The next step was to break this formulation into its components to have a better understanding on what the company wants and define each part. The study then began with defining the words from the formulation. After getting help from the supervisor by having a discussion the authors could construct the meaning of the two most important terms from the problem formulation. Computer traces: The authors needed to know what is computer traces, which type of traces are they and what to do with these traces. Computer traces are event logs caused by user s activity on any online material. These activities can be how many times a user clicked on a button, how long they have watched a video or how many time they have logged in to the website. Preserved personal integrity: Why is this important to preserve the personal integrity in this case? does not require any national identity number while signing up to their website. One of the requirement that was stated by the client was to not mention any school name and student name in the report. The stakeholders, and the municipalities wanted to evaluate whether their learning resource had a meaningful positive impact on academic performances. After understanding the problem the authors discussed with supervisor and a new problem formulation was suggested.the formulation of the problem in a quantifiable and measurable form can thus be as following: Investigate to which extent we can use computer traces to predict a correlation between high activity of a student on an online educational material and their academic grade. Page 29 of 80

32 Chapter 3 The result was suggested by the supervisor to be presented in the form a regression with a mathematical equation and statistical measures to illustrate and denote the relationship. During this period the authors have discussed the different requirements, variables and risks that can be met as it was a part of the CRISP-DM model. A list of the requirements and variables were created from the problem formulation. To be able to go ahead with the project the event logs of user activity (computer traces), academic grades (received from six municipalities) and knowledge of regression were required. The authors decided to estimate the level of user activity with the help of two variables, how many videos a student has watched and how many quizzes a student has completed. Another variable, number of homework done could be interesting in this study. A homework consists of a video and/or quizzes which would be a subset of how many videos a student has watched and how many quizzes a student has done. Beside that, there were very few schools that used the homework functionality of That is why the number of homework done was not used as a variable in this study. Another way to estimate the level of user activity could be number of times a student has logged in to This considered as an ineffective way as a student can log in and do nothing. The two main learning activities on the site are watching videos and doing quizzes related to the contents of these videos thus these are the two most relevant variables for measuring user activity. The relevant variables that were accessible at this stage were different metrics of activity, grade in subjects and years in school. The variables that were decided to be used in the project were: 1. y - Grade in subject of student 2. X 1 - Number of videos watched by student 3. X 2 - Number of quizzes completed by student 4. X 3 - Gender of the student. 5. X 4 - Number of years the student has been in the school Academic grades and the number of years in school were available in the data sent from the municipality and the the number of videos watched and quizzes completed by the students were available from Even though the problem formulation had the focus on activity on the authors decided to add two extra explanatory variables, gender (estimated) and numbers of years in school. The reason behind this is to be able to make a better regression model. Gender for instance, should matter as girls tend to perform better in school than boys [11]. This variable should display a similar pattern in the regression.the authors wanted to also investigate whether being in Sweden for a longer period affects a students grade or not, which the number of years in the school variable gives a indication of. Page 30 of 80

33 Chapter 3 Several risks could be encountered during this project. The authors might not receive all the data in time which would lead to they have to work with small data set. The data that were received might not result to any significant results because of the quality. This step gave the authors a list on the variables that will be involved in further studies and an insight on the risks that can be met. 3.2 Step two: Data understanding While the step one was ongoing, the author received a number of files that contained user activities in Some example of activities are watching video, doing quiz and doing home works. In Table 5 the file name and their contents are described in details. Table 5: Different data files Data file name Size (MB) Number of Rows Description Student Activity This file contains entry ID, lesson ID, subject name, entry date and time, school type and municipality name. Quizzes This file contains entry ID, user ID, name of the user, role, school, school ID, municipality name, entry date and time, subject name and difficulty level. Videos 51, This file contains entry ID, user ID, name of the user, role, school, school ID, municipality name, entry date and time and subject name. With the help of the IT-department of the authors figured out what every column in the raw data of activity means. Below one can read definition of different columns in the files. 1. Entry ID: every entry in the log has a unique ID. 2. Lesson ID: every lesson in has an ID that differs from other lesson ID. 3. User ID: every user of the website has a unique ID which can identify a specific user. 4. Role: in the website two types of roles can be found. Teacher and student. Both teachers and students can open an account in Page 31 of 80

Probability and Statistics Curriculum Pacing Guide

Probability and Statistics Curriculum Pacing Guide Unit 1 Terms PS.SPMJ.3 PS.SPMJ.5 Plan and conduct a survey to answer a statistical question. Recognize how the plan addresses sampling technique, randomization, measurement of experimental error and methods

More information



More information

STA 225: Introductory Statistics (CT)

STA 225: Introductory Statistics (CT) Marshall University College of Science Mathematics Department STA 225: Introductory Statistics (CT) Course catalog description A critical thinking course in applied statistical reasoning covering basic

More information


Preprint. Preprint This is the submitted version of a paper presented at Privacy in Statistical Databases'2006 (PSD'2006), Rome, Italy, 13-15 December, 2006. Citation for the original

More information

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X

The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, / X The 9 th International Scientific Conference elearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-154 DATA MINING SOLUTIONS FOR DETERMINING STUDENT'S PROFILE Adela BÂRA,

More information

learning collegiate assessment]

learning collegiate assessment] [ collegiate learning assessment] INSTITUTIONAL REPORT 2005 2006 Kalamazoo College council for aid to education 215 lexington avenue floor 21 new york new york 10016-6023 p 212.217.0700 f 212.661.9766

More information

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview

Algebra 1, Quarter 3, Unit 3.1. Line of Best Fit. Overview Algebra 1, Quarter 3, Unit 3.1 Line of Best Fit Overview Number of instructional days 6 (1 day assessment) (1 day = 45 minutes) Content to be learned Analyze scatter plots and construct the line of best

More information

Instructor: Mario D. Garrett, Ph.D. Phone: Office: Hepner Hall (HH) 100

Instructor: Mario D. Garrett, Ph.D.   Phone: Office: Hepner Hall (HH) 100 San Diego State University School of Social Work 610 COMPUTER APPLICATIONS FOR SOCIAL WORK PRACTICE Statistical Package for the Social Sciences Office: Hepner Hall (HH) 100 Instructor: Mario D. Garrett,

More information

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved.

VOL. 3, NO. 5, May 2012 ISSN Journal of Emerging Trends in Computing and Information Sciences CIS Journal. All rights reserved. Exploratory Study on Factors that Impact / Influence Success and failure of Students in the Foundation Computer Studies Course at the National University of Samoa 1 2 Elisapeta Mauai, Edna Temese 1 Computing

More information

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur

Module 12. Machine Learning. Version 2 CSE IIT, Kharagpur Module 12 Machine Learning 12.1 Instructional Objective The students should understand the concept of learning systems Students should learn about different aspects of a learning system Students should

More information



More information

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District

An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District An Empirical Analysis of the Effects of Mexican American Studies Participation on Student Achievement within Tucson Unified School District Report Submitted June 20, 2012, to Willis D. Hawley, Ph.D., Special

More information

OCR for Arabic using SIFT Descriptors With Online Failure Prediction

OCR for Arabic using SIFT Descriptors With Online Failure Prediction OCR for Arabic using SIFT Descriptors With Online Failure Prediction Andrey Stolyarenko, Nachum Dershowitz The Blavatnik School of Computer Science Tel Aviv University Tel Aviv, Israel Email:,

More information

The Role of Architecture in a Scaled Agile Organization - A Case Study in the Insurance Industry

The Role of Architecture in a Scaled Agile Organization - A Case Study in the Insurance Industry Master s Thesis for the Attainment of the Degree Master of Science at the TUM School of Management of the Technische Universität München The Role of Architecture in a Scaled Agile Organization - A Case

More information

A Case Study: News Classification Based on Term Frequency

A Case Study: News Classification Based on Term Frequency A Case Study: News Classification Based on Term Frequency Petr Kroha Faculty of Computer Science University of Technology 09107 Chemnitz Germany Ricardo Baeza-Yates Center

More information

K5 Math Practice. Free Pilot Proposal Jan -Jun Boost Confidence Increase Scores Get Ahead. Studypad, Inc.

K5 Math Practice. Free Pilot Proposal Jan -Jun Boost Confidence Increase Scores Get Ahead. Studypad, Inc. K5 Math Practice Boost Confidence Increase Scores Get Ahead Free Pilot Proposal Jan -Jun 2017 Studypad, Inc. 100 W El Camino Real, Ste 72 Mountain View, CA 94040 Table of Contents I. Splash Math Pilot

More information

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4

Chapters 1-5 Cumulative Assessment AP Statistics November 2008 Gillespie, Block 4 Chapters 1-5 Cumulative Assessment AP Statistics Name: November 2008 Gillespie, Block 4 Part I: Multiple Choice This portion of the test will determine 60% of your overall test grade. Each question is

More information

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge

Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Innov High Educ (2009) 34:93 103 DOI 10.1007/s10755-009-9095-2 Maximizing Learning Through Course Alignment and Experience with Different Types of Knowledge Phyllis Blumberg Published online: 3 February

More information

Case study Norway case 1

Case study Norway case 1 Case study Norway case 1 School : B (primary school) Theme: Science microorganisms Dates of lessons: March 26-27 th 2015 Age of students: 10-11 (grade 5) Data sources: Pre- and post-interview with 1 teacher

More information

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE

Edexcel GCSE. Statistics 1389 Paper 1H. June Mark Scheme. Statistics Edexcel GCSE Edexcel GCSE Statistics 1389 Paper 1H June 2007 Mark Scheme Edexcel GCSE Statistics 1389 NOTES ON MARKING PRINCIPLES 1 Types of mark M marks: method marks A marks: accuracy marks B marks: unconditional

More information

AP Statistics Summer Assignment 17-18

AP Statistics Summer Assignment 17-18 AP Statistics Summer Assignment 17-18 Welcome to AP Statistics. This course will be unlike any other math class you have ever taken before! Before taking this course you will need to be competent in basic

More information



More information


SURVIVING ON MARS WITH GEOGEBRA SURVIVING ON MARS WITH GEOGEBRA Lindsey States and Jenna Odom Miami University, OH Abstract: In this paper, the authors describe an interdisciplinary lesson focused on determining how long an astronaut

More information

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland Claus Pahl

More information

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining

Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Evaluation of Usage Patterns for Web-based Educational Systems using Web Mining Dave Donnellan, School of Computer Applications Dublin City University Dublin 9 Ireland Claus Pahl

More information

Individual Differences & Item Effects: How to test them, & how to test them well

Individual Differences & Item Effects: How to test them, & how to test them well Individual Differences & Item Effects: How to test them, & how to test them well Individual Differences & Item Effects Properties of subjects Cognitive abilities (WM task scores, inhibition) Gender Age

More information

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics

GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics 2017-2018 GRADUATE STUDENT HANDBOOK Master of Science Programs in Biostatistics Entrance requirements, program descriptions, degree requirements and other program policies for Biostatistics Master s Programs

More information

Lecture 1: Machine Learning Basics

Lecture 1: Machine Learning Basics 1/69 Lecture 1: Machine Learning Basics Ali Harakeh University of Waterloo WAVE Lab May 1, 2017 2/69 Overview 1 Learning Algorithms 2 Capacity, Overfitting, and Underfitting 3

More information

On-Line Data Analytics

On-Line Data Analytics International Journal of Computer Applications in Engineering Sciences [VOL I, ISSUE III, SEPTEMBER 2011] [ISSN: 2231-4946] On-Line Data Analytics Yugandhar Vemulapalli #, Devarapalli Raghu *, Raja Jacob

More information



More information

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are:

Alpha provides an overall measure of the internal reliability of the test. The Coefficient Alphas for the STEP are: Every individual is unique. From the way we look to how we behave, speak, and act, we all do it differently. We also have our own unique methods of learning. Once those methods are identified, it can make

More information

Mathematics Program Assessment Plan

Mathematics Program Assessment Plan Mathematics Program Assessment Plan Introduction This assessment plan is tentative and will continue to be refined as needed to best fit the requirements of the Board of Regent s and UAS Program Review

More information

Analyzing the Usage of IT in SMEs

Analyzing the Usage of IT in SMEs IBIMA Publishing Communications of the IBIMA Vol. 2010 (2010), Article ID 208609, 10 pages DOI: 10.5171/2010.208609 Analyzing the Usage of IT

More information

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data

What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data What s in a Step? Toward General, Abstract Representations of Tutoring System Log Data Kurt VanLehn 1, Kenneth R. Koedinger 2, Alida Skogsholm 2, Adaeze Nwaigwe 2, Robert G.M. Hausmann 1, Anders Weinstein

More information

Integration of ICT in Teaching and Learning

Integration of ICT in Teaching and Learning Integration of ICT in Teaching and Learning Dr. Pooja Malhotra Assistant Professor, Dept of Commerce, Dyal Singh College, Karnal, India Email: INTRODUCTION 2 st century is an era of

More information

International Advanced level examinations

International Advanced level examinations International Advanced level examinations Entry, Aggregation and Certification Procedures and Rules Effective from 2014 onwards Document running section Contents Introduction 3 1. Making entries 4 2. Receiving

More information

Field Experience Management 2011 Training Guides

Field Experience Management 2011 Training Guides Field Experience Management 2011 Training Guides Page 1 of 40 Contents Introduction... 3 Helpful Resources Available on the LiveText Conference Visitors Pass... 3 Overview... 5 Development Model for FEM...

More information

Software Maintenance

Software Maintenance 1 What is Software Maintenance? Software Maintenance is a very broad activity that includes error corrections, enhancements of capabilities, deletion of obsolete capabilities, and optimization. 2 Categories

More information

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010)

Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010) Understanding and Interpreting the NRC s Data-Based Assessment of Research-Doctorate Programs in the United States (2010) Jaxk Reeves, SCC Director Kim Love-Myers, SCC Associate Director Presented at UGA

More information

What is related to student retention in STEM for STEM majors? Abstract:

What is related to student retention in STEM for STEM majors? Abstract: What is related to student retention in STEM for STEM majors? Abstract: The purpose of this study was look at the impact of English and math courses and grades on retention in the STEM major after one

More information



More information

12- A whirlwind tour of statistics

12- A whirlwind tour of statistics CyLab HT 05-436 / 05-836 / 08-534 / 08-734 / 19-534 / 19-734 Usable Privacy and Security TP :// C DU February 22, 2016 y & Secu rivac rity P le ratory bo La Lujo Bauer, Nicolas Christin, and Abby Marsh

More information

A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting

A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting A Study of the Effectiveness of Using PER-Based Reforms in a Summer Setting Turhan Carroll University of Colorado-Boulder REU Program Summer 2006 Introduction/Background Physics Education Research (PER)

More information

Evaluation of a College Freshman Diversity Research Program

Evaluation of a College Freshman Diversity Research Program Evaluation of a College Freshman Diversity Research Program Sarah Garner University of Washington, Seattle, Washington 98195 Michael J. Tremmel University of Washington, Seattle, Washington 98195 Sarah

More information

Improving Conceptual Understanding of Physics with Technology

Improving Conceptual Understanding of Physics with Technology INTRODUCTION Improving Conceptual Understanding of Physics with Technology Heidi Jackman Research Experience for Undergraduates, 1999 Michigan State University Advisors: Edwin Kashy and Michael Thoennessen

More information

STA2023 Introduction to Statistics (Hybrid) Spring 2013

STA2023 Introduction to Statistics (Hybrid) Spring 2013 STA2023 Introduction to Statistics (Hybrid) Spring 2013 Course Description This course introduces the student to the concepts of a statistical design and data analysis with emphasis on introductory descriptive

More information

Visit us at:

Visit us at: White Paper Integrating Six Sigma and Software Testing Process for Removal of Wastage & Optimizing Resource Utilization 24 October 2013 With resources working for extended hours and in a pressurized environment,

More information


TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE. Pierre Foy TIMSS ADVANCED 2015 USER GUIDE FOR THE INTERNATIONAL DATABASE Pierre Foy TIMSS Advanced 2015 orks User Guide for the International Database Pierre Foy Contributors: Victoria A.S. Centurino, Kerry E. Cotter,

More information

Third Misconceptions Seminar Proceedings (1993)

Third Misconceptions Seminar Proceedings (1993) Third Misconceptions Seminar Proceedings (1993) Paper Title: BASIC CONCEPTS OF MECHANICS, ALTERNATE CONCEPTIONS AND COGNITIVE DEVELOPMENT AMONG UNIVERSITY STUDENTS Author: Gómez, Plácido & Caraballo, José

More information

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University

IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University IT Students Workshop within Strategic Partnership of Leibniz University and Peter the Great St. Petersburg Polytechnic University 06.11.16 13.11.16 Hannover Our group from Peter the Great St. Petersburg

More information

Houghton Mifflin Online Assessment System Walkthrough Guide

Houghton Mifflin Online Assessment System Walkthrough Guide Houghton Mifflin Online Assessment System Walkthrough Guide Page 1 Copyright 2007 by Houghton Mifflin Company. All Rights Reserved. No part of this document may be reproduced or transmitted in any form

More information

How the Guppy Got its Spots:

How the Guppy Got its Spots: This fall I reviewed the Evobeaker labs from Simbiotic Software and considered their potential use for future Evolution 4974 courses. Simbiotic had seven labs available for review. I chose to review the

More information

Science Fair Project Handbook

Science Fair Project Handbook Science Fair Project Handbook IDENTIFY THE TESTABLE QUESTION OR PROBLEM: a) Begin by observing your surroundings, making inferences and asking testable questions. b) Look for problems in your life or surroundings

More information

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools

Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Listening and Speaking Skills of English Language of Adolescents of Government and Private Schools Dr. Amardeep Kaur Professor, Babe Ke College of Education, Mudki, Ferozepur, Punjab Abstract The present

More information

Probabilistic Latent Semantic Analysis

Probabilistic Latent Semantic Analysis Probabilistic Latent Semantic Analysis Thomas Hofmann Presentation by Ioannis Pavlopoulos & Andreas Damianou for the course of Data Mining & Exploration 1 Outline Latent Semantic Analysis o Need o Overview

More information

Association Between Categorical Variables

Association Between Categorical Variables Student Outcomes Students use row relative frequencies or column relative frequencies to informally determine whether there is an association between two categorical variables. Lesson Notes In this lesson,

More information

Enhancing Customer Service through Learning Technology

Enhancing Customer Service through Learning Technology C a s e S t u d y Enhancing Customer Service through Learning Technology John Hancock Implements an online learning solution which integrates training, performance support, and assessment Chris Howard

More information

International Journal of Innovative Research and Advanced Studies (IJIRAS) Volume 4 Issue 5, May 2017 ISSN:

International Journal of Innovative Research and Advanced Studies (IJIRAS) Volume 4 Issue 5, May 2017 ISSN: Effectiveness Of Using Video Presentation In Teaching Biology Over Conventional Lecture Method Among Ninth Standard Students Of Matriculation Schools In Coimbatore District Ms. Shigee.K Master of Education,

More information

The Implementation of Interactive Multimedia Learning Materials in Teaching Listening Skills

The Implementation of Interactive Multimedia Learning Materials in Teaching Listening Skills English Language Teaching; Vol. 8, No. 12; 2015 ISSN 1916-4742 E-ISSN 1916-4750 Published by Canadian Center of Science and Education The Implementation of Interactive Multimedia Learning Materials in

More information

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur)

Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur) Quantitative analysis with statistics (and ponies) (Some slides, pony-based examples from Blase Ur) 1 Interviews, diary studies Start stats Thursday: Ethics/IRB Tuesday: More stats New homework is available

More information

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade

Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade Math-U-See Correlation with the Common Core State Standards for Mathematical Content for Third Grade The third grade standards primarily address multiplication and division, which are covered in Math-U-See

More information

Industrial Excellence

Industrial Excellence Industrial Excellence KPP319 Product and Process Development Mats Jackson, Magnus Wiktorsson, Course objectives The aim of the course is to give the student

More information

Nature of science progression in school year 1-9: An analysis of the Swedish curriculum and teachers suggestions

Nature of science progression in school year 1-9: An analysis of the Swedish curriculum and teachers suggestions Nature of science progression in school year 1-9: An analysis of the Swedish curriculum and teachers suggestions Lotta Leden Kristianstad University Sweden Lena Hansson Kristianstad

More information

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210

State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210 1 State University of New York at Buffalo INTRODUCTION TO STATISTICS PSC 408 Fall 2015 M,W,F 1-1:50 NSC 210 Dr. Michelle Benson Office: 513 Park Hall Office Hours: Mon & Fri 10:30-12:30

More information

Statewide Framework Document for:

Statewide Framework Document for: Statewide Framework Document for: 270301 Standards may be added to this document prior to submission, but may not be removed from the framework to meet state credit equivalency requirements. Performance

More information

Using Software to Reach Beyond the Classroom: Intermediate

Using Software to Reach Beyond the Classroom: Intermediate Using Software to Reach Beyond the Classroom: Intermediate NESA Conference 2007 Presenter: Barbara Dent Educational Technology Training Specialist Thomas Jefferson High School for Science

More information

Evaluation of Teach For America:

Evaluation of Teach For America: EA15-536-2 Evaluation of Teach For America: 2014-2015 Department of Evaluation and Assessment Mike Miles Superintendent of Schools This page is intentionally left blank. ii Evaluation of Teach For America:

More information

Evidence for Reliability, Validity and Learning Effectiveness

Evidence for Reliability, Validity and Learning Effectiveness PEARSON EDUCATION Evidence for Reliability, Validity and Learning Effectiveness Introduction Pearson Knowledge Technologies has conducted a large number and wide variety of reliability and validity studies

More information

How to make an A in Physics 101/102. Submitted by students who earned an A in PHYS 101 and PHYS 102.

How to make an A in Physics 101/102. Submitted by students who earned an A in PHYS 101 and PHYS 102. How to make an A in Physics 101/102. Submitted by students who earned an A in PHYS 101 and PHYS 102. PHYS 102 (Spring 2015) Don t just study the material the day before the test know the material well

More information

Using SAM Central With iread

Using SAM Central With iread Using SAM Central With iread January 1, 2016 For use with iread version 1.2 or later, SAM Central, and Student Achievement Manager version 2.4 or later PDF0868 (PDF) Houghton Mifflin Harcourt Publishing

More information

Measures of the Location of the Data

Measures of the Location of the Data OpenStax-CNX module m46930 1 Measures of the Location of the Data OpenStax College This work is produced by OpenStax-CNX and licensed under the Creative Commons Attribution License 3.0 The common measures

More information

Word Segmentation of Off-line Handwritten Documents

Word Segmentation of Off-line Handwritten Documents Word Segmentation of Off-line Handwritten Documents Chen Huang and Sargur N. Srihari {chuang5, srihari} Center of Excellence for Document Analysis and Recognition (CEDAR), Department

More information

UK Institutional Research Brief: Results of the 2012 National Survey of Student Engagement: A Comparison with Carnegie Peer Institutions

UK Institutional Research Brief: Results of the 2012 National Survey of Student Engagement: A Comparison with Carnegie Peer Institutions UK Institutional Research Brief: Results of the 2012 National Survey of Student Engagement: A Comparison with Carnegie Peer Institutions November 2012 The National Survey of Student Engagement (NSSE) has

More information

Reinforcement Learning by Comparing Immediate Reward

Reinforcement Learning by Comparing Immediate Reward Reinforcement Learning by Comparing Immediate Reward Punit Pandey DeepshikhaPandey Dr. Shishir Kumar Abstract This paper introduces an approach to Reinforcement Learning Algorithm by comparing their immediate

More information

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany

Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Entrepreneurial Discovery and the Demmert/Klein Experiment: Additional Evidence from Germany Jana Kitzmann and Dirk Schiereck, Endowed Chair for Banking and Finance, EUROPEAN BUSINESS SCHOOL, International

More information

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017

Lahore University of Management Sciences. FINN 321 Econometrics Fall Semester 2017 Instructor Syed Zahid Ali Room No. 247 Economics Wing First Floor Office Hours Email Telephone Ext. 8074 Secretary/TA TA Office Hours Course URL (if any) FINN 321 Econometrics

More information


CHANCERY SMS 5.0 STUDENT SCHEDULING CHANCERY SMS 5.0 STUDENT SCHEDULING PARTICIPANT WORKBOOK VERSION: 06/04 CSL - 12148 Student Scheduling Chancery SMS 5.0 : Student Scheduling... 1 Course Objectives... 1 Course Agenda... 1 Topic 1: Overview

More information

Mining Association Rules in Student s Assessment Data

Mining Association Rules in Student s Assessment Data 211 Mining Association Rules in Student s Assessment Data Dr. Varun Kumar 1, Anupama Chadha 2 1 Department of Computer Science and Engineering, MVN University Palwal, Haryana, India 2 Anupama

More information

Exploring Derivative Functions using HP Prime

Exploring Derivative Functions using HP Prime Exploring Derivative Functions using HP Prime Betty Voon Wan Niu College of Engineering Universiti Tenaga Nasional Malaysia Wong Ling Shing Faculty of Health and Life Sciences, INTI

More information

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction

CLASSIFICATION OF PROGRAM Critical Elements Analysis 1. High Priority Items Phonemic Awareness Instruction CLASSIFICATION OF PROGRAM Critical Elements Analysis 1 Program Name: Macmillan/McGraw Hill Reading 2003 Date of Publication: 2003 Publisher: Macmillan/McGraw Hill Reviewer Code: 1. X The program meets

More information


OFFICE SUPPORT SPECIALIST Technical Diploma OFFICE SUPPORT SPECIALIST Technical Diploma Program Code: 31-106-8 our graduates INDEMAND 2017/2018 administrative professional career pathway OFFICE SUPPORT SPECIALIST CUSTOMER RELATIONSHIP PROFESSIONAL

More information

Process improvement, The Agile Way! By Ben Linders Published in Methods and Tools, winter

Process improvement, The Agile Way! By Ben Linders Published in Methods and Tools, winter Process improvement, The Agile Way! By Ben Linders Published in Methods and Tools, winter 2010. Summary Business needs for process improvement projects are changing. Organizations

More information

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y

S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y Department of Mathematics, Statistics and Science College of Arts and Sciences Qatar University S T A T 251 C o u r s e S y l l a b u s I n t r o d u c t i o n t o p r o b a b i l i t y A m e e n A l a

More information

Extending Place Value with Whole Numbers to 1,000,000

Extending Place Value with Whole Numbers to 1,000,000 Grade 4 Mathematics, Quarter 1, Unit 1.1 Extending Place Value with Whole Numbers to 1,000,000 Overview Number of Instructional Days: 10 (1 day = 45 minutes) Content to Be Learned Recognize that a digit

More information

Python Machine Learning

Python Machine Learning Python Machine Learning Unlock deeper insights into machine learning with this vital guide to cuttingedge predictive analytics Sebastian Raschka [ PUBLISHING 1 open source I community experience distilled

More information

University of Groningen. Systemen, planning, netwerken Bosman, Aart

University of Groningen. Systemen, planning, netwerken Bosman, Aart University of Groningen Systemen, planning, netwerken Bosman, Aart IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document

More information

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for

Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Learning Optimal Dialogue Strategies: A Case Study of a Spoken Dialogue Agent for Email Marilyn A. Walker Jeanne C. Fromer Shrikanth Narayanan

More information

Empowering Students Learning Achievement Through Project-Based Learning As Perceived By Electrical Instructors And Students

Empowering Students Learning Achievement Through Project-Based Learning As Perceived By Electrical Instructors And Students Edith Cowan University Research Online EDU-COM International Conference Conferences, Symposia and Campus Events 2006 Empowering Students Learning Achievement Through Project-Based Learning As Perceived

More information

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown

Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology. Michael L. Connell University of Houston - Downtown Digital Fabrication and Aunt Sarah: Enabling Quadratic Explorations via Technology Michael L. Connell University of Houston - Downtown Sergei Abramovich State University of New York at Potsdam Introduction

More information

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler

Machine Learning and Data Mining. Ensembles of Learners. Prof. Alexander Ihler Machine Learning and Data Mining Ensembles of Learners Prof. Alexander Ihler Ensemble methods Why learn one classifier when you can learn many? Ensemble: combine many predictors (Weighted) combina

More information

The KAM project: Mathematics in vocational subjects*

The KAM project: Mathematics in vocational subjects* The KAM project: Mathematics in vocational subjects* Leif Maerker The KAM project is a project which used interdisciplinary teams in an integrated approach which attempted to connect the mathematical learning

More information



More information

NCEO Technical Report 27

NCEO Technical Report 27 Home About Publications Special Topics Presentations State Policies Accommodations Bibliography Teleconferences Tools Related Sites Interpreting Trends in the Performance of Special Education Students

More information

DegreeWorks Advisor Reference Guide

DegreeWorks Advisor Reference Guide DegreeWorks Advisor Reference Guide Table of Contents 1. DegreeWorks Basics... 2 Overview... 2 Application Features... 3 Getting Started... 4 DegreeWorks Basics FAQs... 10 2. What-If Audits... 12 Overview...

More information

Algebra 2- Semester 2 Review

Algebra 2- Semester 2 Review Name Block Date Algebra 2- Semester 2 Review Non-Calculator 5.4 1. Consider the function f x 1 x 2. a) Describe the transformation of the graph of y 1 x. b) Identify the asymptotes. c) What is the domain

More information

Math 96: Intermediate Algebra in Context

Math 96: Intermediate Algebra in Context : Intermediate Algebra in Context Syllabus Spring Quarter 2016 Daily, 9:20 10:30am Instructor: Lauri Lindberg Office Hours@ tutoring: Tutoring Center (CAS-504) 8 9am & 1 2pm daily STEM (Math) Center (RAI-338)

More information

Reality has an author An analysis of how Inner and Outer circle speakers are constituted in English language textbooks in Sweden

Reality has an author An analysis of how Inner and Outer circle speakers are constituted in English language textbooks in Sweden Rapport nr: 2011ht4953 Institutionen för pedagogik, didaktik och utbildningsstudier Examensarbete i utbildningsvetenskap inom allmänt utbildningsområde, 15 hp Reality has an author An analysis of how Inner

More information



More information

LESSON PLANS: AUSTRALIA Year 6: Patterns and Algebra Patterns 50 MINS 10 MINS. Introduction to Lesson. powered by

LESSON PLANS: AUSTRALIA Year 6: Patterns and Algebra Patterns 50 MINS 10 MINS. Introduction to Lesson. powered by Year 6: Patterns and Algebra Patterns 50 MINS Strand: Number and Algebra Substrand: Patterns and Algebra Outcome: Continue and create sequences involving whole numbers, fractions and decimals. Describe

More information

Introduction to Moodle

Introduction to Moodle Center for Excellence in Teaching and Learning Mr. Philip Daoud Introduction to Moodle Beginner s guide Center for Excellence in Teaching and Learning / Teaching Resource This manual is part of a serious

More information