In this paper, we develop and apply novel machine learning and statistical methods to analyse the determinants of students’ PISA 2015 test scores in nine countries: Australia, Canada, France, Germany, Italy, Japan, Spain, UK and USA. The aim is to find out which student characteristics are associated with test scores and which school characteristics are associated to school value-added (measured at school level). A specific aim of our approach is to explore non-linearities in the associations between covariates and test scores, as well as to model interactions between school-level factors in affecting results. In order to address these issues, we apply a two-stage methodology using flexible tree-based methods. We first run multilevel regression trees in the first stage, to estimate school value-added. In the second stage, we relate the estimated school value-added to school level variables by means of regression trees and boosting. Results show that while several student and school level characteristics are significantly associated to students’ achievements, there are marked differences across countries. The proposed approach allows an improved description of the structurally different educational production functions across countries.

Student and school performance across countries: A machine learning approach

Masci, Chiara;Agasisti, Tommaso
2018-01-01

Abstract

In this paper, we develop and apply novel machine learning and statistical methods to analyse the determinants of students’ PISA 2015 test scores in nine countries: Australia, Canada, France, Germany, Italy, Japan, Spain, UK and USA. The aim is to find out which student characteristics are associated with test scores and which school characteristics are associated to school value-added (measured at school level). A specific aim of our approach is to explore non-linearities in the associations between covariates and test scores, as well as to model interactions between school-level factors in affecting results. In order to address these issues, we apply a two-stage methodology using flexible tree-based methods. We first run multilevel regression trees in the first stage, to estimate school value-added. In the second stage, we relate the estimated school value-added to school level variables by means of regression trees and boosting. Results show that while several student and school level characteristics are significantly associated to students’ achievements, there are marked differences across countries. The proposed approach allows an improved description of the structurally different educational production functions across countries.
2018
Boosting; Education; Multilevel model; Regression trees; School value-added; Modeling and Simulation; Management Science and Operations Research; Information Systems and Management
File in questo prodotto:
File Dimensione Formato  
VoR_1-s2.0-S0377221718301462-main.pdf

Accesso riservato

Descrizione: Version of Record
: Publisher’s version
Dimensione 1.36 MB
Formato Adobe PDF
1.36 MB Adobe PDF   Visualizza/Apri
11311-1063208_Agasisti.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 1.85 MB
Formato Adobe PDF
1.85 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1063208
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 67
  • ???jsp.display-item.citation.isi??? 39
social impact