Many applicative studies deal with multinomial responses and hierarchi- cal data. Performing clustering at the highest level of grouping, in multilevel multinomial regression, is also often of interest. In this study, we analyse Po- litecnico di Milano data with the aim of profiling students, modelling their probabilities of belonging to different categories, considering their nested structure within engineering degree programmes. In particular, we are inter- ested in clustering degree programmes standing on their effects on different types of student career. To this end, we propose an EM algorithm for im- plementing semiparametric mixed-effects models dealing with a multinomial response. The novel semiparametric approach assumes the random effects to follow a multivariate discrete distribution with an a priori unknown number of support points, that is allowed to differ across response categories. The advan- tage of this modelling is twofold: the discrete distribution on random effects allows, first, to express the marginal density as a weighted sum, avoiding nu- merical problems in the integration step, typical of the parametric approach, and, second, to identify a latent structure at the highest level of the hierarchy, where groups are clustered into subpopulations.

Semiparametric Multinomial Mixed-Effects Models: a University Student Profiling Tool

C. Masci;F. Ieva;A. M. Paganoni
2022-01-01

Abstract

Many applicative studies deal with multinomial responses and hierarchi- cal data. Performing clustering at the highest level of grouping, in multilevel multinomial regression, is also often of interest. In this study, we analyse Po- litecnico di Milano data with the aim of profiling students, modelling their probabilities of belonging to different categories, considering their nested structure within engineering degree programmes. In particular, we are inter- ested in clustering degree programmes standing on their effects on different types of student career. To this end, we propose an EM algorithm for im- plementing semiparametric mixed-effects models dealing with a multinomial response. The novel semiparametric approach assumes the random effects to follow a multivariate discrete distribution with an a priori unknown number of support points, that is allowed to differ across response categories. The advan- tage of this modelling is twofold: the discrete distribution on random effects allows, first, to express the marginal density as a weighted sum, avoiding nu- merical problems in the integration step, typical of the parametric approach, and, second, to identify a latent structure at the highest level of the hierarchy, where groups are clustered into subpopulations.
2022
File in questo prodotto:
File Dimensione Formato  
AOAS_2022.pdf

Accesso riservato

: Publisher’s version
Dimensione 669.42 kB
Formato Adobe PDF
669.42 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1203116
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 0
social impact