In functional data analysis, some regions of the domain of the functions can be of more interest than others owing to the quality of measurement, relative scale of the domain, or simply some external reason (e.g. interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived from distributional data, which often aggregate measurements of different quality or are affected by scale effects. A weighting scheme can be embedded into the underlying sample space of a PDF when it is considered as continuous compositions applying the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure, and this can be easily changed through the well‐known chain rule. This work provides a formal framework for defining weights through a reference measure, and it is used to develop a weighting scheme on the bounded domain of distributional data. The impact on statistical analysis is illustrated through an application to functional principal component analysis of income distribution data. Moreover, a novel centred log‐ratio transformation is proposed to map a weighted Bayes space into an unweighted 2 space, enabling to use most tools developed in functional data analysis (e.g. clustering and regression analysis) while accounting for the weighting scheme. The potential of our proposal is shown on a real case study using Italian income data.

Weighting the domain of probability densities in functional data analysis

Alessandra Menafoglio;
2020-01-01

Abstract

In functional data analysis, some regions of the domain of the functions can be of more interest than others owing to the quality of measurement, relative scale of the domain, or simply some external reason (e.g. interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived from distributional data, which often aggregate measurements of different quality or are affected by scale effects. A weighting scheme can be embedded into the underlying sample space of a PDF when it is considered as continuous compositions applying the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure, and this can be easily changed through the well‐known chain rule. This work provides a formal framework for defining weights through a reference measure, and it is used to develop a weighting scheme on the bounded domain of distributional data. The impact on statistical analysis is illustrated through an application to functional principal component analysis of income distribution data. Moreover, a novel centred log‐ratio transformation is proposed to map a weighted Bayes space into an unweighted 2 space, enabling to use most tools developed in functional data analysis (e.g. clustering and regression analysis) while accounting for the weighting scheme. The potential of our proposal is shown on a real case study using Italian income data.
2020
Bayes spaces, centred log-ratio transformation, exponential family, functional principal component analysis, probability density functions, reference measure
File in questo prodotto:
File Dimensione Formato  
sta4.283.pdf

Accesso riservato

Descrizione: Articolo principale
: Publisher’s version
Dimensione 2.35 MB
Formato Adobe PDF
2.35 MB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1150377
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 11
  • ???jsp.display-item.citation.isi??? 6
social impact