In functional data analysis, some regions of the domain of the functions can be of more interest than others owing to the quality of measurement, relative scale of the domain, or simply some external reason (e.g. interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived from distributional data, which often aggregate measurements of different quality or are affected by scale effects. A weighting scheme can be embedded into the underlying sample space of a PDF when it is considered as continuous compositions applying the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure, and this can be easily changed through the well‐known chain rule. This work provides a formal framework for defining weights through a reference measure, and it is used to develop a weighting scheme on the bounded domain of distributional data. The impact on statistical analysis is illustrated through an application to functional principal component analysis of income distribution data. Moreover, a novel centred log‐ratio transformation is proposed to map a weighted Bayes space into an unweighted 2 space, enabling to use most tools developed in functional data analysis (e.g. clustering and regression analysis) while accounting for the weighting scheme. The potential of our proposal is shown on a real case study using Italian income data.
Weighting the domain of probability densities in functional data analysis
Alessandra Menafoglio;
2020-01-01
Abstract
In functional data analysis, some regions of the domain of the functions can be of more interest than others owing to the quality of measurement, relative scale of the domain, or simply some external reason (e.g. interest of stakeholders). Weighting the domain is of interest particularly with probability density functions (PDFs), as derived from distributional data, which often aggregate measurements of different quality or are affected by scale effects. A weighting scheme can be embedded into the underlying sample space of a PDF when it is considered as continuous compositions applying the theory of Bayes spaces. The origin of a Bayes space is determined by a given reference measure, and this can be easily changed through the well‐known chain rule. This work provides a formal framework for defining weights through a reference measure, and it is used to develop a weighting scheme on the bounded domain of distributional data. The impact on statistical analysis is illustrated through an application to functional principal component analysis of income distribution data. Moreover, a novel centred log‐ratio transformation is proposed to map a weighted Bayes space into an unweighted 2 space, enabling to use most tools developed in functional data analysis (e.g. clustering and regression analysis) while accounting for the weighting scheme. The potential of our proposal is shown on a real case study using Italian income data.File | Dimensione | Formato | |
---|---|---|---|
sta4.283.pdf
Accesso riservato
Descrizione: Articolo principale
:
Publisher’s version
Dimensione
2.35 MB
Formato
Adobe PDF
|
2.35 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.