The distribution of the test statistics of homogeneity tests is often unknown, requiring the estimation of the critical values through Monte Carlo (MC) simulations. The computation of the critical values at low α, especially when the distribution of the statistics changes with the series length (sample cardinality), requires a considerable number of simulations to achieve a reasonable precision of the estimates (i.e. 10^6 simulations or more for each series length). If, in addition, the test requires a noteworthy computational effort, the estimation of the critical values may need unacceptably long run- times. To overcome the problem, the paper proposes a regression-based refinement of an initial MC estimate of the critical values, also allowing an approximation of the achieved improvement. Moreover, the paper presents an application of the method to two tests: SNHT (standard normal homogeneity test, widely used in climatology), and SNH2T (a version of SNHT showing a squared numerical complexity). For both, the paper reports the critical values for α ranging between 0.1 and 0.0001 (useful for the p-value estimation), and the series length ranging from 10 (widely adopted size in climatological change-point detection literature) to 70,000 elements (nearly the length of a daily data time series 200 years long), estimated with coefficients of variation within 0.22%. For SNHT, a comparison of our results with approximated, theoretically derived, critical values is also performed; we suggest adopting those values for the series exceeding 70,000 elements.

Critical values improvement for the standard normal homogeneity test by combining Monte Carlo and regression approaches

IEVA, FRANCESCA;
2017-01-01

Abstract

The distribution of the test statistics of homogeneity tests is often unknown, requiring the estimation of the critical values through Monte Carlo (MC) simulations. The computation of the critical values at low α, especially when the distribution of the statistics changes with the series length (sample cardinality), requires a considerable number of simulations to achieve a reasonable precision of the estimates (i.e. 10^6 simulations or more for each series length). If, in addition, the test requires a noteworthy computational effort, the estimation of the critical values may need unacceptably long run- times. To overcome the problem, the paper proposes a regression-based refinement of an initial MC estimate of the critical values, also allowing an approximation of the achieved improvement. Moreover, the paper presents an application of the method to two tests: SNHT (standard normal homogeneity test, widely used in climatology), and SNH2T (a version of SNHT showing a squared numerical complexity). For both, the paper reports the critical values for α ranging between 0.1 and 0.0001 (useful for the p-value estimation), and the series length ranging from 10 (widely adopted size in climatological change-point detection literature) to 70,000 elements (nearly the length of a daily data time series 200 years long), estimated with coefficients of variation within 0.22%. For SNHT, a comparison of our results with approximated, theoretically derived, critical values is also performed; we suggest adopting those values for the series exceeding 70,000 elements.
2017
File in questo prodotto:
File Dimensione Formato  
Rienzner-Ieva - MainDocument final.pdf

Accesso riservato

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 515.3 kB
Formato Adobe PDF
515.3 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1010772
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact