This paper introduces a novel feature selection and classification method, based on vertical data partitioning and a distributed searching architecture. The features are divided into subsets, each of which is associated to a dedicated processor that performs a local search. When all local selection processes are completed, each processor shares the features of its locally selected model with all other processors, and the local searches are repeated until convergence. Thanks to the vertical partitioning and the distributed selection scheme, the presented method is capable of addressing relatively large scale examples. The procedure is efficient since the local processors perform the selection tasks in parallel and on much smaller search spaces. Another important feature of the proposed method is its tendency to produce simple model structures, which is generally advantageous for the interpretability and robustness of the classifier. The proposed approach is evaluated and compared to other well-known feature selection and classification approaches proposed in the literature on several benchmark datasets. The obtained results demonstrate the effectiveness of the proposed approach, both in terms of classification accuracy and computational time.

A distributed feature selection scheme with partial information sharing

Brankovic A.;Piroddi L.
2019-01-01

Abstract

This paper introduces a novel feature selection and classification method, based on vertical data partitioning and a distributed searching architecture. The features are divided into subsets, each of which is associated to a dedicated processor that performs a local search. When all local selection processes are completed, each processor shares the features of its locally selected model with all other processors, and the local searches are repeated until convergence. Thanks to the vertical partitioning and the distributed selection scheme, the presented method is capable of addressing relatively large scale examples. The procedure is efficient since the local processors perform the selection tasks in parallel and on much smaller search spaces. Another important feature of the proposed method is its tendency to produce simple model structures, which is generally advantageous for the interpretability and robustness of the classifier. The proposed approach is evaluated and compared to other well-known feature selection and classification approaches proposed in the literature on several benchmark datasets. The obtained results demonstrate the effectiveness of the proposed approach, both in terms of classification accuracy and computational time.
2019
Classification, Distributed optimization, Feature selection, Model selection, Parallel processing
File in questo prodotto:
File Dimensione Formato  
DFS_R4.pdf

Open Access dal 23/05/2020

Descrizione: bozza finale post-referaggio
: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 285.2 kB
Formato Adobe PDF
285.2 kB Adobe PDF Visualizza/Apri
2019 - ML - BrankovicPiroddi.pdf

Accesso riservato

Descrizione: Articolo principale (versione dell'editore)
: Publisher’s version
Dimensione 647.81 kB
Formato Adobe PDF
647.81 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1118923
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 5
  • ???jsp.display-item.citation.isi??? 5
social impact