The early identification of applications through the observation and fast analysis of the associated packet flows is a critical building block of intrusion detection and policy enforcement systems. The simple techniques currently used in practice, such as looking at the transport port numbers or at the application payload, are increasingly less effective for new applications using random port numbers and/or encryption.Therefore, there is increasing interest in machine learning techniques capable of identifying applications by examining features of the associated traffic process such as packet lengths and interarrival times. However, these techniques require that the classification algorithm is trained with examples of the traffic generated by the applications to be identified, possibly on the link where the classifier will operate.This is an important issue, as a pre-trained portable classifier would greatly facilitate the deployment and management of the classification infrastructure.The new contribution of this paper is a comparison of different sets of per-flow attributes that can be used for flow classification and the indication of which ones are more effective when the trained classifier is operated on a different link.

On the Portability of Trained Machine Learning Classifiers for Early Application Identification

VERTICALE, GIACOMO
2008-01-01

Abstract

The early identification of applications through the observation and fast analysis of the associated packet flows is a critical building block of intrusion detection and policy enforcement systems. The simple techniques currently used in practice, such as looking at the transport port numbers or at the application payload, are increasingly less effective for new applications using random port numbers and/or encryption.Therefore, there is increasing interest in machine learning techniques capable of identifying applications by examining features of the associated traffic process such as packet lengths and interarrival times. However, these techniques require that the classification algorithm is trained with examples of the traffic generated by the applications to be identified, possibly on the link where the classifier will operate.This is an important issue, as a pre-trained portable classifier would greatly facilitate the deployment and management of the classification infrastructure.The new contribution of this paper is a comparison of different sets of per-flow attributes that can be used for flow classification and the indication of which ones are more effective when the trained classifier is operated on a different link.
2008
Emerging Security Information, Systems and Technologies, 2008. SECURWARE '08. Second International Conference on
9780769533292
Internet; learning (artificial intelligence); pattern classification; telecommunication traffic
File in questo prodotto:
File Dimensione Formato  
2008_classification2.pdf

Accesso riservato

: Altro materiale allegato
Dimensione 96.81 kB
Formato Adobe PDF
96.81 kB Adobe PDF   Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/538717
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact