Semantic eye segmentation has proven useful in different fields, including biometrics, eye-tracking, and physiological signal extraction. For this reason, identifying the areas of an image regarding the three main parts of the eye surface (sclera, iris, pupil) is of utmost importance, as it allows one to extract various meaningful information. Most studies found in the literature focus on the development of custom Deep Learning models, with an architecture tailored specifically for eye segmentation. Moreover, for tasks such as eye-tracking, the iris and the pupil are often modeled as ellipses, thus including sections of the skin in the segmented area. In this paper, we propose a method for eye segmentation based on You Only Look Once (YOLO) models, generally employed for object detection, solving the previously described issues. First, a large version of YOLOv11 (YOLOv11L-seg) was used to perform a semiautomatic labeling of the dataset, starting from a reduced number of manually generated masks. The so-obtained dataset was then used to train two different versions of YOLO (YOLOv8 nano and YOLOv11 nano), much more compact than the model used for semi-automated labeling. The two models achieved a mean Intersection Over Union (mIOU) over the three classes of 93% and 92%, respectively, while also presenting high generalization capabilities over data acquired with different hardware compared to the current state-of-the-art models.
Maximizing Eye Segmentation Accuracy with YOLO
De Vecchi, Arianna;Bartoli, Pietro;Paracchini, Marco;Villa, Federica
2025-01-01
Abstract
Semantic eye segmentation has proven useful in different fields, including biometrics, eye-tracking, and physiological signal extraction. For this reason, identifying the areas of an image regarding the three main parts of the eye surface (sclera, iris, pupil) is of utmost importance, as it allows one to extract various meaningful information. Most studies found in the literature focus on the development of custom Deep Learning models, with an architecture tailored specifically for eye segmentation. Moreover, for tasks such as eye-tracking, the iris and the pupil are often modeled as ellipses, thus including sections of the skin in the segmented area. In this paper, we propose a method for eye segmentation based on You Only Look Once (YOLO) models, generally employed for object detection, solving the previously described issues. First, a large version of YOLOv11 (YOLOv11L-seg) was used to perform a semiautomatic labeling of the dataset, starting from a reduced number of manually generated masks. The so-obtained dataset was then used to train two different versions of YOLO (YOLOv8 nano and YOLOv11 nano), much more compact than the model used for semi-automated labeling. The two models achieved a mean Intersection Over Union (mIOU) over the three classes of 93% and 92%, respectively, while also presenting high generalization capabilities over data acquired with different hardware compared to the current state-of-the-art models.| File | Dimensione | Formato | |
|---|---|---|---|
|
Maximizing_Eye_Segmentation_Accuracy_with_YOLO.pdf
Accesso riservato
Descrizione: Paper
:
Publisher’s version
Dimensione
2.31 MB
Formato
Adobe PDF
|
2.31 MB | Adobe PDF | Visualizza/Apri |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


