The silicon technology continues reducing scale following the Moore's law. Device variability increases due to a lost in controllability during silicon chip fabrication. The current methodologies based on error detection and thread re-execution (roll back) cannot be enough, when the number of errors increase and arrive to a threshold. This dynamic scenario can be very negative if we are executing programs in HPC systems where a correct, accurate and time constraints solution is expected. The objective of the paper is to show preliminary results of Barbeque OpenSource Project (BOSP) and its potential use in HPC systems.
Framework for scheduling and resource management in time-constrained HPC application
MASSARI, GIUSEPPE;FORNACIARI, WILLIAM;
2015-01-01
Abstract
The silicon technology continues reducing scale following the Moore's law. Device variability increases due to a lost in controllability during silicon chip fabrication. The current methodologies based on error detection and thread re-execution (roll back) cannot be enough, when the number of errors increase and arrive to a threshold. This dynamic scenario can be very negative if we are executing programs in HPC systems where a correct, accurate and time constraints solution is expected. The objective of the paper is to show preliminary results of Barbeque OpenSource Project (BOSP) and its potential use in HPC systems.File in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.