RE.PUBLIC@POLIMI pubblicazioni di ricerca del Politecnico di Milano

Smith-Waterman is a dynamic programming algorithm that plays a key role in the modern genomics pipeline as it is guaranteed to find the optimal local alignment between two strings of data. The state of the art presents many hardware acceleration solutions that have been implemented in order to exploit the high degree of parallelism available in this algorithm. The majority of these implementations use heuristics to increase the performance of the system at the expense of the accuracy of the result. In this work, we present an implementation of the pure version of the algorithm. We include the key architectural optimizations to achieve highest possible performance for a given platform and leverage the Berkeley roofline model to track the performance and guide the optimizations. To achieve scalability, our custom design comprises of systolic arrays, data compression features and shift registers, while a custom port mapping strategy aims to maximize performance. Our designs are built leveraging an OpenCL-based design entry, namely Xilinx SDAccel, in conjunction with a Xilinx Virtex 7 and Kintex Ultrascale platform. Our final design achieves a performance of 42.47 GCUPS (giga cell updates per second) with an energy efficiency of 1.6988 GCUPS/W. This represents an improvement of 1.72x in performance and energy efficiency over previously published FPGA implementations and 8.49x better in energy efficiency over comparable GPU implementations.

Architectural optimizations for high performance and energy efficient Smith-Waterman implementation on FPGAs using OpenCL

DI TUCCI, LORENZO;O'Brien, Kenneth;Blott, Michaela;SANTAMBROGIO, MARCO DOMENICO

2017-01-01

Abstract

Smith-Waterman is a dynamic programming algorithm that plays a key role in the modern genomics pipeline as it is guaranteed to find the optimal local alignment between two strings of data. The state of the art presents many hardware acceleration solutions that have been implemented in order to exploit the high degree of parallelism available in this algorithm. The majority of these implementations use heuristics to increase the performance of the system at the expense of the accuracy of the result. In this work, we present an implementation of the pure version of the algorithm. We include the key architectural optimizations to achieve highest possible performance for a given platform and leverage the Berkeley roofline model to track the performance and guide the optimizations. To achieve scalability, our custom design comprises of systolic arrays, data compression features and shift registers, while a custom port mapping strategy aims to maximize performance. Our designs are built leveraging an OpenCL-based design entry, namely Xilinx SDAccel, in conjunction with a Xilinx Virtex 7 and Kintex Ultrascale platform. Our final design achieves a performance of 42.47 GCUPS (giga cell updates per second) with an energy efficiency of 1.6988 GCUPS/W. This represents an improvement of 1.72x in performance and energy efficiency over previously published FPGA implementations and 8.49x better in energy efficiency over comparable GPU implementations.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2017
			
	Titolo del libro
	
				Proceedings of the 2017 Design, Automation and Test in Europe, DATE 2017
			
	ISBN (International Standard Book Number)
	
				9783981537093
			
	Parole chiave
	
				Computer Networks and Communications; Hardware and Architecture; Safety, Risk, Reliability and Quality
			
	Appare nelle tipologie:
	
				04.1 Contributo in Atti di convegno

File in questo prodotto:

File	Dimensione	Formato
smithwaterman.pdf Accesso riservato Dimensione 246.44 kB Formato Adobe PDF Visualizza/Apri	246.44 kB	Adobe PDF	Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1032248

Citazioni

ND

43

38

social impact