8 channels, 21 ps precision, 10 µs range
Time-to-Digital Converter Module

Davide Tamborini, Davide Portaluppi, Federica Villa, Member, IEEE, and Franco Zappa, Senior Member, IEEE

Abstract— We present an eight-channel Time-to-Digital Converter (TDC) instrument able to measure up to eight independent time intervals, with 21 ps rms precision and less than 1.5 % LSB rms differential non-linearity, over a 10 µs full scale range and up to 5.5 Mconv/s per channel. The module can also operate the eight channels in average- or interleaved-mode, i.e. as a single TDC converter, in order to improve timing performance or conversion rate, respectively. The on-board real-time data processing allows fast data transfer to a remote computer, through a USB 2.0 interface, and a dedicated software interface handles measurements and plots data. Thanks to excellent timing performance and just 6 W power consumption, the 8-channel TDC module is suitable for advanced cost-effective multi-channel time measurements at the ps level.

Index Terms — Time-Correlated Single Photon Counting (TCSPC), time interval meter, Time-to-Digital Converter (TDC).

I. INTRODUCTION

APPLICATIONS like Diffuse Optical Tomography (DOT) [1], fluorescence lifetime imaging (FLIM) [2], multichannel time-resolved spectroscopy [3] make use of Time-Correlated Single-Photon Counting (TCSPC) technique [4] for reconstructing fast, low-intensity, repetitive optical waveforms with picosecond resolution without needing GHz bandwidth electronics. Such a technique is based on the detection of the single photons composing the optical signal and on the measurement of their arrival times, synchronized to the excitation light source. Therefore a single-photon detector and a time-measurement device are the core of TCSPC setups and both of them must provide high resolution (tens of ps or better) with low Differential Non-Linearity (DNL), in order to faithfully reconstruct optical waveform.

Single photon detectors suitable for TCSPC measurements are Superconducting Single Photon Detectors (SSPDs), Photomultiplier Tubes (PMTs), Micro-Channel Plates (MCPs), or Single-Photon Avalanche Diodes (SPADs). SSPDs [5] reach high performance in terms of efficiency and noise, but they come mostly as single or few pixel detectors and they work at few degree Kelvin, thus requiring bulky and heavy cooling systems. PMTs or MCPs have good performance as single photon detectors and come in different multi-pixel solutions. However they are bulky, usually not very robust and require high voltages. SPADs [6] are high-performance, solid-state detectors and they are the most practical solution for TCSPC measurements. State-of-the-art SPADs are developed in custom processes [7], but recently also CMOS SPADs reach excellent performance [8].

Time measurement devices required by TCSPC techniques must provide very low DNL, in the order of few percent of LSB. State-of-the-art TCSPC boards provide picosecond resolution with less than 1% LSB rms DNL [9], [10]. Time measurement devices can be either digital (Time-to-Digital Converters, TDCs) [11],[12], able to perform direct conversion of the time interval into the respective digital code, or analog (Time-to-Amplitude Converters, TACs) [13], requiring an intermediate conversion into an analogue voltage before the final Analog-to-Digital Converter (ADC). Generally speaking, analogue methods reach higher resolution and ADC linearity can be improved through dithering[14], reaching better than 1% LSB rms DNL. Disadvantages of TACs are limited measurement range and high sensitivity to temperature variation and external disturbances. On the other hand, TDCs are easier to implement as integrated circuits and less sensitive to external conditions. For these reasons, TDCs received an increasing interest and some reported devices reach top performance, especially by means of the sliding scale technique [14], able to further improve TDC linearity.

Multichannel TCSPC instrumentation consists of detectors array with multichannel time-measurement modules or of systems based on microelectronic chips with detector arrays together with on-chip timing circuitry. Some works presented SPAD arrays [15], [16] able to reach high detection performance and to discriminate photon arrival times with than 100 ps FWHM precision, to be coupled to multichannel timing boards [9], [10], which offer state-of-the-art performance, but with high power consumption. FPGA-based time measurement systems [17], [18] are an interesting alternative, reaching good resolution (tens of ps) with up to 128 independent channels and limited power consumption, but linearity limits their exploitation in TCSPC applications.

There is an increasing interest to develop SPAD arrays with integrated TDCs for reaching thousands channels. However the best chips reported so far show either good timing performance, but poor detection efficiency, small active area (< 20 µm) and high noise [19], [20], or excellent detection performance, but not so good timing ones [21], [22].
In this paper we present an eight channel, time measurement module suitable for TCSPC applications. The module can perform 8 independent time measurements with 10 ps resolution, less than 1.5 % LSB rms DNL, and up to 5.5 Mconv/s per channel, or can group these channels together to attain an equivalent channel with higher conversion rate (up to 32 Mconv/s) or with better precision (down to 9 ps rms). The module performs on-board real-time data processing, hence minimizing computer computation requirements, and it consumes less than 6 W (about 750 mW per channel) when all 8 channels perform 5.5 Mconv/s.

The paper is organized as follows: the architecture of the developed eight channels TDC module is described in Section II, its experimental characterization is presented in Section III, while Section IV summarizes the work.

II. MODULE DESCRIPTION

The module was conceived to measure up to eight time intervals by means of input signals provided by nine SMA connector couples, i.e. eight independent channels plus one auxiliary channel. In fact, the module can measure up to eight independent time intervals defined by the corresponding 8 channels, or it can make use of the auxiliary channel to provide a common signal to multiple channels.

For each channel we developed a compact TDC-card [23], based on the application specific TDC chip we designed in a robust cost-effective 0.35 µm CMOS, as already described in Ref.[24]. Herewith we provide an in-depth description of the final multichannel module, how we implemented smart solutions to improve system performance (e.g. linearity, mismatch compensations, resource reuse, etc.), and the implementation of different operating modes, like averaging (to improve resolution down to 1.25 ps) and interleaving (for faster conversion rates).

Fig. 1: Photograph (top) and simplified block diagram (bottom) of the module: the motherboard hosts eight independent TDC-cards, an auxiliary input channel with proper signal conditioning electronics, an FPGA for data processing and an USB 2.0 controller.

Fig. 2: TDC chip operation principle: a counter accumulates the reference clock rising edges within the time interval, while two identical two-stages interpolators resolve the counter residual errors for both START and STOP signals, thus reaching 10 ps resolution.

Fig. 1 shows the module architecture. A motherboard hosts the eight TDC-cards, an auxiliary channel for implementing advanced operating modes, and FPGA for parameter settings, data read-out, and processing, an USB 2.0 interface for upload to a remote PC, and power supply.

A. Time-to-Digital Converter Chip

The TDC chip [24] is based on a counter and two START and STOP interpolators. Thanks to an uncorrelated reference clock, this topology inherently implements the sliding-scale technique [12], thus improving measurements linearity. The TDC operation principle is illustrated in Fig. 2. The 4-bit counter accumulates the number of 100 MHz reference clock’s rising edges occurring between START and STOP, hence providing a maximum measurable time interval (Full Scale Range, FSR) of 160 ns. The interpolators are identical and resolve START and STOP occurrences within the reference clock period, thus reaching 10 ps resolution by means of two cascaded interpolation stages: a multiphase-clock interpolator and a single-stage Vernier delay-loop interpolator. This is a good trade-off between low power consumption, high resolution, good DNL and fast conversion time. The effective interpolator resolution is affected by process mismatches, therefore calibration coefficients are needed to properly compute measurements from raw data.

The counter output is proportional to time $T_{\text{CONT}}$, which is given by the reference clock period multiplied by the number $N_{\text{cnt}}$ of clock rising edges occurred between START and STOP pulses. The first interpolator stage is a multiphase-clock interpolator based on a DLL, able to convert the time elapsed between the START (or STOP) pulse and the successive reference clock’s rising edge, with 625 ps resolution. Then the second interpolator stage measures the time-interval between a START (STOP) pulse and the first following clock phase with a 10 ps resolution; it consists of a Vernier delay line, folded in two delay loops.
The overall time interval $T$ is given by:

$$T = T_{\text{counter}} + T_{\text{START}} - T_{\text{STOP}},$$ (1)

The TDC chip provides separately the counter ($N_{\text{counter}}$) and the interpolators (namely $N_{\text{START}}$ and $N_{\text{STOP}}$) data. Proper logic is needed to convert this raw data into $T_{\text{START}}$ and $T_{\text{STOP}}$, by taking into account the effective resolution of interpolators, and to compute the time interval $T$ according to Eq. (1). When a time interval greater than 160 ns is measured, the TDC “coarse” counter folds back, but interpolators are already enabled: the resulting $N_{\text{START}}$ and $N_{\text{STOP}}$ values are valid measurements, while the 4 bit counter gives a wrong (refolded) number. This will be usefully exploited in Section II.D, for extending the full-scale range.

B. Time-to-Digital Converter Card

The TDC-card [23] measures the time interval defined by the START and the STOP input signals, which can be provided through two SMA connectors or through two dedicated edge-card connector pins. The edge-card connector also provides access to the data result bus, the module configuration signals, and the power supply.

As shown in Fig. 3, the core of the TDC-card is the TDC chip, but it hosts also a low time-jitter, extremely stable 100 MHz reference clock, an input signal front-end circuitry, and a pre-processing electronics. The signal conditioning front-end is needed to properly provide a valid START-STOP couple of 3.3 V CMOS signals. For this reason two CMOS fast comparators convert the input signals provided through the SMA connectors into standard CMOS signals, where START and STOP thresholds can be independently set in a $\pm 2.5$ V range. START and STOP signals can be sourced from either the input comparators or the edge-card connector by means of a small CPLD (only 40 logic elements, LE), which is also able to select the signals synchronization edges. The CPLD then outputs only valid START-STOP couples, avoiding the propagation of a STOP pulse before a START is received.

A second CPLD (with 570 LEs) performs data pre-processing and measures the average rate of the most important signals (START, STOP and valid conversions). The pre-processing logic handles the TDC data readout and computes the time interval $T$ from raw data, by applying the necessary interpolator calibration coefficients and Eq. (1). Another digital algorithm is then used to extend the conversion FSR beyond the intrinsic 160 ns limit, by exploiting the fact that the TDC chip does not stop an ongoing conversion when its range is exceeded. In this way, it is possible to augment the TDC on-chip counter with a longer off-chip one, thus resulting in a maximum measurable time interval of $10 \mu$s, while still preserving the intrinsic 10 ps time resolution. The TDC chip conversion time is about 150 ns, plus 10 ns signal conditioning propagation delay and 20 ns needed by the CPLD to detect the EOC and to enable the TDC, thus resulting in a maximum conversion rate of about 5.5 Mconvs/s, still with a low power consumption of less than 0.4 W (at such max conversion rate). The edge-card connector feeds the 20 bit time measurement results through a 100 MHz serial link.

C. Auxiliary channel

The auxiliary channel implemented on the module motherboard (see Fig. 4) consists of a front-end circuitry made by two CMOS fast comparators, two independent small CPLD channel DAC (for independently setting the two input signal thresholds, again within a $\pm 2.5$ V range with 12 bit resolution). Each comparator drives a 40 logic elements CPLD, which provides the auxiliary START (or STOP) to the TDC-cards, according to the selected operating mode.

This auxiliary channel provides three additional operating modes apart from the standard one, where all 8 channels work independently from each other (i.e. with 16 different and
individual START and STOP signals). In a second operating mode, the module operates as a typical TCSPC board, where several detectors are stimulated by a single laser source: one auxiliary signal, e.g. the STOP signal, is distributed to all channels, while the individual START inputs are fed by detector’s outputs. In a third mode, all TDC-cards are driven by START and STOP provided through the auxiliary inputs and all 8 results are averaged in order to obtain an equivalent single-channel time measurement with a resolution improved by a factor 8, reaching an eight time better LSB (i.e. 1.25 ps) and with a theoretical improvement in time measurement precision of $\sqrt{8} = 2.82$. A fourth mode drives the TDC-cards again through the auxiliary channel, but now routing the first START pulse to card 0, the second pulse to card 1, and so on, until the ninth pulse is returned again to card 0, etc. In this way, the 8 cards are time interleaved in order to implement an equivalent single-channel time measurement, with the same measurement performance of a single card, but at an 8-fold increased conversion rate, thus reaching up to 44 Mconv/s.

The CPLDs are programmed to properly distribute the auxiliary signals to the TDC-cards, according to the four operating modes, as shown in Fig. 4. Mode two and three are easily implemented by means of an array of eight two-inputs AND gates: one input is the auxiliary signal while the second one is the enable signal provided by the data-processing block. Mode four is implemented in the START auxiliary channel CPLD: the START triggers a shift register, which in turn sequentially enables one TDC-card after the other. Then an 8 bit bus multiplexer selects the appropriate enable signals for the auxiliary outputs, according to the selected operative mode. A programmable frequency divider within the STOP CPLD can reduce the laser trigger frequency if needed.

**D. Data processing**

The motherboard hosts an ‘Artix 7’ FPGA by Xilinx, able to process TDC-cards data, to handle them, and to manage the auxiliary channel controls. As shown in Fig. 5, data processing consists of eight timing blocks (one for each card), an averaging block, an interleaving block, a control block and a host interface block, all described in the followings.

The timing blocks are the core of data processing and are in charge of reading data from TDC-cards and building the histograms. Card readout is performed by de-serializing the 20 bit measurement data (see Section II.B) and resynchronizing it with the internal FPGA clock.

The histogram generator is based on the block RAM available inside the FPGA, with 14 bits address bus and 18 bits data bus. Therefore only a 14 bit sub-range of the 20 bit TDC-card data can be employed to build the histogram. In order to select the 14 bit sub-range, we implemented an adder and a shifter. At first, just after the deserializer, a time offset is subtracted to the received data for properly selecting the time origin of the histogram sub-range. The same adder is also used to compensate any mismatch among the 8 TDC-cards and the deterministic time delays from the auxiliary channel to each TDC-card (due to both CPLDs internal routings and board layout). Such offset compensation is mandatory in interleaved mode, since a bin-by-bin sum of the histograms of different channels must be performed, hence different time-delays need to be compensated before the sum operation, otherwise the resulting histogram would be grossly distorted.

Eventually the adder output passes through a bit shifter, acting as a digital divider, used to select the bin size of the generated histogram. In fact, in case of no bit shift the time resolution is the 10 ps intrinsic one, instead with a 1 bit, 2 bit, or 3 bit shift, the resolution can be set to 20, 40 or 80 ps, respectively. Such a shift sets also the FSR of the histogram, from the intrinsic 160 ns (i.e. $2^{14}$ times 10 ps) to 320 ns, 640 ns, 1.28 μs, respectively. Eventually the histogram generator uses the 14 bit output of the bit shifter to address the RAM cell and to incremented its content by one.

The averaging block performs the averaging of the time results of selected TDC-cards, in order to improve the final
time resolution. Since such operation must be performed on measured data before histogram generation, this block manages channels’ data paths to retrieve information from the bit shifters of the selected timing channels and to finally feed the histogram generators. To this aim, we decided to perform a smart reuse of the RAM memories and the histogram generators available inside the selected timing blocks. In fact, the averaging block adds together the 14 bit time measurements of the selected cards and provides the result as a 15, 16, 17 bit data, when averaging 2, 4, 8 cards, respectively. Then the averaging should require the division of such a result by 2, 4, or 8 (e.g. through a 1, 2, or 3 bit shift), respectively, to maintain the same 14 bit data width and time resolution. However, since such extra bits provide higher resolution, we decided to not discard them and to keep the sum as is, with 15, 16, or 17 bits. The resulting resolution (LSB) improves as the intrinsic 10 ps one divided by the number of cards, i.e. 5 ps, 2.5 ps, or 1.25 ps, respectively (see Section II.C), but the data remains an integer number. Finally the averaging block routes the overall data into the 2, 4, or 8 histogram generators of the timing blocks: the extra (1, 2 or 3) MSB bits select the timing block to feed, while the least significant 14 bits are the standard address of the RAM cell to increment by one.

In interleaved mode, the interleaving block performs a simple bin-by-bin sum of selected channel histograms (each one stored in the corresponding timing block’s RAM) before uploading data to PC, since all cards contribute to the same histogram and any time-delay mismatch among channels have already been compensated by the offset adder of each timing block, as already discussed. Moreover this block handles the histograms transfer from all timing blocks to the host interface block, for all four modes.

The host interface block is in charge of communication with the PC: it transmits histogram data and receives the system configuration settings. The control block uses the latter information to properly configure each channel’s front-end electronics and the internal FPGA data paths. The control block also manages the configuration of auxiliary channel’s CPLDs and acquires the main conversion rates (START, STOP, Valid START and Valid Conversion) from all channels.

E. Power supply

The module makes use of an AC-DC converter to provide ±5 V from the mains. The motherboard then hosts all power electronics needed to supply the eight TDC-cards, the auxiliary channel front-end circuitry, and the FPGA. All power converters have been optimized to maximize conversion efficiency thus minimizing power consumption. The power supply also provides stable and separate supply rails to the signal conditioning circuitry, in order to minimize propagation time-jitter and crosstalk between channels. To this aim, the FPGA is powered by a DC-DC converter, while TDC-cards and the auxiliary channel are supplied by low-dropout linear regulators after DC-DC conversion.

The motherboard also hosts a microcontroller to monitor all supply voltages and to apply the proper power-up and power-down sequence to the overall instrument. Finally, not in use TDC-cards are automatically turned off for power saving.

III. Module Characterization

The main figures-of-merit for a time measurement instrument are timing precision, timing accuracy, DNL, and conversion rate. When the instrument performs time interval measurements through the eight independent channel inputs, the achieved performance exactly match the stand-alone TDC-card one [23], which is better than 21 ps rms timing precision, better than 100 ppm accuracy, less than 1.25 % LSB rms DNL with 5 ps rms integral nonlinearity (INL), and about 5.5 Mconv/s maximum conversion rate. In the following we report the full characterization of the instrument, in all operating modes. As it will be shown, timing accuracy is independent of input source and/or operating mode, and no crosstalk is measured.

A. Auxiliary channel performance

In typical TCSPC setups where one laser source excites a specimen and several detectors are employed, the auxiliary channel allows to feed the same laser trigger to all cards. For this reason, it is important to characterize timing precision and DNL when START and STOP pulses are fed through the auxiliary channel.

The timing precision, also called single-shot precision in TCSPC measurements, is assessed by repeatedly measuring a constant START-STOP time delay and then computing the standard deviation of the distribution of conversion results distribution. We repeat this process by sweeping the START-STOP time-delay from 0 to the 10 µs FSR, in 1 ns steps. As shown in Fig. 6, the attained average precision is about 21 ps rms (i.e., about 49 ps FWHM, Full-Width at Half Maximum). Minimum and maximum precisions are 17 and 25 ps rms, respectively, thus resulting in ± 4 ps rms variation. Being comparable to those of the stand-alone TDC-card, this test proves that there is negligible degradation of the auxiliary channel timing jitter.

DNL is assessed through a code density test, in which a periodic STOP signal and a uniformly distributed random START signal define random time delays. By accumulating many conversion results, the histogram can reveal non-uniformities of the START signal distribution and nonlinearities of the time conversion [25]: therefore it is necessary to record a sufficiently high number of events per each bin in order to have an approximately uniform input distribution (at least 10 kconv per bin to get better than 1% non-uniformity). Under this condition, the acquired histogram is thus composed by an average number of samples per bin that represents the ideal uniform distribution, plus the non-uniformities of the converter. Therefore, as shown in Fig. 7, the DNL is obtained by subtracting and then normalizing the acquired histogram to its average value. This measurement shows the same trend of the stand-alone TDC-card [23], with a worst DNL value (from 6 to 8 % LSB) at the beginning of the range and an average DNL of about 1.3 % LSB rms (i.e. 130 fs rms), slightly worse than the TDC-card one (due to minor additional noise affecting the auxiliary channel front-
Fig. 6: Timing precision of the auxiliary channels across the 10 μs full-scale range with 1 ns steps: (left) trend of one channel vs. time interval; (right) minimum, average, and maximum (about 17, 21 and 25 ps rms, respectively) timing precision for all channels over the 10 μs range.

Fig. 7: DNL of the auxiliary channels, measured through a code density test employing uncorrelated START and STOP pulses, resulting in uniformly distributed random time intervals over the 10 μs full-scale range: (left) DNL trend of one channel vs. time interval; (right) average and maximum (about 1.3 % LSB rms and from 6 to 8 % LSB, respectively) DNL for all channels.

end), while INL still remains about 5 ps rms.  

B. Interleaving and Averaging modes

The instrument can exploit the auxiliary channel to implement an equivalent single-channel time measurement system with improved performance, thanks to the interleaving mode (for higher conversion rate) or to the averaging mode (for higher resolution and precision). For these two operating modes we characterized the deterministic time difference between the eight START-STOP paths, as provided to the cards from the single auxiliary channel. These deterministic delays were measured by providing fixed time interval to all 8 cards through the auxiliary channel and by collecting the resulting histograms. The delays were then computed as the time difference between the centroids of the eight reconstructed distributions. Even after repeating measurements in several positions over the whole range, the paths time difference is constant and with less than 1 LSB uncertainty. Time offsets among channels are shorter than 4.1 ns and are automatically subtracted by the FPGA when operating in these modes.

Fig. 8 shows the performance achieved in interleaving mode, when reaching a maximum conversion rate of 32 Mconv/s, resulting in about 4 Mconv/s per channel (instead of 5.5. Mconv/s) due to the delay introduced by the auxiliary channel CPLD. The timing precision is about 22 ps rms, with 18 ps minimum and 26 ps maximum values; this slight worsening is due to the quantization error of the time difference that is not an exact multiple of the LSB. The result is an imperfect overlapping of the 8 cards’ histograms, which leads to a broader variance. The DNL instead improves, reaching 0.6 % LSB rms (i.e., 60 fs rms) and 5 % LSB peak, due to cards nonlinearity averaging when all contribute to the same histogram. Fig. 9 shows the performance achieved in the averaging mode, when the resolution resulted to 1.25 ps. The improvement factor of 2.3. The DNL is about 1.5 % LSB rms (i.e., 18 fs rms) and the INL is less than 0.8 ps rms, i.e. a 7x timing precision is about 9 ps rms, resulting in an improvement (since the new LSB value is 1.25 ps).
Fig. 8: Module performance in interleaving mode over the 10 µs full scale range: the timing precision is about 22 ps rms, computed sweeping a constant START-STOP delay in 1 ns steps; the DNL is about 0.6 % LSB rms, evaluated by means of a code density test.

Fig. 9: Module timing precision (left) and DNL (right) in averaging mode. The timing precision is about 9 ps rms, computed sweeping a constant START-STOP delay in 1 ns steps; the DNL is about 1.5 % LSB rms, evaluated by means of a code density test.

IV. CONCLUSION

We presented an eight-channel, TDC based, time measurement instrument with 10 ps resolution, able to provide high performance in terms of timing precision (21 ps rms), DNL (better than 1.3 % LSB rms), with a 5.5 Mconv/s rate per channel, and a much lower power consumption compared to state-of-the-art commercial instrumentation (by Becker&Hickl [9] and PicoQuant [10]), of just 6 W, as shown in Table 1. The module provides 8 couples of START-STOP input connectors, acting as eight fully independent time meter channels, and an auxiliary START-STOP input couple to provide a common signal to the desired channels. As shown in Table 1, the instrument reaches performance comparable with the best-in-class multichannel solution available for TCSPC applications, but it can also be used as a single-channel measurement system, either by keeping same performance but increasing maximum conversion rate, or running at the same conversion rate but increasing the timing performance up to 9 ps rms precision and 18 fs rms DNL.

<table>
<thead>
<tr>
<th>Parameter</th>
<th>[9]</th>
<th>[10]</th>
<th>This work</th>
</tr>
</thead>
<tbody>
<tr>
<td>Channel number</td>
<td>4</td>
<td>8</td>
<td>8</td>
</tr>
<tr>
<td>Resolution (LSB) [ps]</td>
<td>0.813</td>
<td>4</td>
<td>10</td>
</tr>
<tr>
<td>Measurement range [ns]</td>
<td>3.33</td>
<td>260</td>
<td>10,000</td>
</tr>
<tr>
<td>Precision (rms) [ps]</td>
<td>2.5</td>
<td>12</td>
<td>21</td>
</tr>
<tr>
<td>DNL (rms) [% LSB]</td>
<td>0.5</td>
<td>1</td>
<td>1.3</td>
</tr>
<tr>
<td>DNL (peak) [% LSB]</td>
<td>1</td>
<td>5</td>
<td>8</td>
</tr>
<tr>
<td>Max conversion rate per channel [MHz]</td>
<td>5</td>
<td>1.25</td>
<td>5.5</td>
</tr>
<tr>
<td>Power consumption [W]</td>
<td>60</td>
<td>25</td>
<td>6</td>
</tr>
</tbody>
</table>

REFERENCES


Davide Tamborini was born in Angera, Italy, in 1987. He received the M.Sc. cum laude in Electronic Engineering from the Politecnico di Milano, Milano, Italy, in 2012. At present he is a Ph.D. candidate in ICT Engineering at Politecnico di Milano. His main research activities are high-precision time measurement instrumentation for time-correlated single-photon counting of fast phenomena.

Davide Portaluppi was born in Magenta, Italy, in 1989. He received the B.Sc. and M.Sc. cum laude in Electronics Engineering from the Politecnico di Milano, Milano, Italy, in 2011 and 2014 respectively. At present, his activity is focused on development of analog and digital electronics for nuclear instrumentation.

Federica Villa (M’15) was born in Milano in 1986. She received the B.Sc. in Biomedical Engineering in 2008, the M.Sc. summa cum laude in Electronic Engineering in 2010 and the Ph.D. in ICT in 2014 at Politecnico di Milano. Her present research activity aims at designing and developing CMOS SPAD arrays for 2D imaging via single-photon counting and 3D ranging through direct time-of-flight photon-timing. Since 2015, she is Member of the IEEE.

Franco Zappa (SM’12) was born in Milano, Italy, in 1965. He received the Master degree in electronics engineering and the Ph.D. degree in communication technology from Politecnico di Milano, in 1989 and 1993, respectively. Since 2011, he is full Professor of electronics at the Politecnico di Milano. His research interests include microelectronic circuitry for single-photon detectors (SPAD) and CMOS SPAD imagers, for high-sensitivity time-resolved optical measurements, 2D imaging and 3D depth ranging. He is coauthor of more than 160 papers, published in peer-reviewed journals and in conference proceedings, and nine textbook books on Electronic Design and Electronic Systems. He is coauthor of four international patents. In 2004, he cofounded “Micro Photon Devices” focused on the production of SPAD modules and cameras for single photon-counting and photon-timing. Since 2007, he is Senior Member of the IEEE.