

# Reservoir Computing with Charge-Trap Memory Based on a MoS<sub>2</sub> Channel for Neuromorphic Engineering

Matteo Farronato, Piergiulio Mannocci, Margherita Melegari, Saverio Ricci, Christian Monzio Compagnoni, and Daniele Ielmini\*

Novel memory devices are essential for developing low power, fast, and accurate in-memory computing and neuromorphic engineering concepts that can compete with the conventional complementary metal-oxide-semiconductor (CMOS) digital processors. 2D semiconductors provide a novel platform for advanced semiconductors with atomic thickness, low-current operation, and capability of 3D integration. This work presents a charge-trap memory (CTM) device with a MoS<sub>2</sub> channel where memory operation arises, thanks to electron trapping/detrapping at interface states. Transistor operation, memory characteristics, and synaptic potentiation/depression for neuromorphic applications are demonstrated. The CTM device shows outstanding linearity of the potentiation by applied drain pulses of equal amplitude. Finally, pattern recognition is demonstrated by reservoir computing where the input pattern is applied as a stimulation of the MoS<sub>2</sub>-based CTMs, while the output current after stimulation is processed by a feedforward readout network. The good accuracy, the low current operation, and the robustness to input random bit flip makes the CTM device a promising technology for future high-density neuromorphic computing concepts.

# 1. Introduction

In the recent few years, the widespread adoption of artificial intelligence and big data analysis has raised the demand for fast and efficient computing systems capable of processing large amounts of data. Given the inherent limitation of digital computing systems, a growing interest has been devoted to new computing paradigms that can go beyond the conventional von Neumann architecture.<sup>[1]</sup> In this context, neuromorphic engineering emulating the human brain functionality is one of the most promising solutions for fast, energy-efficient

M. Farronato, P. Mannocci, M. Melegari, S. Ricci, C. M. Compagnoni, D. Ielmini

Informazione e Bioingegneria (DEIB) Politecnico di Milano and IUNET

piazza L. da Vinci 32, Milano 20133, Italy

E-mail: daniele.ielmini@polimi.it

The ORCID identification number(s) for the author(s) of this article can be found under https://doi.org/10.1002/adma.202205381.

© 2022 The Authors. Advanced Materials published by Wiley-VCH GmbH. This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.

#### DOI: 10.1002/adma.202205381

processing of unlabeled data.<sup>[2]</sup> Although many research efforts have been devoted to the realization of such systems by dedicated hardware with conventional complementary metal–oxide–semiconductor (CMOS) technologies, the benefits of this approach are still under question.

Emerging solid-state memory devices have been proposed as ideal candidates for the realization of efficient neuromorphic systems.<sup>[3,4]</sup> Among all the available options, innovative devices based on bidimensional (2D) materials offer lowpower consumption, outstanding scaling, back-end integration and the ability for 3D integration.<sup>[5,6]</sup> In the last few years, significant progress has been made regarding the deposition of 2D materials with processes like chemical vapor deposition (CVD)<sup>[7,8]</sup> or pulsed laser deposition (PLD),<sup>[9]</sup> thus enabling the fabrication of CMOS-compatible large-scale memory array. Such memory devices typically con-

sist of a field-effect transistor (FET) with a 2D semiconductor as the channel material. To achieve a memory effect, various physical processes have been adopted to modulate the current–voltage characteristics of the transistor.<sup>[10]</sup> Memory mechanisms include defect migration to tune the Schottky barrier at the channel contact with the metal source and drain regions,<sup>[11]</sup> switching of the spontaneous polarization of a ferroelectric layer in the gate stack<sup>[12]</sup> and the migration of ionic species in the channel.<sup>[13]</sup>

This work presents a charge-trap memory (CTM) based on a FET with molybdenum disulfide (MoS<sub>2</sub>) as the channel material for neuromorphic computing applications. The memory effect in the device is achieved by capture/emission of charge carriers in the microscopic defects at the interface between the channel and the gate oxide. Capture/emission processes are controlled by gate or drain voltage pulses and cause a change of the threshold voltage  $V_{\rm T}$  of the FET. For a given gate voltage  $V_{\rm G}$ , the  $V_{\rm T}$  shift results in a change of the channel conductance *G*. With respect to previously reported CTM devices based on 2D semiconductors<sup>[14-16]</sup> or van der Waals heterostructures,<sup>[17]</sup> our MoS<sub>2</sub>-based CTM displays the smallest channel length, thus being attractive for high-density applications. Within the neuromorphic computing scenario, the investigated CTM can be considered as an artificial synapse and the conductance modulation can be viewed as the potentiation or depression of the synaptic weight. Repeated weight potentiation and depression

Dipartimento di Elettronica





events are demonstrated with the device operating in the deep subthreshold regime, where the ultra-low values of *G* in the nS range are extremely attractive for low power consumption and large scalability of the synaptic network. In addition, the device is extremely attractive for its potential use in hardware accelerators of neural networks thanks to the high linearity of the weight update obtained with equal amplitude pulses and the ultrawide conductance window.

The suitability of the proposed CTM device for neuromorphic computing applications is validated by demonstrating a reservoir computing (RS) system for the classification of digit images<sup>[18]</sup> by the logistic regression method.<sup>[19,20]</sup> Excellent classification performance metrics are demonstrated, with an accuracy of 95.5% and a good robustness to noise corruption of the input data. These results support the proposed CTM device as

a promising solution for low-power, highly scalable neuromorphic computing systems inspired by the brain.

# 2. Device Characteristics

**Figure 1**a shows a sketch illustrating the CTM device, featuring a MoS<sub>2</sub> channel deposited via mechanical exfoliation from a bulk sample on an oxidized p-doped Si substrate used as back gate.<sup>[13,21,22]</sup> The typical MoS<sub>2</sub> thickness is of order of few atomic layers (Figure S1, Supporting Information). Source and drain contacts are deposited on top of the MoS<sub>2</sub> flake by thermally evaporated Ag (Figure 1b and Figure S2, Supporting Information) and patterned by electron-beam lithography (see the Experimental Section). Figure 1c shows a scanning electron



**Figure 1.** Transistor characteristics of the CTM device. a) Schematic of the device highlighting the MoS<sub>2</sub> flake deposited by mechanical exfoliation on top of the oxidized Si substrate and the Ag source and drain regions. b) Optical image of two devices made on a single MoS<sub>2</sub> flake. c) SEM image of the device. d) Measured  $I_D-V_{GS}$  characteristic at constant  $V_{DS} = 100$  mV for a forward and reverse sweep of  $V_{GS}$  with a sweep rate of 3.45 V s<sup>-1</sup>. e)  $I_D-V_{DS}$  characteristic measured at increasing gate voltage  $V_{GS}$ . f) Measured  $I_D-V_{DS}$  characteristics in a log–log plot at negative  $V_{GS}$ . The observed linearity confirms the ohmic behavior of the CTM even for relatively large  $V_{DS}$  and deep subthreshold regime. g) Qualitative band diagram of the CTM device highlighting the defects at the interface between MoS<sub>2</sub> and SiO<sub>2</sub> which are responsible for the observed hysteresis effect. h) Qualitative band diagram after the application of a positive  $V_{GS}$  resulting in electron trapping at interface states. i) Qualitative band diagram after the application of a negative  $V_{GS}$  resulting in electron trapping at interface states.

www.advmat.de

microscopy (SEM) image of the channel region. The triangular shape of the electrodes allows to reduce proximity effects in the channel region during exposure. The typical channel length L ranges from 50 nm to 100 nm, as evidenced by atomic force microscopy (AFM) image in Figure S3 (Supporting Information). The adoption of short channel lengths allows to investigate the device performance in the context of high integration densities.

ADVANCED SCIENCE NEWS \_\_\_\_\_

Figure 1d shows the drain current  $I_D$  as a function of the gate voltage  $V_{GS}$  for a constant drain voltage  $V_{DS} = 0.1$  V and forward or reverse sweeps of V<sub>GS</sub>. A relatively large V<sub>GS</sub> was applied due to the thick SiO2 gate-oxide layer of our device, which was required to easily identify the MoS<sub>2</sub> flake by optical microscopy after exfoliation. The subthreshold and on-state regimes in Figure 1d differ by more than 5 orders of magnitude, thus supporting the excellent transistor performance of the CTM device. All the measured devices display a depletion-type FET operation with negative  $V_{T}$ , which can be explained by an intrinsic n-type doping of MoS<sub>2</sub>, possibly due to S vacancies,<sup>[23]</sup> short channel effects and oxide charges. Figure 1e shows the measured  $I_D - V_{DS}$  at relatively low  $|V_{DS}| < 0.1$  V for increasing  $V_{GS}$ . The highly linear characteristics confirm the ohmic-type contact between Ag source/drain electrodes and MoS<sub>2</sub>.<sup>[24]</sup> The linearity of the characteristic is still observable for higher  $V_{DS}$  (up to 1 V), even for very low  $V_{GS}$ , as it can be seen in Figure 1f. The large hysteresis of the  $I_D-V_{GS}$  characteristic can be attributed to trapping/detrapping at the defects either in the MoS<sub>2</sub> semiconductor,<sup>[25,26]</sup> or in the SiO<sub>2</sub> layer,<sup>[27,28]</sup> or at the SiO<sub>2</sub>-MoS<sub>2</sub> interface.<sup>[29,30]</sup> Electron/hole trapping these defects cause a shift in the threshold voltage  $V_{T}$ , as illustrated in Figure 1g–i. Initially, in the pristine state (Figure 1g), defects are filled according to the equilibrium energy distribution dictated by the Fermi level  $E_{\rm F}$  in the substrate. As  $V_{\rm GS}$  is increased (Figure 1h), defects are filled with electrons as a result of channel inversion and the corresponding increase of  $E_{\rm F}$ , thus resulting in a positive  $V_{\rm T}$  shift in Figure 1d. On the other hand, when a negative  $V_{GS}$  sweep is applied (Figure 1i), defects are filled by holes as a result of the decrease of  $E_F$ , thus causing a decrease of  $V_T$  in Figure 1d. Hysteretic behaviors were observed systematically in all the devices even after 650 repeated V<sub>GS</sub> sweeps (Figure S4, Supporting Information). The hysteresis characteristics can change from device to device, as can be seen in Figure S5 (Supporting Information), probably due to differences in the properties of the MoS<sub>2</sub> flake obtained by mechanical exfoliation. Interface defects generally display a wide distribution of capture/emission times, thus changing the  $V_{GS}$  sweep rate results in a change in the  $V_{\rm T}$  hysteresis window.<sup>[31]</sup> Note that the threshold hysteresis can be caused also by gas adsorption/desorption.[16,32] This is supported by evidence of hysteresis reduction after device passivation, as shown in Figure S6 (Supporting Information). The adsorbed atoms/molecules might act as trapping sites, in addition to intrinsic interface states or bulk defects in SiO<sub>2</sub>, thus contributing to the observed hysteresis. Note that dipole orientation of adsorbed polar species, such as H<sub>2</sub>O, would result in an opposite direction of the threshold shift (see Figure S7, Supporting Information), thus should not represent the main contribution to the V<sub>T</sub> hysteresis in our devices.

Figure S8a (Supporting Information) shows the measured  $I_D-V_{GS}$  characteristics for increasing sweep rate, while Figure

S8b (Supporting Information) reports the measured  $V_{\rm T}$  after positive and negative  $V_{\rm GS}$  sweep. The positive threshold voltage shows a larger shift, which can be explained by the large positive overdrive voltage  $V_{\rm GS} - V_{\rm T}$  for  $V_{\rm GS} = +40$  V compared to the negative overdrive for  $V_{\rm GS} = -40$  V. Decreasing the sweep rate results in a wider hysteresis, since more traps have enough time to be charged or discharged.<sup>[27]</sup> Finally note that the hysteresis is also present in the  $I_{\rm D}-V_{\rm DS}$  characteristic for relatively large  $V_{\rm DS}$  inducing potentiation, as shown in Figure S9 (Supporting Information). The increase of the drain voltage causes charge trapping and a consequent increase of conductance at the origin of the hysteresis behavior.

#### 3. Synaptic Characteristic

Based on the hysteresis behavior in Figure 1d, a pulsed scheme was developed to operate the CTM device as an artificial synapse with the weight corresponding to its channel conductance G. To test the synaptic potentiation, the device was first biased with a constant negative  $V_{GS}$  and  $V_{DS}$  = 100 mV (see Figure S10a, Supporting Information). After 20 s of constant bias, negative voltage pulses were applied to the gate while monitoring  $I_{DS}$ . The bias preconditioning excludes furthers drift during the application of pulses and bring the device in an equilibrium condition. The negative V<sub>GS</sub> pulses induce a decrease of V<sub>T</sub> by emptying some defects at the MoS<sub>2</sub>/gateoxide interface, hence in a growth of G which can be viewed as the potentiation of the artificial synapse. After each potentiation pulse, a read pulse at the same voltage of the pre-bias phase was applied to V<sub>GS</sub> to monitor the channel conductance G (Figure S10b, Supporting Information). All pulses have the same amplitude and duration (1 ms for both potentiation and read). Figure 2a shows the conductance G during a sequence of 100 negative pulses. A clear increase of G (potentiation of the device) can be seen, as a result of the negative shift of  $V_{\rm T}$ . The evolution of *G* with the number of applied potentiation pulses was reproduced with the formula<sup>[33]</sup>

$$G = G_0 \left( 1 - e^{-\nu p} \right) + G_{\min} \tag{1}$$

where *p* is the normalized number of pulses, *v* is a shape factor depending on the linearity of the potentiation characteristic,  $G_{\min}$  is the initial value, and  $G_0$  is a fitting parameter representing the *G* dynamic range. Based on Equation (1), a relatively low value of *v* indicates a good linearity of the potentiation curve. From Figure 2a, the increase of  $V_{\text{GS}}$  gives rise to a larger modulation of *G* at the cost of a decrease of linearity of the potentiation trend.

The same potentiation behavior was obtained by applying drain pulses. Here, instead of negative gate pulses, positive drain pulses were applied at constant negative  $V_{GS}$  that act as a reference for the potentiation and the depression (Figure S11, Supporting Information). Note that the positive  $V_{DS}$  is equivalent to a more negative  $V_{GS}$  relative to the pre-bias condition, thus causing an increase of the trapped hole concentration, hence in potentiation. On the other hand, a negative  $V_{DS}$  corresponds to a more positive  $V_{GS}$  relative to the pre-bias and/ or potentiation bias, thus a decrease of trapped hole and a

ADVANCED SCIENCE NEWS \_\_\_\_\_



**Figure 2.** Potentiation and depression characteristics of the CTM devices. a) Channel conductance *G* as a function of the number of pulses for negative applied  $V_{GS}$  pulses at various  $V_{GS}$  of the read condition. A repetition of 100 identical pulses with relative amplitude  $\Delta V_{GS} = -5$  V was applied starting from the read  $V_{GS}$  reported in the legend with  $V_{DS} = 100$  mV. Also shown are the fitting curves according to Equation (1) to assess the linearity of the potentiation characteristics. b) Channel conductance *G* as a function of the number of pulses for positive applied  $V_{DS}$  pulses at increasing read  $V_{GS}$ . The fitting by Equation (1) indicates an increase of linearity for increasingly deep subthreshold regime. c) Measured  $I_D-V_{GS}$  characteristics after a preparation step at increasing read  $V_{GS}$ . The device was first biased at a constant negative  $V_{GS}$  and  $V_{DS} = 100$  mV. After 20 s,  $V_{GS}$  was gradually increased to 35 V while monitoring the drain current.

consequent depression. Similar to the gate-pulse potentiation, the device was biased for 20 s before the application of the drain pulses, to reach a steady-state initial charging of the interface states. Each drain pulse has a constant amplitude of  $V_{\rm DS}$  = 2.5 V and pulse-width of 200 ms. Figure 2b shows the measured potentiation characteristics: as V<sub>GS</sub> becomes more negative, the modulation of G decreases and the update linearity increases, with a minimum shape factor v = 0.25. Note that there is a trade-off between the conductance window and the linearity of curves, as illustrated in Figure S12 (Supporting Information). Here, the shape factor increases, hence the linearity decreases, for increasing conductance window  $G_{\text{max}}$  –  $G_{\text{min}}$  of potentiation. This saturation effect can be attributed to the filling of interface states at large concentrations of trapped charge. All the conductance curves have extremely low  $G_{\min}$ , generally below 1 nS (R > 1 G $\Omega$ ), even for the highest  $V_{GS}$ values. The low value of  $G_{\min}$  and its relative independence on the pre-bias  $V_{GS}$  can be explained by the stabilization mechanism in Figure 2c. Here the device was biased with negative gate  $V_{GS}$  for 20 s before sweeping  $V_{GS}$  to + 35 V while monitoring  $I_{DS}$ . The  $I_{DS}$  curve shifts to negative voltages as the negative pre-bias  $V_{GS}$  is increased, due to the emptying of some defects at the MoS<sub>2</sub>/gate-oxide interface. The resulting shift of  $V_{\rm T}$  results in the stabilization of  $G_{\rm min}$  to a value that is almost independent of the initial  $V_{\rm GS}$  value. Note that such stabilization effect is useful to reduce possible device-to-device variations of  $V_{\rm T}$ . In addition, the low conductance G in Figure 2b can be considered an ohmic conductance given the high linearity of the  $I_{\rm D}-V_{\rm DS}$  characteristics in Figure 1f, even for low  $V_{\rm GS}$  in the deep subthreshold regime. Note that ohmic conduction is essential for in-memory matrix-vector multiplication where the product is obtained from Ohm's law I = GV.<sup>[1]</sup> Figure S13a,b (Supporting Information) shows the distribution of conductance change  $\Delta G$  measured after each pulse and the average ( $\mu$ ) and standard deviation ( $\sigma$ ) of the distributions. Both  $\mu$  and  $\sigma$ decrease for decreasing  $V_{GS}$  in deep subthreshold regime. The linearity of the potentiation characteristics in Figure 2b, the very low conductance values, the linearity of the  $I_{\rm D}-V_{\rm DS}$  curve and the extremely small footprint makes the CTM device extremely

appealing with respect to other emerging devices, including RRAM,<sup>[34,35]</sup> phase change memories (PCM),<sup>[36]</sup> electrochemical random access memory (ECRAM)<sup>[37]</sup> and MoS<sub>2</sub> heterostructures,<sup>[38]</sup> as illustrated in Figure S14 (Supporting Information). Table S1 (Supporting Information) provides an overview of synaptic transistors based on 2D material. With respect to previously reported devices in the literature,<sup>[14–17,36–40]</sup> the proposed CTM device shows the lowest conductance levels and lowest pulse amplitude, resulting in a very low energy consumption. The latter can be further reduced by minimizing the gate oxide thickness, hence the gate operating voltages.

www.advmat.de

In addition to synaptic potentiation, we also studied synaptic depression by applying pulses of opposite polarity. **Figure 3**a,b shows repeated cycles of potentiation and depression, each consisting of 50 voltage pulses with  $V_{DS} = 2.5$  V and  $V_{DS} = -1$  V for potentiation and depression, respectively, for various  $V_{GS}$  bias. The  $V_{DS}$  values for potentiation and depression were selected to achieve the same dynamic range  $G_0$  for potentiation and depression within the same number of pulses, i.e., 50 pulses in Figure 3a,b. The measured characteristics show analog and bidirectional weight update with good reproducibility. Similar to data in Figure 2b, changing the gate bias results in a change of the conductance window.

Depression in Figure 3a,b shows a slightly higher nonlinearity compared to potentiation, which can be explained by the spontaneous discharge contributing to the depression transition. The low linearity of the depression can be overcome by a 2-CTM synapse, where potentiation of the positive CTM or potentiation of the negative CTM results in an overall potentiation or depression, respectively, as illustrated in the Figure S15 (Supporting Information). To better address the retention in our CTM device, Figure 3c shows the measured G after application of 50 positive V<sub>DS</sub> pulses for potentiation, followed by a read phase with a small  $V_{DS} = 150 \text{ mV}$  to monitor the spontaneous depression of conductance. Data show a spontaneous exponential decay of conductance with a retention time constant around 5 s. The decay can be understood by electron retrapping at interface states after potentiation. The asymmetric potentiation/depression in Figure 3a,b can thus be attributed www.advancedsciencenews.com



**Figure 3.** Pulsed potentiation and depression experiments. a) Repeated positive–negative drain cycles with  $V_{DS} = 2.5$  V and  $V_{DS} = -1$  V, respectively, demonstrating reproducible consecutive potentiation and depression characteristics. The  $V_{CS}$  was kept constant at -15 V. b) Same as (a) but with  $V_{CS} = -10$  V. The window increases with increasing gate bias, similar to data in Figure 2b. c) Measured G during potentiation (initial 50 pulses) followed by read at low  $V_{DS}$ , to monitor the spontaneous depression of the CTM. The conductance shows an exponential decay due to spontaneous re-trapping with a retention time  $t_r$  around 5 s. d) Analog conductance tuning obtained by applying potentiation pulses at reduced frequency during the read phase. Applying potentiation pulses at  $f_2 = 1$  Hz allows for a dynamic refresh of the conductance state.

to the additional contribution by the spontaneous electron retrapping during the depression phase. Spontaneous re-trapping can be mitigated by proper drain voltage pulses after potentiation. For instance, Figure 3d shows the application of 50 pulses with  $V_{\rm DS} = 4$  V applied at a frequency  $f_1 = 1.66$  Hz, followed by another train of 50 pulses with the same amplitude and a reduced frequency  $f_2$ . Adopting a frequency  $f_2 = 1$  Hz, allows to compensate the spontaneous re-trapping, thus allowing to effectively refresh the CTM to achieve a constant G.

## 4. Reservoir Computing

The proposed  $MoS_2$ -based CTM was used to implement the reservoir layer of the reservoir computing system shown in **Figure 4**a. An image made of  $n \times m$  pixels of binary amplitude is encoded in a spatio-temporal sequence of pulses. More specifically, for each column of pixels in the image, *m* voltage pulses of duration *T* (200 ms for experiments in Figure 4b) and amplitude equal to 0 or *V* are simultaneously produced. The amplitude of each of these pulses encodes the amplitude of a pixel on the rows of the image, where a voltage 0 and *V* are used to

represent a pixel in off and on states, respectively. The voltage pulses are applied as input signals to an array of *n* artificial synapses made by the proposed CTM devices, which act as a reservoir layer and convert the applied pulses to a current depending on the device conductance G. The current can be sensed as the output voltage in a transimpedance amplifier (TIA) with  $R_{TIA}$ feedback resistance. The output of the reservoir layer is fed to the classification network of the RC system, which is composed of a single-layer, fully connected neural network whose weight matrix can be realized using crosspoint arrays of resistive memories, enabling a fully in-memory implementation of the proposed network.<sup>[41]</sup> The output vector resulting from the matrix-vector multiplication between the  $n \times 1$  reservoir output vector and the  $n \times d$  weight matrix is fed to an array of d neurons equipped with a sigmoid-type activation function, where dis the number of classification labels. The final pattern classification is extracted by considering the largest neuronal output.

Owing to the RC concept,<sup>[42,43]</sup> training is necessary for the classification network weights only, as the reservoir layer is a network with a fixed connectivity structure where the neurons evolve dynamically under the stimulation of the spatiotemporal input pattern. Training of the classification network was

www.advmat.de



ADVANCED MATERIALS www.advmat.de



**Figure 4.** Demonstration of reservoir computing (RC) for image recognition. a) Schematic of the RC experiment, where an input spatial pattern is transformed to a spatiotemporal pattern which is applied to a vector of five CTM devices. The vector of CTM conductance values is used as input of a linear feedforward neural network performing the classification of the input pattern. The supervised training of the readout layer can be carried out in just one single step via logistic regression. b) Conductance response of the CTM layer (gate input) for each digit image. For the training, conductance curves for each input stream were randomly extracted from an experimental dataset to take into account the stochastic variation of CTM potentiation/depression.

performed using the logistic regression method<sup>[19]</sup> via the pseudoinverse matrix concept (see the Experimental Section). To reduce the computation time and energy of the pseudoinverse matrix, the operation can be directly executed by in-memory directly on the crosspoint array.<sup>[44]</sup>

To evaluate the performance of the CTM devices as reservoir element in the RC system, we considered a toy problem consisting of the classification of  $5 \times 4$  monochrome images of digits from 0 to 9. Each 4-pixels image row was fed to the corresponding CTM input channel as a sequence of four pulses applied to the gate or the drain terminal. Figure 4b shows the

reservoir states measured at the end of the submission of each input pattern as negative gate pulses to the CTM. Figure S16 (Supporting Information) shows similar results for drain pulses adopted as input stimulation signals. Despite the variability of the CTM response for a given input temporal sequence, the final  $5 \times 1$  output state is clearly unique for each input digit.

Supervised training of the feedforward readout network was achieved by using the dataset of Figure 4b and considering the variability of the reservoir output (see the Experimental Section). To evaluate the network accuracy, we performed Monte Carlo simulations by testing 2 million images uniformly





Figure 5. Results of the digit images recognition with reservoir computing. a) Confusion matrix showing the classification results for images and experimental CTM datasets in Figure 4b. The colors represent the normalized numbers of correct classification respect to the total number of each occurrence. b) Test accuracy of the RC system as a function of the number of flipped bits in the images. All data are extracted with the gate-pulse configuration. c) Example of a one-bit flip image dataset used as test images. d) Confusion matrix for 1-bit flip test images. Most of the digits are still correctly classified, although there is a decrease in the accuracy.

sampled from the digit dataset. Figure 5a shows the resulting confusion matrix, where the large average value of about 95.5% of the diagonal elements highlights the good accuracy of the classification network, only limited by the response variability of the reservoir output. The accuracy drops to 87.7% when the input signal is submitted via drain pulses instead of gate pulses (Figure S17, Supporting Information), due to the lower conductance window (see Figures S18 and S19, Supporting Information), hence the larger sensitivity to stochastic variations. The robustness to input noise was also assessed by performing Monte Carlo simulations with corrupted images, where 1, 2, or 3 bit flips at random locations were assumed for each input stream set. The accuracy generally drops as the number of noisy pixels increases, as shown in Figure 5b, owing to the additional variation due to the random bit flip. Figure 5c shows an example of an image dataset with one random bit flip. The corresponding confusion matrix for the Monte Carlo experiment is reported in Figure 5d, supporting the capability of noise rejection up to one bit flip. Note that the loss of accuracy for one bit flip images is mainly caused by a single digit fail, confirming the robustness of the RC system (see also Figure S20, Supporting Information).

## 5. Conclusion

We have realized a CTM device based on a MoS<sub>2</sub> channel, metallic source/drain contact and memory effect arising from trapping/detrapping at the interface states between MoS<sub>2</sub> and the SiO<sub>2</sub> gate dielectric. The application of voltage pulses of equal amplitude at the gate (negative) or at the drain (positive amplitude) causes a linear increase of the channel conductance due to the negative  $V_{\rm T}$  drift. Synaptic potentiation characteristics with high linearity are obtained, thus supporting the potential of the CTM device for synaptic applications. In addition, the relatively long retention times in the order of few seconds is extremely useful for neuromorphic applications such as gesture and speech recognition. Reservoir computing for the recognition and classification of pattern images is also demonstrated with a test accuracy of 95.5% over 2000 test images and a good robustness against random corruption of the input pattern. The low-power consumption, the high scalability and integrability into inmemory computing frameworks make the MoS<sub>2</sub>-based CTM device an attractive technology solution for in-memory computing applications.

www.advmat.de



www.advancedsciencenews.com

## 6. Experimental Section

Device Fabrication: The MoS<sub>2</sub>-based charge-trap memory devices were fabricated on p-doped silicon substrate with a 285 nm thick top layer of SiO<sub>2</sub> grown in an STS Multiplex PECVD deposition tool. The thickness of the oxide is measured using a J. A. Woollam VASE ellipsometer. The resulting substrate sample was then cleaned in acetone and isopropyl alcohol (IPA). To entirely remove any residual organic material, the sample was treated with oxygen plasma before the transferring of the MoS<sub>2</sub> flakes, using a Plasma Asher PVA TEPLA 300 AL, 200 W power, 2 min. The MoS<sub>2</sub> was transferred by mechanical exfoliation and the interesting flakes were selected with an optical microscope, exploiting the blue contrast obtained with the chosen oxide thickness. As a confirmation, some flakes were also characterized with an atomic force microscope (Keysight 5600LS) in contact mode. Source and drain electrodes were patterned by electron-beam lithography using poly(methyl methacrylate) (PMMA) resist and realized by silver thermally evaporated. Typical channel lengths were between 50 and 100 nm, while flake thickness was below 4 nm.

*Electrical Characterization*: The characterization of the  $MoS_2$  CTM was performed in Probe Station, using a Keithley 4200A-SCS semiconductor parameter analyzer. Pulsed measurements were carried out using a Keithley custom library (with some modules) realized for the scope. Experiments for the RC system were realized by applying all the 16 pulsed data streams multiple times. Pulses are composed by a program phase and a read phase. In case of zero code the program phase have 3 V amplitude and the same width of the read phase (200 ms). Same pulse width was chosen for gate pulses, while the amplitude was bigger (–5 V). To avoid correlation between measurements, each stream was applied after a complete sweep of the gate from –40 to +40 V.

Classification Network Simulation: Simulations were realized using Matlab. Each output of the reservoir was chosen randomly from a set of measurements. To be robust to the variation of the initial state of the device, the difference between the final conductance state and the initial one ( $\Delta G$ ) was considered as an output of each CTM. The feedforward network had also an addition input (the bias), set at the mean values of the current reservoir output state. Each neuron had a sigmoid-type activation function, with characteristic slope equal to 0.001. Network weights were initialized to random values. Training was performed using a dataset equally composed by 2000 digit images. The output of the reservoir was always randomly chosen from an experimental stream set composed by 100 experiments. At the end of the classification phase, the target label matrix Y was converted to a summation matrix S applying the sigmoid inverse. The weights were then obtained in one step by the pseudoinverse concept, with the operation:

$$W = X^+ \cdot S \tag{2}$$

where  $X^+$  is the pseudoinverse of the reservoir output. The test accuracy was computed by performing Monte Carlo simulations with datasets of 2 million images and random CTM response. Images were classified according to the highest neuronal output of the final classification layer. Noisy images were realized by performing random bit flips on the dataset images.

Statistical Analysis: Conductance points in Figures 2a,b and 3a–d were obtained by averaging 80 measured current points during the read phase (sample time = 200  $\mu$ s or 20  $\mu$ s for the faster pulses). Each curve is the result of a single experiment (no averaging of multiple experiments) and no pre-processing of data was done. For the reservoir computing results, the test accuracy was obtained, for a single training, by the test of 1000 images (100 for each digit), 2000 times. The confusion matrices show the result of 200 000 images for each digit. Figure 5b shows the median values and the 25th and 75th percentiles of the 2000 test accuracies.

# **Supporting Information**

Supporting Information is available from the Wiley Online Library or from the author.



www.advmat.de

## Acknowledgements

The authors would like to thank Claudio Somaschini, Marco Asa, Andrea Scaccabarozzi, Chiara Nava, and Elisa Sogne for help in the fabrication process and Edoardo Albisetti for help in the AFM characterization. This work was partially performed in Polifab, the micro- and nanofabrication facility of Politecnico di Milano. This article received funding from the European Union's Horizon 2020 research and innovation program (grant agreement no. 824164).

Open access Funding provided by Politecnico di Milano within the CRUI-CARE Agreement.

## **Conflict of Interest**

The authors declare no conflict of interest.

## **Data Availability Statement**

The data that support the findings of this study are available from the corresponding author upon reasonable request.

## **Keywords**

2D semiconductors, charge-trap memory, neural networks, neuromorphic engineering, reservoir computing

Received: June 14, 2022 Revised: September 15, 2022 Published online: November 16, 2022

- A. Sebastian, M. L. Gallo, R. Khaddam-Aljameh, E. Eleftheriou, Nat. Nanotechnol. 2020, 15, 529.
- [2] D. Ielmini, G. Pedretti, Adv. Intell. Syst. 2020, 2, 2000040.
- [3] D. Ielmini, S. Ambrogio, Nanotechnology 2020, 31, 092001.
- [4] N. K. Upadhyay, H. Jiang, Z. Wang, S. Asapu, Q. Xia, J. J. Yang, Adv. Mater. Technol. 2019, 4, 1800589.
- [5] W. Choi, N. Choudhary, G. H. Han, J. Park, D. Akinwande, Y. H. Lee, *Mater. Today* 2017, 3, 116.
- [6] G. Fiori, F. Bonaccorso, G. Iannaccone, T. Palacios, D. Neumaier, A. Seabaugh, S. K. Banerjee, L. Colombo, *Nat. Nanotechnol.* 2014, 9, 768.
- [7] P. P. Tummala, C. Martella, A. Molle, A. Lamperti, Nanomaterials 2022, 12, 973.
- [8] D. Chiappe, E. Scalise, E. Cinquanta, C. Grazianetti, B. van den Broek, M. Fanciulli, M. Houssa, A. Molle, Adv. Mater. 2014, 26, 2096.
- [9] F. Tumino, C. S. Casari, M. Passoni, V. Russo, A. L. Bassi, Nanoscale Adv 2019, 1, 643.
- [10] C. Liu, H. Chen, S. Wang, Q. Liu, Y.-G. Jiang, D. W. Zhang, M. Liu, P. Zhou, Nat. Nanotechnol. 2020, 15, 545.
- [11] V. K. Sangwan, H.-S. Lee, H. Bergeron, I. Balla, M. E. Beck, K.-S. Chen, M. C. Hersam, *Nature* **2018**, *554*, 500.
- [12] Q. Zhang, H. Xiong, Q. Wang, L. Xu, M. Deng, J. Zhang, D. Fuchs,
  W. Li, L. Shang, Y. Li, Z. Hu, J. Chu, *Adv. Electron. Mater.* 2022, *8*, 2101189.
- [13] M. Farronato, M. Melegari, S. Ricci, S. Hashemkhani, A. Bricalli, D. Ielmini, Adv. Electron. Mater. 2022, 8, 2101161.
- [14] B. Wang, X. Wang, E. Wang, C. Li, R. Peng, Y. Wu, Z. Xin, Y. Sun, J. Guo, S. Fan, C. Wang, J. Tang, K. Liu, *Nano Lett.* **2021**, *21*, 10400.
- [15] S. Bhattacharjee, R. Wigchering, H. G. Manning, J. J. Boland, P. K. Hurley, *Sci. Rep.* **2020**, *10*, 12178.
- [16] G. Ding, B. Yang, R. Chen, W. Mo, K. Zhou, Y. Liu, G. Shang, Y. Zhai, S. Han, Y. Zhou, Small 2021, 17, 2103175.

#### **ADVANCED** SCIENCE NEWS

www.advancedsciencenews.com

- [17] F.-S. Yang, M. Li, M.-P. Lee, I.-Y. Ho, J.-Y. Chen, H. Ling, Y. Li, J.-K. Chang, S.-H. Yang, Y.-M. Chang, K.-C. Lee, Y.-C. Chou, C.-H. Ho, W. Li, C.-H. Lien, Y.-F. Lin, *Nat. Commun.* **2020**, *11*, 2972.
- [18] M. Lukoševičius, H. Jaeger, Comput. Sci. Rev. 2009, 3, 127.
- [19] Z. Sun, G. Pedretti, A. Bricalli, D. Ielmini, *Sci. Adv.* **2020**, *6*, eaay2378.
- [20] T. Gokmen, Y. Vlasov, Front. Neurosci. 2016, 10, 333.
- [21] B. Radisavljevic, A. Radenovic, J. Brivio, V. Giacometti, A. Kis, Nat. Nanotechnol. 2011, 6, 147.
- [22] S. Das, H.-Y. Chen, A. V. Penumatcha, J. Appenzeller, Nano Lett. 2013, 13, 100.
- [23] D. Liu, Y. Guo, L. Fang, J. Robertson, Appl. Phys. Lett. 2013, 103, 183113.
- [24] D. S. Schulman, A. J. Arnold, S. Das, Chem. Soc. Rev. 2018, 47, 3037.
- [25] R. Addou, L. Colombo, R. M. Wallace, ACS Appl. Mater. Interfaces 2015, 7, 11921.
- [26] J. Hong, Z. Hu, M. Probert, K. Li, D. Lv, X. Yang, L. Gu, N. Mao, Q. Feng, L. Xie, J. Zhang, D. Wu, Z. Zhang, C. Jin, W. Ji, X. Zhang, J. Yuan, Z. Zhang, *Nat. Commun.* **2015**, *6*, 6293.
- [27] Y. Y. Illarionov, T. Knobloch, M. Waltl, G. Rzepa, A. Pospischil, D. K. Polyushkin, M. M. Furchi, T. Mueller, T. Grasser, 2D Mater. 2017, 4, 025108.
- [28] Y. Y. Illarionov, T. Knobloch, M. Jech, M. Lanza, D. Akinwande, M. I. Vexler, T. Mueller, M. C. Lemme, G. Fiori, F. Schwierz, T. Grasser, *Nat. Commun.* 2020, *11*, 3385.
- [29] B. Stampfer, F. Zhang, Y. Y. Illarionov, T. Knobloch, P. Wu, M. Waltl, A. Grill, J. Appenzeller, T. Grasser, ACS Nano 2018, 12, 5368.
- [30] A. Di Bartolomeo, L. Genovese, F. Giubileo, L. Iemmo, G. Luongo, T. Foller, M. Schleberger, 2D Mater. 2017, 5, 015014.

[31] Y. Y. Illarionov, G. Rzepa, M. Waltl, T. Knobloch, A. Grill, M. M. Furchi, T. Mueller, T. Grasser, 2D Mater. 2016, 3, 035004.

**ADVANCED** 

www.advmat.de

- [32] F. Urban, F. Giubileo, A. Grillo, L. lemmo, G. Luongo, M. Passacantando, T. Foller, L. Madauß, E. Pollmann, M. P. Geller, D. Oing, M. Schleberger, A. Di Bartolomeo, 2D Mater. 2019, 6, 045049.
- [33] S. Yu, Proc. IEEE 2018, 106, 260.
- [34] J. Woo, K. Moon, J. Song, S. Lee, M. Kwak, J. Park, H. Hwang, IEEE Electron Device Lett. 2016, 37, 994.
- [35] S. H. Jo, T. Chang, I. Ebong, B. B. Bhadviya, P. Mazumder, W. Lu, Nano Lett. 2010, 10, 1297.
- [36] J.-W. Jang, S. Park, G. W. Burr, H. Hwang, Y.-H. Jeong, IEEE Electron Device Lett. 2015, 36, 457.
- [37] J. Tang, D. Bishop, S. Kim, M. Copel, T. Gokmen, T. Todorov, S. Shin, K.-T. Lee, P. Solomon, K. Chan, W. Haensch, J. Rozen, 2018 IEEE Int. Electron Devices Meeting (IEDM), IEEE, Piscataway, NJ, USA 2018, https://doi.org/10.1109/IEDM.2018.8614551
- [38] S. Hao, X. Ji, F. Liu, S. Zhong, K. Y. Pang, K. G. Lim, T. C. Chong, R. Zhao, ACS Appl. Nano Mater. 2021, 4, 1766.
- [39] W. Hu, J. Jiang, D. Xie, B. Liu, J. Yang, J. He, J. Mater. Chem. C 2019, 7, 682.
- [40] A. J. Arnold, A. Razavieh, J. R. Nasr, D. S. Schulman, C. M. Eichfeld, S. Das, ACS Nano 2017, 11, 3110.
- [41] G. Milano, G. Pedretti, K. Montano, S. Ricci, S. Hashemkhani, L. Boarino, D. Ielmini, C. Ricciardi, *Nat. Mater.* 2022, 21, 195.
- [42] J. Li, C. Zhao, K. Hamedani, Y. Yi, 2017 Int. Joint Conf. on Neural Networks (IJCNN), IEEE, Piscataway, NJ, USA 2017, pp. 3439–3446.
- [43] C. Du, F. Cai, M. A. Zidan, W. Ma, S. H. Lee, W. D. Lu, Nat. Commun. 2017, 8, 2204.
- [44] P. Mannocci, G. Pedretti, E. Giannone, E. Melacarne, Z. Sun, D. Ielmini, *IEEE Trans. Circuits Syst. I* 2021, 68, 4889.