Multiple machine learning approach to characterize two-dimensional nanoelectronic devices via featurization of charge fluctuation

Lee, Kookjin; Nam, Sangjin; Ji, Hyunjin; Choi, Junhee; Jin, Jun-Eon; Kim, Yeonsu; Na, Junhong; Ryu, Min-Yeul; Cho, Young-Hoon; Lee, Hyebin; Lee, Jaewoo; Joo, Min-Kyu; Kim, Gyu-Tae

doi:10.1038/s41699-020-00186-w

Download PDF

Article
Open access
Published: 04 January 2021

Multiple machine learning approach to characterize two-dimensional nanoelectronic devices via featurization of charge fluctuation

Kookjin Lee ORCID: orcid.org/0000-0002-9896-1090^1,2,3^na1,
Sangjin Nam⁴^na1,
Hyunjin Ji⁵,
Junhee Choi⁶,
Jun-Eon Jin⁷,
Yeonsu Kim³,
Junhong Na⁸,
Min-Yeul Ryu⁸,
Young-Hoon Cho⁷,
Hyebin Lee⁷,
Jaewoo Lee³,
Min-Kyu Joo ORCID: orcid.org/0000-0001-7537-1015^9,10 &
…
Gyu-Tae Kim³

npj 2D Materials and Applications volume 5, Article number: 4 (2021) Cite this article

3407 Accesses
7 Citations
1 Altmetric
Metrics details

Subjects

Electronic devices

Abstract

Two-dimensional (2D) layered materials such as graphene, molybdenum disulfide (MoS₂), tungsten disulfide (WSe₂), and black phosphorus (BP) provide unique opportunities to identify the origin of current fluctuation, mainly arising from their large surface areas compared with those of their bulk counterparts. Among numerous material characterization techniques, nondestructive low-frequency (LF) noise measurement has received significant attention as an ideal tool to identify a dominant scattering origin such as imperfect crystallinity, phonon vibration, interlayer resistance, the Schottky barrier inhomogeneity, and traps and/or defects inside the materials and dielectrics. Despite the benefits of LF noise analysis, however, the large amount of time-resolved current data and the subsequent data fitting process required generally cause difficulty in interpreting LF noise data, thereby limiting its availability and feasibility, particularly for 2D layered van der Waals hetero-structures. Here, we present several model algorithms, which enables the classification of important device information such as the type of channel materials, gate dielectrics, contact metals, and the presence of chemical and electron beam doping using more than 100 LF noise data sets under 32 conditions. Furthermore, we provide insights about the device performance by quantifying the interface trap density and Coulomb scattering parameters. Consequently, the pre-processed 2D array of Mel-frequency cepstral coefficients, converted from the LF noise data of devices undergoing the test, leads to superior efficiency and accuracy compared with that of previous approaches.

Application of a long short-term memory for deconvoluting conductance contributions at charged ferroelectric domain walls

Article Open access 28 October 2020

Machine learning enables completely automatic tuning of a quantum device faster than human experts

Article Open access 19 August 2020

Machine learning in electronic-quantum-matter imaging experiments

Article 19 June 2019

Introduction

Low-frequency (LF) 1/f noise spectroscopy is a nondestructive defect diagnosis tool, which identifies dominant scattering origins. Such scattering origins are caused by imperfect crystallinity, lattice vibration, surface trap distribution, and channel and dielectric defects, in addition to the Schottky barrier inhomogeneity at the metal-semiconductor interface in semiconductor devices^1,2,3,4,5,6. However, as the size of the channel material decreases, particularly in the case of two-dimensional (2D) layered materials, their atomically thin nature with a large surface-to-volume ratio makes it significantly difficult to investigate them using LF 1/f noise analysis as compared with their bulk silicon counterparts^7,8. Conventionally, the time-resolved current (I) variation in electronic devices has been ascribed to the carrier number and/or mobility fluctuation⁹; ${\Delta}I(t) \propto q\mu \left( {{\Delta}N} \right) + q\left( {{\Delta}\mu } \right)N$, where q, μ, and N denote the elementary unit charge, carrier mobility, and number of charge carriers, respectively. However, since the inherent vulnerability of 2D materials to surrounding interfaces considerably influences the charge fluctuation, this high sensitivity of LF noise features would reflect the individual effects of both channel and dielectric materials in addition to the presence of chemical/electrical doping^10,11.

Thus far, numerous LF noise features have been reported on 2D materials, such as the presence of electron-hole puddle induced charge scattering on graphene¹², the Coulomb scattering suppression via high-κ passivation of black phosphorus (BP)¹³, the promotion of charge fluctuation in molybdenum disulfide (MoS₂) due to the height and inhomogeneity of the Schottky barrier¹¹, the anisotropic LF noise feature of rhenium disulfide (ReS₂)¹⁴, and the thickness-dependent Coulomb scattering parameter of molybdenum ditelluride (MoTe₂)¹⁵. These studies indicate the high feasibility of LF noise spectroscopy as a tool to classify the material and device properties. Nevertheless, the origin of carrier fluctuation, occurring either in the 2D layered material itself or at the interface between the 2D layered material and the gate dielectric, has not been identified clearly. Moreover, it is significantly difficult to identify an individual noise source from the LF noise data without appropriate data processing for the model-dependent LF noise analysis.

Most recently, the combination of artificial-intelligence (AI) based approach and scientific data analysis has been widely considered in various applications such as healthcare^16,17, image recognition^18,19, voice search²⁰, and molecular/material science^21,22. Further, it has also been determined that these combined techniques are suitable for solving the problems associated with non-linear processes or enormous combinatorial spaces with high efficiency^16,22,23,24. This clearly indicates that the machine learning (ML) and deep learning (DL) approaches can provide a better optimization and decision-making by converging the scientific data and extracting interpretable models from these data automatically^22,25. Recently, studies on applying ML or DL to analysis of 2D layered materials have been widely conducted^26,27,28,29.

In this study, we introduce an effective technique to classify and infer the characterization of current fluctuation with high efficiency and precision by combining AI and LF noise spectroscopy. Due to the similarities of the fabrication process, geometry, bandgap, and mobility of 2D FETs, classifying only using fundamental DC analysis of transfer curves and output characteristics is very difficult. On the other hand, since LF noise data measure the tiny fluctuations of carriers in channel according to time, characterizing an own FET is easily explained. Based on the time-resolved ΔI(t) measured from various 2D material-based field-effect transistors (FETs), 2D arrays of the Mel-frequency cepstral coefficient (MFCC) for several electronic properties were considered, and the corresponding features were obtained via a hidden Markov model (HMM). HMM has disadvantages that it must have a relatively large amount of data and hardly express dependencies between hidden states. However, HMM is suitable for processing a large amount of LF noise data based on the advantages of having a strong statistical foundation and enabling efficient learning from raw sequence data. This approach allows us to automatically identify essential device information such as the type of 2D channel materials and gate dielectrics, interface trap density (N_it), Coulomb scattering parameter (α_SC), and the presence of chemical and electron beam doping. Therefore, the combination of factors such as channel material, gate dielectric, contact metal, and electron beam irradiation significantly affects carrier fluctuations as a function of time. This combination, which has more than 100 LF noise data sets under 32 conditions, becomes a catalyst for machine learning that automatically and effectively classify the characteristics of various nanoelectronic devices. In addition, the obtained LF noise spectroscopy data are highly interpretable via machine learning techniques, thereby identifying the contribution of engineered features in characterizing the device information and performance.

Results

Workflow for audio and current signal classification

The decimal data type, measured in the time domain, has been used generally for ML and DL in data science; however, the Fourier transform (FT) of this data are frequently employed in ML algorithms to improve data interpretation^30,31. The process of transforming raw data into a suitable representation for a learning algorithm is often called featurization. For instance, in speech recognition, proper methodology has been widely studied to convert a signal from the time domain to the frequency domain for more accurate classification and analysis^31,32,33. A typical data demonstration method that extracts the characteristics of the original audio signal through the Mel-frequency cepstral coefficient is illustrated in Fig. 1a. Each speech frame of the time domain signal is first obtained through the pre-emphasis, framing, window, and other processing of the original audio signal as expressed in schematic (i) of Fig. 1a. Subsequently, the speech signals comprising a 30 ms frame window are Fast Fourier transformed (FFTed) with a Hamming window. Further, each spectrum signal is processed by Mel filters (26 filters) to obtain the corresponding Mel-frequency spectrum. Finally, the Mel-frequency spectrums are processed using discrete cosine transform (DCT) to acquire the MFCCs in the cepstral domain as shown in schematic (ii) of Fig. 1a.

**Fig. 1: Workflow comparison between speech and device characteristic classification.**

We employ this data processing algorithm for a number of ΔI(t) data obtained from various 2D material based FETs which have been fabricated and analyzed under various experimental conditions such as different gate dielectrics^11,13,34,35, temperatures^11,34,36, channel materials^{10,11,13,34,35,37,38,39}, chemical/electron beam doping^40,41, and source/drain contact metals^11,34 (see Fig. 1b). More than 100 LF noise data sets of various 2D layered FETs were considered in this study under 32 different conditions at a particular gate (V_G) and drain (V_D) bias condition. In contrast to the audio signal shown in Fig. 1a, after performing the additional signal normalization process, each MFCC of the current signal in the cepstral domain is consequently determined via FFT and DCT as displayed in schematic (i) and (ii) of Fig. 1b. The MFCCs of the audio and current signals, which comprise the 2D array, are respectively used in speech recognition and device classification (materials/characteristics) through the inference process using ML with the optimized algorithm (see Fig. 1c). The conditions of the device that ML trained and learned in this algorithm distinguishes are as follows (see Fig. 1d): BP, graphene, MoS₂, ReS₂, MoTe₂, and tungsten diselenide (WSe₂) were used as channel materials; h-BN and SiO₂ were employed as gate oxides; Ti, Au, Pt, and Cr were used as the contact metals; and passivation, temperature variations, triethanolamine (TEOA) doping, and electron beam irradiation were considered as the different external factors.

Process flowchart for learning and classifying 2D transistor

The ΔI(t) of 2D material-based FETs under several conditions were measured at a particular V_G and V_D (see Fig. 2a) in a shielding metal box (see Fig. S2 and Note 2 in the Supplementary Materials for details of the LF noise measurement system)⁴². The drain current I_D can be defined as the sum of the average statistic (DC) drain current ($\overline {I_{\mathrm{D}}}$) and low-noise current fluctuations (ΔI_D); $I_{\mathrm{D}} = \overline {I_{\mathrm{D}}} + {\mathrm{{\Delta}}}I_{\mathrm{D}}$^3,4,5,6,9. Since the amplitude of ΔI_D is substantially smaller than I_D, ΔI_D is generally converted to the voltage signal using the low-noise current-to-voltage preamplifier, as depicted in Fig. 2b. The amplified noise signal was considered as the input ΔI_D(t) data used in Python, where the amplitude normalization and pre-emphasis processes were performed, as presented in Fig. 2c. Subsequently, the preprocessed ΔI_D(t) data were separated into specific frames with respect to the time domain, and FFT was performed on these data. The transformed data produced by each frame were expressed as power spectral density (S_I) in the frequency domain, and all S_I were filtered onto the Mel scale. This transformation of specific frames into S_I allowed the evaluation of periodic spectra, and the amount of spectral energy between frequencies could then be obtained by combining the respective frames. It was observed that the Mel scale filter interval was directly proportional to the frequencies i.e., narrow around low frequencies and became wider at the higher the frequencies indicating that the Mel-scale filter amplified the amount of energy around low frequencies (see Fig. S4 and Note 3 in the Supplementary Materials)^{30,43,44,45,46,47}.

**Fig. 2: Flowchart for learning and classifying characteristics of 2D transistors.**

The obtained data, which are called Mel-frequency spectrums and mainly used for learning, were consequently more sensitive to the low frequency values, allowing a precise carrier scattering analysis in the devices. Subsequently, the Mel-frequency spectrums were transformed through the DCT and extracted to a finite data point sequence, composed of the current MFCCs in the cepstral domain^43,44,45. Further, all S_I filtered by the Mel scale were overlapped, indicating the existence of correlations between the spectral densities. These correlations could be separated using the DCT method. The current MFCCs transformed by the DCT was expressed as the change in filter energy, and a part of them was extracted to store data as 2D arrays, as demonstrated in the schematic in Fig. 2c^30,47. Based on the research conducted thus far, the engineered current MFCC features were characterized into 2D arrays with a number between 200 and 1000 for each device class.

Every engineered current MFCCs feature was stored by class, based on the device conditions, and they were learned and classified using ML with an HMM algorithm and DL with an NN, as illustrated in Fig. 2d, e. The HMM based on the Markov chain^30,31,48,49 in Fig. 2d was the first algorithm model used for learning and classifying the data in this study. In HMM, the unaligned training sequences are processed by iteratively evaluating the data stored as current MFCCs. For all the training parameters, the estimates with prior probability distributions are assumed using a maximum a posteriori approach^30,50,51. The scores for the 32 classes were calculated as Y, as shown in Fig. 2d. Further, the class with the highest score was determined, and could be used to infer the device conditions as shown in Fig. 2f.

The second method used was the NN^30,52,53, which is one of the DL methods. In this algorithm, the Y values of classes, which had been calculated by HMM, were classified by performing one additional learning step. In this method, the input score vectors, Y, were transferred to the first layer (layer-1) with 32 perceptrons, which is the number of classes, and were then transferred to the second layer (layer-2) by employing a rectified linear unit (ReLU) function as the activation function^54,55. Instead of the widely used sigmoid function, we considered a ReLU function here as the activation function because of its sparse activation property, which could be partially activated by providing zero as an output against a negative input^30,54,55. Subsequently, the score vector data were classified using the softmax function, which is used for classification in layer-2, and the probability of a specific class was calculated and classified, demonstrating a normalization effect. The softmax function was obtained by dividing the sigmoid value of each class by the sum of sigmoid values of all classes as described below^30,54:

$${\mathrm{S}}\left( {y_i} \right) = \frac{{e^{y_i}}}{{\mathop {\sum}\nolimits_{j = 1}^i {e^{y_i}} }}$$

(1)

Compared to the HMM method, the second method had the advantage of classifying the score via repetitive training and learning, which was performed to determine the maximum value of the scores, Y, obtained through HMM. Finally, the device conditions could be inferred as indicated in Fig. 2f.

Data featurization

Numerous electrical properties of FETs, such as the carrier type (electrons or holes), field-effect mobility, subthreshold swing, and current on/off ratio of the device-under-test (DUT) can be determined from the I_D – V_G transfer characteristics of 2D layered FETs (see Fig. 3a and Supplementary Materials Fig. S1). However, a precise classification of the 2D FETs with fundamental DC analysis is significantly challenging, due to the similarities of mobility, bandgap, geometry, and fabrication process of 2D FETs, except under a few specific conditions such as graphene FET. For instance, ΔI_D(t) can be measured during 0.5 s at a particular V_G and V_D in a device belonging to a specific class (condition) after excluding $\overline {I_{\mathrm{D}}}$ as illustrated in Fig. 3b. It is noteworthy that we only considered ΔI_D(t) data where $\overline {I_{\mathrm{D}}}$ was larger than 100 nA to avoid a possible error caused by the minimum detection limit of our system. Subsequently, the current normalization process was performed for ΔI_D(t) and divided into 11 frames with a 200 ms window. The S_I of each frame was converted using FFT as shown in Fig. 3c and converted into a vector, x_n, possessing 100 current MFCC elements, a_nm, via Mel-scale filtering and DCT. The x_n for each frame was concatenated to create the current MFCC 2D array of each class (condition), X_(class)i, as indicated in Fig. 3d.

$$X_{(class)_i} = \left[ {x_1\,x_2\, \cdots \,x_n\, \cdots \,x_{10}\,x_{11}} \right]$$

(2)

$$x_n = \left[ {a_{n1}\,a_{n2} \cdots a_{nm} \cdots a_{n99}\,a_{n100}} \right]$$

(3)

where i depends on the specific voltage applied to the device belonging to the class (condition).

**Fig. 3: Detailed flowchart of ΔI_D featurization.**

In order to examine the high feasibility of our approach, we considered the carrier number fluctuation-correlated mobility fluctuation (CNF-CMF) model to interpret our ΔI_D(t) data (see the detailed LF noise theory in Supplementary Materials Note 2). This CNF-CMF model ascribes ΔI_D(t) to the carrier number fluctuation (CNF) caused by trapping/detrapping phenomena in the interface traps between the channel and gate dielectric in addition to the correlated mobility fluctuation. More specifically, ΔI_D(t) data can be influenced by many factors such as the carrier type of channel, interface quality and condition between the gate oxide and channel, and the presence of doping (see Fig. 3e). According to the CNF-CMF model, the drain current normalized S_I can be expressed as follows^5,6,9,56:

$$\frac{{S_{\mathrm{I}}}}{{\overline {I_{\mathrm{D}}} ^2}} = \frac{{q^2kTN_{{\mathrm{it}}}}}{{f^\gamma WLC_{{\mathrm{ox}}}^2}}\left( {1 + \frac{{\alpha _{{\mathrm{SC}}}\mu _{{\mathrm{eff}}}C_{{\mathrm{ox}}}\overline {I_{\mathrm{D}}} }}{{g_{\mathrm{m}}}}} \right)^2\left( {\frac{{g_{\mathrm{m}}}}{{\overline {I_{\mathrm{D}}} }}} \right)^2$$

(4)

where q is the carrier charge, k is the Boltzmann constant, T is the absolute temperature, f is the frequency, γ is the frequency exponent, C_ox is the dielectric capacitance per unit area, g_m is the transconductance (=Δ$\overline {I_{\mathrm{D}}}$/ΔV_G), S_Vfb is the flat-band voltage spectral density, and μ_eff is the effective mobility. The trapped carriers near the channel-gate dielectric interface not only cause variations in $S_{V_{fb}}$, but also degrade electron mobility, resulting in modulation of the carrier density.

Figure 3f shows the representative S_I of each frame in the frequency domain among the fabricated DUTs. The observation of certain harmonics in the S_I could be attributed to the carrier trapping/de-trapping process in the gate oxide trap sites. In fact, these harmonics assisted in understanding the characteristics of each device and expressed these characteristics as spectral envelopes with specific peaks^32,33,57,58. Therefore, N_it and α_SC between the gate dielectric and the channel of each class (condition) have a significant effect on the spectral envelopes and unique characteristics of device (see Fig. 3g). As a result, the current MFCC 2D array comprises power spectral sequences for each frequency (amplified for the low frequency region), as demonstrated in Fig. 3h. The HMM algorithm that learns the previous state and infers the next state is efficient for learning the current MFCC 2D array that contains N_it, α_SC, and γ information according to the frequency sequence.

Current MFCC and classification accuracy

LF MFCC element parts (from a_n1 to a_n40) of engineered current MFCC at the same $\overline {I_{\mathrm{D}}}$ ≈ 1 μA in each class ((i) graphene on trench structure, (ii) MoS₂ on SiO₂, (iii) MoS₂ on SiO₂ after e-beam irradiation, and MoS₂ on h-BN at (iv) T = 25 K, (v) T = 100 K, and (vi) T = 200 K) are directly compared in Fig. 4a (see also Supplementary Materials Figs. S5–6 and Note 3). In all DUTs, we consistently observe the 1/f noise tendency. The current MFCC elements in the LF regime were considered to significantly contribute to learning and classification. Except for the graphene case (see (i) in Fig. 4a), the $S_{\mathrm{I}}/\overline {I_{\mathrm{D}}} ^2$ curves for all 2D materials in this study fit well to the CNF-CMF model, implying the engineered current MFCCs of graphene would have a different image than those of the other 2D layered materials. The effects of e-beam irradiating the monolayer MoS₂ FET on the SiO₂ substrate are compared in schematic (ii) and (iii) of Fig. 4a, b. The obtained N_it increases by a factor of 10 after electron beam irradiation. Moreover, the engineered current MFCC for α_SC as a function of T in the monolayer MoS₂ FET on h-BN is also demonstrated (see (iv) to (vi) in Fig. 4a, b). α_SC increases with increasing T from 3.23 × 10⁴ (T = 25 K) to 3.08 × 10⁵ V s C⁻¹ (T = 200 K)^11,36.

**Fig. 4: Engineered current MFCC study and classification accuracy.**

The frequency distributions are presented in a histogram with 20 intervals, as shown in Fig. 4a, using the normalized elements of LF MFCC of classes (i)–(vi) (see Fig. 4b). As N_it increases from condition (ii) to (iii), the highest frequency of the histogram shifts to the positive direction. A similar positive frequency shift is observed in cases (iv)–(vi) with the increasing T. Referring to Eq. (4), the S_I varies as a function of N_it and α_SC, and the corresponding current MFCCs can be extracted via featurization, consequently enabling the representation of a specific histogram tendency.

The HMM algorithm, which learns considering the correlation between the previous state and the next state, progresses under the following two learning conditions. The first learning condition is that the specific current fluctuation of each device in a specific class is due to N_it and α_SC, and the current MFCC contributes to learning by considering the above information. The second learning condition is that the HMM algorithm is learned by considering the correlation between the MFCC of the previous frequency and the MFCC of the next frequency. Thus, in Eq. (4), the exponent γ, which reflects the trap distribution, also influences the learning process with the HMM algorithm, with N_it and α_SC. Figure 4c displays the classification accuracies and processing time obtained using the HMM algorithm and the HMM score vector learning method employing the NN for a number of data. When the number of data was 7800, the HMM classification accuracy was 76.3% with f1-score and AUC value of <0.78 using fourfold cross-validation (see Supplementary Materials Fig. S3 and Note 3.11)^30,59,60. The HMM+NN classification accuracy was 85.2% with f1-score of 0.86, AUC value of 0.83, and processing time of 3 h for each fourfold cross-validation. For 22,100 data points, the HMM+NN classification accuracy increased to 95.5% with f1-score of 0.93, AUC value of 0.91, and processing time of 11 h for each fourfold cross-validation. However, for 48,800 data points, classification accuracies exhibited no further improvement, and only the processing time increased to 15 h. Moreover, the classification accuracy learned by the convolution neural network (CNN) algorithm using current MFCCs as image data not through HMM architecture reached 93.6% with f1-score of 0.82, AUC value of 0.87 as good as the performance of HMM+NN. (see Supplementary Materials Fig. S7 and Note 4). On the other hand, the logistic regression model achieved only a classification accuracy of 88.8% with f1-score of 0.75, AUC value of 0.89. Therefore, provided that the performance of CNN architecture for classifying by learning perceptrons of each layer is acceptable, a transfer learning for any other channel or gate oxide materials can be possible⁶¹.

Most of the classes (or labels) were in good agreement with the CNF-CMF model with high averaged cross-validation accuracies of over 90% with f1-score of over 0.86 and AUC value of over 0.84, as presented in Fig. 4d. However, two exceptional classes, i.e., ReS₂ (blue bar) and MoS₂ (red bar) FETs fabricated on h-BN, are present in this figure with low classification accuracies of 74.2% (with f1-score of 0.79 and AUC of 0.75) and 38% (with f1-score of 0.49 and AUC of 0.51). This indicates that the current MFCCs for these classes were misinterpreted in the high current region. To interpret this miscalculation clearly, the corresponding $S_{\mathrm{I}}/\overline {I_{\mathrm{D}}} ^2$ curves at f = 10 Hz for both cases are displayed in Fig. 4e. Although they are well fitted to the CNF-CMF model in most of the current regions, the additional contact resistance (R_CT) contributing towards the total LF noise behavior in the high current regions curtails the accuracies in particular¹¹. The inset in Fig. 4d shows the confusion matrix of the HMM+NN architecture. Interestingly, some classes, which should consider the effects on additional contact resistance such as ReS₂ and MoS₂ FETs using h-BN as gate dielectric, WSe₂ FETs using Au as contact metal, and monolayered MoS₂ FETs, are sometimes confused with each other.

Discussion

Combining the LF noise spectroscopy with machine learning algorithms provides an efficient and precise approach to characterize and classify 2D layered FETs. Through the use of an NN based on the hidden Markov model algorithm, we demonstrate that MFCCs, which were converted from the LF noise data of DUTs, can be predicted more precisely than the limits of fundamental measurements. Importantly, this method of applying only a specific voltage can be considered advantageous in both classifying device information and characterization of device performances. The combination of factors such as channel material, gate dielectric, contact metal, and electron beam irradiation have a profound effect on carrier fluctuations, enabling effective learning and training. Further, the learning models using LF noise spectroscopy presented herein are highly interpretable, and aid in identifying how engineered features, including the behaviors between carriers and traps, contribute to characterizing device information and performance. Therefore, the considerable flexibility of this approach makes it adaptable in distinguishing the degree of degradation and reliability of device and to modeling optimized fabrication conditions and device structures. The carrier transport direction, stacking order and orientation in 2D heterostructures would be a critical factor that influences significantly on charge fluctuation, expecting to enable the improved interpretation in the future via this approach. Moreover, the inference of engineered current MFCC features that currently lack sufficient noise data, combined with the CNF-CMF and additional contact noise approaches, and an improved ability to build models from limited experimental data should be possible using the developed model.

Methods

Sample fabrication

An appropriately selected chemical vapor deposited monolayer MoS₂ and mechanically exfoliated 2D multilayer materials such as MoS₂, BP, ReS₂, MoTe₂, WSe₂, and h-BN were transferred onto high-quality 300 nm-thick SiO₂/p⁺-Si substrates. To make source and drain metal electrodes on them, standard electron beam lithography was used, and 80 nm-thick Au, Ti, Pt, and Cr were deposited using an electron-beam evaporation system. To suppress the contact resistance effect at the metal-semiconductor interface, all the fabricated devices were annealed under a high vacuum condition for 2 h at 473 K. The trenched graphene FETs in this study was fabricated on a pre-patterned parallel grid structure made of spin-coated poly(Methyl Methacrylate) A2 via conventional dry transfer methods³⁹. The Al₂O₃ passivation layer was deposited on 2D materials using an atomic layer deposition system.

In-situ measurement with e-beam irradiation

Electron-beam irradiation was conducted under high vacuum conditions (~10⁻⁶ Torr) at 300 K using a scanning electron microscope (SEM) (Quanta 3D FEG) chamber with a nano-manipulator for multilayer MoS₂, WSe₂, and monolayer MoS₂ for 30 s with 30 kV and 50 pA. Four tungsten probes installed on the nano-manipulator system were electrically connected to a semiconductor parameter analyzer.

Electrical transport measurement

All the devices, except the Al₂O₃ passivated MoS₂ FETs, were characterized in a high vacuum-probe station system¹⁰. Fundamental electrical transport characterizations were performed using semiconductor analyzers (Keithley 4200, Agilent B1500A) with a temperature controllable probe system (335, Lake-Shore). Low-frequency noise characteristics were obtained from a home-made noise measurement system (the system details are presented in Fig. S2 in the Supplementary Materials), consisting of a home-made battery box, a low noise current-to-voltage pre-amplifier (SR570, Stanford Research Systems), and a data acquisition system (DAQ-4431, National Instruments)⁴².

Data processing and training

We used the Python speech features library in Github (https://github.com/jameslyons/python_speech_features) for processing of LF noise data into MFCC parameters. We only considered data where $\overline {I_{\mathrm{D}}}$ was larger than 100 nA to avoid a possible error. The optimized combination of hyperparameters was based on the previous studied LF noise analysis, narrowed the range, and found the best result by iterating through for loop. After Augmenting training MFCCs dataset using Gaussian noise, we used hmmlearn (https://github.com/hmmlearn/hmmlearn) library in Github for using HMM trainer function with training MFCCs data. Through HMM training, trained data generated for each class were converted into score vectors, and these vectors were trained by neural network based on the Tensorflow keras (https://www.tensorflow.org/guide/keras). Finally, We learned and trained current MFCC data directly using CNN also based on the Tensorflow keras (https://www.tensorflow.org/guide/keras).

Model validation

We used the 4-fold cross-validation method to train our MFCCs dataset, training MFCCs dataset was divided into 4 subsets having equal sizes randomly. Of the 4 subsets, a single subset was retained for the test data for evaluating the model, and the remaining three subsets were used as training. Our cross-validation process is repeated 4 times, with each of the four subsets used once for test. The remained test MFCC datasets were converted into score vectors to evaluate the model with training data learned through the HMM+NN architecture. We obtained not only the accuracy, but also confusion matrix, receiver operating characteristic (ROC) curves, area under the curve (AUC) value, and f1-score to evaluate the model performance accurately with imbalance of the data (https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics).

Data availability

Some of LF noise data that support the findings of this study are available from Github (https://github.com/Kookjin-Lee/Kookjin.Sangjin.noiseML) and the test LF noise data are uploaded in the subfolder with each label name in the folder (Test data_noise). All LF noise data are available from the corresponding author(s) upon reasonable request.

Code availability

All the codes used to train, evaluate, calculate the presented results in this study are available from Github (https://github.com/Kookjin-Lee/Kookjin.Sangjin.noiseML) using the available Python file (LFnoise_ML_Classification_SJ_KJ.py).

References

Kirton, M. J. & Uren, M. J. Noise in solid-state microstructures: a new perspective on individual defects, interface states and low-frequency (1/f) noise. Adv. Phys. 38, 367–468 (1989).
Article CAS Google Scholar
Rogers, C. T. & Buhrman, R. A. Nature of single-localized-electron states derived from tunneling measurements. Phys. Rev. Lett. 55, 859 (1985).
Article CAS Google Scholar
Lee, K. et al. Understanding of aging pattern in quantum dot light-emitting diodes by low-frequency noise. Nanoscale 12, 15888–15895 (2020).
Article CAS Google Scholar
Rogers, C. T. & Buhrman, R. A. Composition of 1/f noise in metal-insulator- metal tunnel junctions. Phys. Rev. Lett. 53, 1272 (1984).
Article Google Scholar
Schroder, D. K. Semiconductor Material and Device Characterization (John Wiley & Sons, Ulsan, 2006).
Hooge, F. N. 1/f noise sources. IEEE Trans. Electron Devices 41, 1926–1935 (1994).
Article CAS Google Scholar
Song, S. H., Joo, M. K., Neumann, M., Kim, H. & Lee, Y. H. Probing defect dynamics in monolayer MoS₂ via noise nanospectroscopy. Nat. Commun. 8, 1–5 (2017).
Article CAS Google Scholar
Balandin, A. A. Low-frequency 1/f noise in graphene devices. Nat. Nanotechnol. 8, 549–555 (2013).
Article CAS Google Scholar
Von Haartman, M. & Ö stling, M. Low-Frequency Noise in Advanced MOS Devices (Springer Science & Business Media, 2007).
Na, J. et al. Low-frequency noise in multilayer MoS₂ field-effect transistors: the effect of high-k passivation. Nanoscale 6, 433–441 (2014).
Article CAS Google Scholar
Joo, M. K. et al. Understanding coulomb scattering mechanism in monolayer MoS₂ channel in the presence of h-BN buffer layer. ACS Appl. Mater. Interfaces 9, 5006–5013 (2017).
Article CAS Google Scholar
Xu, G. et al. Effect of spatial charge inhomogeneity on 1/f noise behavior in graphene. Nano Lett. 10, 3312–3317 (2010).
Article CAS Google Scholar
Na, J. et al. Few-layer black phosphorus field-effect transistors with reduced current fluctuation. ACS Nano 8, 11753–11762 (2014).
Article CAS Google Scholar
Mitra, R., Jariwala, B. & Bhattacharya, A. Probing in-plane anisotropy in few-layer ReS 2 using low frequency noise measurement. Nanotechnology 29, 145706 (2018).
Article CAS Google Scholar
Ji, H. et al. Thickness-dependent carrier mobility of ambipolar MoTe₂: interplay between interface trap and Coulomb scattering. Appl. Phys. Lett. 110, 183501 (2017).
Article CAS Google Scholar
Esteva, A. et al. A guide to deep learning in healthcare. Nat. Med. 25, 24–29 (2019).
Article CAS Google Scholar
Gopnik, A. Making AI more human. Sci. Am. 316, 60–65 (2017).
Article Google Scholar
Lawrence, S., Giles, C. L., Tsoi, A. C. & Back, A. D. Face recognition: a convolutional neural-network approach. IEEE Trans. Neural Netw. 8, 98–113 (1997).
Article CAS Google Scholar
de Albuquerque, V. H. C., Cortez, P. C., de Alexandria, A. R. & Tavares, J. M. R. S. A new solution for automatic microstructures analysis from images based on a backpropagation artificial neural network. Nondestruct. Test. Eval. 23, 273–283 (2008).
Article CAS Google Scholar
Vignal, C., Mathevon, N. & Mottin, S. Audience drives male songbird response to partner’s voice. Nature 430, 448–451 (2004).
Article CAS Google Scholar
Butler, K. T. et al. Machine learning for molecular and materials science. Nature 559, 547–555 (2018).
Article CAS Google Scholar
Lee, K. et al. Detection and Accurate Classification of Mixed Gases using Machine Learning with Impedance Data. Adv. Theory Simul. 3, 2000012 (2020).
Article CAS Google Scholar
Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521, 452–459 (2015).
Article CAS Google Scholar
Jordan, M. I. & Mitchell, T. M. Machine learning: trends, perspectives, and prospects. Science 349, 255–260 (2015).
Article CAS Google Scholar
Kireeva, N. et al. Generative topographic mapping (GTM): universal tool for data visualization, structure-activity modeling and dataset comparison. Mol. Inform. 31, 301–312 (2012).
Article CAS Google Scholar
Mortazavi, B. et al. Exploring phononic properties of two-dimensional materials using machine learning interatomic potentials. Appl. Mater. Today 20, 100685 (2020).
Article Google Scholar
Das, S. et al. Machine learning in materials modeling—fundamentals and the opportunities in 2D materials. Synth. Model. Charact. 2D Mater. their Heterostruct. 445–468 https://doi.org/10.1016/B978-0-12-818475-2.00019-2. (2020).
Brown, K. A., Brittman, S., Maccaferri, N., Jariwala, D. & Celano, U. Machine learning in nanoscience: big data at small scales. Nano Lett. 20, 2–10 (2020).
Article CAS Google Scholar
Mortazavi, B. et al. Machine-learning interatomic potentials enable first-principles multiscale modeling of lattice thermal conductivity in graphene/borophene heterostructures. Mater. Horiz. 7, 2359–2367 (2020).
Article Google Scholar
Borovcnik, M., Bentz, H.-J. & Kapadia, R. A Probabilistic Perspective. Chance Encounters: Probability in Education (Springer, Dordrecht, 1991).
Fine, S., Singer, Y. & Tishby, N. The hierarchical hidden Markov model: analysis and applications. Mach. Learn. 32, 41–62 (1998).
Article Google Scholar
Wu, J. & Yu, J. An improved arithmetic of MFCC in speech recognition system. 2011 Int. Conf. Electron. Commun. Control. ICECC 2011-Proc. 719–722 (2011).
Shao, X. & Milner, B. MAP prediction of pitch from mfcc vectors for speech reconstruction. 8th Int. Conf. Spok. Lang. Process. ICSLP 2004 1, 2425–2428 (2004).
Google Scholar
Ji, H. et al. Suppression of Interfacial Current Fluctuation in MoTe2 Transistors with Different Dielectrics. ACS Appl. Mater. Interfaces 8, 19092–19099 (2016).
Article CAS Google Scholar
Ko, S. P. et al. Low frequency noise reduction in multilayer WSe2 field effect transistors. IEEE-NANO 2015 - 15th Int. Conf. Nanotechnol. 1118–1121 (2015).
Joo, M. et al. Electron excess doping and effective schottky barrier reduction on the MoS₂/h‑BN heterostructure. Nano Lett. 16, 6383–6389 (2016).
Article CAS Google Scholar
Cho, Y. H. et al. Soft-type trap-induced degradation of MoS₂ field effect transistors. Nanotechnology 29, 1–8 (2018).
Article CAS Google Scholar
Jin, J. E. et al. Catalytic etching of monolayer graphene at low temperature via carbon oxidation. Phys. Chem. Chem. Phys. 18, 101–109 (2016).
Article CAS Google Scholar
Jin, J. E. et al. Surface modulation of graphene field effect transistors on periodic trench structure. ACS Appl. Mater. Interfaces 8, 18513–18518 (2016).
Article CAS Google Scholar
Ryu, M. Y. et al. Triethanolamine doped multilayer MoS₂ field effect transistors. Phys. Chem. Chem. Phys. 19, 13133–13139 (2017).
Article CAS Google Scholar
Lee, K. et al. Real-time effect of electron beam on MoS₂ field-effect transistors. Nanotechnology 31, 455202 (2020).
Article CAS Google Scholar
Joo, M. K. et al. A dual analyzer for real-time impedance and noise spectroscopy of nanoscale devices. Rev. Sci. Instrum. 82, 034702 (2011).
Article CAS Google Scholar
Sigurdsson, S., Petersen, K. B. & Lehn-Schiøler, T. Mel frequency cepstral coefficients: An evaluation of robustness of MP3 encoded music. ISMIR 2006 - 7th Int. Conf. Music Inf. Retr. 286–289 (2006).
Hasan, R., Jamil, M., Rabbani, G. & Rahman, S. Speaker Identification Using Mel Frequency Cepstral Coefficients. 3rd Int. Conf. Electr. Comput. Eng. ICECE 2004 28–30 (2004).
Farooq, O. & Datta, S. Mel filter-like admissible wavelet packet structure for speech recognition. IEEE Signal Process. Lett. 8, 196–198 (2001).
Article Google Scholar
Molau, S., Pitz, M., Schlüter, R. & Ney, H. Computing mel-frequency cepstral coefficients on the power spectrum. ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process. - Proc. 1, 73–76 (2001).
Google Scholar
Churi, A., Bhat, A., Mohite, R. & Churi, P. P. E-zip: An electronic lock for secured system. 2016 IEEE Int. Conf. Adv. Electron. Commun. Comput. Technol. ICAECCT 2016 2, 45–49 (2017).
Google Scholar
Beal, M. J. et al. The Infinite Hidden Markov Model. NIPS 14, 577–584 (2012).
Google Scholar
Schuller, B. et al. Hidden Markov model-based speech emotion recognition. Proc. - IEEE Int. Conf. Multimed. Expo. 1, I401–I404 (2003).
Google Scholar
Sonnhammer, E. L., von Heijne, G. & Krogh, A. A hidden Markov model for predicting transmembrane helices in protein sequences. Proc. Int. Conf. Intell. Syst. Mol. Biol. 6, 175–182 (1998).
CAS Google Scholar
Zhang, J. & Sclaroff, S. Saliency detection: A boolean map approach. Proc. IEEE Int. Conf. Comput. Vis. 153–160 (2013).
Rahman, M. A. & Hoque, M. A. Online adaptive artificial neural network based vector control of permanent magnet synchronous motors. IEEE Power Eng. Rev. 17, 28 (1997).
Google Scholar
Lin, C. F. & Wang, S. De. Fuzzy support vector machines. IEEE Trans. Neural Netw. 13, 464–471 (2002).
Article Google Scholar
Agarap, A. F. Deep Learning using Rectified Linear Units (ReLU). arxiv 2–8. Preprint at https://arxiv.org/abs/1803.08375 (2018).
Hara, K., Saito, D. & Shouno, H. Analysis of function of rectified linear unit used in deep learning. Proc. Int. Jt. Conf. Neural Networks 1–8 (2015).
Ghibaudo, G., Roux, O., Nguyen-Duc, C., Balestra, F. & Brini, J. Improved analysis of low frequency noise in field-effect MOS transistors. Phys. Status Solidi 124, 571–581 (1991).
Article Google Scholar
Juvela, L. et al. Speech waveform synthesis from MFCC sequences with generative adversarial networks. ICASSP, IEEE Int. Conf. Acoust. Speech Signal Process 5679–5683 (2018).
Jensen, J. H. et al. Evaluation of MFCC estimation techniques for music similarity. European Signal Processing Conference 2219–5491 (2006).
Schaffer, C. Technical Note: Selecting a Classification Method by Cross-Validation. Mach. Learn. 13, 135–143 (1993).
Article Google Scholar
Moore, A. W. & Lee, M. S. Efficient Algorithms for Minimizing Cross Validation Error. Machine Learning Proceedings 1994 (Morgan Kaufmann Publishers, Inc., 1994).
Deepak, S. & Ameer, P. M. Brain tumor classification using deep CNN features via transfer learning. Comput. Biol. Med. 111, 103345 (2019).
Article CAS Google Scholar

Download references

Acknowledgements

This research was supported by Nano-Material Technology Development Program through the National Research Foundation of Korea (NRF) funded by Ministry of Science and ICT and also supported by the Future Semiconductor Device Technology Development Program funded by Ministry of Trade, Industry & Energy (MOTIE) and Korea Semiconductor Research Consortium (KSRC) (NRF-2017M3A7B4049119 & Grant 10067739, G.-T.K.) supervised by the IITP (Institute for Information & communications Technology Planning & Evaluation). Further, M.-K.J. also wishes to acknowledge the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIP) (NRF-2019R1C1C1003467 & NRF-2019K2A9A1A06083674). We appreciate Hyunwoo J. Kim for helping us design and develop the code.

Author information

These authors contributed equally: Kookjin Lee, Sangjin Nam.

Authors and Affiliations

imec, Leuven, Belgium
Kookjin Lee
Department of Materials Science, KU Leuven, Leuven, Belgium
Kookjin Lee
School of Electrical Engineering, Korea University, Seoul, Republic of Korea
Kookjin Lee, Yeonsu Kim, Jaewoo Lee & Gyu-Tae Kim
Department of Computer Science and Engineering, Korea University, Seoul, Republic of Korea
Sangjin Nam
School of Electrical Engineering, University of Ulsan, Ulsan, Republic of Korea
Hyunjin Ji
Defense agency for technology and quality, Jinju-si, Gyeongsangnam-do, Republic of Korea
Junhee Choi
Samsung Electronics Co. Ltd, 1 Samsung-ro, Yongin-si, Gyeonggi-do, Republic of Korea
Jun-Eon Jin, Young-Hoon Cho & Hyebin Lee
Samsung Display Co. Ltd, 1 Samsung-ro, Yongin-si, Gyeonggi-do, 17113, Republic of Korea
Junhong Na & Min-Yeul Ryu
Department of Applied Physics, Sookmyung Women’s University, Seoul, Republic of Korea
Min-Kyu Joo
Institute of Advanced Materials and Systems, Sookmyung Women’s University, Seoul, 04310, Republic of Korea
Min-Kyu Joo

Authors

Kookjin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Sangjin Nam
View author publications
You can also search for this author in PubMed Google Scholar
Hyunjin Ji
View author publications
You can also search for this author in PubMed Google Scholar
Junhee Choi
View author publications
You can also search for this author in PubMed Google Scholar
Jun-Eon Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yeonsu Kim
View author publications
You can also search for this author in PubMed Google Scholar
Junhong Na
View author publications
You can also search for this author in PubMed Google Scholar
Min-Yeul Ryu
View author publications
You can also search for this author in PubMed Google Scholar
Young-Hoon Cho
View author publications
You can also search for this author in PubMed Google Scholar
Hyebin Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jaewoo Lee
View author publications
You can also search for this author in PubMed Google Scholar
Min-Kyu Joo
View author publications
You can also search for this author in PubMed Google Scholar
Gyu-Tae Kim
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

K.L. and S.N. initiated and developed the work under the supervision of M.-K.J. and G.-T.K., K.L., and S.N. designed and developed the code, and conducted the analysis. K.L., H.J., J.C., J.-E.J., Y.K., J.N., M.-Y.R., Y.-H.C., H.L., J.L., and M.-K.J. fabricated the samples and measured low-frequency noise data. K.L., S.N., M.-K.J., and G.-T.K. wrote the manuscript and discussed with all the authors.

Corresponding authors

Correspondence to Min-Kyu Joo or Gyu-Tae Kim.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Lee, K., Nam, S., Ji, H. et al. Multiple machine learning approach to characterize two-dimensional nanoelectronic devices via featurization of charge fluctuation. npj 2D Mater Appl 5, 4 (2021). https://doi.org/10.1038/s41699-020-00186-w

Download citation

Received: 12 August 2020
Accepted: 18 November 2020
Published: 04 January 2021
DOI: https://doi.org/10.1038/s41699-020-00186-w