Multi-channel masked autoencoder and comprehensive evaluations for reconstructing 12-lead ECG from arbitrary single-lead ECG

Multi-channel masked autoencoder and comprehensive evaluations for reconstructing 12-lead ECG from arbitrary single-lead ECG

Introduction

The cardiovascular disease (CVD)1,2 contributes the leading mortality all around the world. Moreover, the prevalence rate continues to show an upward trend in the developing areas in the past decades3, posing a great challenge for researchers and cardiologists to address. In clinical practice, clinicians need to adopt some characterization tools4 to diagnose cardiovascular disease, and one of the most popular tools is the standard 12-lead electrocardiogram (ECG). The significant advancements in deep learning have enabled certain researchers to develop models capable of achieving cardiologist-level proficiency in interpreting 12-lead electrocardiograms (ECGs). For instance, Ribeiro et al. have successfully trained such a cardiologist-like model, as detailed in their study5. In conclusion, the 12-lead ECG can provide comprehensive cardiac information from various views for doctors and classification models, playing an essential role in cardiac healthcare.

However, the 12-lead ECG signal collection process puts at least 10 electrodes on the user’s surface, which causes considerable inconvenience and discomfort for users, and make long-term cardiac health monitoring difficult. Up to now, the standard 12-lead ECG is traditionally used in the hospital for short-term diagnosis, usually lasting about 1 min, while long-term monitoring6 is essential for capturing the paroxysmal cardiac abnormalities. Consequently, the pursuit of user-friendly devices capable of capturing ubiquitous electrocardiogram (ECG) signals is a priority for both researchers and markets, including patch6,7,8, smartwatch9,10,11, and armband12,13,14. Further, the single-lead ECG has been used for cardiac abnormality classification, such as the lead I ECG for the Atrial Fibrillation15, the lead V1 ECG for the Brugada Syndrome16, and the lead aVR ECG for the Sinus Bradycardia17. While wearable devices offer the advantage of ambulatory monitoring by collecting single-lead ECG signals, they do not match the diagnostic depth of a standard 12-lead ECG. The limitation arises from these devices capture the heart’s electrical activity from a restricted subset of perspectives, which may not provide a comprehensive assessment of cardiac health.

It is of great importance to strike a harmonious balance between clinical effectiveness and application feasibility. On the one hand, the clinical standard 12-lead ECG can comprehensively measure cardiac health5, but it causes somewhat inconvenience and discomfort. On the other hand, wearable devices have been a popular choice for users, but they are with limited clinical importance. Then, many researchers tried to reduce the gap between the reduced-lead and 12-lead ECG, like the challenge proposed by Reyna et al.18. The challenge asks to access the diagnostic potential of the reduced-lead ECG, including 6-lead, 4-lead, 3-lead, and 2-lead ECG. In this challenge, Nejedly et al.19 adopt the ensemble learning, residual network, and attention mechanism to achieve state-of-the-art performance, and similarly in these researches20,21,22,23. Unfortunately, these mentioned studies only focus on the classification performance, merely providing an indirect approach to reduce the gap between the reduced-lead and 12-lead ECG.

Subsequently, some researchers try to provide a direct approach to reduce the gap between the reduce-lead (Specifically, single-lead) and 12-lead ECG, that is, reconstructing 12-lead ECG with the reduced-lead ECG24,25,26,27,28,29,30,31,32,33. Prior works managed to explore transformation between the Frank lead and the standard 12-lead ECG, in which the inverse Dower matrix is released by Edenbrandt et al.24, and it turns 12-lead ECG into 3-dimensional Vectorcardiogram. Nelwan et al.25 attempt to reconstruct 12-lead ECG from reduced lead sets. The experimental findings indicate a strong correlation coefficient of ~0.932 when one or two precordial leads are excluded from the lead set. Maheshwari et al.26 adopt a solution for reconstructing 12-lead ECG from 3-lead ECG, and the reconstruction score is about 0.9187 in the testing phase. However, the assumption of dominantly linear relationship between ECG vectors can not fit the human heart electrical conduction system. Some researchers adopted autoencoders with different model architectures, such as Atoui et al.27 proposed Artificial Neural Network (ANN), and successfully realized the generation process of 3-lead ECG to the remaining 5 chest leads. This work and the following work all adopt the training idea of automatic encoders, including Sohn et al.28 used LSTM; Gundlapalle et al.29 combined CNN and LSTM; Garg et al.30 combined the attention mechanism in autoencoder, thereby improving the feature expression ability. Generative adversarial network (GAN)34 also attracts a number of research attention, such as refs. 31–33. Lee et al.31 adopt the conditional generative adversarial network(CGAN) to explore the feasibility of converting limb leads into chest leads. It is worth mentioned that the input of CGAN is ECG, instead of the random noise in the traditional GAN. The average structural similarity index (SSIM) between the generated ECG signal and the real ECG signal is 0.92, and the percent root mean square difference (PRD) is only 7.21%. Seo et al.32 also use the CGAN for reconstructing 12-lead, and the Mean Absolute Error (MAE) between the generated and real ECG signals is only 0.25. Joo et al.33 proposes a novel CGAN that consists of two generators, and achieves good reconstruction performance, like the root mean square error between the generated and real 12-lead ECG is 0.32. Additionally, our previous work35 also uses this method to reconstruct 12-lead ECG from lead I ECG. However, the training instability and poor diversity make generating adversarial networks to difficultly address this reconstruction task, and most of the above-mentioned studies are limited flexible, since they only work on a fixed limb lead30,32,35. Chen et al.36 propose a novel framework to establish Electrocardio panorama; however, only the 12-lead ECG signals are considered useful, while the remaining non-standard lead signals are deemed meaningless. Consequently, there is a critical need to investigate methods for reconstructing the 12-lead ECG from an arbitrary single-lead ECG, While these methodologies are capable of approximating the reconstruction of a 12-lead electrocardiogram (ECG) from limited-lead inputs, there remains a significant research gap that needs to be addressed in the domain of 12-lead ECG reconstruction. Firstly, the traditional generative models usually focus on the fixed single-lead, instead of arbitrary single-lead ECG. Secondly, the related works27,28,29,30,31,32,33,35 lack a comprehensive evaluation benchmark, mainly focus on the signal-level evaluation. Therefore, the contributions in this study are as follows:

  • This study proposes a multi-channel masked autoencoer, MCMA, and it can convert arbitrary single-lead ECG into 12-lead ECG.

  • This study designs a comprehensive evaluation benchmark, ECGGenEval, including signal-level, feature-level, and diagnostic-level evaluation.

  • MCMA can achieve state-of-the-art reconstruction performance in the ECGGenEval across the internal and external testing datasets, with a mean square error of 0.0175 and a Pearson correlation coefficient of 0.7772 in the internal testing dataset.

In a word, MCMA demonstrates its efficacy in reconstructing a 12-lead ECG from a single lead, thereby offering significant potential to augment the capabilities of wearable health monitoring devices in the digital health era. This advancement is poised to improve the diagnostic and monitoring capabilities of these devices, ensuring more accurate and accessible health assessments for users.

Method

ECG background

ECG capture the electrical activity of the heart, characterized by distinct waveforms such as the P-wave, QRS-complex, and T-wave. The standard 12-lead ECG has been a prevalent diagnostic tool in clinical practice due to its ability to provide a comprehensive view of cardiac function. This tool, however, requires the placement of 10 electrodes on the body’s surface. The electrode positioning in the 12-lead ECG is detailed in Table 1.

Table 1 ECG background: the standard electrode configuration in the standard 12-lead ECG
Full size table

Dataset

This study conducts a large-scale 12-lead ECG datasets, consisting of 28,833 recordings from three public 12-lead ECG datasets, i.e., PTB-XL37,38, CPSC201839, and CODE-test5. The proposed framework is trained and validated with PTB-XL initially, and using the internal and two external testing datasets to further prove its feasibility.

PTB-XL37,38 is used for model training, validating, and testing. As a large dataset, PTB-XL involves 21,799 clinical 10-s 12-lead ECG signals, and the sampling frequency is 500 Hz. Based on the clinical standard, this dataset includes 71 kinds of ECG statements. As recommended, this study adopts the standard tenfold setting, in which the folds from the 1st fold to the 8th fold is the training set, and the 9th fold and the 10th fold act as the validation set and testing set, respectively. The ratio for training:validation: and testing is about 8:1:1.

CPSC201839 is used as an external testing set since the data distribution and information do not appear in model training and choosing. CPSC2018 contains 6877 12-lead ECG, and these lengths varied from 6 s to 60 s with 500 Hz in sampling frequency.

CODE-test is also used as an external testing set, particularly for diagnostic-level evaluation. CODE-test includes 827 12-lead ECG collected from different patients with different arrhythmia. Ribeiro et al.5 contributed a trained cardiologist-level classification model for this testing dataset.

Table 2 presents the data distribution for the signal-level and feature-level evaluation in PTB-XL and CPSC2018. Table 3 presents the data distribution for the diagnostic-level evaluation in CODE-test, including 6 distinguished arrhythmia types in this dataset.

Table 2 The data distribution of PTB-XL and CPSC2018, and these datasets are used for signal-level and feature-level evaluation
Full size table
Table 3 The data distribution of CODE-test, and it is used for the diagnostic-level evaluation
Full size table

MCMA

Multi-Channel Masked Autoencoder (MCMA) masks 11 different leads, leaving only a single-lead ECG to generate the standard 12-lead ECG. MCMA takes a single-lead ECG as input and produces a 12-lead ECG as output, both with a signal length of 1024. The abstract of MCMA is seen in Fig. 1. In this study, no preprocessing steps like filtering or scaling are applied to avoid altering the ECG signals. Additionally, MCMA uses a multi-channel masked configuration to reduce training and inference costs, requiring only one model, which sets it apart from related approaches in the prior works30,32,33,35.

Fig. 1: The 12-lead ECG generation with single-lead ECG.
Multi-channel masked autoencoder and comprehensive evaluations for reconstructing 12-lead ECG from arbitrary single-lead ECG

Top-left: the input single-lead ECG can be arbitrary, including I, II, III, aVR, aVL, avF, V1, V2, V3, V4, V5, V6.Top-right: it shows the detailed process, and this case takes lead I as an example. Bottom: the evalution benchmark, including signal-level, feature-level and diagnoistic-level.

Full size image

Model architecture

MCMA needs a designed architecture, as seen in Fig. 2. Motivated by ResNet40 and UNet41. The model includes two modules, namely, the downsampling and upsampling modules, which are composed of the multi-convolution block (MCBlock) and multi-convolution-transpose block (MCTBlock), respectively. The kernel size (k) is 5 and the window size (s) is 2. The choice of setting kernel size as 5 for MCBlock and MCTBlock layers aims in achieving effective feature extraction in deep learning models, particularly in those processing data with rich spatial hierarchies. The window size is usually 2 for the striding process, which can reduce the feature dimension and improve the learning ability. The activation function is GELU. The experimental results with different hyperparameters can be seen in Supplementary materials. To improve the gradient stability, layer normalization (LN) and instance normalization (IN) are used in each block. The skip connections can speed up the convergence rate of the model and improve the representation ability. Additionally, the basic training recipe is provided in Table 4.

Fig. 2: The detailed model architecture, the proposed model mainly includes MCBlock and MCTBlock.
figure 2

Left: the situation of each layer and shape changes from input to output. Top-right: composition of MCBlock, including two branches, which will achieve downsampling; Bottom-right: composition of MCTBlock, including two branches, which will achieve upsampling.

Full size image
Table 4 The hyperparameters configuration in the MCMA training process
Full size table

MCMA implementation

Padding strategy

MCMA utilizes a zero-padding strategy to retain the space information for each single-lead ECG. When the single-channel ECG is processed into the 12-channel format, while the other channels are zeros, as seen in Eq. (1).

$$P(ec{g}_{12},i)={I}_{z}times ec{g}_{12}[i]$$
(1)

In Eq. (1), the shape of index matrix for zero-padding is 12 × 1, Iz(i) = 1 with other elements being zeros. Specifically, the output shape equals the input shape, and the shape of ecg12 is 12 × N, then the shape of ecg12[i] is 1 × N, so the output shape also is 12 × N. With zero-padding, MCMA can adaptively solve different inputs. To highlight its advantages, the 12 copies for the single-lead ECG acts as a comparison, named as the copy-padding strategy. The index matrix for copy-padding strategy, Ic, all elements are 1. At the same time, the arbitrary input lead and the fixed lead (lead I) are compared. In addition, the 12-lead ECG is provided in model training, and the padding strategy aims to mask the original 11-lead ECG with zeros or the remaining single-lead ECG in the standard 12-lead ECG. Meanwhile, only the single-lead ECG exists in the real-world application process, it should be with the padding strategy for the proposed framework.

Loss function

The generative models mainly involve autoencoder(AE)42, generative adversarial network(GAN)34, diffusion model43. Although the diffusion model has shown its great potential and ability in various tasks, the sampling speed44 is challenging. GAN32,33,35 and AE30 have been studied by the previous research works. Additionally, it is worth mentioning that the traditional GAN is not enough to complete this task, which supports converting random noise into the generative signals. Therefore, the researchers of this task adopted a conditional generative adversarial network, including Seo et al.32 Joo et el.33, and our previous study35. In this study, the autoencoder can be a feasible solution for this 12-lead ECG reconstruction task, due to the training stability. Further, the proposed framework needs to be compared with the GAN-based32,33,35 and AE-based30 methods.

The autoencoder (AE) can extract the latent representation from the raw data and convert the latent representation into the target output. The common loss function (L) is shown in Eq. (2).

$$L=| | ec{g}_{12}-AE(ec{g}_{1})| {| }^{2}$$
(2)

In Eq. (2), the 12-lead and single-lead ECG signals are represented by ecg12 and ecg1. P means the padding strategy, as shown in Eq. (1), i means the index, varying from 1 to 12. MCMA employs a zero-padding strategy as default, while copy-padding is utilized for comparative analysis within the ablation study.

Inferencing MCMA

After the training process, MCMA can be used in real-world applications, i.e., the inferencing (testing) process. The single-lead ECG with the zeros-padding strategy is the input of MCMA. Then, the application process for MCMA can be seen in Eq. (3).

$${g}_{ecg}=AE({I}_{z}times ec{g}_{1})$$
(3)

In Eq. (3), gecg is the generated 12-lead ECG with MCMA, ecg1 is the single-lead ECG collected by wearable devices, Iz can convert ecg1 into the input of AE.

Comprehensive evaluations of ECG reconstruction

This study introduces ECGGenEval, a comprehensive evaluation benchmark for 12-lead ECG reconstruction, including three distinct dimensions: signal-level, feature-level, and diagnostic-level.

Signal-level evaluations

This study adopts the Pearson correlation coefficient (PCC) and mean square error (MSE) in the signal-level evaluation. The real and generated ECG signal are defined as recg and gecg. Then, the definitions for PCC and MSE are shown in Eqs. (4) and (5).

$$PCC({r}_{ecg},{g}_{ecg})=frac{mu ({r}_{ecg}times {g}_{ecg})-mu ({r}_{ecg})mu ({g}_{ecg})}{sigma ({r}_{ecg})sigma ({g}_{ecg})}$$
(4)
$$MSE({r}_{ecg},{g}_{ecg})=mu ({({r}_{ecg}-{g}_{ecg})}^{2})$$
(5)

In these equations, as Eqs. (4) and (5), μ(*) and σ(*) denotes the mean value and standard deviation, respectively. PCC varies from −1 to 1, and MSE is bigger than 0. The relationship between PCC and generation performance is positively related, while the relationship between MSE and generation performance is negatively related. For the signal-level evaluation, a better generative model should be with a higher PCC and lower MSE.

Feature-level evaluations

Furthermore, this study adopts the estimated heart rate of the generated 12-lead ECG for the feature-level evaluation. Since the heart rate in real 12-lead ECG signals theoretically occurs simultaneously, and the generated signals should meet this requirement. The mean heart rate (MHR) at the jth lead can be calculated, as shown in Eq. (6).

$$MHR(j)=frac{60times (n-1)}{mathop{sum }nolimits_{i = 1}^{n-1}(R(i+1,j)-R(i,j))}$$
(6)

In Eq. (6), the ith detected R-wave in jth lead is denoted as R(i, j), and its unit is second. Therefore, MHR can represent the heartbeat per minute. Based on the 12 MHR from different 12-lead ECG, the average value MMHR can be computed with Eq. (7). Then, the feature-level evaluation involves standard deviation (SD), Range (the difference between maximum and minimum), and coefficient of variation (CV), expressed as MHRSD, MHRRange and MHRCV, respectively. The calculation processes can be seen in Eq. (8), Eq. (9) and Eq. (10), respectively.

$$MMHR=frac{1}{12}mathop{sum }limits_{j=1}^{12}left(MHR(j)right.$$
(7)
$$MH{R}_{SD}=sqrt{frac{1}{12}mathop{sum }limits_{j=1}^{12}{(MHR(j)-MMHR)}^{2}}$$
(8)
$$MH{R}_{Range}=max (MHR)-min (MHR)$$
(9)
$$MH{R}_{CV}=frac{MH{R}_{SD}}{MMHR}$$
(10)

The reference estimation is completed with the original 12-lead ECG These feature-level evaluation is good if the inter-lead heart rates are consistent.

Diagnostic-level evaluations

This study also adopts the diagnostic-level evaluation for this 12-lead ECG reconstruction task. MCMA is able to convert the limit-lead (even single-lead) ECG into 12-lead ECG, which bridges the limited-lead ECG to the classifiers which trained with 12-lead ECG as input. Therefore, this study can evaluate the generated 12-lead ECG using classification performance, including the precision (Pre), recall (Rec), specificity (Spe) and F1-score (F1), as shown in literature5. These calculation process of classification metric are seen in Eq. (11), Eq. (12), Eq. (13) and Eq. (14).

$$Pre=frac{TP}{TP+FP}$$
(11)
$$Rec=frac{TP}{TP+FN}$$
(12)
$$Spe=frac{TN}{TN+FP}$$
(13)
$${F}_{1}=frac{2times TP}{2times TP+FN+FP}$$
(14)

Also, the original classification performance with the real 12-lead ECG is the standard reference, and the generated 12-lead ECG with the other methods30,32,33,35 are used in the result comparison.

Results

Signal-level performance

First of all, the signal-level evaluation is the primary evaluation metric, such as MSE and PCC. In contrast to conventional approaches, this scheme offers a distinct advantage: it enables the conversion of an arbitrary single-lead ECG to a 12-lead ECG without the necessity of training multiple generative models. The experimental results of MSE and PCC are shown in Table 5, where the horizontal direction represents the output and the vertical direction represents the input. Besides, the reconstruction performance in the external dataset, CPSC2018, is seen in Table 6.

Table 5 The signal-level evaluation of mean square error (MSE) and Pearson correlation coefficient (PCC) between the generated and real 12-lead ECG in the internal testing set, PTB-XL
Full size table
Table 6 The signal-level evaluation of mean square error (MSE) and Pearson correlation coefficient (PCC) between the generated and real 12-lead ECG in an external testing set, CPSC2018
Full size table

Feature-level performance

This study also provides the feature-level evaluation results for MCMA, including the standard deviation MHRSD, Range MHRRange and coefficient of variation MHRCV. The feature-level evaluation results in the internal testing set PTB-XL and external testing set CPSC2018 are shown in Table 7 and Table 8, respectively. In the mentioned two tables, the first group is the reference value of the original 12-lead ECG. Additionally, the R-peak recognition is completed by algorithm45.

Table 7 The feature-level evaluation for the generated and real 12-lead ECG in the internal testing dataset, PTBXL
Full size table
Table 8 The feature-level evaluation for the generated and real 12-lead ECG in the external testing dataset, CPSC2018
Full size table

Diagnostic-level performance

Lastly, this study demonstrates the diagnostic-level performance of MCMA. The classifier is trained and validated by Ribeiro et al.5, which only accepts the 12-lead ECG. Then, it is essential to present the classification performance with the generated 12-lead ECG. For example, Table 9 shows the classification performance of the generated 12-lead ECG with lead I. The detailed diagnostic-level evaluations are shown in Table 10, including the original 12-lead ECG (as the reference), the single-lead ECG (i.e., MCMA input) and the generated 12-lead ECG (i.e., MCMA output), which directly shows the gain in the arrhythmia classification task.

Table 9 The diagnostic-level evaluation for MCMA, as the generated 12-lead ECG is from lead I ECG, CODE-test
Full size table
Table 10 The diagnostic-level evaluation for the generated 12-lead ECG in another external testing dataset, CODE-test
Full size table

Comparison with other methods

MCMA compares with other research works, including Garg et al.30, Seo et al.32, and Joo et al.33. As known, Garg et al.30 adopt the lead II, while Seo et al.32 and Joo et al.33 utilizes the lead I. Moreover, MCMA can convert arbitrary single-lead ECG into the standard 12-lead ECG. The comparisons in signal-level, feature-level, and diagnostic-level are shown in Table 11, Table 12, and Table 13.

Table 11 The signal-level comparison of different methods in PTB-XL and CPSC2018
Full size table
Table 12 The feature-level comparison of different methods in PTB-XL and CPSC2018
Full size table
Table 13 The diagnostic-level comparison of different methods in CODE-test
Full size table

Ablation study

MCMA utilizes two key modules, one for arbitrary single-lead ECG reconstruction, and another for zero-padding strategy. Then, it is necessary to compare with different settings, including fixed-channel (lead I as an example) and copy-padding strategy. The signal-level evaluation metric includes mean square error (MSE) and Pearson correlation coefficient (PCC). The experimental results comparison with different settings can be shown in Table 14, including the lead I and the average value for 12 single-lead ECG. In most cases, MCMA has achieved excellent result in 12-lead ECG reconstruction task.

Table 14 The ablation study for the proposed framework, MCMA, which adopts the zero-padding strategy and supports arbitrary single-lead ECG as input
Full size table

Case study

The training process details of MCMA can be illustrated as seen in Fig. 3. To show the advantages of the proposed framework, the generated and real 12-lead ECG should be clearly shown in Fig. 4, in which the generated and the real signals are colored blue and red. Figure 4 demonstrates the great generation ability of the proposed framework. For example, the average MSE and PCC between the generated and real 12-lead ECG is 0.0032 and 0.9560, and it is concluded that the generator can generate 12-lead ECG with single-lead ECG. Besides the internal testing dataset (i.e., PTB-XL), the external testing dataset’s (i.e., CPSC2018) reconstruction performance demonstrates the proposed framework’s advantages from another aspect, as seen in Fig. 5.

Fig. 3: The mean square error and Pearson correlation coefficient in the training process.
figure 3

The red circle means training loss, the blue star means validation loss, the black circle means training Pearson correlation coefficient (PCC), and the black star means validation Pearson correlation coefficient (PCC).

Full size image
Fig. 4: The 12-lead ECG reconstruction performance in the internal testing set PTB-XL, the red lines are the real signals while the blue lines represent the generated signals.
figure 4

Top: the real and reconstructed signal of lead I, lead II, lead III. Middle: the real and reconstructed signal of lead aVR, lead aVL, lead avF, lead V1, lead V2, lead V3. Bottom: the real and reconstructed signal of lead V4, lead V5, lead V6.

Full size image
Fig. 5: The 12-lead ECG reconstruction performance in the external testing set CPSC2018, the red lines are the real signals while the blue lines represent the generated signals.
figure 5

Top: the real and reconstructed signal of lead I, lead II, lead III. Middle: the real and reconstructed signal of lead aVR, lead aVL, lead avF, lead V1, lead V2, lead V3. Bottom: the real and reconstructed signal of lead V4, lead V5, lead V6.

Full size image

Based on the experimental result provided in Fig. 4 and Fig. 5, it is shown that the multi-channel masked autoencoder (MCMA) can be used to reconstruct the 12-lead ECG with single-lead ECG. In clinical practice, the ECG collected by wearable devices can be with different signal length, instead of the fixed length. It is necessary to demonstrate the proposed framework cloud also works with the variable-duration ECG signals, and the signal reconstruction result with 10-s ECG is seen in Fig. 6. In this case, the 5000 points should be filled with the extra 120 points, and it can be as the 5 individual samples for MCMA to reconstruct 12-lead ECG with single-lead ECG as input.

Fig. 6: The generated and real 10-s 12-lead ECG, demonstrating its advantages for variable-duration ECG reconstruction, the red lines are the real signals while the blue lines represent the generated signals.
figure 6

From the top to the bottom, the signals are lead I, lead II, lead III, lead aVR, lead aVL, lead avF, lead V1, lead V2, lead V3, lead V4, lead V5, lead V6.

Full size image

Discussion

In this work, we propose a multi-channel masked autoencoder, MCMA, for generating the standard 12-lead ECG with arbitrary single-lead ECG. Further, this study establishes a comprehensive evaluation benchmark, ECGGenEval, including the signal-level, feature-level, and diagnostic-level evaluation. MCMA can work well in ECGGenEval, achieving state-of-the-art performance. MCMA can convert arbitrary single-lead ECG into 12-lead ECG, instead of the fixed-lead ECG30,32,33,35. Secondly, we provide multiple-level evaluation results in an internal and two external testing datasets, and the details are as follows.

Firstly, according to the signal-level evaluation results from Tables 5 and 6, on the mentioned experimental results, it is known that the proposed framework can reconstruct high-fidelity 12-lead ECG with single-lead ECG. The average MSE and PCC in PTB-XL are 0.0175 and 0.7772, while the average MSE and PCC in CPSC2018 are 0.0654 and 0.7287, respectively. The reconstruction performance in the internal and external testing dataset can demonstrate its advantages, and MCMA can reconstruct the standard 12-lead ECG with arbitrary single-lead ECG as input. Therefore, the proposed method can provide a feasible solution when collecting the standard 12-lead ECG is inconvenient and difficult, like remote cardiac healthcare. In the signal-level comparison, the MSE and PCC for generating 12-lead ECG with lead II are 0.0178 and 0.8088, better than Garg et al.30, with the MSE of 0.0292 and PCC of 0.7981. Therefore, MCMA can be used for 12-lead ECG reconstruction tasks while the single-lead ECG is collected, and the signal-level evaluation provides a novel solution in real-world cardiac healthcare applications.

Secondly, Tables 7 and 8 complete the feature-level evaluation. For the internal testing dataset, PTB-XL, Table 7 demonstrates that the heart rate estimation in different leads is similar in the generated 12-lead ECG, and it is even better than the original 12-lead ECG. The estimated heart rate from the real 12-lead ECG may be different, since the noise exists in special channels. Table 7 shows that the average MHRSD, MHRCV, and MHRRange are 1.0481, 1.58%, and 3.2874, in which the optimal result is from the generated 12-lead ECG by lead V4 ECG. Table 8 shows the external evaluation in CPSC2018, the average MHRSD, MHRCV and MHRRange are 0.9483, 1.24%, and 3.0916, while the optimal result is from the generated 12-lead ECG by lead II ECG. The generated 12-lead ECG from arbitrary single-lead ECG can produce a good heart rate consistency in different leads, and it can even be better than the original 12-lead ECG in some cases, due to the ECG signal denoising function in the proposed framework. Table 12 demonstrates the advantages of MCMA over others, which can be highlighted as red. Therefore, the feature-level evaluation can demonstrate the advantages of MCMA.

Based on Table 9, the classifier can adopt the generated 12-lead ECG for arrhythmia classification. The average F1-score over 6 classes is 0.8319. Then, it is proven that MCMA can convert the single-lead ECG into the 12-lead ECG, and the generated 12-lead ECG can retain the pathological information, and it is different to the signal-level and feature-level evaluation. Therefore, with the multi-channel masked autoencoder, it is possible to complete arrhythmia classification with single-lead ECG, like lead I ECG in Table 9. Further, according to Table 10, the classification performance of the generated 12-lead ECG is better than that of single-lead ECG and similar to the real 12-lead ECG, which can demonstrate the classification performance gain brought by MCMA. The generated 12-lead from lead I can provide the closest classification performance, the average F1 is 0.8319, which exceeds other cases. According to Table 13, the classification performance with generated 12-lead ECG is improved. For example, taking lead II as input, Garg et al.30can achieve an F1 of 0.7807, lower than the proposed method. Similarly, with the lead I as input, Seo et al.32 and Joo et al.33 have an F1 of 0.8299 and 0.7730, respectively, while MCMA can be with a F1 with 0.8223. From the view of classification task, the classification performance in the above tables demonstrates the generated 12-lead ECG can be used for cardiac abnormality detection, which can prove its advantage in bridging the single-lead ECG and 12-lead ECG, and it is effective to generate the pathological information with single-lead ECG as input.

As Table 14 shows, the proposed framework is effective. The multi-channel strategy can support arbitrary single-lead to generate 12-lead ECG. Although the reconstruction performance of lead I is slightly lower than the fixed-channel. When the lead I ECG inputs, the fixed-channel can have a MSE of 0.0176 and a PCC of 0.7879, while MCMA can be with a MSE of 0.0177 and a PCC of 0.7906. However, for the fixed-channel, it is difficult to realize 12-lead ECG reconstruction with other leads, and the training and inference cost is largely different in training and storing 12 models with this setting. Further, the zero-padding strategy is better than the copy-padding strategy, while the two strategies both support the 12-lead reconstruction with arbitrary single-lead ECG. The mean MSE and PCC in MCMA are 0.0175 and 0.7772, while the mean MSE and PCC in copy-padding are 0.0197 and 0.7198, respectively.

This study is with the following advantages, from the engineering and clinical perspectives. Firstly, the generated signal is similar to the original signal, as the mean square errors of 0.0175 and 0.0654, correlation coefficients of 0.7772 and 0.7287 in the signal-level evaluation. Secondly, the generated signal can be used in the arrhythmia classification, as the average F1 with two generated 12-lead ECG is 0.8233 and 0.8410 in the diagnostic-level evaluation. According to the mentioned advantages, the contributions are as follows:

Further, this study is expected to be a feasible solution for wearable ECG monitoring, and it is able to improve the clinical importance of arbitrary single-lead ECG. For this research project, these experimented is conducted in these public datasets, such as PTB-XL and CPSC2018. Naturally, there are some limitations in this study, and these issues should be addressed in the future, as follows. High-quality electrocardiogram (ECG) signal acquisition method can significantly impact the reconstruction performance, and it may be addressed in the sensing layer46 or the algorithmic layer43. The generated signals necessitate evaluation by professional clinicians to ascertain their viability as a long-term substitute for the conventional 12-lead ECG in continuous monitoring scenarios.In other words, the question is whether a physician can render an equivalent diagnosis utilizing the 12-lead ECG generated by MCMA. Consequently, additional research endeavors are essential to advance the mentioned problems, ultimately realizing the considerable clinical relevance and practical utility.

In a word, this study proposes a novel generative framework to reconstruct 12-lead ECG with a single-lead ECG, as multi-channel masked autoencoder (MCMA), and it involves two main contributions. Firstly, unlike other methods, the proposed framework can convert arbitrary single-lead ECG into the standard 12-lead ECG. The experimental results showed that the proposed framework had excellent performance, achieving state-of-the-art performance on the proposed benchmark, ECGGenEval, including the signal-level, feature-level, and diagnostic-level evaluation. For example, the average Pearson correlation coefficients in the internal and external testing set are 0.7772 and 0.7287, outperforming the related approaches. Additionally, it is shown that the zero-padding strategy can play an important role in the proposed framework, beats the copy-padding strategy. In the future, it is necessary to study high-quality ECG and clinical validation, to let the proposed framework play an important role in clinical practice, which provides a novel feasible solution for long-term cardiac health monitoring.

Related Articles

Foundation models for cardiovascular disease detection via biosignals from digital stethoscopes

Auscultation of the heart and the electrocardiogram (ECG) are two central components of the cardiac exam. Recent innovations of the stethoscope have enabled the simultaneous acquisition of a high-quality digital acoustic signal and ECG. We present foundation models trained on phonocardiogram (PCG) and ECG data collected from digital stethoscopes during routine clinical practice. We show that these foundation models that are pre-trained on large unlabeled datasets in a self-supervised manner can be fine-tuned for a variety of cardiovascular disease detection tasks. This is the first study that builds foundation models specifically for synchronously captured PCG and ECG data. Our approach is based on the recently developed masked autoencoder framework which we extend to handle multiple signals that are synchronously captured. This paradigm makes it possible to use large capacity models leading to superior performance even though the size of datasets with medical label annotations may be limited.

Multi-expert ensemble ECG diagnostic algorithm using mutually exclusive–symbiotic correlation between 254 hierarchical multiple labels

Electrocardiograms (ECGs) are a cheap and convenient means of assessing heart health and provide an important basis for diagnosis and treatment by cardiologists. However, existing intelligent ECG diagnostic approaches can only detect up to several tens of ECG terms, which barely cover the most common arrhythmias. Thus, further diagnosis is required by cardiologists in clinical settings. This paper describes the development of a multi-expert ensemble learning model that can recognize 254 ECG terms. Based on data from 191,804 wearable 12-lead ECGs, mutually exclusive–symbiotic correlations between hierarchical multiple labels are applied at the loss level to improve the diagnostic performance of the model and make its predictions more reasonable while alleviating the difficulty of class imbalance. The model achieves an average area under the receiver operating characteristics curve of 0.973 and 0.956 on offline and online test sets, respectively. We select 130 terms from the 254 available for clinical settings by considering the classification performance and clinical significance, providing real-time and comprehensive ancillary support for the public.

Ion channel traffic jams: the significance of trafficking deficiency in long QT syndrome

A well-balanced ion channel trafficking machinery is paramount for the normal electromechanical function of the heart. Ion channel variants and many drugs can alter the cardiac action potential and lead to arrhythmias by interfering with mechanisms like ion channel synthesis, trafficking, gating, permeation, and recycling. A case in point is the Long QT syndrome (LQTS), a highly arrhythmogenic disease characterized by an abnormally prolonged QT interval on ECG produced by variants and drugs that interfere with the action potential. Disruption of ion channel trafficking is one of the main sources of LQTS. We review some molecular pathways and mechanisms involved in cardiac ion channel trafficking. We highlight the importance of channelosomes and other macromolecular complexes in helping to maintain normal cardiac electrical function, and the defects that prolong the QT interval as a consequence of variants or the effect of drugs. We examine the concept of “interactome mapping” and illustrate by example the multiple protein–protein interactions an ion channel may undergo throughout its lifetime. We also comment on how mapping the interactomes of the different cardiac ion channels may help advance research into LQTS and other cardiac diseases. Finally, we discuss how using human induced pluripotent stem cell technology to model ion channel trafficking and its defects may help accelerate drug discovery toward preventing life-threatening arrhythmias. Advancements in understanding ion channel trafficking and channelosome complexities are needed to find novel therapeutic targets, predict drug interactions, and enhance the overall management and treatment of LQTS patients.

Non-linear dynamics in ECG: a novel approach for robust classification of cardiovascular disorders

Detecting cardiac disorders from multi-channel ECG has significant implications for cardiac care. Current methods face challenges due to ECG waveform variations by electrode placement, high signal non-linearity, and low millivolt amplitudes. The present study introduces a non-linear analysis approach leveraging Recurrence plot visualizations as the patterned occurrence of well-defined structures, such as the QRS complex, can be exploited effectively using Recurrence plots. Using the Physikalisch-Technische Bundesanstalt dataset from PhysioNet, we examined four cardiac disorder classes- Myocardial infarction, Bundle branch blocks, Cardiomyopathy, Dysrhythmia, and healthy controls, achieving an impressive classification accuracy of 100%. Wilcoxon rank-sum test is performed at 95% C.I. on Recurrence Quantitative Analysis (RQA) features, identifying five features with statistically significant differences across pairs of study groups. Additionally, t-SNE visualizations of latent space embeddings derived from Recurrence plots and RQA features reveal clear separation among cardiac disorders and healthy subjects, underscoring the efficacy of the proposed approach.

Functional feature extraction and validation from twelve-lead electrocardiograms to identify atrial fibrillation

Deep learning methods on standard, 12-lead electrocardiograms (ECG) have resulted in the ability to identify individuals at high-risk for the development of atrial fibrillation. However, the process remains a “black box” and does not help clinicians in understanding the electrocardiographic changes at an individual level. we propose a nonparametric feature extraction approach to identify features that are associated with the development of atrial fibrillation (AF).

Responses

Your email address will not be published. Required fields are marked *