Virtual sensing-enabled digital twin framework for real-time monitoring of nuclear systems leveraging deep neural operators

Introduction
Proper monitoring and inspection of in-service components in nuclear reactors is essential for long-term safety and efficiency, as these components are continuously subjected to extreme temperatures, pressures, and radiation. Among these components, the primary circuit is particularly important, as it removes the immense heat generated in the reactor core, acting as the central component of the cooling system. This loop carries highly pressurized water at high velocities, creating significant mechanical stresses on the piping system1,2. Turbulent coolant flow, especially in areas with bends and joints, leads to localized disturbances3,4,5 that can induce material degradation mechanisms such as erosion, fatigue, and stress corrosion cracking. Given the significant role of the primary coolant loop, real-time monitoring is essential for detecting early signs of degradation and preventing failures. Tracking key parameters such as pressure, velocity, and turbulence, enables the detection of deviations from normal operational conditions, such as flow reductions, vibrations, or pressure drops, which can serve as early indicators of potential material degradation. These insights provide early warnings, often before structural damage occurs, ensuring safe and efficient reactor operation while reducing risks associated with material degradation.
Non-destructive testing (NDT) methods, such as eddy-current, ultrasonic, radiographic, and visual inspections, are considered standard practices for degradation monitoring in nuclear reactors6,7. These methods effectively identify various forms of material degradation, including cracks, corrosion, and fatigue, without damaging the system itself. However, they are usually performed during scheduled outages, resulting in significant revenue losses. Additionally, nuclear piping systems often extend over several miles, making it challenging to inspect the entire system within a limited outage period7. In-line inspections (ILI) are widely recognized as a standard approach for assessing internal pipeline conditions, including wall thinning, corrosion, and structural integrity8,9. However, they are also conducted periodically and have similar issues as NDT methods. Thus, while traditional periodic inspection methods remain valuable, they present challenges in addressing material degradation during routine operations. Continuous monitoring of coolant flow in real-time offers a promising alternative to scheduled inspections. By identifying anomalies in operational parameters as they occur, real-time monitoring enables prompt scheduling of inspections to address specific concerns before they escalate into critical issues. Current pipeline monitoring systems use advanced simulation models to enable real-time monitoring of pressure conditions. These models offer valuable insights into operational parameters, particularly in areas where direct measurements are either impractical or impossible9. However, simulation models take a long time to get results, which makes them unsuitable for continuous real-time monitoring tools. Real-time monitoring can address the shortcomings of periodic inspections if it can provide uninterrupted oversight of system parameters.
This uninterrupted view of the entire reactor can be achieved through Digital Twin technology. Digital Twin creates a realistic virtual replica of the physical asset, allowing for real-time and synchronized monitoring, control, and prediction of system behavior10,11,12. By dynamically updating itself with real-time data, a Digital Twin provides a comprehensive and synchronized view of the reactor’s operational state. This capability enables proactive maintenance, fault detection, and optimization of reactor operations, reducing the reliance on periodic inspection and minimizing operational risks. By continuously integrating operational data, Digital Twin frameworks enable predictive modeling of degradation processes, such as material fatigue and erosion-corrosion, ensuring timely interventions. Digital Twin framework relies heavily on sensors as they work as a connection between the physical reactor and its digital counterparts13. Sensors for coolant flow monitoring in nuclear reactors include ultrasonic, electromagnetic, and thermal dispersion flow meters14,15, as well as sensors for key parameters like temperature, pressure, and radiation16,17,18. These sensors are strategically placed at critical locations, such as the inlet and outlet of the reactor core, steam generators, and other components, to ensure continuous monitoring of system conditions19, as well as to preserve sensor health and minimize degradation20,21. However, for a successful Digital Twin framework, comprehensive coverage of the entire reactor system is needed, which cannot be achieved solely through physical sensors. While placing sensors at the inlet and outlet of piping segments is feasible, installing them within pipe sections is impractical as it will cause flow disturbances and also has logistical challenges. Though DFO can address coverage limitations, it is very expensive and cannot be implemented in existing reactors.
Virtual sensors can address these limitations of physical sensors. They are software-based models that estimate physical quantities using data and simulations rather than direct physical measurements22,23,24. They mimic the behavior of physical sensors, providing readings without requiring the installation of actual hardware25,26,27. Integrated within a Digital Twin, virtual sensors utilize data from sporadically placed physical sensors to predict values in unseen or unmonitored areas, providing critical insights into system conditions in real-time and offering a complete view of the reactor’s operational state. Unlike physical sensors, virtual sensor-based Digital Twin frameworks are not constrained by installation or environmental limitations, making them ideal for monitoring hard-to-reach or harsh reactor environments28. Furthermore, Digital Twin frameworks utilize existing sensor networks to operate virtual sensors, reducing the need for modifications to established reactor designs while still expanding monitoring capabilities. They also improve data reliability by reducing signal interference and are easily adjustable to accommodate changes in system requirements29,30. This adaptability ensures the Digital Twin remains accurate and reliable, even when physical sensors degrade or fail.
While virtual sensing is still a relatively new concept in nuclear applications, its potential to enhance system reliability and efficiency has been demonstrated over the past years. For instance, Sevilla et al.31 showed how neural networks could estimate variables in pressurized water reactors by optimizing input selection and network architecture. Ahmed et al.28 developed virtual sensor networks for accident monitoring in nuclear plants, while Tipireddy et al.32 introduced Gaussian process-based virtual sensors to replace faulty physical sensors, reducing unscheduled downtime. These examples underscore the critical role virtual sensing plays in Digital Twin frameworks, enabling real-time monitoring, fault tolerance, and system optimization in nuclear reactors.
Emerging neural network technologies, such as the Deep Operator Network (DeepONet)33, Fourier neural operators (FNO)34, and physics-informed neural networks (PINN)35 provide powerful tools for developing virtual sensors. These models are adept at solving partial differential equations (PDEs) by approximating nonlinear operators and learning mappings between functional spaces. This makes them highly suitable for modeling and predicting the behavior of complex, interdependent systems, such as nuclear reactors. Among these algorithms, the DeepONet has demonstrated greater efficiency compared to others. Lu et al., in their recent paper36, show that DeepONet exhibits superior adaptability over FNO in accommodating diverse problem formulations and dataset structures. With PINN, as well, each new parameter requires separate simulations or retraining. Moreover, PINN struggles to approximate PDEs with strong non-linearity commonly found in practical fluid flow problems. In contrast, DeepONet alleviates the need for retraining by learning operators that map entire functions, rather than specific input-output pairs, across different conditions. Once trained, DeepONet can generalize to a wide range of new inputs without requiring retraining for each new scenario, as it understands the underlying functional relationships between inputs and outputs. This is advantageous for real-time applications, adapting to varying conditions without frequent retraining. DeepONet approximates both linear and nonlinear PDE solution operators by leveraging parametric functions as inputs and mapping them to corresponding output spaces, thereby eliminating retraining requirements37.
A well-trained neural operator offers the computational efficiency needed for real-time or near real-time predictions, which is crucial for control system optimization12,38. These advantages make DeepONet our preferred algorithm for establishing Digital Twin. Unlike traditional finite element/volume (FEM/FVM) simulations, which are computationally intensive and time-consuming, DeepONet generates predictions orders of magnitude faster, making it ideal for real-time applications. By training on high-fidelity simulations and experimental data, DeepONet accurately predicts coolant flow behavior under various operating conditions, which is central to developing a virtual sensor framework for nuclear reactors. The key advantage of using DeepONet for our problem is its trunk network, which processes spatial coordinates to evaluate the output function33,38. This network predicts parameters at various pipe locations based on input data from a single location, providing comprehensive pipe condition inference. Combined with near real-time capability, DeepONet provides highly accurate estimates of flow velocity, turbulence, pressure, and temperature at key primary circuit locations without physical sensors.
In this work, we study the hot leg as the representation of the primary coolant system to demonstrate DeepONet’s effectiveness. The hot leg, consisting of horizontal and vertical conduits with elbows, transports high-pressure, high-temperature water from the reactor pressure vessel (RPV) to the steam generator (SG). Leakage in the hot leg can cause serious issues, including a loss of coolant accident (LOCA), impairing the reactor’s cooling capacity, and posing radiological risks. Monitoring coolant conditions in the hot leg is crucial to ensure proper heat transfer and reactor safety. However, real-time monitoring of key parameters like pressure, velocity, and turbulence kinetic energy inside the hot leg is challenging due to critical operating conditions. We focus on developing a real-time monitoring model of thermal-hydraulic conditions for the hot leg of the AP1000. We use the coolant inlet velocity, already monitored through existing sensors, assuming normal PWR operational conditions. Based on this information, we develop a model to predict thermal-hydraulic parameters (pressure, velocity, and turbulence kinetic energy) in the hot leg’s central plane. Monitoring these parameters is important for assessing heat transfer performance and ensuring structural integrity. For instance, high or fluctuating pressure stresses piping, causing wear, fatigue, and cracking, while sudden drops in pressure may signal blockages or leaks, accelerating degradation. Monitoring velocity is equally important, as high coolant flow velocity, particularly in areas where the flow direction changes, can cause flow-induced vibrations. These vibrations lead to mechanical fatigue and increase the risk of erosion and corrosion, gradually thinning the pipe walls. Similarly, turbulence, especially near bends and joints, amplifies stress on the piping structure, accelerating fatigue and the erosion-corrosion process. It also introduces hydraulic shocks and uneven temperature distributions, further weakening material integrity over time. Monitoring these parameters in real-time enables operators to correlate abnormal flow patterns with potential degradation, allowing for timely maintenance decisions. Given their role in degradation, our goal is to develop a real-time thermal-hydraulic monitoring model for the hot leg of a PWR.
The key contribution of our work is demonstrating the feasibility and benefits of using DeepONet within a Digital Twin framework for real-time monitoring of coolant flow in the primary circuit of a nuclear reactor. By integrating virtual sensors, our study highlights the potential of Digital Twins to significantly improve the monitoring and inspection of reactor components. Through comprehensive simulations and experimental validation, we show how this approach delivers accurate and timely data, enabling the detection of operational conditions indicative of degradation, facilitating proactive maintenance, and optimizing reactor operations.
Results
Data processing
The data for this study was generated using ANSYS Fluent, focusing on fluid dynamics within the hot leg elbow joint of an AP-1000 LWR nuclear reactor. The average inlet velocity range for the simulations was set between 0.63 and 0.83. The computational mesh consisted of a total of 11,340 nodes, distributed along the central plane of the hot leg to capture the coolant flow characteristics throughout the pipe section. For this study, we performed a total of 5000 simulations. This dataset was subsequently divided into a training and testing set in an 80%–20% ratio, resulting in 4000 scenarios for training and 1000 scenarios for testing. Importantly, the 1000 test data points were never touched during the training or validation process. They remained completely unseen by the model and were used exclusively at the final stage to evaluate the model’s performance on unseen data. This setup was designed to monitor coolant behavior in near-real-time, capturing detailed velocity, turbulence, and pressure variations to detect any anomalies effectively.
To ensure the DeepONet model properly learns the values of different parameters, we addressed the issue of varying ranges among these parameters. For instance, the pressure values range between approximately −231.25 and 132.7, while the turbulence kinetic energy values range between approximately 0.000875 and 0.019015. Such discrepancies in value ranges can hinder the model’s learning process. To mitigate this difference, we applied min-max scaling to normalize all the values between 0 and 1.
where X is the original value, ({X}_{min }) is the minimum value in the dataset, ({X}_{max }) is the maximum value in the dataset, and Xscaled is the normalized value in the range [0, 1].
Figure 1 provides a visual representation of this scaling process. The histograms in the top row display the original values of kinetic turbulence energy, pressure, and velocity. The bottom row shows the scaled values of these parameters after applying min-max scaling. From these histograms, it is evident that the original values of turbulence (a), pressure (b), and velocity (c) are distributed across different ranges. The original turbulence values are densely packed near the lower end of the range, with a significant peak around 0.0025. The pressure values exhibit a broader distribution with notable peaks around −50 and 50. The velocity values are concentrated between 0.6 and 0.8, with a pronounced peak near 0.75. After scaling, the turbulence values (d) are now spread across the [0,1] range, maintaining the original distribution shape but normalized. The scaled pressure (e) and velocity (f) values are also within the range of [0,1], preserving the distribution characteristics seen in the original data. By scaling these parameters, we ensure that the DeepONet model can learn effectively from the data, with each parameter contributing proportionately to the training process. This step helps improve the model’s accuracy and reliability in predicting coolant behavior.

This figure displays the histograms of the turbulence, pressure, and velocity parameters before and after min-max scaling. a–c The original distributions of a turbulence, b pressure, and c velocity, each exhibiting distinct distribution patterns. d–f The min-max scaled distributions of d turbulence, e pressure, and f velocity, normalized to the range [0, 1] for consistent analysis.
Training process
The training process for this study is designed to ensure that the model fits the training data well while also generalizing effectively to unseen data. Initially, the training dataset is split into two parts: 80% for training and 20% for validation. The final model, including the linear layer, was trained for 100 epochs to ensure convergence with this specific dataset, activation function, and optimizer. The number of hidden layers for the final model was fixed at three, but the number of neurons in each layer was determined through hyperparameter tuning.
Hyperparameter tuning for the final model with linear layers was achieved through Bayesian optimization using the Optuna framework, which efficiently searches for optimal parameters. The loss function employed was a mean squared error (MSE), and the model was optimized using the Adam optimizer with L2 regularization to prevent overfitting. Training was conducted over 100 epochs, with early stopping implemented to halt training if performance did not improve within 5 consecutive epochs, ensuring efficient use of computational resources. The model underwent 5-fold cross-validation to assess its generalization capabilities across different subsets of the dataset. This process ensures that each data point is used for both training and validation, providing a comprehensive evaluation of the model’s performance. The average performance across all five folds gives a reliable estimate of how well the model will generalize to new data. The total training duration was 19 hours and 38 minutes. The computations for the training and evaluation tasks were performed on a computational node with a single NVIDIA A100 GPU within the Delta cluster hosted by the National Center for Supercomputing Applications (NCSA). While hyperparameter tuning determined most parameters, the number of linear layers was decided through manual inspection based on observed improvements in performance. Table 1 summarizes the ranges and best values found for each parameter.
After confirming that the model generalizes well through cross-validation, the next step is to train the model on the entire dataset for an extended period of 1000 epochs. For this final training phase, the hidden layer configuration was [512, 512, 512], the dropout rate was set to 0.0, the weight decay rate was 10−8, and the learning rate was 0.001. Training on the whole dataset allows the model to learn from all available data, maximizing its potential to capture complex patterns. The extended training period ensures that the model has sufficient time to converge to an optimal solution.
After training the model on the entire dataset, the best model is then used to make predictions on the test dataset. The test dataset consists of data points that were not used during training or validation, providing an unbiased evaluation of the model’s performance. These 1000 test data points remained completely unseen by the model and were used exclusively at this stage to evaluate the model’s performance on unseen data. To evaluate the model on the test dataset following quantities were calculated:
where yi represents the true values generated through the finite volume method, ({hat{y}}_{i}) represents the predicted values from the neural network, and N is the total number of data points.
The values reported in the tables are calculated as the averages over the 1000 test scenarios in the test set, ensuring an unbiased evaluation of model performance. The average values for mean square error (MSE), mean absolute error (MAE), and Relative L2 error are computed as follows:
Here, Metricj is the value of the metric (MSE, Relative L2 error, MAE) for the j-th test scenario. And M is the total number of test scenarios (1000 in this case). The formula aggregates the metric for all test scenarios and divides it by the total number of scenarios to compute the average.
Vanilla DeepONet vs DeepONet with linear layers
To support our modification of the original model with added linear layers for each parameter separately, we present a comparison between the DeepONet model with and without these linear layers. This comparison is conducted purely to evaluate the impact of adding linear layers, and the Vanilla DeepONet model is used solely for this purpose.
To match the total number of learnable parameters for both of the model, the number of hidden layers for Vanilla DeepONet model was taken to be 11 for Branch network and 10 for Trunk network. Each hidden layer has 4096 neurons. Which makes the total number of learnable parameters to be 365431887. On the other hand, our considered model with linear layer has 3 hidden layers with 512 neurons each. With the extra three linear layers at the end, the total number of learnable parameters for this model is 361499727. With that these two models become comparable with each other. Table 2 shows the average value, standard deviation, and maximum value of the mean square error (MSE) and Relative L2 error on the test dataset for each model, evaluated separately for the three parameters: pressure, velocity, and turbulence kinetic energy. It is important to note that no hyperparameter tuning was performed for the Vanilla DeepONet model. The parameters for this model were taken to match those of the DeepONet model with linear layers, as described in Methods section. Since the Vanilla DeepONet model is not used for further evaluation or visualization, hyperparameter tuning was not necessary.
Analyzing the data in Table 2, it is evident that the model with linear layers consistently outperforms the original DeepONet model across all parameters. For pressure (P), the MSE average decreases from 0.00048 to 0.00026, and the Relative L2 error average drops from 0.03106 to 0.0204, indicating a significant improvement in predictive accuracy. Similarly, for velocity (Vo), the model with linear layers shows a reduction in the MSE average from 0.00281 to 0.00141, and a decrease in the Relative L2 error average from 0.07934 to 0.05184. Turbulence kinetic energy (k) also benefits from the linear layers, with the MSE average reducing from 0.00105 to 0.00054, and the Relative L2 error average declining from 0.15924 to 0.10573. These enhancements demonstrate the efficacy of the added linear layers in refining the model’s alignment with target values, thus validating our modifications.
Impact of data splits and node counts on model accuracy
To evaluate the impact of different train-test splits on the model’s performance, the dataset was divided into the following ratios: 70–30%, 80–20%, and 90–10%. The model was trained on these varying fractions of the training set, and the average mean squared error (MSE) and Relative L2 error were calculated for each test split. It was hypothesized that the model’s performance would degrade with a smaller training dataset, as less data typically leads to poorer model performance. However, the results were surprisingly consistent across different splits, as shown in Table 3.
As seen in Table 3, the performance metrics (MSE and Relative L2 error) for pressure, velocity, and turbulence remain very close across different training data fractions. This consistent performance across various training sizes demonstrates that the model is inherently robust and capable of achieving high performance even with varying amounts of training data. The model has effectively learned the underlying patterns of the dataset, ensuring reliable predictions. Each of the training datasets, from 70% to 90%, contains enough data for the model to generalize well, thus explaining the minimal performance differences. The model’s effective regularization techniques prevent overfitting, ensuring stable performance across different training sizes. These findings confirm that the model has been trained with a sufficient amount of data and performs well even with reduced training data. This robust performance implies that the model can be effectively used in scenarios with limited data availability. Moreover, the ability to maintain accuracy with varying training sizes indicates that the model is versatile and reliable. For the rest of the studies and demonstrations, we have chosen the 80–20 train-test split as it is a standard practice in the field, providing a good balance between training and testing data.
In a separate experiment, the train-test split was kept constant at 80%–20% while varying the number of nodes used for training. The original number of nodes was 11,340, and this was compared to a reduced number of 2835 nodes, which is one-forth of the original number. The results, presented in Table 4, show that the model provided consistent performance even when the number of nodes was reduced by three-fourths.
As shown in Table 4, the performance metrics (MSE and Relative L2 error) for pressure, velocity, and turbulence remain consistent across the different numbers of nodes. This suggests that the model’s performance is not heavily dependent on the number of nodes used for training, indicating that the model is capable of maintaining accuracy even with fewer nodes. This robustness implies that the model can be effectively scaled down, which can be advantageous for computational efficiency without sacrificing predictive accuracy.
These two experiments demonstrate the consistency of the model’s performance across varying training data sizes and different numbers of nodes, indicating its potential reliability in diverse scenarios. These findings highlight that the model is well-trained with the available data and can still perform effectively even with smaller datasets, making it practical for various applications and computationally efficient.
Performance evaluation
Building on our previous studies—comparing Vanilla DeepONet to DeepONet with Linear Layers, and analyzing the impact of data splits and node counts on model accuracy—we now focus on a comprehensive analysis of the modified DeepONet with linear layers using an 80–20 train-test split. This section provides a comprehensive evaluation of the model’s performance across the key parameters of turbulence kinetic energy, velocity, and pressure. Table 5 summarizes the average mean absolute error (MAE), mean squared error (MSE), and Relative L2 error for the individual parameters on the test dataset. The results indicate that the model performs best in predicting pressure, as evidenced by the lowest values in MSE, MAE, and Relative L2 error. In contrast, while the MSE and MAE for velocity and turbulence remain relatively small, the model shows higher errors for velocity. Turbulence exhibits the worst performance in terms of Relative L2 error, with a relative error of 10.58%.
To further illustrate the model’s performance, we provide a detailed error distribution analysis using histograms in Fig. 2. These histograms help us better understand the frequency distribution of MSE and Relative L2 error across the test dataset for each parameter, offering deeper insight into the variability and consistency of the model’s predictions. Panel (a) presents the distribution of MSE values for pressure predictions, indicating a concentration of lower error values, which suggests a strong model performance in pressure estimation. Similarly, panel (c) shows the model’s MSE distribution for velocity predictions and panel (e) for the turbulence kinetic energy. The right-hand panels, (b), (d), and (f), illustrate the Relative L2 error percentages. The embedded statistics within these panels (mean, standard deviation, and quantiles) summarize the error distribution, offering a comprehensive perspective on the model’s performance.

a, b The histograms for pressure predictions. a displays the MSE distribution, with most errors concentrated at lower values. b shows the Relative L2 Error, which reflects the model’s consistent predictive accuracy relative to the magnitude of true pressure values. c, d The histograms for velocity predictions. c presents the MSE distribution, revealing a wider spread of errors compared to pressure. d shows the Relative L2 Error. e, f The histograms for turbulence kinetic energy predictions. e shows the MSE distribution, indicating moderate absolute errors. f presents the Relative L2 Error, which is higher compared to pressure and velocity.
The histograms for pressure predictions (Panels (a) and (b)) show a concentration of lower error values, indicating a strong and consistent model performance in estimating pressure. The narrow spread of MSE and Relative L2 error values reflects the model’s ability to make accurate and reliable pressure predictions with minimal variability. For velocity predictions (Panels (c) and (d)), the MSE values exhibit a larger spread, suggesting greater variability in the model’s accuracy for this parameter. The higher MSE for velocity indicates that, on average, the errors in velocity predictions are larger compared to those in pressure predictions. However, the Relative L2 error for velocity is lower than that for turbulence, which means that, relative to the magnitude of the true values, the errors in velocity predictions are smaller compared to the errors in turbulence predictions. Turbulence kinetic energy predictions (Panels (e) and (f)) present the most significant challenge for the model. Although the MSE for turbulence is lower than that for velocity, the Relative L2 error is higher. This discrepancy signifies that while the mean square error in turbulence predictions might be smaller, the error relative to the actual values of turbulence is larger. In other words, the model’s errors in predicting turbulence are more significant when viewed in the context of the scale of the turbulence values. This high Relative L2 error indicates that the model struggles more with predicting turbulence accurately compared to velocity and pressure, reflecting the complex nature of turbulence and the inherent difficulty in modeling it accurately.
A detailed visual analysis of the model’s predictive performance, including best-case, worst-case, and transitional scenarios, is presented in the section “Local variable analysis and discussion” to further investigate the model’s strengths and limitations.
Inference time
DeepONet is well-known for its short inference time, especially when compared to traditional numerical methods such as the finite volume method (FVM). After the model has been fully trained, a well-trained DeepONet model can generate predictions almost instantaneously for new inputs. This is because it bypasses the need for iterative solving of differential equations, which is computationally expensive in traditional methods like FVM.
Table 6 shows the inference time for DeepONet in comparison to FVM simulation. As we can see, DeepONet is approximately 1481 times faster than FVM, which requires 200 seconds for a single simulation. This drastic reduction in prediction time makes DeepONet particularly useful for virtual sensing. This further reinforces our choice of utilizing DeepONet for this study.
Note: Both the FV simulation and DeepONet inference were performed on the same machine to ensure a fair comparison. The machine specifications were Intel Core i7 CPU with 16 GB RAM.
Local variable analysis and discussion
To provide a deeper understanding of the model’s predictive performance, we include a comprehensive visual analysis in Figs. 3–5. These figures showcase the ANSYS simulated data, the neural network’s predictions, and the corresponding errors. For each parameter—pressure, velocity, and turbulence kinetic energy—we include five plots: the best-case scenario, the worst-case scenario, and three intermediate cases, all selected based on Relative L2 Error. The best-case plots highlight scenarios where the model’s predictions closely align with FVM data, demonstrating optimal performance. Conversely, the worst-case plots reveal significant discrepancies and prediction errors. The intermediate plots show the gradual transition between these extremes, providing insights into regions where the model’s performance deteriorates. This detailed visual analysis will not only help us understand which parameters the model predicts poorly but also identify the specific sections of the pipe where these inaccuracies occur. We can gain valuable insights into the model’s limitations and highlight areas for refinement to enhance its overall accuracy and reliability.

a–c These panels represent the a true velocity, b predicted velocity, and c error percentage for the Best-Case Scenario. d–f These panels represent the d true velocity, e predicted velocity, and f error percentage for the 25% Best Case. g–i These panels represent the g true velocity, h predicted velocity, and i error percentage for the 50% Best Case. j–l These panels represent the j true velocity, k predicted velocity, and l error percentage for the 75% Best Case. m–o These panels represent the m true velocity, n predicted velocity, and o error percentage for the Worst-Case Scenario.

a–c These panels represent the a true turbulence kinetic energy, b predicted turbulence kinetic energy, and c error percentage for the Best-Case Scenario. d–f These panels represent the d true turbulence kinetic energy, e predicted turbulence kinetic energy, and f error percentage for the 25% Best Case. g–i These panels represent the g true turbulence kinetic energy, h predicted turbulence kinetic energy, and i error percentage for the 50% Best Case. j–l These panels represent the j true turbulence kinetic energy, k predicted turbulence kinetic energy, and l error percentage for the 75% Best Case. m–o These panels represent the m true turbulence kinetic energy, n predicted turbulence kinetic energy, and o error percentage for the Worst-Case Scenario.

a–c These panels represent the a true pressure, b predicted pressure, and c error percentage for the Best-Case Scenario. d–f These panels represent the d true pressure, e predicted pressure, and f error percentage for the 25% Best Case. g–i These panels represent the g true pressure, h predicted pressure, and i error percentage for the 50% Best Case. j–l These panels represent the j true pressure, k predicted pressure, and l error percentage for the 75% Best Case. m–o These panels represent the m true pressure, n predicted pressure, and o error percentage for the Worst-Case Scenario.
Figure 3 presents the velocity contours for both the true and predicted values, along with the associated error distribution, on a selected diagnostic plane. The figure showcases the results for five representative cases, ranging from the best-performing case to the worst-performing case, with the intermediate cases illustrating a 25% interval distribution of the best-performing case. The figure demonstrates that DeepONet, while generally proficient at predicting velocity patterns, encounters difficulties in regions with abrupt velocity transitions, particularly near the elbow joint where steep gradients and discontinuities are prevalent due to flow separation from the inner radius. In these areas, the model exhibits higher prediction errors compared to regions with smoother flow fields. This discrepancy can be attributed to the inherent challenge of capturing sharp gradients and discontinuities using a data-driven model like DeepONet. While the flow behavior in these zones is governed by intricate physical laws, DeepONet primarily relies on learning patterns from training data. Consequently, the model struggles to achieve high accuracy in regions where flow physics plays a dominant role. This limitation is consistently observed across all five cases presented.
Analysis of turbulence kinetic energy (TKE) in Fig. 4 reveals that DeepONet accurately predicts turbulence patterns. However, error patterns for TKE are inversely related to those observed in velocity predictions. TKE magnitude increases downstream of the elbow due to flow separation and recirculation, resulting in pressure gradients and upsurged turbulence. DeepONet excels at predicting local zones with high TKE values, but accuracy diminishes in regions of low TKE, even with smooth flow fields. This could be attributed to the nature of TKE magnitudes, which are near zero in smooth flows and significantly higher in turbulent regions. DeepONet struggles to capture these subtle variations near zero, leading to the observed error pattern reversal compared to velocity fields. Furthermore, the contour plots reveal instances of exceptionally high errors at specific nodes within the diagnostic plane, suggesting the presence of outliers in the predicted data. These outliers, characterized by significant deviations from the true values, could arise from various factors such as noise in the training data, model limitations in handling extreme or rare flow conditions, or numerical instabilities during the prediction process.
Figure 5 represents the comparison of pressure between the CFD simulation and DeepOnet. In the 120∘ elbow bend, centrifugal force causes the fluid to bulge towards the convex wall, decreasing the flow rate and creating a high-pressure region. Conversely, the concave wall experiences contraction effects and increased velocity, resulting in a low-pressure zone. This pressure difference between the convex and concave sides drives a secondary flow from the convex to the concave wall along the bend’s circumference. The investigation into pressure field predictions echoes previous observations regarding DeepONet’s performance in capturing velocity fields. While the model demonstrates general proficiency, it encounters challenges in regions characterized by abrupt pressure transitions, particularly near the elbow joint where steep gradients and discontinuities are prevalent. In these areas, the model exhibits elevated prediction errors compared to regions with smoother pressure distributions. This aligns with the inherent difficulty of capturing sharp gradients and discontinuities using a data-driven model like DeepONet. The pressure behavior in these zones is strongly influenced by complex flow physics, whereas DeepONet primarily learns patterns from training data.
In general, the DeepONet model demonstrated a strong ability to accurately capture local fluctuations and patterns within the hydrodynamic variables. However, a disparity in prediction accuracy was observed across the different variables. Specifically, the highest errors were encountered in the turbulence kinetic energy predictions, followed by the pressure fields, with the velocity fields exhibiting the lowest errors. This suggests that the model’s capacity to learn and generalize from the training data may be influenced by the inherent complexity and variability of the different flow variables. While DeepONet effectively captures the more readily observable velocity patterns, it struggles to achieve the same level of precision for the more nuanced and less directly measurable quantities such as turbulence kinetic energy and pressure. This could be attributed to the intricate physical relationships and nonlinear interactions governing these variables, which may not be fully captured by the current model architecture or training dataset.
Further research and model refinement, potentially incorporating additional physical constraints or specialized loss functions, may improve the accuracy of predictions for these more challenging flow variables.
Discussion
In this study, we have explored the integration of Digital Twin technology with advanced neural operator models, specifically DeepONet, to develop a virtual sensing-enabled framework for real-time parameter prediction in nuclear reactors. This section elaborates on the significance of our findings, the role of DeepONet within a Digital Twin framework, and the implications of our results for degradation monitoring. In addition, we address the limitations of our approach and discuss potential future directions.
DeepONet-based Digital Twin
The DeepONet developed in this study serves as a digital twin framework for plant operations, aligning with the U.S. Nuclear Regulatory Commission’s (NRC’s) definition of a digital twin through its integration of real-time inference, adaptability to operational condition changes, and synchronization with the physical system39,40.
-
Dynamic Digital Representation and Synchronization with the Physical System: DeepONet provides a dynamic representation of the system it models, offering real-time insights into the system’s behavior. Synchronization between the physical system and its digital representation is a key requirement for realizing digital twin technology. DeepONet addresses this need by processing real-time signals through a supporting data pipeline and dynamically updating its predictions, enabling seamless interaction with the physical system40. DeepONet models can be trained using simulation data (such as ANSYS simulations used in this study) and later fine-tuned with real-world data, bridging the gap between digital and physical systems. This capability aligns with the key feature of digital twins to adapt and evolve alongside their physical counterparts. As shown in our study, the branch network processes real-time sensor data (average inlet velocity) ensuring synchronization, while the trunk network provides spatial details, functioning as a virtual sensor network. This setup allows DeepONet to augment physical sensor data by providing full-field predictions, such as flow velocity and turbulence, at critical locations like pipe bends and elbow joints, where physical sensors are impractical. The model’s predictive capabilities, as shown in Figs. 3–5, demonstrate its effectiveness in replicating system behavior. These results validate DeepONet’s role as a reliable and efficient component of the Digital Twin framework.
-
Real-Time Inference: A key feature of digital twins is real-time or near-real-time inference to provide operational insights and enhance decision-making during dynamic conditions. As shown in Table 6, DeepONet achieves unprecedented speed, making predictions 1481 times faster than traditional finite volume simulations. This allows operators to monitor critical thermal-hydraulic parameters—pressure, velocity, and turbulence kinetic energy—almost instantaneously. The real-time inference capability ensures the model remains an active, dynamic component of the digital twin framework, crucial for nuclear plant operations where quick decision-making is essential.
-
Adaptability to Operational Condition Changes: Digital twins must adapt to changing operational conditions and maintain accuracy across diverse scenarios without constant retraining. This study demonstrates that DeepONet fulfills this requirement by learning the underlying functional relationships between input parameters and spatial coordinates. Its architecture ensures robust predictions under varying conditions, without requiring retraining. As shown in Fig. 2, DeepONet performs consistently across diverse scenarios. It effectively predicts spatial distributions of velocity, turbulence, and pressure across different inlet conditions, showcasing its adaptability and reliability for monitoring dynamic systems. This capability aligns with the NRC’s definition of digital twins, emphasizing adaptability and synchronization with evolving physical systems.
These findings validate DeepONet’s role in enabling a Digital Twin framework, bridging real-time data and predictive modeling.
Monitoring operational conditions indicative of degradation
Degradation, such as material/wall thinning, stress corrosion cracking, or fatigue, develops gradually due to persistent mechanical stresses and flow irregularities. Certain operational conditions, however, may act as early indicators of these processes and can be monitored to prevent long-term damage. For instance, flow-induced vibrations, caused by turbulent flow around pipe bends and fittings, are a major contributor to fatigue and erosion-corrosion in nuclear piping systems. These vibrations can lead to high-frequency stresses, overloading pipe supports or nozzles, and accelerating material degradation. Similarly, transient phenomena such as water hammer or condensation-induced hammer, generate impact loads with high dynamic factors, which can weaken pipe walls or connections over time. Monitoring turbulence patterns, pressure fluctuations, and velocity distributions in such areas provides actionable insights into regions where degradation risks are elevated41.
While our study does not directly simulate material degradation or wall thinning, the ability to monitor turbulence and pressure in areas prone to stress highlights DeepONet’s usefulness in degradation-informed decision-making. Elevated turbulence near pipe bends is a known precursor to erosion-corrosion, while fluctuating pressure often signals transient phenomena such as water hammer, both of which can severely impact system integrity. So, say for example, a rise in turbulence intensity near elbow joints (Fig. 4) over time could signal heightened mechanical stress in those regions, suggesting targeted inspections or preventive maintenance. Similarly, deviations in pressure (as in Fig. 5) or velocity distributions (shown in Fig. 3) can identify developing flow-induced vibrations that may accelerate wear, particularly in locations vulnerable to acoustic resonance.
Within a Digital Twin, these data can be collected over time to form flow profiles, revealing how conditions within a pipe evolve dynamically. Such profiles are invaluable for identifying patterns indicative of emerging degradation risks, such as increasing pressure in certain regions or abnormal turbulence patterns near pipe bends. Furthermore, by illuminating unmonitored regions with accurate predictions, DeepONet enhances the visibility of the entire system, enabling reliable condition-based maintenance strategies that can reduce maintenance costs and enhance reactor safety.
Limitations and future directions
While the results of this study show the effectiveness of DeepONet in predicting key thermal-hydraulic parameters, certain limitations must be acknowledged. One notable challenge is the spectral bias inherent in data-driven models like DeepONet. All Neural networks tend to prioritize learning low-frequency, smoother patterns over high-frequency, more complex ones, such as those observed in turbulent regions. This spectral bias, coupled with the imbalanced nature of our dataset, evident in Figs. 3–5, where larger, smoother flow patterns dominate over the relatively smaller turbulent flow patterns in the bend region, likely contributed to the increased error observed in predicting turbulence. We assume that the model is smoothing out high-frequency patterns. For example, the model exhibited higher errors in areas with intense turbulence, such as near elbow joints and bends, where high-frequency flow patterns dominate. These regions are most important for identifying flow-induced vibrations and other precursors to degradation, highlighting a potential limitation of the current approach.
To address this, future research could explore combining neural operators with diffusion models42. Future directions could also include designing hybrid frameworks, where specialized models can be trained for different regions of the system based on the initial CFD modeling. This targeted approach could improve the performance by allowing each model to focus on specific flow characteristics, thereby addressing challenges such as spectral bias and ensuring better accuracy in regions with high-frequency turbulent flows.
Methods
Data generation
The hot leg in an AP1000 reactor plays a critical role in its operation, acting as the primary channel for transferring heat generated in the reactor core to the steam generators. Hence, the core could overheat without proper flow through the hot leg, leading to a potential meltdown. AP1000 reactor cooling system includes hot and cold leg pipes connecting the reactor vessel, steam generators, and reactor coolant pumps. Each loop has three pipes, including a 787.4 mm inner diameter pipe between the reactor vessel outlet and steam generator inlet with a length of 2.3 m43. While system-level codes provide a useful tool for macroscopic thermal-hydraulic analysis of nuclear reactors, their limitations in resolving detailed flow features such as flow re-circulation and turbulence within the hot leg necessitate the use of more advanced methods like computational fluid dynamics (CFD). CFD, with its ability to resolve complex geometries and flow physics, can provide a more accurate and detailed understanding of the fluid dynamics within the hot leg. However, due to the complexity of the full-scale geometry, a scaled-down model was used for CFD analysis. Geometric scaling was employed, ensuring a constant flow rate per unit volume between the actual case and the model. In this study, geometric scaling was implemented according to the following conditions:
Where, Qm and Qa are the volumetric flow rates of model and actual case and Vm Va are the volume which results in volume of the model and actual case. Hence we can write,
where vm and the va are the velocities of the model and the actual case. lm and la represents the model and actual flow length the hot leg. The diameter of the hot leg pipe (da) was scaled down by the a factor of λ equivalent to 31.5, which gives the value of dm to be 25 mm. The flow length of the model was kept 150 mm where la = dm × lm. Hence. the relationship between the Reynolds number (Re) for the model and the actual scenario becomes:
As shown in Fig. 6, the elbow joint angle (θc) was kept 120∘, same as the actual scenario of hot leg.

a The side view of the geometry shows the coolant flow direction, inlet velocity (vin), outlet pressure (pout), the no-slip wall condition, and the bend angle (θc). b The diagnostic measurement plane of interest is highlighted.
The walls of the flow domain were treated as adiabatic and subject to no-slip conditions. This implies that these surfaces neither gained nor lost heat, and fluid velocity at the wall interface was zero. In terms of thermal conditions, the inlet fluid temperature was maintained at 594.3 K. The outlet was set to a gauge pressure of zero, ensuring it remained at ambient atmospheric pressure. The inlet velocity (vin) was changed according to the magnitude of Re, ensuring turbulent flow. Figure 6 shows the essential boundary conditions and the plane of interest where the hydrodynamics parameters are evaluated through CFD simulations to train the ML model.
In the numerical analyses, the coolant was represented as a Newtonian fluid with constant viscosity and density, ensuring a linear correlation between the stress rate and the strain rate. Simulations were performed under steady-state conditions, encompassing both forced turbulent flow regimes.3D Navier-Stokes equations were solved using the Finite Volume Method (FVM). The governing equations are as follows44:
Continuity equation:
Momentum equation:
Energy equation:
In the present study, forced turbulent flows were assessed employing the RNG k–ε model, the equations for which are presented below45,46:
In the FVM, constructing a computational grid over the geometry is crucial for achieving solution convergence. This study utilized a combination of hexahedral and tetrahedral meshes across the entire computational domain, with a strong emphasis on enhanced wall treatment to improve accuracy. A refined wall treatment approach, incorporating ten boundary layers, was implemented to ensure convergence. Near-critical regions featured finely sized mesh elements of 3.75 mm. Additionally, mesh quality metrics were rigorously maintained, with skewness controlled at 0.11 and orthogonal quality at 0.96 across all channel configurations44. Figure 7 represents the grid generation over the fluid domain.

a The side view of the grid generation in the elbow section, illustrating the structured grid distribution for accurate simulation. b The cross-sectional view of the grid shows the meshing pattern in the circular region.
This study utilized an implicit time-marching scheme to solve the governing equations. The pressure–velocity coupling was achieved through the SIMPLEC algorithm. For the discretization of mass, momentum, and energy conservation equations, a second-order upwind scheme was implemented, balancing accuracy and computational efficiency. The turbulence kinetic energy and dissipation rate equations were discretized using a first-order upwind scheme, chosen for its robustness and stability in turbulent flow simulations. The turbulence intensity (I) at the inlet was specified based on the following empirical relationship47:
DeepONet architecture
In this study, we have used an unstacked DeepONet developed and described by Lu Lu et al.33, which consists of a single branch network and a trunk network. Figure 8 shows the model architecture used in this work. The DeepONet architecture lays the foundation for capturing the interplay between dynamic operational input parameters and spatially distributed system behaviors (spatial coordinates), a core characteristic of digital twin frameworks. The branch network processes the input function, which, in our case, represents the average initial velocity. This input, belonging to an infinite-dimensional functional space, is characterized by n control points. Here, n = 1, as we are considering the average value. The trunk network, on the other hand, handles the spatial information. It takes as input a collection of N points within the domain, each defined by its (x, y, z) coordinates. These points correspond to 11,340 virtual sensors (N = 11,340) distributed throughout the pipe section. By encoding spatial dependencies and integrating dynamic operational data, the trunk network ensures that the digital twin remains synchronized with the physical system.

This figure illustrates the architecture of the DeepONet model used for predicting thermal-hydraulic parameters in a reactor system. Both of the models consist of a single branch and trunk network. The branch network takes the average inlet velocity (u) as input, and the trunk network takes the spatial domain coordinates (x, y, z). The output quantities are distributions of coolant pressure (P), velocity (Vo), and turbulence kinetic energy (k). a The schematic shows the original DeepONet architecture. The Branch network has 11 hidden layers and the trunk network has 10 hidden layers with 4096 neurons each. b The schematic illustrates the modified architecture with additional linear layers for each parameter. The branch network has layer sizes of [n, 512, 512, 512, N] where n = 1, and the trunk network has layer sizes of [3, 512, 512, 256, 3], both utilizing ReLU activation. At the end, there are three linear layers, each with sizes [N, N], where N = 11, 340, without any activation. All abbreviations and symbols used in this figure are defined as follows. The term bs refers to batch size, N represents the number of spatial nodes in the computational domain, and dim indicates the dimensionality of the data. The circle with a dot (⨀) represents the element-wise multiplication operation, and the square boxes denote computational layers such as linear layers or solution operators.
The core capability of DeepONet, which justifies its role as a digital twin, lies in the fusion of real-time information from the branch network and spatial information from trunk networks. This integration is achieved through element-wise multiplication48,49 of the outputs from Branch and Trunk Networks. The resulting output, G(u)(y), is a function of the spatial coordinates y conditioned on the input function u. This formulation enables the network to learn complex mappings between the input and output spaces, empowering real-time predictions of critical thermal-hydraulic parameters such as pressure, velocity, and turbulence kinetic energy. This aligns with the NRC’s definition39 of a digital twin, which emphasizes real-time synchronization, predictive capabilities, and comprehensive system representation.
To enhance the model’s predictive capabilities, we added additional linear layers to the original DeepONet architecture. The concept of utilizing linear layers was inspired by the work of Kazuma et al.50. These linear layers consist of three independent layers, each dedicated to refining one of the predicted parameters—velocity, turbulence kinetic energy, or pressure. Each linear layer takes an input vector corresponding to the spatial domain (in our case 11,340 spatial points) and outputs a vector of the same size, ensuring that the predictions retain the spatial resolution required for monitoring. The weights of these layers are initialized using Xavier initialization, which ensures stable gradients and efficient training, while biases are initialized to zero to prevent unnecessary offsets in the early stages. This modular design allows each linear layer to specialize in its respective parameter, enabling parameter-specific transformations that improve prediction accuracy. These layers enable scaling and shifting of the output, improving alignment with the target values. Figure 8 displays the original model architecture introduced by Lu Lu et al.33 (a), alongside the modified architecture used in this study (b). After the element-wise multiplication of the outputs from the Branch and Trunk Networks, the resulting tensor is split into three separate components, with each component corresponding to a specific parameter (velocity, turbulence, or pressure). Each component is then passed through its respective linear layer, which applies a final transformation. For a batch of data, the input tensor to the linear layers has dimensions (batch-size, no-of-parameters, number-of-spatial-points). After processing, each linear layer outputs a tensor of size (batch-size, number-of-spatial-points), in our case (512, 11340), preserving spatial resolution by mapping input vectors directly to corresponding output vectors. In the section “Vanilla DeepONet vs DeepONet with linear layers”, we present a comparison between the models with and without the additional linear layers to further justify these modifications. The DeepONet model was trained to predict three primary output functions: turbulence, pressure, and velocity distributions in the central plane of the hot leg. These output functions reside in distinct functional spaces, denoted as S1, S2, and S3, where k ∈ S1, p ∈ S2, and vo ∈ S3 represent turbulence, pressure, and velocity values in the central plane. The mapping solution operators of the DeepONet can be defined as G1, G2, and G3 for three different parameters.
For both branch and trunk networks, we employed feedforward neural networks (FNNs) due to their simplicity and effectiveness in approximating nonlinear functions. The architecture of these FNNs included three hidden layers, with the number of neurons in each layer determined through hyperparameter tuning. This hyperparameter optimization process was crucial in achieving optimal performance. The activation function of choice for the branch and trunk networks was the rectified linear unit (ReLU). The Adam optimizer, known for its efficiency and stability, was employed to update the network parameters iteratively. The training process involved minimizing the scaled mean squared error (MSE) loss function:
where N presents the number of data points (number of nodes), yi is the predicted value and ({hat{y}}_{i}) is the true value.
Declaration of generative AI and AI-assisted technologies in the writing process
During the preparation of this work, the author(s) utilized a Large Language Model (ChatGPT) to assist solely with language editing and refinement. The AI tool was employed strictly and only to enhance the clarity, grammar, and overall readability of the manuscript, ensuring the effective communication of ideas. It is important to note that no scientific content, original ideas, or conceptual contributions were generated, altered, or influenced by the AI.
Responses