Deep-learning real-time phase retrieval of imperfect diffraction patterns from X-ray free-electron lasers

Deep-learning real-time phase retrieval of imperfect diffraction patterns from X-ray free-electron lasers

Introduction

The phase problem, a well-known inverse problem involving the extraction of phase information hidden in the interference fringes of measured intensities, is prevalent in nature and complicates the direct interpretation of diffraction signals related to target objects1. Its impact spans various research modalities, including X-ray crystallography and high-resolution imaging, driving active development to recover lost phase information. While several methods exhibiting good performance have been developed, they often require significant data handling time. The effectiveness of these techniques depends on the completeness of the measured data, noise contamination, and technique-specific challenges in data collection. Deep learning (DL) has shown considerable potential in addressing these issues2,3. It reduces processing time by replacing conventional approaches with non-iterative operations after appropriate training, which can be accelerated with graphic processing units (GPUs). Due to these advantages, substantial efforts have been directed toward using DL for denoising, classification, and phase retrieval4,5,6,7,8,9,10,11,12, though these tasks remain challenging in X-ray diffraction.

Recent advancements in the development of brighter X-ray sources that provide ultrashort X-ray pulses, such as X-ray free-electron lasers (XFELs), have significantly enhanced the ability to observe ultrafast molecular bonding processes, transient material dynamics, and hidden material phases in strongly driven nonequilibrium states13,14,15,16. Diffraction imaging, which retrieves phase information through numerical iterations as an optimization process, holds great promise for determining the structure of single specimens. However, the diffraction signals, often plagued by low signal-to-noise ratios due to limited photons and data imperfections, have constrained the practical application of DL for interpreting experimental data17.

In this study, we propose a new deep neural network (DNN) for phase retrieval of imperfect diffraction patterns, enabling real-time image reconstruction for single-particle imaging experiments using XFELs. The network is based on a residual neural network (ResNet) with weight-corrected convolution layers designed to handle diffraction signals18. It was trained using masked diffraction patterns as inputs, which were generated from pseudo-random objects. We imposed a nonnegative real value constraint to the images, which is legitimate in X-ray diffractions of nanoparticles with weak scattering and negligible absorption19,20. We demonstrated the network’s excellent performance with simulated data by comparing it to conventional iterative phase retrieval algorithms. After verifying its effectiveness, we applied the network to single-pulse diffraction data obtained with XFELs, where it exhibited robust real-time image reconstruction with improved image quality. By providing a solution to the phase problem in X-ray diffraction imaging, this study addresses a significant bottleneck in data processing time by eliminating the need for computationally expensive iterative phase retrievals. With advancements in the development of new light sources that offer high brilliance and repetition rates, data accumulation rates have increased exponentially, necessitating rapid data processing. We believe that the proposed DNN method will serve as a crucial basis for advancing scientific discoveries through effective data mining.

Results

DNN for the phase problem of diffraction patterns

By replacing computer vision tasks using pretrained parameters, DNNs deliver superior performance on such tasks and have become increasingly efficient with a combination of convolution operations21. Various convolution types have been developed to enhance performance for specific applications. Depth-wise separable convolution (DSC) is an efficient convolution method, typically using ten times fewer parameters than plain convolution22. Partial convolution (PC) functions as a mask-aware convolution, allocating occluded data based on known data23. Fast Fourier convolution (FFC) provides a global receptive field through an additional convolution of the Fourier transform of the input24. Recently, a ResNet-based DNN named LaMa has demonstrated exceptional performance in image inpainting, even with large masks25. LaMa’s straightforward architecture, which includes downscaling layers, residual blocks, and upscaling layers, employs FFC across the entire network to leverage the global receptive field. Building on this basic architecture, we introduce deep phase retrieval (DPR), a new DNN featuring an encoder–decoder architecture with two novel operations: an encoder with weighted partial convolutions (WPC) and a two-stage decoder with intermediate Fourier modulation (Fig. 1b and Supplementary Table 1). DPR is a promising network for the immediate reconstruction of imperfect, noisy diffraction patterns, utilizing WPC and FFC to reflect the nature of X-ray diffraction.

Fig. 1: DNN for real-time phase retrieval of imperfect single-pulse diffraction patterns.
Deep-learning real-time phase retrieval of imperfect diffraction patterns from X-ray free-electron lasers

a Schematic diagram of data generation and the network training. b Schematic diagram of DPR. The network consists of a WPC-based encoder and a two-stage decoder, including a base decoder and a diffraction-compensated decoder (+D). c, d Evolution of validation loss during training iterations (c) and evaluation metrics (d) for WPC-, PC-, and FFC-based encoders with and without +D. The boxes and whiskers represent the average and standard errors of each metric, respectively. Differences with WPC + D are verified by the Mann-Whitney U test (not indicated, p ≤ 10−8; ***10−8 < p ≤ 0.001; **0.001 < p ≤ 0.01; *0.01 < p ≤ 0.05; ns, p > 0.05).

Full size image

While PC equally distributes known information to missing values within convolving regions, WPC employs a physics-based approach to assign position-dependent weights based on the Guinier–Porod model. This model describes the radial intensity distribution in a small-angle region (see Methods)26. As the diffraction intensity typically decreases by ({Q}^{-4}) for the momentum transfer, ({boldsymbol{Q}}) ((={{boldsymbol{k}}}_{f}-{{boldsymbol{k}}}_{i})), with wave vectors, ({{boldsymbol{k}}}_{left(i,fright)}) of incoming ((i)) and outgoing ((f)) light, the WPC assigns (Q)-dependent weight factors from the Guinier–Porod model for a smooth sphere to known values during the operation of the PC. In the two-stage decoder, the diffraction-compensated decoder, which performs Fourier modulation before the unit blocks of the residual structure, is connected serially to the base decoder. The Fourier modulation is achieved by replacing the Fourier magnitudes of the primary outputs with the inputs. This operation retrieves the initial diffraction patterns that attenuate through the deep layers of the network, aiding in the accurate generation of Fourier transform pairs. Additionally, FFC operates analogously with conventional phase retrieval algorithms, which iteratively connect Fourier space information using discrete Fourier transform (DFT) and inverse DFT between diffractions and objects. These components work together to reconstruct the lost phase information from imperfect, photon-limited diffraction data.

For the dataset generation, we constructed a diffraction model reflecting experimental conditions and used with pseudo-random objects (see Methods). The shape and density of the objects were assigned separately from two pre-existing datasets: EMNIST (handwritten character digits) and CIFAR-100 (images of real-world objects), respectively, with proper image transformations27,28. These objects maximally exclude any physical or human biases and are more appropriate for the goal of DPR: finding real-space pairs corresponding to measured diffraction patterns from general objects, than using projected images from plausible three-dimensional objects based on specific models. Diffraction patterns were generated by performing a fast Fourier transform (FFT) along with the diffraction model, incorporating additional operations to account for experimental conditions such as spatial coherence, limited photon counts, and measurement noise. The X-ray scattering of the sample was considered within the first-order Born approximation ignoring multiple scattering, which is prevalent in X-ray regime19. The approximation is valid when the sample thickness satisfies (D,lesssim, 2pi lambda C/left|1-nright|), where (n) is the complex refractive index and (Capprox 0.2), and the limit is typically few microns in X-ray regime, which is much larger than the size of typical nanoparticles29. The resulting patterns were then partially obscured with irregular masks, simulating data loss due to pixel arrangements on a detector and a central beam stop that blocks intense direct beams in actual measurements. These patterns were fed into the network as inputs (Fig. 1a). We used the AdamW optimizer for backpropagation with a custom-designed loss function for back propagation of the network (see Methods)30. The loss function consisted of the mean absolute error (MAE) of final outputs, ({mathcal{L}}), MAE of their gradients, ({{mathcal{L}}}_{{rm{grad}}}), the perceptual loss, ({{mathcal{L}}}_{{rm{perc}}}), and the R-factor with the ground-truth Fourier magnitudes, ({R}_{{rm{F}}}^{{rm{GT}}})31. After an ablation study of each component, the final loss function was settled on ({ {mathcal L} }_{{rm{total}}}= {mathcal L} +10{ {mathcal L} }_{{rm{grad}}}+0.1{ {mathcal L} }_{{rm{perc}}}+0.01{R}_{{rm{F}}}^{{rm{GT}}}) (Supplementary Fig. 1).

DPR in phase retrievals and evaluation of its performance

We first validated the improved performance of the WPC-based encoder and diffraction-compensated decoder within the DPR framework. Compared to encoders based on PC and FFC, the WPC-based encoder exhibited lower validation loss during training and showed significant improvements in the R-factor (({R}_{{rm{F}}})), while maintaining or even surpassing the peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) (see Methods, Fig. 1c, d). PSNR and SSIM are well-known metrics that commonly used to evaluate image qualities based on given reference images. As the noises, including intrinsic shot noise, predominate the single-pulse diffraction signals, evaluating the validity of image reconstructions solely from the ({R}_{{rm{F}}}), that is a normalized MAE of Fourier magnitudes, is insufficient for the noisy diffraction signals. To avoid such inaccurate evaluation, we implemented additional metrics, PSNR and SSIM, to evaluate the validity in the real-space using given ground-truth images, complementing the limitations from using ({R}_{{rm{F}}}). Additionally, the two-stage decoder including the diffraction-compensated decoder outperformed a single base decoder with doubled residual blocks, despite having 43% fewer total trainable parameters (Fig. 1c, d). This confirms that DPR provides enhanced performance with an efficient architecture and effective handling of imperfect diffraction signals.

We also examined how the reconstruction performance of DPR depends on the size of the training dataset by doubling the size from 12,000 to 192,000 (Fig. 2a). Although a slight improvement was observed in ({R}_{{rm{F}}}), PSNR, and SSIM, the network performance quickly saturated with larger training datasets. This indicates that the current amount of training data is sufficient for DPR, given the efficiency of network training. After validating the DPR architecture, we compared its phase retrieval performance with those of conventional iterative projection algorithms, specifically hybrid input–output (HIO) and generalized proximal smoothing (GPS) (see Methods)32,33. We simulated diffraction patterns using the same diffraction model as for the training datasets, applying irregular masks. Additionally, we generated a second set of test data with a constant central mask covering 16 × 16 pixels at the center to compare scenarios with and without substantial data loss. We compared the image reconstructions from DPR; DPR with refinement; and the two conventional phase retrieval algorithms, HIO and GPS. Refinement for DPR involved few iterations of GPS to obtain the final images (see Methods). The results demonstrate that DPR significantly outperformed HIO and GPS in reconstructing real-space images (Fig. 2b). While HIO and GPS produced indistinct images, especially with large masks, DPR consistently provided superior performance regardless of masking areas. Moreover, DPR reconstructed more detailed features with refinement, though this came at the cost of increased noise from the diffraction signals.

Fig. 2: Phase retrieval performance of DPR on simulated data.
figure 2

a Change in evaluation metrics for different sizes of the training dataset. b Examples of real-space image reconstructions from simulated data. Diffraction patterns were simulated using the diffraction model with irregular masks and a constant central mask. Reconstructed real-space images from each algorithm are compared with ground-truth (GT) images. As all real-space images were restricted to be real-valued, the impact of irregular masks would be reduced effectively by centrosymmetry of Fourier magnitudes. c, d Comparisons of evaluation metrics (c) and processing times (d) between HIO and GPS, as representative conventional algorithms, DPR trained with the AdamW optimizer (DPR0), DPR trained with the AdamWR optimizer and ASAM (DPR), and DPR with refinement (DPR + R). The boxes and whiskers represent the average and standard errors of each metric, respectively. Differences with DPR are verified by the Mann-Whitney U test (not indicated, p ≤ 10−8; ***10−8 < p ≤ 0.001; **0.001 < p ≤ 0.01; ns, p > 0.05).

Full size image

Overall, DPR demonstrated significantly better image quality with higher PSNR and SSIM than GPS, indicating its robust performance on noisy diffraction signals (Fig. 2c). Additionally, DPR exhibited similarly high levels of PSNR and SSIM in both the irregular and constant center mask cases, highlighting its superior performance in scenarios involving partial data loss. DPR with refinement achieved a significantly lower ({R}_{{rm{F}}}), even lower than that of GPS, though this came at the expense of image quality. For further application to experimental data, the AdamWR optimizer with adaptive sharpness-aware minimization (ASAM) was employed to mitigate failures by improving generalization (see Methods)30,34. This approach also led to a notable improvement in the quality of the reconstructed images from the test data (Fig. 2c).

DPR also benefits from having fewer training parameters (1.52 × 107 in total) than conventional deep convolutional neural networks, despite the large input size of 512 × 512. It efficiently addresses complex phase problems using WPC and FFC, handles imperfect diffraction signals with an appropriate physical model, and utilizes Fourier-space information similar to conventional phase retrieval methods. The processing times were 9.02 ± 0.00215 ms and 52.2 ± 2.45 ms per data for DPR and DPR with refinement, respectively, using a single NVIDIA GeForce RTX 3090 GPU (Fig. 2d). This represents more than a 1000-fold increase in speed over that of conventional iterative phase retrieval algorithms, underscoring the superiority of DPR for real-time processing of data from upcoming MHz-repetition-rate XFELs35.

DPR in phase retrievals of experimental data from the XFEL

After demonstrating the performance with simulated data, DPR was applied to experimental data obtained from XFELs. Single-pulse X-ray diffraction imaging experiments were conducted at the Pohang Accelerator Laboratory-XFEL (PAL-XFEL) (see Methods). X-rays with a photon energy of 5 keV and a spectral bandwidth of (Delta E/E) ≈ 5 × 10−3 were focused into a 5 µm (horizontal) × 7 µm (vertical) area, giving an effective photon flux of 8 × 109 photons·μm−2 per pulse. Single-pulse diffraction patterns were recorded by a 1-megapixel multi-port charge-coupled device (CCD) detector with a pixel size of 50 × 50 μm2 positioned 1.60 m downstream of the sample, providing a pixel resolution of 15.5 nm for a 512 × 512 window. Specimens of Ag nanoparticles, with characteristic flower and cube shapes, were randomly dispersed and mounted on thin Si3N4 membranes for measurement.

Real-space images were directly obtained from the diffraction signals using as-trained DPR without any fine-tuning or additional data processing (Fig. 3a). The central regions of the diffraction patterns were obscured by a beam stop and strong parasitic scattering near the direct beam. The fringe oscillations from the specimen appeared blurry due to the experimental conditions, including imperfect spatial coherence and other signal contaminations. Despite these challenges, DPR successfully extracted accurate images from the measured diffraction patterns. The images obtained with DPR and DPR with refinement showed distinct shapes with relatively high contrast compared to results from conventional iterative algorithms, HIO and GPS. As reference real-space images could not be given for experimental data, local ({R}_{{rm{F}}}), that is a pixelwise ({R}_{{rm{F}}}), was introduced for more detailed evaluation of the results (see Methods). Notably, DPR is not biased toward low-(Q) signals near the diffraction center, which represent a significant portion of the total diffraction intensities, resulting in a lower local ({R}_{{rm{F}}}) for high-(Q) signals (Fig. 3a, and Supplementary Fig. 2a). Since the diffraction signals in the high-(Q) region provide high-resolution information on internal structures, DPR produces real-space images with clearer shapes and more detailed structures.

Fig. 3: DPR on data from single-pulse X-ray diffraction imaging experiments using the XFEL.
figure 3

a Comparisons of reconstruction results from diffraction patterns of Ag flower, double Ag flower, and Ag cube nanoparticles measured at PAL-XFEL. The first row displays the measured single-pulse diffraction patterns, with the central part blocked by the beam stop and missing data along vertical line gaps due to detector chip arrangements. Phase retrievals using DPR, DPR with refinement (DPR + R), HIO, and GPS were performed, and the reconstructed images are shown in the left column beneath each diffraction pattern. Local ({R}_{{rm{F}}}) distributions, derived from the Fourier magnitudes of the images, are presented in color in the right columns. Scale bars represent 150 nm. b Pairwise Pearson correlation coefficients (PCCs) for all pairs of results obtained using each method. The boxes and whiskers represent PCCs and their confidence intervals at a 95% confidence level.

Full size image

The results showed strong positive correlations (above 0.8) to those from conventional algorithms, indicating a high degree of agreement in their morphologies (Fig. 3b). An important advantage of the DNN method is that DPR does not require support constraints. It directly converts Fourier-space data into corresponding real-space objects without additional information, unlike conventional phase-retrieval algorithms that require support estimation for real-space constraints. As a hybrid option, refined DPR with 50 iterations of GPS after DPR achieved an improved ({R}_{{rm{F}}}), even lower than the GPS result with a thousand iterations (Supplementary Fig. 2a). This indicates that DPR provides an efficient approach to optimization by giving starting points already close to the global minima.

To further evaluate the phase retrieval performance of DPR on general single-pulse diffraction data, we applied it to public datasets from the Coherent X-ray Imaging Data Bank (CXIDB)36. We obtained three datasets, i.e., chlorovirus PBCV-1, bacteriophage T4, and Fe2O3 ellipsoid nanoparticles, from the repository37. In these experiments, X-rays with a photon energy of 1.2 keV were used, and diffraction signals were measured with a 1-megapixel pnCCD with a pixel size of 75 × 75 μm2 positioned 0.738 m downstream of the sample. This setup provided ideal pixel resolutions of 19.9 nm with a 512 × 512 window and 9.93 nm after 2 × 2 binning for the Fe2O3 ellipsoid dataset.

Despite the completely different samples and experimental conditions, real-space images were successfully obtained using DPR (Fig. 4a). The images produced by DPR exhibited distinct shapes with clear internal structures and higher contrasts compared to those from conventional algorithms. DPR generated real-space objects that were better aligned with high-(Q) diffraction signals and demonstrated strong positive correlations with the results from conventional algorithms, similar to findings from independent experiments (Fig. 4a, b, and Supplementary Fig. 2b). Thus, DPR was validated as effective for extracting real-space information from diffraction patterns, showing robustness to experimental noise and partial data loss. Notably, DPR was trained without any physical bias and did not require fine-tuning procedures for different types of samples. The consistently improved performance across various datasets confirms the general applicability of DPR. This method enables rapid reconstruction of real-space images from imperfect, noisy diffraction patterns within 10 ms using a single GPU, regardless of experimental conditions or sample types, achieving real-time phase retrieval for XFEL data. In addition, although multiple scattering effect is ignored in the present DPR, an extension to implement such effect can be made by modifying the diffraction model with the multislice method38. As DNN has shown a superior performance for various types of data buried by strong noises, even succeeded phase retrieval of very low photon count measurements in optical regime, DPR also has a potential to deal with diffraction data having extreme noise levels by training with much weaker diffraction datasets39. Moreover, the techniques employed in DPR, such as WPC, are not limited to phase retrieval but are applicable to solving various problems in X-ray diffraction experiments, including classification and denoising of measured data.

Fig. 4: DPR on publicly available single-pulse X-ray diffraction data.
figure 4

a Comparisons of reconstruction results from diffraction patterns of chlorovirus PBCV-1, bacteriophage T4, and Fe2O3 ellipsoid nanoparticles from CXIDB. Measured diffraction patterns served as inputs, with masked regions not recorded due to detector chip arrangements. For Fe2O3 ellipsoid nanoparticles, 2 × 2 binning was applied to reduce the oversampling ratio before phase retrieval. Phase retrievals using DPR, DPR with refinement (DPR + R), HIO, and GPS were compared. Reconstructed images for each phase retrieval method are displayed in the left column beneath the diffraction patterns in the first row. Local ({R}_{{rm{F}}}) distributions, derived from the Fourier magnitudes of the images, are shown in color in the right columns. Scale bars represent 200 nm. b Pairwise PCCs for all pairs of results obtained using each method. The boxes and whiskers represent PCCs and their confidence intervals at a 95% confidence level.

Full size image

Discussion

The DNN with the newly proposed architecture excels in solving the phase problem, demonstrating outstanding performance in the phase retrieval of X-ray diffraction patterns. Notably, this network shows excellent tolerance to experimental noise and partial data loss. When applied to single-pulse XFEL diffraction patterns, it achieves rapid and direct reconstruction of real-space images, enabling real-time phase retrieval. The increasing importance of high-speed data processing arises from the large volumes of data generated in a short time by next-generation X-ray sources, and DPR is well-suited to handle such extensive datasets. Whilst the present form of DPR is not designed to deal with complex-valued or 3D objects, it can be readily adapted to those datasets with simple modifications: generating two-channel outputs with each channel corresponding to real and imaginary parts or replacing all 2D operations in DPR to 3D ones for complex-valued objects and 3D diffraction data, respectively. It is also possible to reconstruct 3D objects from 2D images by using real-space tomographic reconstruction algorithms like RESIRE40. On the other hand, as DPR does not take account of any correlation between input diffraction patterns, the reconstruction of 3D objects from correlated 2D diffraction patterns of a specimen with random orientations is not available, limiting its use in 3D single-particle imaging yet41,42.

The WPC, which utilizes the Guinier–Porod model to guide lost information, highlights the importance of properly handling diffraction data to extract structural information. Despite the WPC-based encoder comprising only 10% of the total trainable parameters in DPR, this approach can be easily adapted to various types of incomplete experimental data, such as X-ray absorption or emission data, by applying appropriate physical models for further improvements in DL-based operations. Thus, DPR not only provides real-time phase retrieval for imperfect diffraction patterns but also represents a novel method for managing partially damaged data from various experiments with distinctive characteristics. The approach is particularly relevant for time-resolved diffraction imaging with high-repetition-rate XFELs, allowing observation of femtosecond dynamics in systems driven far from equilibrium, thus revealing hidden material phases not accessible through equilibrium thermodynamics. DPR is poised to significantly advance this research area by fully utilizing massive datasets in parallel with data collection.

Methods

Weighted partial convolution

Building on the concept of PC, WPC incorporates position-dependent weights based on the Guinier–Porod model, which describes the radial intensity distribution in small-angle scattering data23,26. For an ideal sphere with a smooth surface, the Guinier–Porod model provides the relationship between intensity (I) and momentum transfer (Q) as follows:

$$Ileft(Qright)=left{begin{array}{c}Gexp left(-frac{{R}^{2}{Q}^{2}}{5}right),qquad{rm{for}}Qle {Q}_{1}\ Gexp left(-frac{{R}^{2}{Q}_{1}^{2}}{5}right){left(frac{{Q}_{1}}{Q}right)}^{4},qquad{rm{for}}Q, >, {Q}_{1}end{array}right.$$
(1)

where (G) is the Guinier scale factor; (R) is the sphere radius; and ({Q}_{1}) is the boundary momentum transfer between the Guinier and Porod models, defined as ({Q}_{1}=sqrt{10}/R). Here, (R) is given by (pi /sigma), where (sigma) is the oversampling ratio along an axis, to match a unit of momentum transfer with a pixel of the measured diffraction pattern. The position-dependent weights of the WPC were determined using Eq. (1), with (sigma =min left(H,Wright)/64), where (H) and (W) are the height and width of the input for each layer, respectively, and 64 represents the matrix size allocated for the final real-space images. The operation of WPC with convolution kernel ({boldsymbol{K}}) is defined as.

$${x}^{{prime} }=left{begin{array}{c}{{boldsymbol{K}}}^{T}left({boldsymbol{X}}odot {boldsymbol{M}}right)frac{sum _{i}{{boldsymbol{W}}}_{i}}{sum _{{{boldsymbol{M}}}_{i}ne 0}{{boldsymbol{W}}}_{i}},qquad{rm{if}}sum _{i}{{boldsymbol{M}}}_{i}ne 0\ 0,qquad{rm{otherwise}}end{array}right.$$
(2)

where ({boldsymbol{X}}) is the input, ({boldsymbol{M}}) is the binary mask for the valid data points, ({boldsymbol{W}}) is the weight in the region covered by the kernel during convolution, and (odot) is an element-wise multiplication.

Diffraction model

The diffraction model generates diffraction patterns from objects, reflecting the properties of single-pulse X-ray diffraction imaging experiments using XFELs. Basic diffraction patterns are produced by taking the absolute square of the FFT of pseudo-random objects derived from a combination of EMNIST and CIFAR-100 datasets27,28. EMNIST consists of handwritten character digits that define the shapes of the objects, while CIFAR-100 includes images of real-world objects in 100 classes that provide internal density distributions. Specifically, EMNIST images are enlarged using maximum filters with random widths ranging from 3 to 7 pixels, then modified by affine transforms with random angles (0° to 90°) and scales (0.8 to 1.5), and finally cropped to 64 × 64 pixels. This results in an oversampling ratio of approximately 10 to 20 within a 512 × 512 window. CIFAR-100 images are cropped with random scales and aspect ratios ranging from 0.08 to 1 and 0.75 to 1.33, respectively, and then resized to 64 × 64 pixels. After generating the basic diffraction patterns, the Gaussian Schell model is used to account for the finite spatial coherence length of the radiation from the XFELs, as follows:

$${{boldsymbol{I}}}^{{prime} }=left|{rm{FT}}left{{{rm{FT}}}^{-1}left[{boldsymbol{I}}right]odot exp left(-frac{{{boldsymbol{r}}}^{2}}{4{sigma }_{mu }^{2}}right)right}right|$$
(3)

where ({boldsymbol{I}}) is the diffraction pattern, ({boldsymbol{r}}) is the matrix of radial distances from the center, and ({sigma }_{mu }) is the spatial coherence length43. ({sigma }_{mu }) is given by 200 pixels with 10% random deviations. Then, the diffraction patterns are scaled to have total diffraction intensities in the range of 106–107, and mixed Poisson–Gaussian noise was added to the patterns as follows:

$${{boldsymbol{I}}}_{i}^{{prime} }={rm{Pois}}left({{boldsymbol{I}}}_{i}cdot frac{{{boldsymbol{I}}}_{{rm{total}}}}{sum _{j}{{boldsymbol{I}}}_{j}}right){{+}}{mathcal{N}}left(0,sigma right)$$
(4)

where ({rm{Pois}}left({rm{lambda }}right)) generates random values from a Poisson distribution with λ events, and ({mathcal{N}}left(mu ,sigma right)) generates random values from a normal distribution with mean (mu) and standard deviation σ. σ was set to 1/2.35482, giving full width at half maximum of 1 for Gaussian normal distribution. The final diffraction patterns were paired with random masks. These masks were created using a combination of center masks with random radii (ranging from 8 to 32 pixels) and positional deviations (ranging from −8 to 8 pixels along each axis), along with irregular masks from the NVIDIA Irregular Mask Dataset23. The occlusion ratio for the irregular masks was limited to 50%. The total number of generated patterns was 96,000 for training, 12,000 for validation, and 12,000 for testing.

Loss function and network training

The loss function comprises MAE, MAE of image gradients, perceptual loss, and ({R}_{{rm{F}}}) with the ground-truth Fourier magnitudes. These functions are defined as

$$begin{array}{c} {mathcal L} ({boldsymbol{X}},{boldsymbol{Y}})=frac{1}{N}mathop{sum }limits_{i=1}^{N}{{|}}{{boldsymbol{X}}}_{i}-{{boldsymbol{Y}}}_{i}|,qquad{ {mathcal L} }_{{rm{grad}}}({boldsymbol{X}},{boldsymbol{Y}})=frac{{sum }_{{{boldsymbol{Y}}}_{i}ne 0}||nabla {{boldsymbol{X}}}_{i}-nabla {{boldsymbol{Y}}}_{i}||_{1}}{{sum }_{{{boldsymbol{Y}}}_{i}ne 0}1}\ { {mathcal L} }_{{rm{perc}}}({boldsymbol{X}},{boldsymbol{Y}})= {mathcal L} (Phi [{boldsymbol{X}}],Phi [{boldsymbol{Y}}]),qquad{R}_{{rm{F}}}^{{rm{GT}}}({boldsymbol{X}},{boldsymbol{Y}})=frac{{sum }_{i}||FT[{rm{X}}]{|}_{i}-|{rm{FT}}[{rm{Y}}]{|}_{i}|}{{sum }_{i}|FT[Y]{|}_{i}}end{array}$$
(5)

where ({boldsymbol{X}}) is the output from the network, ({boldsymbol{Y}}) is the target, and (Phi) is the pretrained neural network. For perceptual loss, intermediate outputs after the 4th and 5th blocks of ImageNet-pretrained VGG-19 were used31. Additional weights were applied to the outputs based on the square root of the total diffraction intensity to reduce the influence of weak data. Based on the loss function, the network is trained by the AdamW optimizer with ({beta }_{1}=0.9), ({beta }_{2}=0.999), and a weight decay of 0.0001 for 500 epochs followed by 100 epochs with learning rates of 0.001 and 0.0001, respectively30. For the case using the AdamWR optimizer with ASAM, parameters of ASAM were set as (rho =0.2) and (eta =0.01); the learning rate was determined by cosine annealing with a warm restart scheduler as ({alpha }_{i}={alpha }_{min }+0.5left({alpha }_{max }-{alpha }_{min }right)left(1+cos left(pi T/{T}_{i}right)right)), where ({alpha }_{min }={10}^{-8}), ({alpha }_{max }=0.005), (T) is the number of epochs after a recent restart, and ({T}_{i}) is the number of epochs between two restarts, initially set to 40 and doubled after each restart34. Twelve NVIDIA GeForce RTX 3090 GPUs were used for network training.

Evaluation metrics

The performance of DPR was evaluated using three metrics: ({R}_{{rm{F}}}), PSNR, and SSIM44. The metrics are defined as follows:

$$begin{array}{c}{R}_{{rm{F}}}left({boldsymbol{X}},{boldsymbol{I}}right)=frac{sum _{i,{rm{valid}}}left|{left|{rm{FT}}left[{boldsymbol{X}}right]right|}_{i}-sqrt{{{boldsymbol{I}}}_{i}}right|}{sum _{i,{rm{valid}}}sqrt{{{boldsymbol{I}}}_{i}}},qquad{rm{PSNR}}left({boldsymbol{X}},{boldsymbol{Y}}right)=20{log }_{10}frac{max left({boldsymbol{Y}}right)}{sqrt{displaystylefrac{1}{N}{sum }_{i=1}^{N}{left({{boldsymbol{X}}}_{i}-{{boldsymbol{Y}}}_{i}right)}^{2}}}\ {rm{SSIM}}left({boldsymbol{X}},{boldsymbol{Y}}right)=frac{left(2{mu }_{{boldsymbol{X}}}{mu }_{{boldsymbol{Y}}}+{c}_{1}right)left(2{sigma }_{{boldsymbol{XY}}}+{c}_{2}right)}{left({mu }_{{boldsymbol{X}}}^{2}+{mu }_{{boldsymbol{Y}}}^{2}+{c}_{1}right)left({sigma }_{{boldsymbol{X}}}^{2}+{sigma }_{{boldsymbol{Y}}}^{2}+{c}_{2}right)}end{array}$$
(6)

where ({boldsymbol{X}}) is the output from the network, ({boldsymbol{Y}}) is the target, ({boldsymbol{I}}) is the diffraction pattern, ({mu }_{{boldsymbol{X}}}) is the mean of ({boldsymbol{X}}), ({sigma }_{{boldsymbol{X}}}^{2}) is the variance of ({boldsymbol{X}}), and ({sigma }_{{boldsymbol{XY}}}) is the covariance of ({boldsymbol{X}}) and ({boldsymbol{Y}}). ({c}_{1}) and ({c}_{2}) in SSIM are given by ({left(0.01max left({boldsymbol{Y}}right)right)}^{2}) and ({left(0.03max left({boldsymbol{Y}}right)right)}^{2}), respectively. A two-sided Mann–Whitney U test was also performed using the evaluation metrics to identify statistical differences in DPR. For cases involving experimental data, the local ({R}_{{rm{F}}}) and Pearson correlation coefficients (PCCs) for all pairs were calculated. The local ({R}_{{rm{F}}}) was calculated pixelwise for data points with photon counts exceeding 0.5, while the PCC was defined as ({rm{PCC}}left({boldsymbol{X}},{boldsymbol{Y}}right)={sigma }_{{boldsymbol{XY}}}/{sigma }_{{boldsymbol{X}}}{sigma }_{{boldsymbol{Y}}}).

Phase retrieval parameters

For phase retrieval using HIO and GPS, 1000 iterations were performed with 100 initial random phases32,33. The HIO algorithm was employed with (beta =0.9,) and the error reduction algorithm accounted for 10% of the total iterations. GPS was executed as R variants (GPS-R) with the following parameters: (t=1), (s=0.9), (sigma) linearly increasing from 0.01 by a factor of 10 at 40% and 70% of the total iterations, and (gamma =1/2{alpha }^{2}) with (alpha) linearly decreasing from 1024 by 10% every 100 iterations. Both algorithms also used the shrink-wrap algorithm with (sigma) linearly decreasing from 3 pixels by 1% and a threshold of 20% of the maximum value to update the support constraints every 50 iterations. The initial supports were 60 × 60 pixels for the test data and 30 × 30 pixels for the experimental data. The final images were selected based on ({R}_{{rm{F}}}): a single image for the test data and an average of five images for the experimental data. To refine the outputs from DPR, the support constraints were derived from the output images by thresholding at 1% of the 99th percentile values. Using these supports, GPS-R was conducted for 50 iterations with the following parameters: (t=1), (s=0.9), (sigma) increasing from 0.1 to 1 at 40% of the total iterations, and (gamma =1/2{alpha }^{2}) with (alpha) linearly decreasing from 1024 by 20% every 10 iterations.

Single-pulse X-ray diffraction imaging experiments

The experiments were conducted at the nanocrystallography and coherent imaging (NCI) beamline of the PAL-XFEL45. X-ray pulses from self-amplified spontaneous emission with a nominal photon energy of 5 keV and a bandwidth of (Delta E/E) ≈ 5 × 10−3 were used for the experiments. The X-ray pulses were focused into a 5 µm (horizontal) × 7 µm (vertical) area by a pair of Kirkpatrick–Baez mirrors installed 5 m upstream of sample position, giving an effective photon flux of approximately 8 × 109 photons·μm−2 per pulse. Diffraction patterns were recorded using a 1-megapixel multi-port CCD with a pixel size of 50 × 50 μm2, located 1.6 m downstream of the sample position. A beam stop was placed in front of the detector to block the direct X-ray beam and cover a quadrant of the detector plane. The samples included Ag flower and cube nanoparticles with approximate widths of 250 nm and 150 nm, respectively. These were spread on 100 nm-thick Si3N4 membranes and loaded into the imaging chamber. All beam paths, including the imaging chamber, were kept under vacuum during the measurements. Background signals were subtracted from the measured diffraction patterns, and multiple scattering effect was ignored based on the first-order Born approximation19. Missing values were substituted with values at centrosymmetric positions in accordance with Friedel’s law, ignoring imaginary parts of atomic form factors, which are much smaller than their real parts, in the experiments. As Ewald sphere curvatures only gave up to 3.20 × 10−3% and 3.38 × 10−2% difference in in-plane components of the momentum transfers for the data from PAL-XFEL and CXIDB, respectively, their contributions were ignored. Out-of-plane components of the momentum transfers from Ewald sphere curvatures, up to 8.11 × 10−4 nm−1 and 2.06 × 10−3 nm−1, respectively, were also negligible, considering the size of the samples (Supplementary Note 1).

Related Articles

Global alignment reference strategy for laser interference lithography pattern arrays

Large-area gratings play a crucial role in various engineering fields. However, traditional interference lithography is limited by the size of optical component apertures, making large-area fabrication a challenging task. Here, a method for fabricating laser interference lithography pattern arrays with a global alignment reference strategy is proposed. This approach enables alignment of each area of the laser interference lithography pattern arrays, including phase, period, and tilt angle. Two reference gratings are utilized: one is detached from the substrate, while the other remains fixed to it. To achieve global alignment, the exposure area is adjusted by alternating between moving the beam and the substrate. In our experiment, a 3 × 3 regions grating array was fabricated, and the −1st-order diffraction wavefront measured by the Fizeau interferometer exhibited good continuity. This technique enables effective and efficient alignment with high accuracy across any region in an interference lithography pattern array on large substrates. It can also serve as a common technique for fabricating various types of periodic structures by rotating the substrate.

PID3Net: a deep learning approach for single-shot coherent X-ray diffraction imaging of dynamic phenomena

This paper introduces a deep learning (DL)-based method for phase retrieval tailored to single-shot, multiple-frame coherent X-ray diffraction imaging (CXDI), designed specifically for visualizing local nanostructural dynamics within a larger sample. Current phase retrieval methods often struggle with achieving high spatiotemporal resolutions, handling dynamic imaging, and managing computational costs, which limits their applicability in observing nanostructural dynamics. This study addresses these gaps by developing a novel method that leverages a feedforward architecture with a physics-informed strategy utilizing measurement settings, enabling the reconstruction of dynamic “movies” from time-evolving diffraction images of the illuminated area. The method incorporates key enhancements, such as temporal convolution blocks to capture spatiotemporal correlations and a unified TV regularization applied to the reconstructed object, resulting in improved noise reduction and spatial smoothness. An expanded evaluation framework, including multiple metrics and systematic sensitivity analysis, is employed to comprehensively assess the method’s performance and robustness. Proof-of-concept experiments, including numerical simulations and imaging experiments of a moving Ta test chart and colloidal gold particles (dispersed in aqueous polyvinyl alcohol solutions) with synchrotron hard X-rays, validate the high imaging performance of this method. Experimental results demonstrate that structures in the sample have been successfully reconstructed at short exposure times, significantly outperforming both traditional methods and current DL-based methods. The proposed method provides efficient and reliable reconstruction of dynamic images with low computational costs, making it suitable for exploring fast-evolving phenomena in synchrotron- or free-electron laser-based applications requiring high spatiotemporal resolutions.

Robust self-supervised denoising of voltage imaging data using CellMincer

Voltage imaging is a powerful technique for studying neuronal activity, but its effectiveness is often constrained by low signal-to-noise ratios (SNR). Traditional denoising methods, such as matrix factorization, impose rigid assumptions about noise and signal structures, while existing deep learning approaches fail to fully capture the rapid dynamics and complex dependencies inherent in voltage imaging data. Here, we introduce CellMincer, a novel self-supervised deep learning method specifically developed for denoising voltage imaging datasets. CellMincer operates by masking and predicting sparse pixel sets across short temporal windows and conditions the denoiser on precomputed spatiotemporal auto-correlations to effectively model long-range dependencies without large temporal contexts. We developed and utilized a physics-based simulation framework to generate realistic synthetic datasets, enabling rigorous hyperparameter optimization and ablation studies. This approach highlighted the critical role of conditioning on spatiotemporal auto-correlations, resulting in an additional 3-fold SNR gain. Comprehensive benchmarking on both simulated and real datasets, including those validated with patch-clamp electrophysiology (EP), demonstrates CellMincer’s state-of-the-art performance, with substantial noise reduction across the frequency spectrum, enhanced subthreshold event detection, and high-fidelity recovery of EP signals. CellMincer consistently outperforms existing methods in SNR gain (0.5–2.9 dB) and reduces SNR variability by 17–55%. Incorporating CellMincer into standard workflows significantly improves neuronal segmentation, peak detection, and functional phenotype identification, consistently surpassing current methods in both SNR gain and consistency.

Optical sorting: past, present and future

Optical sorting combines optical tweezers with diverse techniques, including optical spectrum, artificial intelligence (AI) and immunoassay, to endow unprecedented capabilities in particle sorting. In comparison to other methods such as microfluidics, acoustics and electrophoresis, optical sorting offers appreciable advantages in nanoscale precision, high resolution, non-invasiveness, and is becoming increasingly indispensable in fields of biophysics, chemistry, and materials science. This review aims to offer a comprehensive overview of the history, development, and perspectives of various optical sorting techniques, categorised as passive and active sorting methods. To begin, we elucidate the fundamental physics and attributes of both conventional and exotic optical forces. We then explore sorting capabilities of active optical sorting, which fuses optical tweezers with a diversity of techniques, including Raman spectroscopy and machine learning. Afterwards, we reveal the essential roles played by deterministic light fields, configured with lens systems or metasurfaces, in the passive sorting of particles based on their varying sizes and shapes, sorting resolutions and speeds. We conclude with our vision of the most promising and futuristic directions, including AI-facilitated ultrafast and bio-morphology-selective sorting. It can be envisioned that optical sorting will inevitably become a revolutionary tool in scientific research and practical biomedical applications.

Photovoltaic bioelectronics merging biology with new generation semiconductors and light in biophotovoltaics photobiomodulation and biosensing

This review covers advancements in biosensing, biophotovoltaics, and photobiomodulation, focusing on the synergistic use of light, biomaterials, cells or tissues, interfaced with photosensitive dye-sensitized, perovskite, and conjugated polymer organic semiconductors or nanoparticles. Integration of semiconductor and biological systems, using non-invasive light-probes or -stimuli for both sensing and controlling biological behavior, has led to groundbreaking applications like artificial retinas. From fusion of photovoltaics and biology, a new research field emerges: photovoltaic bioelectronics.

Responses

Your email address will not be published. Required fields are marked *