Calibrating quantum gates up to 52 qubits in a superconducting processor

Introduction
Realistic quantum computers suffer from noise, hindering themselves from demonstrating an advantage over their classical counterpart1,2,3,4,5. Quantum error correction is the key to reducing noise6,7,8,9,10, yet its implementation hinges on the availability of high-fidelity quantum gates and weak correlations among different gate operations11,12,13. Thus, diagnosing the noise level and assessing the interaction within the quantum circuits is essential to improving their performance and, hence, realizing fault-tolerant quantum computing. Within a step of quantum circuits, single-qubit and two-qubit gates are parallelly implemented to reduce the circuit depth. Due to the potential crosstalk and residual coupling between the qubits14, the local gate fidelities may be insufficient to give an overall evaluation of the global gate. In contrast, global gate fidelity contains information about correlations among different local gates, providing an overall performance metric.
As quantum hardware advances and platforms grow, large-scale quantum gate calibration becomes increasingly crucial. Researchers have made great efforts to develop practical benchmarking methods15,16,17 and enlarge the gate calibration size. Currently, the most frequently used method to evaluate the quantum gate fidelity is randomized benchmarking (RB)18,19,20,21,22, which enjoys the advantage of low complexity and robustness to state preparation and measurement errors. Nonetheless, the original RB protocol20 is only realized up to three qubits23 owing to the compiling problem of global Clifford gates. Several RB variants24,25,26 were proposed circumventing compiling issues and significantly advanced the benchmarking size to 10 qubits for an individual gate via cycle benchmarking25 and 27 qubits for a gate set via mirror RB26.
Nonetheless, a gap exists between these achievements and the scale of state-of-the-art quantum computers. Currently, quantum computers have been realized with tens of or even hundreds of qubits in superconducting27,28,29, ion trap30,31, and neutral atom systems32. Calibrating larger gates is demanding for assessing and enhancing the performance of current quantum computers. Meanwhile, a noticeable challenge to achieving this goal is the low fidelity often associated with large-scale gates. Many RB protocols require repeated execution of the target gate. However, noise can prevent the gate from being repeated without significant signal loss, which can render the protocol ineffective. Resolving this issue requires enhancing benchmarking protocols to ensure their effectiveness in short-depth circuits and achieving the realization of higher-performance quantum gates.
In this work, to achieve large-scale gate calibration, we utilize the character-average benchmarking (CAB)33 protocol. This method can evaluate the fidelity robust to state preparation and measurement errors for an individual Clifford gate up to a local unitary transformation and be scalable with respect to the system size. Compared to cycle benchmarking25 that also aims to evaluate the individual gate fidelity, CAB requires shorter circuit depth and can tolerate higher gate errors and benchmarking larger gates. Additionally, CAB exhibits smaller statistical fluctuations than cycle benchmarking, with more details shown in the Supplemental Material (See Supplementary Material for more detailed experimental data of the comparison between character-average benchmarking and cycle benchmarking, the fluctuation analysis of the experimental results, and additional benchmarking and optimization results, which includes references25,33). Meanwhile, in experiments, we improve gate fidelity and reduce a significant portion of gate control errors via qubit frequency adjustment and pre-calibration of gate parameters, as elaborated in Methods. The high fidelity and nearly depolarizing noise of the quantum gates are crucial to our success in realizing large-scale CAB experiments.
Here, we consider two important types of gates for experimental benchmarking. One is a fully connected gate composed of two layers of CZ gates and two layers of local Clifford gates. This gate is a part of the brickwise architecture circuit with an efficient realization scheme on current devices and hence is favored in variational quantum algorithms34. It also plays an essential role in other quantum information processing tasks like simulating a nearest-neighbor interacting Hamiltonian evolution. We benchmark such gates up to 46 qubits and get a fidelity of 17.42% ± 0.45% dressed with local twirling gates.
We also benchmark the parallel CZ gate, composed of parallelly implemented local CZ gates, which excels in generating entanglement across multiple parties and is essential in preparing graph states and executing numerous quantum algorithms35,36. We benchmark such gates from 4 to 44 qubits. The average fidelity of a single local CZ gate is about 98% and does not decrease with the qubit number increase.
With the global fidelity of the parallel CZ gate, we characterize the correlation among the local CZ gates. This procedure can be done simultaneously when benchmarking global fidelity without extra experiments. The correlation data provides the interaction information within the circuit and helps to evaluate and optimize the gate performance. The correlation of the 44-qubit parallel CZ gate turns out to be weak and constantly positive. In contrast, the results of the 52-qubit parallel CZ gate present some negative values. The experimental results are well explained by our established composite noise model, which incorporates local depolarizing and ZZ-coupling noises. The correlation magnitude positively depends on the coupling strength. Interestingly, the correlation sign between two gates varies in two cases: staying positive in two-gate coupling yet turning negative in three-gate coupling when one gate strongly couples with a third one. The correlation results help identify coupling gates and advance the study of correlated noise.
Since an important application of fidelity benchmarking is gate optimization, we demonstrate optimization experiments for parallel CZ gates and compare outcomes when setting the target function as the parallel CZ gate fidelity and the individual local CZ gate fidelities. We observe better results of the former. This result validates the effectiveness of CAB and suggests that global fidelity is more effective in optimizing quantum circuit performance, originating from containing more correlation information. The optimization result is consistent with the ZZ-coupling noise model. When three gates couple, improving the fidelity of one local CZ gate may reduce the fidelities of others. This antagonistic relationship demonstrates the limitations of local fidelities.
Results
Preliminary
Let us start with briefly revisiting the concept of gate fidelity. Any noisy quantum gate (widetilde{U}) can be treated as a composite of its noise channel Λ, and the ideal gate U, expressed as (widetilde{U}=U,{circ}, Lambda). In this work, the fidelity of gate U refers to the process fidelity of Λ, defined by,
where n is the number of qubits and ({{mathsf{P}}}_{n}={{{mathbb{I}},X,Y,Z}}^{otimes n}) represents the n-qubit Pauli group. This process fidelity is a linear function of the average fidelity, a metric commonly employed in quantum gate benchmarking studies. The summation in Eq. (1), initially spanning 4n terms, can be effectively restructured into 2n terms as follows:
where pt(P) denotes a bitstring that takes 0 on bit i if P acts as the identity ({mathbb{I}}) on qubit i and takes 1 if P acts non-trivially, and ∣w∣ signifies the weight of the bitstring w. The quantity λw is termed the weighted quality parameter of Λ with weight 2−2n3∣w∣, and below, we concisely refer to it as the quality parameter of Λ.
The core of CAB lies in assessing the quality parameter of the channel (sqrt{{U}^{-1}{Lambda}^{{prime} }_{U}U{Lambda }_{U}}) through the circuit depicted in Fig. 1(a), with ΛU and ({Lambda}^{{prime} }_{U}) being the Pauli-twirled noise channels of U and U−1, respectively. This approach yields the CAB fidelity of U, closely approximating U’s fidelity under physically reasonable conditions33. Hereafter, we refer to the CAB fidelity simply as the gate fidelity unless stated otherwise. While estimating all quality parameters for fidelity evaluation demands exponential resources, the number of quality parameters necessary for accurate fidelity estimation within acceptable error margins and confidence levels is independent of the qubit count, thereby ensuring the scalability of the protocol. The whole procedure is shown in Fig. 1 and elaborated in Methods. Note that the fidelity evaluated by this procedure is associated with the noise of U and that of the local twirling gates adjacent to U and U−1, referred to as dressed fidelity. To isolate U’s fidelity, one can employ interleaved RB techniques22, comparing the dressed fidelity against the local twirling gate fidelity. Particularly, the fidelity of U is derived by ref.22
with Fdress and Ftwirl the dressed and twirling gate fidelities, respectively. The local twirling gate fidelity itself is determined through CAB, with the identity operation as the target gate.

a shows the circuit, beginning with the preparation of the state ({leftvert 0rightrangle }^{otimes n}), followed by a random local Clifford gate ({bigotimes }_{i = 1}^{n}{C}_{i}). Subsequently, 2m layers of random Pauli gates ({bigotimes }_{j = 1}^{n}{P}_{j}^{(i)}) are interleaved with alternate sequences of U and U−1. The inverse gate ({U}_{inv}={({Pi }_{i = 1}^{m}({U}^{-1}{bigotimes }_{j = 1}^{n}{P}_{j}^{(2i)}U{bigotimes }_{j = 1}^{n}{P}_{j}^{(2i-1)}))}^{-1}) and the inverse of the local Clifford gate ({bigotimes }_{i = 1}^{n}{C}_{i}^{-1}) are applied thereafter. Finally, one applies computational-basis measurements and records the outcome. One needs to sample Kr random sequences, and for each sequence, one measures Ks times. The outcome statistics are counted for each sequence, as in (b). After that, one randomly chooses Kq observables ({Z}_{w}in {{{mathbb{I}},Z}}^{otimes n}), where w ∈ {0, 1}n and pt(Zw) = w, with probability 2−2n3∣w∣. One estimates the expectation values of these chosen observables, and the expectation values need to be averaged across Kr random sequences. The above procedure is repeated for different m from a circuit depth set, {m1, m2, ⋯, mM}. For all m, the chosen observables have to be the same. c demonstrates the expectation values of different observables for different sequence lengths. The expectation value Ow(m) is approximately proportional to ({lambda }_{w}^{2m}), which can be fit to Aλ2m to determine quality parameter λw like (d). e shows the last step. The final fidelity estimation is the average of the fitting values, and subsequently, one can further evaluate the gate correlation and employ gate optimization. Practical experimental settings choose constant values for Kr, Ks, and Kq independent of the qubit count.
Fully connected gate and parallel CZ gate benchmarking
Our CAB experiments for the fully connected gate and the parallel CZ gate are conducted on a 54-qubit superconducting quantum computer. On a two-qubit system, the CZ gate is ({rm{CZ}}=leftvert 0rightrangle leftlangle 0rightvert otimes {mathbb{I}}+leftvert 1rightrangle leftlangle 1rightvert otimes Z), and its parallel extension across multiple qubits is defined by (U={bigotimes }_{k = 1}^{r}{{rm{CZ}}}^{({i}_{k},{j}_{k})}), where ({{rm{CZ}}}^{({i}_{k},{j}_{k})}) denotes the CZ gate acting on a specific qubit pair, (ik, jk), making U a 2r-qubit gate. The fully connected gate comprises two layers of different parallel CZ gates intertwined with two layers of single-qubit gates, as shown in Fig. 2(a). One of the parallel CZ gates connects qubits 1 and 2, 3 and 4, … and the other connects qubits 2 and 3, 4 and 5…

a The left figure demonstrates the qubits and CZ gates used to realize the fully connected gate with 46 qubits. We select a ring on the two-dimensional quantum processor, corresponding to a one-dimensional quantum system. The ring contains two patterns of parallel CZ gates, shown in red and green, respectively. The right figure shows the circuit structure for the fully connected quantum gate on n = 2r qubits, where Qi represents the i-th qubit. The gate comprises two layers of single-qubit gates, ({bigotimes }_{i = 1}^{n}{V}_{1}^{i}) and ({bigotimes }_{i = 1}^{n}{V}_{2}^{i}), and two layers of different parallel CZ gates. b The quality parameters and the fidelities of the dressed gates, including the error from the local twirling gates and the target gates. The distributions of the noise channel quality parameters are shown with violin plots. Their mean values equal the gate fidelities, shown with a point with an error bar. The length of the error bar equals the standard error of the fidelity estimation.
Fully connected gate benchmarking
The single-qubit gates in the fully connected gate, like ({V}_{1}^{i}) and ({V}_{2}^{i}) in Fig. 2(a), are free to vary. In quantum information tasks like variational quantum algorithms, they are normally chosen according to specific problems. In our experiments, we randomly sample all the single-qubit gates from the Clifford group so that the global gate is a Clifford gate and can be benchmarked with CAB. It is worth mentioning that cycle benchmarking requires repeatedly implementing the target gate U proportional to the gate order, defined as the smallest positive integer p to make ({U}^{p}={mathbb{I}}). The fully connected gate generally has a rapidly increasing order with respect to the qubit number, and the order has already been thousands on average for 16 qubits, which we show in the Supplemental Material (See Supplementary Material for more detailed experimental data of the comparison between character-average benchmarking and cycle benchmarking, the fluctuation analysis of the experimental results, and additional benchmarking and optimization results, which includes references25,33). The long-depth circuit will lead to extremely noisy experimental results and cannot provide effective benchmarking. Instead, by using the target gate and its inverse, this gate can be benchmarked within a short-depth circuit using CAB.
We realized the fully connected gate with qubit numbers from 16 to 46 and benchmarked its dressed fidelity. The dressed fidelity of each target gate is estimated with circuit depths of {0, 1} through 50 circuit samples per circuit depth and performed 20,000 measurements per circuit. Theoretically, larger circuit depth differences can reduce fidelity estimation fluctuations, but practical constraints limit how deep the circuits can be. To reduce the impact of gate noise on the measurement results, we choose the circuit depths below 2.
Based on the measurement results, we estimated 100 quality parameters and averaged them to compute the gate fidelity. The results are shown in Fig. 2(b), ranging from 63.49% ± 0.07% to 17.42% ± 0.45% for qubit numbers from 16 to 46, with the full data available in the Supplemental Material (See Supplementary Material for more detailed experimental data of the comparison between character-average benchmarking and cycle benchmarking, the fluctuation analysis of the experimental results, and additional benchmarking and optimization results, which includes references25,33). The quality parameters are distributed near the mean value, meaning the noise is close to a depolarizing noise. This feature is extremely useful in realizing CAB experiments in a low-fidelity region. If the noise is far from the depolarizing noise, like the unitary noise, and the fidelity is low, the quality parameters may be below 0. Since a quality parameter λ is obtained by fitting it to Aλ2m as shown in Fig. 1, a negative quality parameter λ is indistinguishable from—λ and cannot be obtained correctly by the exponential fitting.
Parallel CZ gate benchmarking
The benchmarking for the parallel CZ gates contains two patterns, depicted in Fig. 3(a) using two distinct colors. We first evaluate the fidelities of the gates within the orange pattern, consisting of 22 pairs of CZ gates aligned in the same physical direction. This benchmarking was conducted progressively, starting with 2 pairs of CZ gates and incrementally including more gates up to the full set of 22 pairs. Subsequently, we evaluate the gate correlations within the orange and the blue patterns. The latter incorporates 26 pairs of CZ gates and engages almost the entire quantum processor.

a Two parallel CZ gate patterns, colored orange (22 CZ gates) and blue (26 CZ gates) on a 54-qubit quantum processor, with available qubits shown in black. b Benchmarking fidelities for the parallel CZ gate within the orange pattern. The benchmarking is done progressively from 2 to 22 CZ gates. Brown and purple violin plots illustrate the noise channel quality parameters for the dressed and local twirling gates, respectively. The point with an error bar in each violin plot represents the mean value equal to the gate fidelity, and the length of the error bar equals the standard error of the fidelity estimation. Green diamond-shaped points denote individual fidelities of target gates, fitting well with the theoretical curve Fn/2, where n is the qubit number and F = 97.94%. c A heatmap displays the pairwise correlation among CZ gates within the orange pattern, revealing weak positive correlations among several neighboring gates. We label the indexes of the CZ gates on the two axes. CZAB means the CZ gate between qubits A and B. For instance, CZ0802 is the CZ gate between qubits 2 and 8. The scatter diagram shows that the absolute value of the correlation decreases with the distance of the CZ gates and is mainly large between nearby CZ gates. The CZ gate distance is measured as the minimal line count connecting qubit pairs in the processor layout from (a). d The heatmap and scatter plot for the blue pattern, with the same color scale and interpretation as in (c), indicate more significant correlations compared to the orange pattern and show that large correlations also exist for two remote CZ gates.
The settings for benchmarking parallel CZ gates within the orange pattern are the same as that for the fully connected gate, except the circuit depths changed to {0, 2}. In this experiment, we benchmark both the dressed fidelity and the local twirling gate fidelity. The local twirling gate fidelity is obtained by changing the target gate U with the identity operation. We then use the interleaved technique22 and Eq. (3) to isolate the pure parallel CZ gate fidelity.
In Fig. 3(b), we show the noise channel quality parameter distribution for the dressed parallel CZ gates and the local gates with violin plots. The mean value equals the fidelity and is shown with a point. The green diamond-shaped points represent the pure fidelities of the parallel CZ gates within the orange pattern. The largest one, or the 22-pair parallel CZ gate, possesses a fidelity of 63.09% ± 0.23%. These fidelities have been fit using the function Fn/2, where n is the qubit number. The fit aligns closely with our experimental data, suggesting a fidelity value of approximately 97.94% for a single CZ gate. This indicates that, in the orange pattern, the fidelity of individual CZ gates remains nearly constant and is not affected by an increase in qubit number. This observation implies that the crosstalk among these parallel CZ gates is either limited to short-range interactions or is remarkably minimal. Such a characteristic is critical for the implementation of quantum error correction. The detailed fidelity and standard error data are available in the Supplemental Material (See Supplementary Material for more detailed experimental data of the comparison between character-average benchmarking and cycle benchmarking, the fluctuation analysis of the experimental results, and additional benchmarking and optimization results, which includes references25,33).
Correlation benchmarking
Beyond gate fidelity, our analysis extends to examining correlations within parallel CZ gates. We define this correlation as the deviation of the parallel CZ gate fidelity from the product of the fidelities of its comprising local CZ gates. The concept of correlation is elaborated in Methods. Any nonzero correlation value emerges as an indicator of interactions among the local gates.
The correlations among every two CZ gates within the orange and blue patterns are visualized through heatmaps in Fig. 3(c) and (d), respectively. Additionally, we plot the correlation magnitudes as a function of the physical distance between CZ gates in the processor. For the orange pattern, we used the pure CZ gate fidelities from the parallel CZ gate benchmarking experiment to evaluate the correlation. The result of the blue pattern is obtained by another experiment. The orange pattern exhibits a notable feature: correlations between CZ gates are consistently positive and pronounced only when the gates are closed. This observation suggests a negligible presence of long-range interactions in the implementation. However, the 26-pair parallel CZ gate within the blue pattern reveals different performance, with substantially higher correlations even for distant CZ gates, indicating the existence of long-range interactions within this configuration. The difference between the two patterns can be explained by whether a parallel calibration of CZ gates before benchmarking exists. Before experiments, gates in the same direction–such as all the two-qubit gates in the orange pattern–were calibrated in parallel using the method detailed in the “Parallel calibration of controlled-Z gate parameters with back probability” subsection in Methods. To maximize the qubit number of a pattern, the blue pattern is formed by combining gates in two directions. The results show that correlation benchmarking serves as an additional indicator of gate performance, alongside fidelity.
In Methods, we establish a noise model composed of depolarizing and ZZ-coupling noises to explain the correlation benchmarking results. The model first introduces a local depolarizing noise on each gate, followed by a unitary correlated noise. The Hamiltonian of the correlated noise is a summation of pairwise ZZ on each pair of CZ gates, where the coefficients depend on the coupling strengths. Our analysis focuses on the two-gate and three-gate coupling cases. A natural result is that the correlation value between two gates positively depends on their own coupling strength. Weak correlation implies weak coupling. Interestingly, we find that when only two gates couple with each other, the correlation is always positive. The correlation value is normally less than 0.001 for uncorrelated gates and can be on the order of 0.01 for correlated gates, which is consistent with the experimental data. Nonetheless, if one of the two gates couples with a third gate, the correlation between the original two gates can become negative. Particularly, when one gate is strongly coupled with the third gate, and the other gate is weakly coupled to it, the negative correlation becomes significant.
Applying the theoretical analysis to the experimental data, we observe that most correlation values are positive, corresponding to weak coupling or two-gate coupling cases. The results of the orange pattern can be fully explained by two-gate coupling. The pair of CZ2216 and CZ1004 and the pair of CZ4337 and CZ3125 exhibit the strongest couplings. The CZ gates in the two pairs are both nearest neighbors. For the blue pattern, negative correlation values are observed, indicating that the two-gate coupling model is not sufficient to explain the data. Take the pair of CZ4943 and CZ1812 as an example, the negative correlation value implies that besides the coupling between CZ4943 and CZ1812, there exists a third gate strongly coupled with CZ4943 or CZ1812. From the correlation data, we infer that a neighbor of CZ4943, like CZ3731, couples strongly with CZ4943 but not with CZ1812. The coupling relationship among these three CZ gates can lead to a negative correlation between CZ4943 and CZ1812. Note that in real experiments, couplings can involve more than three gates, making the origins of negative correlations more complicated than the three-gate model described here. We expect our results to inspire more explorations into the correlation and coupling among quantum gates.
Parallel CZ gate optimization
In addition to benchmarking, we conducted optimization on parallel CZ gates, employing two distinct approaches: optimizing with global fidelity and optimizing with individual local CZ gate fidelities. In both cases, the Nelder-Mead algorithm is utilized for optimization37,38. Note that the noise of the target gate is much larger than that of the local twirling gates, and the fidelities of the local twirling gates are stable. The noise of the target gate dominates the dressed fidelity, making this quantity sufficient for optimization. To reduce benchmarking time, we optimize using the dressed fidelity, avoiding additional benchmarking of local twirling gates and the interleaved procedure. To minimize the influence of other unstable factors on gate fidelity, we measure both the fidelity of the parameters being iterated (iterative fidelity) and the fidelity of the initial parameters as a reference (reference fidelity) throughout the optimization process. The optimization target function is defined as the difference between these two fidelities. Before this optimization experiment, each local CZ gate was calibrated with the “fast calibration” and “parallel calibration” approaches shown in Methods.
In our experiments, the parallel CZ gate comprises 2n optimizable parameters with n the qubit number. We chose n as 4 and 6 so that the number of parameters is suitable for the Nelder-Mead algorithm to work. Optimizing gates with tens of qubits requires more scalable optimization algorithms. Figure 4 presents the optimization results for a parallel CZ gate comprising 3 local CZ gates on 6 qubits. The topology of the three CZ gates and the optimization procedure are shown in Fig. 4(a). The three gates are relatively close, which are more likely to correlate with each other. Meanwhile, the readout channels of these three gates are relatively stable compared to other gates, ensuring minimal influence of environmental noise on the optimization procedure.

a The left figure shows the topology of 3-pair parallel CZ gates on the quantum processor. The three CZ gates are labeled with A, B, and C. During the optimization, we benchmark the fidelities of the target gate associated with two sets of parameters via CAB. The reference fidelity, fluctuating due to environmental factors, is based on reference parameters, and the iterative fidelity, mainly influenced by varying CZ gate parameters, corresponds to iterative parameters. The target function subtracts the reference fidelity from the iterative fidelity, thus mitigating the interference of environmental factors. The optimization algorithm is the Nelder-Mead algorithm. b The data of global fidelity during the optimization procedure. The line is the fidelity estimation, and the shadow above and below represents the value of one standard error away from the fidelity. The blue and orange lines correspond to the optimization results utilizing global and local CZ gate fidelities, respectively. The left and right figures show the reference and iterative fidelities, respectively. The small figure shows the probability density distribution of the fidelities within iterations 100–180, obtained by kernel density estimation. During this range, two reference fidelities remain stable and close, and iterative fidelities converge. Comparing the iterative fidelities in this range allows for a fair assessment, with the right figure indicating more effective fidelity improvements when global fidelity is used for optimization. c The distribution of gate correlations derived from reference and iterative fidelities within iterations 100–180 in (b). For a 3-pair parallel CZ gate, correlations are observed among all three gates (CorrelationABC) and between each pair (CorrelationAB, CorrelationAC, and CorrelationBC). For instance, CorrelationAB refers to the correlation between CZ gates A and B. The average values and standard deviations of correlations from reference fidelities are similar for both objective functions, indicating consistent environmental influences. However, for iterative fidelities, except CorrelationBC, employing global fidelity as the target function generally leads to improved outcomes.
The progression of fidelities during the optimization is depicted in Fig. 4(b), and the inter-gate correlations, calculated based on data from iterations 100-180, are illustrated in Fig. 4(c). This specific iteration range is chosen as it is the phase of the iterative parameter convergence and stable reference fidelities, indicating a reduced impact from other fluctuating factors. Further experimental details and results of a 4-qubit setup are available in the Supplemental Material (See Supplementary Material for more detailed experimental data of the comparison between character-average benchmarking and cycle benchmarking, the fluctuation analysis of the experimental results, and additional benchmarking and optimization results, which includes references25,33).
Below, we compare the optimization results employing global fidelity and the local gate fidelities with data from iterations 100-180. The fidelity is improved to 92.04% and 87.65%, and the correlation is reduced to 3.22% and 3.53% when using global fidelity and the local gate fidelities for optimization, respectively. It is clear that using global fidelity for optimization more effectively enhances fidelity and reduces correlation. Using local gate fidelities tends to yield inferior optimization outcomes, attributable to the lack of correlation information within the target function. In Methods, we use the ZZ-coupling model to explain the difference between global and local gate fidelities. When three gates are mutually coupled, optimizing one local fidelity may cause a decrease in the other two. This antagonistic relationship between local fidelities can trap the optimization in a local region. In contrast, global fidelity incorporates all correlation information, allowing the optimization to proceed monotonically. This finding underscores the significance of correlation in optimizing parallel gates and demonstrates the crucial advantage and essence of benchmarking large-scale quantum gates.
Discussion
In conclusion, we utilize CAB to conduct large-scale experiments of benchmarking the fully connected and parallel CZ gates. The benchmarking of gate correlation provides quantity to evaluate the quantum gate performance beyond gate fidelity, allowing the detection of long-range interaction, which may further be useful in studying many-body physics. Combined with the ZZ-coupling noise model, it is feasible to detect the coupling pattern of the parallel gates with correlation benchmarking. The established noise model also provides insight into further study of near-term quantum devices and quantum error correction.
The results highlight the crucial role of correlation in optimizing parallel quantum gates. Practically, one can decompose the circuit into multiple layers and optimize each layer with improved gate fidelity and reduced inter-gate correlation. This approach is more effective than optimizing each local gate individually, as evidenced by our optimization results. Meanwhile, one can first identify strongly coupled gates through correlation benchmarking and divide them into distinct groups. Gates within each group are strongly correlated, while groups themselves are weakly correlated. By optimizing each group individually, the overall performance of the entire layer can be enhanced. This approach simplifies the optimization process by focusing on the dominant gate correlations.
Our experimental methodology for benchmarking and optimizing large-scale gates applies to enhancing all Clifford circuits up to a local gauge transformation33, including the essential case of quantum error-correcting code circuits39. Typical quantum encoding schemes like surface codes9,10 or, more generally, quantum low-density-parity-check codes40, involve tens or even hundreds of qubits. Optimizing such large batches involves managing many gate parameters, necessitating the development of scalable optimization algorithms rather than solely scalable benchmarking. In the future, advanced large-scale optimization algorithms, particularly gradient-free ones, will help to explore large-gate optimization, ultimately contributing to realizing universal fault-tolerant quantum computers.
Methods
Procedure of character-average benchmarking
Below in Box 1, we introduce the procedure of the CAB protocol when benchmarking an n-qubit Clifford gate corresponding to Fig. 1. For target gates as non-Clifford gates and more protocol details, one can refer to Ref. 33.
Note that the procedure in Box 1 differs from the original one in Ref. 33. The main modification lies in step 5. In Ref. 33, one does not sample observables but takes all observables from ({{{mathbb{I}},Z}}^{otimes n}), which requires Kq = 2n. Here, we only need to set ({K}_{q}=O(-{epsilon }^{-2}log delta )) to ensure that Eq. (4) only differs from the fidelity estimated by traversing observables a small quantity, ϵ, with a high confidence level, 1 − δ. This can be seen from Hoeffding’s inequality, which is shown below. Note that λi is limited in the region [−1, 1].
Setting (2exp (-frac{{K}_{q}{epsilon }^{2}}{2})=delta), we get ({K}_{q}=2{epsilon }^{-2}(log 2{delta }^{-1})). Note that Kq is irrelevant to the qubit number n. Thus, the complexity of the classical postprocessing is independent of n. The number of sampled sequences for fidelity estimation is also independent of n as proved in Ref. 33. Thus, the whole benchmarking protocol is scalable.
Correlation
Here, we introduce the formal definition of the correlation of a parallel gate. We consider a parallel gate, (U={bigotimes }_{i = 1}^{g}{U}_{i}), where Ui is more local or acts on fewer qubits than U. Normally, Ui is a one-local or two-local gate in experiments. That is, Ui only acts on one qubit or two qubits. Via CAB, one can simultaneously get the fidelity of U and the fidelities of Ui, denoted as F(U) and F(Ui), respectively. If there is no correlation among Ui, the noise channel of U can be expressed as (Lambda ={bigotimes }_{i = 1}^{g}{Lambda }_{i}) where Λi is the individual noise of Ui. Then, the global fidelity, F(U), would be equal to the product of local gate fidelities, (mathop{prod }nolimits_{i = 1}^{g}F({U}_{i})). In reality, the interaction among different gates would make the two values different. We define the following quantity to characterize the total correlation among {Ui, 1 ≤ i ≤ g},
The denominator is a normalization factor. When the correlation is positive, the global fidelity is larger than the product fidelity, indicating that the correlation helps to increase the global fidelity. Since Eq. (6) is defined among g gates, we call it g-correlation. Except for g-correlation, one can also obtain j-correlation among each j gate in {Ui, 1 ≤ i ≤ g} where 2 ≤ j ≤ g − 1 for parallel gate (U={bigotimes }_{i = 1}^{g}{U}_{i}). Note that since F(U) and F(Ui) can be obtained simultaneously from the same experimental data, gate correlation can also be evaluated concurrently without any additional experimental effort.
Experimental platform
In this work, we utilized a processor with the same design as the Zuchongzhi2.0 processor41 and selected up to 54 qubits for our experiments. The basic performance of the processor is shown in Table 1, where the single-qubit gate error and the two-qubit CZ gate error with a median of 0.24% and 3.21% by cross-entropy benchmarking (XEB)42,43,44. Our scheme to realize a two-qubit CZ gate is implementing an all-microwave coupler with a fixed gate time of 110 ns45. The approach involves applying a microwave signal with an envelope A(t) and a driving frequency of ωt to the tunable coupler, with an extra flux given by (Phi (t)=A(t)cos ({omega }_{t}t+{phi }_{0})). When the driving frequency ωt matches the energy difference between the (leftvert 11rightrangle) and (leftvert 02rightrangle) states, i.e., ωt = ω11 − ω02, resonance takes place between these two states. Here, (leftvert 11rightrangle) is a computational-basis state with each qubit at state (leftvert 1rightrangle), and (leftvert 02rightrangle) is a state outside the computational subspace. However, due to the nonlinear relationship between the extra flux and the coupling strength, although Φ(t) is a good single-frequency signal when transformed into the coupling strength, the signal contains significant frequency components not only at ωt, but also at 2ωt and 4ωt. Therefore, when considering the frequency layout of qubits, it is necessary to avoid ωt, 2ωt, and 4ωt equal to either Δ01,10 = ∣ω01 − ω10∣ or Δ11,20 = ω11 − ω20.
Before our experiments, we adjusted the qubit frequency and calibrated the parameters for each local gate to make the processor perform well. In the following subsections, we will elaborate on this procedure in detail.
Frequency conflict and frequency adjustment
The distribution of qubit frequencies is typically constrained within a range of 0-400 MHz due to magnetic flux noise and the bandwidth of the digital-to-analog converter. When arranging the qubit frequencies, the following factors need to be considered and balanced: (1) Energy relaxation time T1 and dephasing time Tϕ. (2) Spacing of frequencies between neighboring qubits and next-to-nearest neighboring qubits. (3) Two-qubit gate frequencies, as well as their second and fourth harmonic frequencies, and the frequency conflicts with Δ01,10 = ∣ω01 − ω10∣ and Δ11,20 = ω11 − ω20. (4) Maximum frequencies for each qubit, which represent the available frequency range for each qubit. By defining the above factors as error functions and frequency domains, we can obtain a set of theoretically optimal frequencies. After adjusting the frequencies of all qubits to the optimized arrangement, a majority of single-qubit gates and two-qubit gates can achieve high fidelity through standard calibration. However, local fine-tuning is still required for poorly performing gates. Additionally, the performance of a quantum processor may deteriorate during certain time intervals due to long-term periodic frequency variations in two-level systems. This also requires fine-tuning of the corresponding qubits. Qubit frequency tuning is relatively frequent and tedious, as adjusting the frequency of one qubit requires re-calibrating two-qubit gates associated with it. Therefore, it is crucial to calibrate single-qubit gates and two-qubit gates efficiently in this process. On the other side, the success of subsequent benchmarking relies on an initial good adjustment of qubit frequencies, as a higher gate fidelity improves the benchmarking accuracy and stability. The configuration of qubit frequencies is the key to the success of our experiments.
Fast calibration of controlled-Z gates
After fine-tuning the qubit frequency, we need to recalibrate CZ gates related to it. The main parameters for calibrating the CZ gates are microwave frequency, microwave amplitude, and dynamic phase of the two relevant qubits. First, we roughly determine the microwave frequency and amplitude through the circuit in Fig. 5(a) with N typically set to 0. Within this circuit, the two qubits Q1 and Q2 are initially set at (leftvert 0rightrangle). We first flip these two qubits by applying pulse Xπ. Then, we apply the microwave pulse once, and after that, we measure the probability of two qubits returning to the (leftvert 11rightrangle) state. In the process of applying microwave pulses, (leftvert 11rightrangle) and (leftvert 02rightrangle) states will be exchanged, and we try to find the microwave pulse parameters to maximize the probability back to (leftvert 11rightrangle) for the ending state. Then, we fine-tune the microwave amplitude by implementing circuit (a) again. To amplify the errors caused by the parameters, we superimpose 2N + 1 CZ gates, with N > 0 this time. As the conditional phase is relatively sensitive to the frequency, we fine-tune the microwave frequency through the circuit in Fig. 5(b). When the β of Xβ changes, the probability of Q1, or the first qubit, changes as follows:
By fitting P(β) and obtaining ϕI and ϕx, we can get the conditional phase ϕ = ϕx − ϕI. The optimal microwave frequency is the frequency at which ϕ = π is satisfied. We then repeat the circuit (a) again to calibrate the microwave amplitude further with a large N. The optimal microwave amplitude is the amplitude that makes the probability of the (leftvert 11rightrangle) state closest to 1. The dynamic phase of the two relevant qubits can be calibrated through the circuit in Fig. 5(c). We first change the dynamic phase Zϕ of Q1 and find the point where the probability of the (leftvert 1rightrangle) state is closest to 1 to complete the dynamic phase compensation for Q1. This process is repeated for Q2 subsequently with ({X}_{frac{pi }{2}}) and Zϕ applied at Q2 in circuit (c).

C represents the coupler. ({X}_{beta }={e}^{-frac{ibeta X}{2}}) where X is the Pauli X operator. N normally takes 1, 3, 5, 7.
Parallel calibration of controlled-Z gate parameters with back probability
When we calibrate CZ gate parameters by means of amplifying errors through the superposition of multiple layers of CZ gate circuits, we can attain a high level of fidelity for most CZ gates. To further refine the gate parameters, we resort to the Nelder-Mead algorithm to continue to search for CZ gate parameters. Particularly, for each local CZ gate, we input (leftvert 00rightrangle), implement a random sequence of two-qubit Clifford gates, and record the probability of the final state back to (leftvert 00rightrangle). Each two-qubit Clifford gate is decomposed into CZ and single-qubit gates in implementation. Additionally, we alternate running the same set of random circuits with the reference and iterative parameters. The difference between their outcomes is used as the target function of the Nelder-Mead algorithm to mitigate environmental influence. To quickly calibrate a set of CZ gates where no two gates share a common qubit, we execute the above procedure in parallel for each local CZ gate. Each gate is calibrated independently using its own back probability. This parallel calibration method typically yields better gate parameters than the parameter scanning approach described in the previous subsection.
Fidelity and correlation analysis of depolarizing and correlated noise
In this part, we establish a simple but physical noise model to explain the experimental results. Note that CAB faithfully evaluates the process fidelity of quantum gates33. We consider how the process fidelity behaves under a combination of local depolarizing noise and gate interaction. Particularly, for a unitary gate (U={bigotimes }_{i = 1}^{g}{U}_{i}), we consider the following noise model.
where ({Lambda }_{{p}_{i}}) is a depolarizing noise on the i-th gate with parameter pi such that
Here, di is the dimension of gate Ui, and ({{mathbb{I}}}_{i}) is the identity operator on this subsystem. The noise ΛV is a correlated noise among all local gates Ui, modeled as a unitary evolution:
The form of Λ incorporates both decoherence on individual gates and interactions between gates. The decoherence on each gate is set as a depolarizing type, which is a standard setting in studies of near-term quantum devices3,4,5,28. Different from previous works, we introduce an additional interaction term V, which makes the noise model more physical. The form of V depends on the implemented gate U, which will be specified later. When considering the action of V on a subsystem S, the remainder of the system is treated as a maximally mixed state or a thermalized state at infinite temperature. That is, for state ρS on subsystem S,
Here, ΛV∣S is the restriction of ΛV to S, (bar{S}) is the complementary subsystem of S, ({d}_{bar{S}}) is the dimension of (bar{S}), and ({{mathbb{I}}}_{bar{S}}) is the identity operator on (bar{S}).
Since the gate (U={bigotimes }_{i = 1}^{g}{U}_{i}) comprises g gates {U1, U2, ⋯ , Ug}, we use [g] = {1, 2, ⋯ , g} to denote the whole system. Given a subset S ⊆ [g], we can represent a part of the gate U, ⨂i∈SUi, whose fidelity is given by (F({Lambda }_{V}{| }_{S},{circ},{bigotimes }_{iin S}{Lambda }_{{p}_{i}})) and denoted as FS. Note that the process fidelity has an expression (F(Lambda )={rm{tr}}(leftvert {Phi }^{+}rightrangle leftlangle {Phi }^{+}rightvert Lambda (leftvert {Phi }^{+}rightrangle leftlangle {Phi }^{+}rightvert ))) where (leftvert {Phi }^{+}rightrangle) is a maximally entangled state on two copies of the system. Through direct calculation, we have that
where L is a subset of S, SL is the complementary set of L in S, dL and dS are the dimensions of subsystems L and S, respectively, and
Note that (bar{L}) is the complementary set of L in [g]. From Eq. (13), we can obtain the global fidelity and local gate fidelities, and hence evaluate the correlation and investigate how fidelities depend on noise parameters.
In our experiments, we mainly consider the parallel CZ gate (U={bigotimes }_{k = 1}^{r}{{rm{CZ}}}^{({i}_{k},{j}_{k})}). Based on experimental observations, the correlated noise mainly arises from the ZZ coupling among qubits. Particularly, we consider a simplified noise model where only one qubit from each CZ gate couples with each other. Without loss of generality, we set this qubit as ik. Meanwhile, the Hamiltonian of the correlated noise only contains two local terms while the strength between ik and il is set as γkl. Thus, the correlated noise V is
The strength parameter is determined by γkl = gklt, where t is the evolution time, and gkl is the coupling strength between two qubits. In our experiments, t corresponds to the two-qubit gate time, which is 110ns. For two physically isolated qubits, their coupling strength is typically less than 0.3 MHz, resulting in γkl less than 0.033 for two uncorrelated gates. For two coupled qubits, γkl can be on the order of 0.1.
Substituting Eq. (15) into Eq. (13) gives the global fidelity and local CZ gate fidelities. We provide the results when r = 2 and r = 3, which relates to our correlation benchmarking and gate optimization results. More general cases can be straightforwardly obtained using Eq. (13).
When r = 2, the fidelities of the two local CZ gates are
The global fidelity of the two CZ gates is
The above gives the correlation when only two CZ gates correlate with each other via the ZZ coupling:
In the limit that p1 and p2 are close to 1, (sqrt{{F}_{[2]}{F}_{1}{F}_{2}}) is approximately ({p}_{1}{p}_{2}{cos }^{3}{gamma }_{12}). Then, the correlation is approximately (sin {gamma }_{12}tan {gamma }_{12}). Given the value of γ12 as 0.033 and 0.1, the correlation values take 0.001 and 0.01, respectively. This is consistent with our experimental results.
When r = 3, we give the fidelities of the three local CZ gates, the fidelities of each pair of CZ gates, and the global fidelity.
Here, (cos overrightarrow{lambda }=cos {gamma }_{12}cos {gamma }_{13}cos {gamma }_{23}) and (sin overrightarrow{lambda }=sin {gamma }_{12}sin {gamma }_{13}sin {gamma }_{23}). In this case, the correlation between the first and second local CZ gates is given by
Note that when p1 and p2 are close to 1, the first term in the formula of fidelities dominates, and we get the above approximation. It is interesting that the sign of the correlation depends on (({cos }^{2}{gamma }_{13}-{sin }^{2}{gamma }_{13})({cos }^{2}{gamma }_{23}-{sin }^{2}{gamma }_{23})). A negative correlation value implies that one of γ13 and γ23 is larger than π/4.
To further investigate how the correlation depends on parameters γ12, γ13, and γ23, we fix γ12 with values in {0, π/32, π/16, 3π/32, π/8, 5π/32} and depict the correlation with respect to γ13 and γ23. The results are available in Fig. 6.

Particularly, we assign fixed values to γ12, selecting from the set {0, π/32, π16, 3π/32, π/8, 5π/32}. The values of γ13 and γ23 range from 0 to 5π/16.
From the expressions in Eqs. (19) and (27), we observe a natural result that the correlation between two gates always increases with their own coupling strength. In the two-gate coupling case, the correlation is (sin {gamma }_{12}tan {gamma }_{12}), which increases monotonically with γ12. In the three-gate coupling case, the numerator of the correlation is proportional to ({cos }^{2}{gamma }_{12}{sin }^{2}{gamma }_{12}), which also increases monotonically with γ12. Thus, the correlation magnitude directly reflects the coupling strength.
Examining the sign of the correlation reveals distinct behaviors. If only two CZ gates are coupled, their correlation is always positive. However, in a three-gate coupling scenario, the situation changes. The correlation is still positive for a low-strength correlated noise when all parameters are less than π/4. Nonetheless, if two CZ gates exhibit strong ZZ coupling, one of the CZ gates will have a negative correlation with the third CZ gate. From Eq. (27) and Fig. 6, we can see that the negativity becomes particularly pronounced when one coupling is weak while the other is strong. This characteristic is useful for identifying the strongly coupled pair of gates.
Beyond correlation analysis, the dependence of local gate fidelities and global fidelity on the coupling strength helps explain the differences between these two types of fidelities in gate optimization. In the case of r = 2, this difference is small. Both kinds of fidelities are proportional to ({cos }^{2}{gamma }_{12}), and optimization naturally reduces γ12 to improve performance. Nonetheless, for the three-gate coupling model, the situation can be different. Each local fidelity depends only on two coupling parameters. When optimizing F1, the coupling parameters γ12 and γ13 tend to be smaller to make F1 higher and make the correlation weaker. However, given a fixed γ23, the decrease of γ12 or γ13 can also decrease F2 or F3. For instance, ({F}_{2}={p}_{2}({cos }^{2}{gamma }_{12}({cos }^{2}{gamma }_{23}-{sin }^{2}{gamma }_{23})+{sin }^{2}{gamma }_{23})+frac{1-{p}_{2}}{16}). When ({cos }^{2}{gamma }_{23}-{sin }^{2}{gamma }_{23} < 0), the decrease of γ12 makes F2 also decrease. This creates a competition between optimizing one local fidelity and another, often leading to the optimization being trapped in a local region. In contrast, global fidelity incorporates all coupling parameters, allowing the optimization process to simultaneously reduce all coupling strengths.
Responses