Accurate piezoelectric tensor prediction with equivariant attention tensor graph neural network

Introduction
The absence of inversion symmetry and non-conductor are necessary for generating a large piezoelectric effect. The piezoelectric tensor describes the change in electric polarization induced by mechanical stress or strain, or conversely, the change in stress or strain induced by an external electric field. Many bulk piezoelectric materials have been extensively investigated and applied, such as quartz, Rochelle salt, barium titanate, and lead zirconate titanate (PZT)1. Those piezoelectric materials drive a multi-billion dollar industry through their applications as sensors, actuators, and energy harvesters. Recently, discovery of lead-free and high-performance piezoelectric materials has attracted great attention for next generation eco-friendly and efficient devices.
Conventional discovery of new piezoelectric materials has relied on experimental efforts and density functional theory (DFT) calculations. The former often requires substantial time for experimental synthesis processes, while the latter demands increasingly vast computational resources as the system size grows, especially for the complicated calculations of three-rank piezoelectric tensors. Recently, machine learning based on existing data has emerged as a new paradigm in materials science research with largely enhanced efficiency and reduced resource demand. It has achieved success in predicting various material properties, such as band gaps2, crystal structures3, free energies4, and dielectric constants5. Generally, there are two classes of machine learning approaches in the materials field. Firstly, methods based on feature descriptors, such as Matminer6, classical force-field-inspired descriptors (CFID)7 have attained considerable success. These methods generate feature descriptors for each specific material and combine them with traditional tree models or other models to predict material properties. They often perform well for scalar predictions and small datasets. Secondly, graph neural networks (GNNs) play a crucial role in crystalline materials, with representatives like CGCNN8 and ALIGNN9 mapping crystal structures to crystal graphs for neural network training, displaying decent performance in predicting most material properties, albeit typically requiring large datasets for training. To improve the model performance, several advanced techniques from the field of machine learning, such as equivariance, multi-head attention mechanisms and layer normalization10, have been introduced into materials science research.
Recently, several efforts are trying to predict the piezoelectric constants and piezoelectric tensors of crystals, employing conventional machine learning or invariant scalar graph neural networks11,12,13. However, predicting the full piezoelectric tensor, rather than a single piezoelectric component, offers a more comprehensive understanding of a material’s piezoelectric properties. The full tensor provides crucial information about longitudinal, transverse, and shear piezoelectric effects, enabling the accurate description of material responses in all crystallographic directions. Additionally, the signs of different tensor components carry important physical significance. For example, when the longitudinal and transverse components share the same sign, the material may exhibit the electric auxetic effect14, where an electric field induces simultaneous expansion or contraction in all directions. The piezoelectric tensor is typically represented as a 3 × 6 matrix due to index symmetry15, which is highly dependent on the choice of coordinate system and crystal symmetry. The conventional invariant scalar models are hard to capture the relationship between crystal symmetry and target tensor, as they are not designed to handle the directional and symmetry-related characteristics of tensors.
To address the limitations of traditional invariant models, equivariant neural networks, which incorporate spatial rotational symmetry, have been proposed and have demonstrated a significant impact in various domains of physical property prediction10,16,17,18,19. Different from its invariant counterpart, the equivariant model is independent of the reference frame and can preserve material symmetry under the rotational operation. Thus, the equivariant models can inherently identify materials with inversion symmetry and produce piezoelectric components where all elements are zero. Furthermore, these models accurately generate piezoelectric tensors that conform to the symmetry operations of various space groups18. While, achieving such outcomes with conventional non-equivariant machine learning models is challenging, it requires data augmentation to approximate this process. Therefore, employing equivariant models for machine learning of piezoelectric tensors is highly suitable.
In this work, we propose a tensor equivariant attention tensor graph neural network (EATGNN) that incorporates multi-head attention mechanisms and layer normalization into the equivariant graph neural network10,17,20. EATGNN is equivariant to rotation transformations of the crystal structure in order to predict the complete piezoelectric tensor of both two-dimensional (2D) materials and bulk crystals. The output of this model is independent of the reference frame, so the choice of any reference frame does not affect the model’s results. Moreover, the model can intrinsically capture the symmetry of the material, which is reflected in the output piezoelectric tensor, thereby preserving the material’s symmetry. Compared to the previously developed equivariant neural network for predicting the bulk tensor property18,21, our model incorporates multi-head attention mechanisms and enhanced encoding of atomic attributes. These enhancements strengthen the learning capabilities of the equivariant model, resulting in improved performance in learning piezoelectric tensors. Besides the superb performance for bulk crystals, our model is also in good prediction in low-dimensional materials. This work holds great significance for accurately predicting piezoelectric tensor and identifying more potential high-performance piezoelectric materials.
Results
Equivariant attention tensor graph neural network
EATGNN can establish the relationship between crystal structures and their properties. The architecture and specific details of EATGNN are shown in Fig. 1. It takes the crystal structure as input22, represents it as a crystal graph ({mathcal{G}}(V,E)), and utilizes message passing and multi-head attention mechanism to update the graph across multiple graph attention layers. Ultimately, the model outputs the piezoelectric tensor in its irreducible representation, which is then post-processed to obtain the target piezoelectric tensor.

The model converts the input crystal structure into a graph representation, which is then processed by an equivariant graph attention neural network to predict the piezoelectric tensor as the output. In this figure, “(oplus)” denotes addition, “(otimes)” denotes multiplication, and “TP” means tensor product.
The crystal graph ({mathcal{G}}(V,E)) is composed of nodes and edges, where the nodes represent atoms, and edges represent atomic bonds. The feature vector ({f}_{i}) characterizes atom (i), and the initial value of ({f}_{i}) is obtained by Magpie23,24 which describes the physical properties of an atom using a vector that contains 21 atomic physical attributes, including atomic weight, covalent radius, electronegativity, group number, period number, and other information. We then perform one-hot encoding on these 21 atomic attribute values, resulting in a 119-dimensional vector to represent the nodes’ attributes and initial features25. A cutoff radius ({r}_{{cut}}) varies for different unit cell sizes to better capture the full symmetry of the crystal, and edges are created to connect atoms within the cutoff radius, considering the periodic boundary conditions during the search for neighboring atoms. For the edge embedding, it is processed in two parts: first, the radial component ({||}{vec{r}}_{{ij}}{||}) is expanded using a smooth-finite basis functions, where ({vec{r}}_{{ij}}) is a vector between atoms (i) and (j), and the angular component ({hat{vec{r}}}_{{ij}}) is expanded using spherical harmonics ({Y}_{l}^{m}).
GNNs typically update information through a message-passing mechanism.
Equation (1) represents a general message-passing formula for equivariant graph neural networks, where ({f}_{j}), ({f}_{i}^{{prime} }) are the nodes input and output, (partial (i)) is the set of average neighbors of the atom (i) in the training dataset, ({vec{r}}_{{ij}}) is the relative vector, (h) is a MLP with distance between atoms as input, (Y) is the spherical harmonics and (xotimes y) is tensor product of (x) and (y) parametrized by some weights w10,18, the tensor product (otimes) is defined in Eq. (2), where (C) denotes the Clebsch-Gordan coefficients. Equation (6) is an equivariant function, so if the input is rotated, the output is transformed accordingly. Unlike traditional GNNs, the atomic features in EATGNN include not only scalars but also higher-order irreps tensors, which enhances the model’s generalization capabilities and allows it to achieve better results with less data18.
In this work, we replace this summation process in Eq. (1) with a multi-head attention mechanism. The attention mechanism transforms features sent from one node to another with input-dependent weights10, and enhances the model’s ability to update information of the node and graph by leveraging the neighbors of each node. We perform a tensor product of the node features and node attributes to obtain the query (({q}_{i})), and a tensor product of the node features and edge attributes to obtain the key (({k}_{i})) and value (({v}_{i})) required for the attention mechanism computation. Then, the ({q}_{i}), ({k}_{i}) and ({v}_{i}) are each split according to the number of heads10, then, for each head, we obtain the respective ({q}_{{im}}), ({k}_{{im}}) and ({v}_{{im}}). And then calculate the attention coefficients ({alpha }_{{ijm}}), which are defined by Eq. (3), the calculation of the output for node features ({f}_{{im}}^{{prime} }) is shown as in Eq. (4) and concatenate the representations from all heads back together17. Afterward, the node features are subjected to nonlinear activation and equivariant layer normalization10. Before the model outputs the target quantity, an average pooling is performed across all nodes, since our target physical quantity, the piezoelectric tensor, is independent of the number of atoms in the unit cell.
In summary, EATGNN can be considered as an equivariant function to the group transformation, such that:
where ({D}_{x}(g)) and ({D}_{y}left(gright)) are the transformation matrices for the crystal structure and the piezoelectric tensor, respectively. Equation (5) ensures the model’s capability for equivariant recognition, accommodating the symmetry operations of major space groups and outputting a piezoelectric tensor that conforms to the material’s intrinsic symmetry. Under these symmetry constraints, many components of the piezoelectric tensor are restricted, thereby significantly enhancing the model’s predictive accuracy for piezoelectric tensors.
Irreducible representation of piezoelectric tensor
The piezoelectric tensor ({e}_{{ijk}}) is a third-rank tensor, with a total of 27 components, and it satisfies the index symmetry relation ({e}_{{ijk}}={e}_{{ikj}}). Therefore, there are a total of 18 independent components for a piezoelectric tensor. The intrinsic symmetry requirements of materials further restrict the number of independent components in the piezoelectric tensor (Fig. 2a). It is worth noting that the specific representation of the third-rank piezoelectric tensor depends on the choice of the coordinate axes. Thus, the rotation of the reference coordinate axes will change the values of tensor elements. The piezoelectric component varies with the rotation of the coordinate axes, as follows:
where ({A}_{{il}}) are the elements of the rotation matrix corresponding to the two coordinate axes26. Therefore, when we know the piezoelectric tensor in one coordinate system, the formula can be used to find the representation of the piezoelectric tensor in any other coordinate system. Due to the constraints imposed by the intrinsic symmetry of the crystals and in conjunction with Eq. (6), we obtain the distribution of independent elements in the piezoelectric tensors across all crystal systems and point groups, as illustrated in Fig. 2a. Nevertheless, the numerical values of the final piezoelectric tensor matrix still depend on the choice of coordinate system and cannot be directly expressed in an equivariant form with respect to the crystal structure.

a Symmetry classes and independent components of the piezoelectric tensor in each crystal system and point group. b Irreducible representation decomposition of the piezoelectric tensor.
To address this issue, we performed a harmonic decomposition of the piezoelectric tensor27. Then, the space of piezoelectric tensors is factored into the direct sum of irreducible representations. The piezoelectric tensor can be decomposed into four irreducible subspaces:
where ({{mathcal{H}}}^{n}) is the space of nth order symmetric and traceless tensors28, and a symmetric and traceless tensor is called a deviator. For example, scaler and vector are zeroth- and first-order deviator. The piezoelectric tensor ({e}_{{ijk}}) has the orthogonal irreducible decomposition27,29:
where (u=({u}_{i})) and (v={(v}_{j})) is a vector with three independent components, (D=({D}_{{ij}})) is a second order symmetric and traceless tensor with five independent components, ({D}_{{ijk}}) is a third order symmetric and traceless tensor with seven independent components, ({epsilon }_{{ijk}}) is the Levi-Civita symbol and ({delta }_{{ij}}) is the Kronecker delta27,28,29, as shown in Fig. 2b. In Eq. (8), each component of the decomposed piezoelectric tensor is equivariant to rotational operations. This decomposition provides a powerful framework for physically understanding the piezoelectric behavior of various materials based on the contributions from these irreducible subspaces.
With the irreducible representation of the piezoelectric tensor, we now can overcome the challenges in previous reports11,12,13,18 posed by the choice of coordinate system in the prediction of piezoelectric tensors. Therefore, by combining the irreducible representation of the piezoelectric tensor and the multi-head attention mechanism with an equivariant model, we can achieve highly accurate AI model for predicting the piezoelectric tensor of crystals. This significantly accelerates the discovery of new and high-performance piezoelectric materials and devices.
Piezoelectric tensor machine learning for bulk crystals
The piezoelectric tensor data of bulk crystals in this work was obtained from the Materials Project21 and a recent high-throughput computational study on piezoelectric tensors24,30. Due to the presence of a significant number of outliers in the piezoelectric tensors, we performed data resampling before applying machine learning techniques to prevent overfitting. Additionally, we restricted the materials used in this work to a unit cell containing no more than 25 atoms to ensure a reasonable distribution of data. We examined all materials dataset to ensure that the target piezoelectric tensors are consistent with the material’s symmetry. Furthermore, we mathematically adjusted those piezoelectric tensors that did not conform to the material’s symmetry to be invariant under crystallographic symmetry operations. Ultimately, we obtained a dataset comprising 3444 entries, and these data were randomly divided into training and test sets in a 9:1 ratio.
Figure 3a shows the distribution of the absolute values of the maximum piezoelectric components in the dataset after resampling. The MAE of each piezoelectric tensor element in Voigt notation is displayed in Fig. 3b using a heatmap. Unlike other models about piezoelectric tensor prediction, EATGNN can directly predict complete piezoelectric tensors that conform to crystal symmetries, and each tensor element exhibits relatively small errors. Utilizing the complete piezoelectric tensor allows for the determination of the maximum piezoelectric components, including the maximum longitudinal, transverse, and shear piezoelectric effects. Furthermore, the Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) of EATGNN on the test set are 0.154 C/m2 and 0.141 C/m2, respectively. Table 1 compares the performance of EATGNN with the equivariant model MatTen18, the invariant model ALIGNN, and the invariant model based on CFID descriptors combined with XGBoost (XGB)31. It can be observed that EATGNN outperforms the other models in both MAE and RMSE. Additionally, we evaluated the impact of the multi-head attention mechanism on the learning capability of the equivariant model. The results in Table 1 indicate that incorporating this mechanism further enhances the model’s predictive performance. This demonstrates the effectiveness of the multi-head attention mechanism and harmonic decomposition of the tensor.

a Histogram of the distribution of ({|{e}_{{ij}}|}_{max }) for bulk materials. b Heatmap of the MAE of the piezoelectric tensors for bulk materials in unit of ({rm{C}}/{{rm{m}}}^{2}). c Histogram of the distribution of ({|{e}_{{ij}}|}_{max }) for 2D materials. d Regression plot of the maximum piezoelectric component in 2D materials. e Regression plot of the maximum longitudinal piezoelectric component in 2D materials. f The maximum piezoelectric component of MoS2 under various rotations of the coordinate system, the coincidence of the solid and dashed lines indicates that EATGNN satisfies equivariance. The green curve represents the maximum piezoelectric components of the initial output piezoelectric tensor at various rotation angles, while the yellow stars denote the maximum piezoelectric components output by EATGNN at each angle.
Piezoelectric tensor machine learning for 2D crystals
The high demand for flexible nanodevices drives interest in 2D piezoelectric materials, essential for precision actuators, wearable sensors, and smart materials. These materials serve as nano-generators, offering a practical alternative to micro-scale battery packs for powering nanoscale devices. In contrast to bulk systems, the periodicity along the (c)-direction in 2D materials is lost, and consequently the 2D piezoelectric tensor can be further simplified to a 3 × 3 matrix32. Therefore, there has not been any reported machine learning models to predict piezoelectric tensors for both bulk and 2D crystals.
Next, we will prove that our EATGNN model is also applicable for 2D materials with good accuracy. The piezoelectric tensor data of 2D materials was obtained from the C2DB database33,34, comprising a total of 1382 data points, after analyzing and removing some extreme outliers to prevent the model from learning an inaccurate data distribution, 1350 data points were retained and randomly split into training and test sets with 9:1 ratio. Figure 3c shows the distribution of the absolute values of the maximum piezoelectric components in the dataset after resampling. EATGNN requires that the output maintains equivariance when the input data undergoes a group transformation. We first checked whether the piezoelectric tensors in the dataset satisfy the symmetry of their corresponding crystal structures. This was achieved by applying the crystal structure’s symmetry operations to the tensor and checking if it remains invariant under these operations. If a change occurred, we transformed the tensor into a symmetry-compliant form that respects the lattice symmetry22. Generally, this operation can utilize symmetry constraints to restrict certain tensor elements to zero.
EATGNN can output the full piezoelectric tensor, and the RMSE and MAE are used to evaluate the performance of this model. the RMSE and MAE of our model are 50.6 pC/m and 16.8 pC/m, respectively. Table 1 compares the performance of EATGNN on 2D materials with that of the equivariant model MatTen, the invariant model ALIGNN, and the invariant model based on CFID descriptors combined with XGB, similar to the comparisons performed for bulk models. It can be observed that EATGNN outperforms the other models in both MAE and RMSE, with the incorporation of the multi-head attention mechanism further enhancing its predictive capability.
We also computed several commonly used piezoelectric properties: the maximum piezoelectric component and longitudinal component, all taken as absolute values (({|{e}_{{ij}}|}_{max }) and ({|{e}_{{ii}}|}_{max })). Model performance on the test set is shown in Fig. 3d and e. Additionally, we generate material descriptors using CFID and train scalar models with XGB and Random Forests (RF)35 for comparison. The results, shown in Table 2, indicate that for predicting the highly symmetry-dependent physical quantity of the piezoelectric tensor, EATGNN demonstrates a significant advantage over the descriptor-based methods.
To ensure the model’s equivariance, we rotated the crystal structure and anticipated that the predicted piezoelectric tensor would correspondingly transform. Using Eq. (6), we calculated the piezoelectric tensor under rotational operations at various angles and checked if it was consistent with the model’s output. As shown in Fig. 3f, we calculated the maximum piezoelectric component ({|{e}_{{ij}}|}_{max }) of MoS2 under arbitrary in-plane rotations of the coordinate system. Simultaneously, we rotate the crystal structure accordingly, and employ the model to predict the corresponding rotated piezoelectric tensor and its maximum piezoelectric component. By comparing the predicted and calculated values, we verify the equivariance of the model as shown in Fig. 3f. The results demonstrate that the model predictions perfectly match the actual calculations, confirming that this model is indeed an equivariant model for piezoelectric tensor prediction.
Screening of crystals with large piezoelectricity
In order to discover new potential high-performance piezoelectric materials, we use our model on both 3D and 2D materials in the Materials Project and 2DMatPedia database36, respectively. Our selection criteria are 1) out of the training datasets, 2) inversion asymmetry, 3) Egap > 0.1 eV, and 4) Nuc < 30 atoms. Using our model, we identify several good candidates (See in Tables 3 and 4 and Supplementary Material), which are further validated using density functional perturbation theory (DFPT) calculations. As demonstrated in Fig. 4a, the distribution of the maximum piezoelectric components of these newly discovered bulk materials were obtained through first-principles calculations. The results of one of the best piezoelectric materials, CsBiNb2O7, is shown in Fig. 4b, with its piezoelectric tensor components ({e}_{22}) and ({e}_{26}) being 2.01 and 2.72 C/m2, respectively. The piezoelectric tensor can be further decomposed into two parts: a clamped-ion term and internal strain term37. One can observes that the total ({e}_{22}) is dominated by the large internal strain term. However, the clamped-ion contribution is nearly zero. Besides, since the piezoelectric tensor depends on the choice of coordinate system, rotating the structure by 45 degrees around the z-axis can yield larger piezoelectric tensor components ({e}_{11}=) 3.14 C/m2. Based on this property, a Z-45° cutting slice can be fabricated in experiments to achieve greater longitudinal piezoelectric effect.

a Histogram of the distribution of ({|{e}_{{ij}}|}_{max }) of new potential bulk piezoelectric materials. b The variation of polarization with strain along b axis in ({{rm{CsBiNb}}}_{2}{{rm{O}}}_{7}). The inset image shows the geometric structure of CsBiNb2O7.
Discussion
In this work, we decomposed the piezoelectric tensors into irreducible representations, enabling them to serve as learning targets for neural network. Our EATGNN utilizing multi-head attention mechanism and layer normalization, is constructed specifically for machine learning applications involving piezoelectric tensors. From the predicted full piezoelectric tensor, various piezoelectric performance metrics can be extracted, such as the maximum piezoelectric component, maximum longitudinal piezoelectric component. By combining the model predictions with DFPT calculations, we screened both bulk and 2D material databases and identified new materials exhibiting large piezoelectric components which have never been reported before.
One of the key novelties in this work lies in the formulation of the piezoelectric tensor prediction as an equivariant learning task, achieved through the irreducible representation decomposition. This approach, implemented via the equivariant graph attention neural network, captures the relationship between the crystal structure and piezoelectric tensor, respecting the inherent symmetries. Consequently, the model can identify materials with inversion symmetric structures, which lack piezoelectric effects, and can also accurately predict the full tensor along with its transformations during coordinate system rotations. This represents a significant advantage over scalar models that are constrained by symmetry limitations.
Furthermore, the synergy between machine learning predictions and first-principles calculations facilitated efficient high-throughput screening, resulting in the discovery of new materials with exceptional piezoelectric performance, which have potential applications in sensors, energy harvesting, and other piezoelectric devices.
The equivariant machine learning approach utilized in this study effectively predicts the piezoelectric stress tensor (e) for both 2D and bulk materials. Theoretically, this model is also expected to perform well in predicting the piezoelectric strain tensor (d). There are two potential methods to obtain (d): the first strategy is to construct a dataset specifically for (d) and train a similar equivariant graph attention neural network model to predict it directly. The second method involves predicting the elastic tensor (C) using an equivariant attention graph neural network model. Once (e) and (C) are obtained, the piezoelectric strain tensor (d) can be derived. Therefore, this equivariant machine learning framework is not limited to the piezoelectric stress tensor (e), but can also be extended to the piezoelectric strain tensor (d) and other tensor properties closely related to crystal structure and symmetries. By capturing the relationship between the crystal structure and target tensors while respecting symmetry constraints, equivariant attention graph neural networks have demonstrated exceptional performance and generalization capabilities. Moreover, this approach provides a powerful method for efficient and accurate machine learning prediction of various material tensor properties governed by symmetries, with potential significant impacts on material design, property optimization, and related fields.
Methods
DFT ab initio calculations for piezoelectric tensor
Using the Vienna Ab Initio Simulation Package (VASP)38 for ab initio calculations, and combining it with the Atomic Simulation Environment (ASE)39 code to run an automated computational workflow, allows us to automatically compute a majority of the properties. The exchange-correlation functional is treated using the Perdew-Burke-Ernzerhof (PBE) parametrization of the generalized gradient approximation (GGA)40. The plane-wave cutoff energy is fixed at 520 eV. The k-point sampling depends on the specific material’s lattice constants and symmetry being calculated.
Density functional perturbation theory (DFPT)41 is employed to calculate the piezoelectric stress tensor ({e}_{{ijk}}). To simulate the strain, the lattice constant along the strain direction is fixed, while the lattice constants in the other directions are fully relaxed. Then, the polarization change induced by the applied stress and strain should be determined by the piezoelectric stress tensor and the piezoelectric strain tensor. Therefore, these two can be represented by the following formulas:
where (i,j,kin left{mathrm{1,2,3}right}), (sigma) is strain and (eta) is stress, using Voigt notation, the piezoelectric stress tensor ({e}_{{ijk}}) and piezoelectric strain tensor ({d}_{{ijk}}) can be abbreviated as ({e}_{{ij}}) and ({d}_{{ij}}), (e) and (d) can be related through the elastic tensor (C), with the specific formula as follows32:
Dataset
The training data for the model presented in this work is sourced from Computational 2D Materials Database (C2DB)33,34, the Material Project21 and a recent high-throughput computational study on piezoelectric tensors30.
Model architecture
The model EATGNN used in this work to study the piezoelectric tensors of crystals is based on the graph. Periodic boundary conditions are considered. In this work, the model uses the following form for the cutoff radius:
the initial cutoff radius ({r}_{{cut}}) is set to a relatively small value, typically to prevent overfitting. In addition to adjusting the cutoff radius, in this work, ({r}_{{cut}}) is set to 5 Å, (a) and (b) represent the lattice constants of the crystals.
The unit vector ({hat{vec{r}}}_{{ij}}) is expanded using sphere harmonic basis with a degree of ({l}_{max }), the distance ({r}_{{ij}}) is expanded using the “smooth_finite” function from e3nn20. In this work, the nonlinearity of scalar is chosen to be the SiLU function42, and for each non-scalar part (x) in atom features, the gated nonlinearity is adopted43,
where (f) is a nonlinearity function, and (s) is a scalar obtained from Eq. (1). Then, an equivariant layer normalization is applied to accelerate the convergence of model training10.
After updating the information in the graph, a pooling operation is required on the graph’s nodes. Since the piezoelectric tensor is independent of the number of atoms in the unit cell, we apply a mean pooling and connect it to the irreducible representation (“(2times 1{rm{o}}+1times 3{rm{o}}+1times 4{rm{o}})” in e3nn notation) of the piezoelectric tensor through an equivariant linear layer.
Training
Networks are trained using a mean-squared-error loss function based on complete piezoelectric tensor:
where ({e}_{{nijk}}^{{prime} }) is the predicted piezoelectric tensor and ({e}_{{nijk}}) is the target piezoelectric tensor, (N) is the batch size. And the MAE and RMSE of piezoelectric tensor is defined as:
We train this model with the AdamW optimizer44 to minimize the loss function with a mini-batch size of 16. The learning rate is set to 0.01 for 2D material piezoelectric tensor and is set to 0.0003 for bulk material piezoelectric tensor. The number of heads in this work is 2, the number of graph attention layers is 2 for bulk materials and is 3 for 2D materials, and the dimension of atoms embedding is 32. The query and key of this network is set to 16x0e + 16x0o + 8x1e + 8x1o + 4x2e + 4x2o + 2x3e + 2x3o. The scheduler is employed to adjust the learning rate, setting it to decay by a factor of 0.95 each epoch, over a total of 30 training epochs.
Responses