Related Articles
The design space of E(3)-equivariant atom-centred interatomic potentials
Molecular dynamics simulation is an important tool in computational materials science and chemistry, and in the past decade it has been revolutionized by machine learning. This rapid progress in machine learning interatomic potentials has produced a number of new architectures in just the past few years. Particularly notable among these are the atomic cluster expansion, which unified many of the earlier ideas around atom-density-based descriptors, and Neural Equivariant Interatomic Potentials (NequIP), a message-passing neural network with equivariant features that exhibited state-of-the-art accuracy at the time. Here we construct a mathematical framework that unifies these models: atomic cluster expansion is extended and recast as one layer of a multi-layer architecture, while the linearized version of NequIP is understood as a particular sparsification of a much larger polynomial model. Our framework also provides a practical tool for systematically probing different choices in this unified design space. An ablation study of NequIP, via a set of experiments looking at in- and out-of-domain accuracy and smooth extrapolation very far from the training data, sheds some light on which design choices are critical to achieving high accuracy. A much-simplified version of NequIP, which we call BOTnet (for body-ordered tensor network), has an interpretable architecture and maintains its accuracy on benchmark datasets.
Myeloid neoplasms with PHF6 mutations: context-dependent genomic and prognostic characterization in 176 informative cases
Recent reports suggest a favorable prognosis for PHF6 mutation (PHF6MUT) in chronic myelomonocytic leukemia (CMML) and unfavorable in acute myeloid leukemia (AML). We accessed 176 consecutive patients with a spectrum of myeloid neoplasms with PHF6MUT, including AML (N = 67), CMML (N = 49), myelodysplastic syndromes (MDS; N = 36), myeloproliferative neoplasms (MPN; N = 16), and MDS/MPN (N = 8). PHF6 mutations were classified as nonsense (43%) or frameshift (30%) with the PHD2 domain being the most frequently (64%) affected region. Median follow-up was 25 months with 110 (63%) deaths and 44 allogenic transplants. Our top-line observations include (a) a distinctly superior overall survival (OS; 81 vs. 18 months; p < 0.01) and blast transformation-free survival (BTFS; “not reached” vs. 44 months; p < 0.01) in patients with CMML vs. those with other myeloid neoplasms, (ii) a higher than expected frequency of isolated loss of Y chromosome, in the setting of CMML (16% vs. expected 6%) and MDS (8% vs expected 2.5%), (iii) a significant association, in MDS, between PHF6MUT variant allele fraction (VAF) > 20% and inferior OS (HR 3.0, 95% CI 1.1–8.1, multivariate p = 0.02) as well as female gender and inferior BTFS (HR 26.8, 95% CI 1.9–368.3, multivariate p = 0.01), (iv) a relatively favorable median post-transplant survival of 46 months. Multivariable analysis also identified high-risk karyotype (HR 5.1, 95% CI 1.2–20.9, p = 0.02), and hemoglobin <10 g/dL (HR 2.7, 95% CI 1.0–7.2, p = 0.04), as independent predictors of inferior OS in patients with MDS. The current study provides disease-specific information on genotype and prognosis of PHF6-mutated myeloid neoplasms.
A data-driven generative strategy to avoid reward hacking in multi-objective molecular design
Molecular design using data-driven generative models has emerged as a promising technology, impacting various fields such as drug discovery and the development of functional materials. However, this approach is often susceptible to optimization failure due to reward hacking, where prediction models fail to extrapolate, i.e., fail to accurately predict properties for designed molecules that considerably deviate from the training data. While methods for estimating prediction reliability, such as the applicability domain (AD), have been used for mitigating reward hacking, multi-objective optimization makes it challenging. The difficulty arises from the need to determine in advance whether the multiple ADs with some reliability levels overlap in chemical space, and to appropriately adjust the reliability levels for each property prediction. Herein, we propose a reliable design framework to perform multi-objective optimization using generative models while preventing reward hacking. To demonstrate the effectiveness of the proposed framework, we designed candidates for anticancer drugs as a typical example of multi-objective optimization. We successfully designed molecules with high predicted values and reliabilities, including an approved drug. In addition, the reliability levels can be automatically adjusted according to the property prioritization specified by the user without any detailed settings.
Accurately adjusted phenothiazine conformations: reversible conformation transformation at room temperature and self-recoverable stimuli-responsive phosphorescence
Conformational flexibility is essential to the stimuli-responsive property of organic materials, but achieving the reversible molecular transformation is still challenging in functional materials for the high energy barriers and restriction by intermolecular interactions. Herein, through the incorporation of various steric hindrances into phenothiazine derivatives with different positions and quantities to tune the molecular conformations by adjustable repulsive forces, the folded angles gradually changed from 180° to 90° in 17 compounds. When the angle located at 112° with moderated steric effect, dynamic and reversible transformation of conformations under mechanical force has been achieved for the low energy barriers and mutually regulated molecular motions, resulting in both self-recoverable and stimuli-responsive phosphorescence properties for the first time. It opened up a new way to realize the self-recovery property of organic materials, which can facilitate the multi-functional property of smart materials with the opened avenue for other fields with inspiration.
The contribution of genetic determinants of blood gene expression and splicing to molecular phenotypes and health outcomes
The biological mechanisms through which most nonprotein-coding genetic variants affect disease risk are unknown. To investigate gene-regulatory mechanisms, we mapped blood gene expression and splicing quantitative trait loci (QTLs) through bulk RNA sequencing in 4,732 participants and integrated protein, metabolite and lipid data from the same individuals. We identified cis-QTLs for the expression of 17,233 genes and 29,514 splicing events (in 6,853 genes). Colocalization analyses revealed 3,430 proteomic and metabolomic traits with a shared association signal with either gene expression or splicing. We quantified the relative contribution of the genetic effects at loci with shared etiology, observing 222 molecular phenotypes significantly mediated by gene expression or splicing. We uncovered gene-regulatory mechanisms at disease loci with therapeutic implications, such as WARS1 in hypertension, IL7R in dermatitis and IFNAR2 in COVID-19. Our study provides an open-access resource on the shared genetic etiology across transcriptional phenotypes, molecular traits and health outcomes in humans (https://IntervalRNA.org.uk).
Responses