Proteome selectivity profiling of photoaffinity probes derived from imidazopyrazine-kinase inhibitors
Introduction
Protein phosphorylation, catalyzed by protein kinases, represents one of the most common post-translational modifications (PTMs) and regulates a wide variety of signaling networks in the cell1. Changes in protein phosphorylation mediate processes such as transcription, apoptosis, metabolism, cell proliferation, and cell differentiation1,2. Unsurprisingly, dysregulation of kinase activity contributes to various human pathologies, including inflammatory diseases3, neurodegenerative diseases4 and cancer5.
The human genome codes for more than 500 kinases2, which are subdivided into 7 subgroups (Fig. 1A). The kinase domains consist of 2 lobes; the N- and the C-terminal lobe, connected by a hinge region (Fig. 1B). The ATP-binding pocket is located between the two lobes, and the substrate binding site locates to the C-terminal lobe. Interactions at this site determine—in part—the selectivity of kinases towards certain substrates6. Because of their role in various human diseases, kinase inhibitors have received a lot of attention from drug discovery programs. This has been quite successful: approximately 80 kinase inhibitors have been approved by the Food and Drug Administration (FDA), predominantly for treatment of cancer (http://www.brimr.org/PKI/PKIs.htm). Nevertheless, the development of selective kinase inhibitors remains highly challenging. One of the main reasons is that most kinase inhibitors mimic the ATP structure and bind into the conserved ATP binding site. Many medicinal chemistry efforts have focused on small differences in the ATP binding site in order to develop kinase inhibitors that are selective for a kinase target of interest7. Oppositely, there is also evidence that a certain degree of promiscuity—i.e., targeting multiple kinase targets—may be beneficial for treatment of disease because of redundancies of kinases involved in signaling pathways8.

A Kinome tree with indication of BTK, IRE1, and IGF1R, and the 7 kinase subgroups (TK = tyrosine kinases; TKL = TK-like kinases; STE = STE kinases; homologous to yeast STE20, −11, and −7 kinases; CK1 = casein kinase 1 homologs; AGC = protein kinase (A, G, C) families; CAMK = Ca2+/calmodulin-dependent protein kinases, CMGC = diverse group containing cyclin-dependent kinases, mitogen-activated protein kinases, glycogen synthase kinases, and Cdk-like kinases). B Structure of a kinase domain (here exemplified by IGF-1R; PDB code: 3D94), with indication of the N- and C-lobes and the ATP-binding pocket. Protein in cartoon mode with helices in cyan, β-sheets in magenta, and random coils in pink. Picture rendered with PyMol45. C Structure of KIRA6 with indicated numbering of the imidazopyrazine scaffold, and the KIRA6-derived photoaffinity probe 1. D Structures of acalabrutinib and linsitinib. E Co-crystal structure of acalabrutinib and murine BTK (PDB: 8FD9) reveals that the N-substituent on the 5-membered ring points towards the solvent. Acalabrutinib and the covalently-bound cysteine C481 are depicted in sticks, and the protein in cartoon and semi-transparent surface representation. Picture rendered with PyMol.
Various mass spectrometry (MS) techniques exist to identify the target(s) of a kinase inhibitor from whole proteomes. They include thermal proteome profiling9, kinobeads10,11, and photoaffinity labeling (PAL)12. The benefit (in comparison to evaluation against purified kinases) is that some of these techniques also identify non-kinase targets. Through a PAL approach, we have recently shown that the imidazo[1,5-a]pyrazine-based kinase inhibitor KIRA6 (Fig. 1C), which is a reported inhibitor of IRE1α13, has a wide range of other non-kinase targets14, including the ATP-binding HSP6015. Interestingly, the imidazopyrazine core structure occurs in the FDA-approved drug acalabrutinib, which inhibits Bruton’s tyrosine kinase (BTK), drug candidate linsitinib, which targets insulin-like growth factor 1 receptor (IGF-1R) (Fig. 1D), and several other reported kinase inhibitors, for example for activated Cdc42-associated kinase (ACK1)16, and plasmodium falciparum calcium-dependent protein kinase 117. To provide more insight into the selectivity of this scaffold and how it is influenced by the different substituents, we designed, synthesized and evaluated imidazopyrazine photoaffinity probes. In line with our previous study14, we found that imidazopyrazine-based small molecules target proteins outside the kinome. In silico analysis further suggests that their proteome selectivity (i.e., the number of off-targets) is likely influenced by the size and rigidity of the substituents interacting with the binding pocket (primarily the C1 substituent), as well as the overall three-dimensional conformation in solution. Overall, the presented data may offer insight on how to tune the proteome selectivity of imidazopyrazine-based inhibitors.
Results and discussion
Besides our already available KIRA6-based photoaffinity probe 1 (Fig. 1C), we set out to develop two additional probes that share the imidazopyrazine core, based on the ATP-competitive kinase inhibitors linsitinib and acalabrutinib. Both inhibitors exhibited high selectivity for their respective primary targets when assessed in traditional biochemical assays (against panels of purified protein kinases)18,19. Linsitinib is a reversible kinase inhibitor—currently in a phase 2 clinical trial (NCT05276063)20—that targets insulin-like growth factor 1 receptor (IGF-1R) and a homolog kinase, the insulin receptor (IR). In contrast, acalabrutinib is a second generation, irreversible BTK inhibitor that was approved by FDA in 201721. It bears a butynamide moiety that covalently reacts with Cys481 located just outside the ATP pocket of BTK. In this study, our focus was to understand how the proteome selectivity of imidazopyrazine-based inhibitors is influenced by ATP pocket-binding substituents, rather than additional chemical features that define the mode of binding. Therefore, the probe derived from the acalabrutinib scaffold was equipped with a photoaffinity group, as done previously for ibrutinib22. Additionally, a photoaffinity group would allow direct comparison with the other PAL probes in this study. Similar to our previous KIRA6 probe 114, we decided to introduce the same minimalist alkyne diazirine building block23. Although diazirine has some preference for acidic side chains24, which may contribute to lower photoaffinity labeling efficiencies, it is also small and leads to low non-specific protein modification25. The photoaffinity building block was placed at the C3 position of the imidazopyrazine scaffold, because crystal structures of acalabrutinib bound to murine BTK26 (Fig. 1E) and of a near-identical linsitinib derivative to IGFR27 (Fig. S1) showed that substituents at this position are solvent-exposed. Although this placement may result in missing some of the off-targets of the parent compound, we reasoned that this design likely retains the ability to interact with nucleotide binding pockets. Therefore, the minimalist photoaffinity handle was designed to replace the butynamide moiety on acalabrutinib and the hydroxyl group on linsitinib. For both molecules, the stereochemistry of the C3-substituent on the PAL probes was maintained as in the parent compounds.
The synthesis of the probes is outlined in Fig. 2. In brief, chloropyrazine 2 was coupled to Cbz-protected proline (10a; for acalabrutinib) or cyclo-butane derivative 10b (for linsitinib), followed by Bischler-Napieralski cyclization to obtain the imidazopyrazine cores 4a-b28. Next, iodination at the C1 position and nucleophilic aromatic substitution of the chlorine by ammonia yielded compounds 6a-b. Suzuki coupling was employed to introduce the C1 substituent towards 7a-b. Finally, temporary protection of the exocyclic amine with Fmoc, followed by Cbz removal, coupling with diazrine building block 12 and final Fmoc deprotection furnished the desired acalabrutinib-based (9a) and linsitinib-based (9b) PAL probes.

a Cmp 10a or 10b, EDC.HCl, DCM, rt, o/n.; (b) PCl5, MeCN, 0 °C, then 50 °C; (c) N-iodo-succinimide, DMF, 60 °C, o/n.; (d) 35% NH4OH, dioxane, 100 °C. e Cmp 11a or 11b, K2CO3, Pd(PPh3)4, water/DME, 80 °C, o/n; (f) Fmoc-Cl, pyridine, DCM, rt, o/n. g 33% HBr in AcOH, 0 °C, then 1 h at rt; (h) Cmp 12, HATU, DIEA, DMF, rt, o/n; (i) 1% DBU in DMF, 1 h, rt.
We next set out to demonstrate the ability of probes 9a-b to covalently label proteins upon irradiation. To this end, an increasing concentration of probe was incubated in lysates of MCF7 breast cancer cells or Ramos cells, a lymphoma-derived cell line. Irradiation at 365 nm was followed by click chemistry-mediated tagging of the covalently modified proteins with an azido-TAMRA derivative. Proteins were then resolved by SDS-PAGE and visualized on a flatbed fluorescent scanner. As expected, an increasing probe concentration led to more intense labeling, which was largely competed out by co-incubation with an excess of the parent kinase inhibitor (Fig. 3A). Multiple protein species were labeled by acalabrutinib probe 9a and linsitinib probe 9b, which was in line with our previous experiments using KIRA6-based probe 114. Note that some background labeling occurred even when irradiation was omitted (Fig. 3A, right lanes). This may be explained by an azide-alkyne-thiol reaction as recently reported by Wiest & Kielkowski29. A direct comparison of the three PAL probes in lysates of several different cell lines showed that the overall labeling pattern appeared to be similar, with some minor differences for specific probes and lysates (Fig. 3B). For example, in cell lysates of the A375 melanoma cell line, probe 9a gave labeling of an approximately 30 kDa protein that was not or only weakly labeled by the other probes. This was similar, but less apparent in MCF7 cells. In Ramos cells, probe 1 led to a higher intensity labeling of a protein of approximately 75 kDa. Nevertheless, there appears to be a substantial number of labeled proteins, which represent potential off-targets. Competitive protein profiling experiments using the different probes in competition with different inhibitors (Fig. 3C) showed that not only the parent compound, but also the other imidazopyrazine inhibitors lead to a reduction of labeling intensity, suggesting a (partial) overlap in their targets.

A Increasing labeling intensity by probes 9a and 9b, which is largely competed with the parent compound. Non-competable signal in the SDS-PAGE may be due to non-specific diazirine photolabeling25. Click chemistry background (in absence of probe and irradiation) is detected in the most right gel lane. B Labeling by the three imidazopyrazine probes (10 μM) in lysates of three different cell lines. Red asterisks indicate some differentially labeled gel bands. C Competitive protein profiling (10 μM probe) in different cell lysates using the parent kinase inhibitors and the pan-kinase inhibitor staurosporin (at 10 × probe concentration). Coomassie stains of gels in Fig. S2–S4.
To identify the imidazopyrazine targets, we followed a chemical proteomics workflow as outlined in Fig. 4A. We performed these experiments in cell lysates to minimize the influence of different cell permeabilities of the three probes. Although lysis may influence enzyme activities or protein-protein interactions, which could affect probe labeling, the SDS-PAGE-based read-out had already shown that a wide variety of proteins were detected as targets. Hence, lysates of A431 cells were UV irradiated in the presence of 10 μM of the PAL probe, using DMSO (blank) and PAL probe with excess of parent inhibitor (competition) as controls, and samples were subjected to bioorthogonal click chemistry with a TAMRA-biotin-azide tag (Fig. S5 for quality control of replicates). After removal of the excess click reagents, labeled proteins were enriched on immobilized streptavidin. Next, samples were processed by tryptic digestion before they were analyzed by LC-MS/MS30 and quantified by label-free quantification (LFQ). Volcano plots of probe versus dmso (Fig. S6) and of probe versus competition (Fig. 4B) revealed significant enrichment (2-fold or more) for various proteins (see Supplemental Data D1 for full lists). We applied stringent selection criteria for hits: proteins were only considered as targets (a) if they were significantly enriched versus the DMSO control and also showed at least a 2-fold reduction in the competition control, and (b) if they were identified with at least two unique peptides. This led to a final target list of 10 proteins for KIRA6 probe 1, 32 proteins for acalabrutinib probe 9a and 42 proteins for linsitinib probe 9b (Supplemental Data D1), with some overlap in the identified targets (Fig. 4C). The primary targets for acalabrutinib (BTK), linsitinib (IGF1R), and KIRA6 (IRE1α) were not identified, possibly because of very low expression in A431 cells (Fig. S7), although we cannot exclude reasons that relate to probe design. Introduction of the minimalist photoaffinity handle, for example, may interfere with binding to the primary target. Alternatively, upon irradiation the photoreactive group may fail to form a covalent bond with the target when not located close enough to a crosslinkable residue. Despite not finding the primary targets, various proteins were significantly enriched versus both control samples. In total, we identified 2 kinases for KIRA6 probe 1, 4 kinases for acalabrutinib probe 9a, and 1 kinase for linsitinib probe 9b (Fig. 4D), although less stringent criteria led to a slightly higher amount of kinase hits (Table S1). Notably, kinase targets of probe 1 had been described before as off-targets of parent compound KIRA631. Moreover, the kinase targets of probes 9a and 9b have been reported as targets (or share 99% sequence similarity with targets) of linsitinib32 and acalabrutinib18,33, respectively. Given that the imidazopyrazine scaffold mimics the adenine moiety, we classified targets as kinases, and nucleotide or nucleoside binding proteins, or others (lacking such binding sites). For a potential explanation, we reasoned that some of these proteins may interact with kinases and as such have co-enriched or undergone reaction with solvent-exposed diazirine. We therefore checked whether these identified targets are annotated as kinase interaction partners using the IntAct database (Fig. S8)34. We noted that between 45% (for 9b) to more than 90% (for 9a) of the kinase interacting partners were interactors of the kinases identified in the enrichment (Table S2). Altogether, over half of all identified proteins represent kinases, nucleotide/nucleoside binding or kinase interaction partners (Fig. 4D). Interestingly, several kinases acting as interactors of here identified hits have also been reported as off targets of KIRA6, acalabrutinib or linisitinib18,31,32,33,35,36. We hypothesize that such interactions may have led to their enrichment in the proteomics workflow.

A Schematic workflow of the target identification: photoaffinity labeling with 10 μM probe was followed by click chemistry to introduce a biotin, enrichment on immobilized streptavidin, and on bead digestion. Resulting peptides were analyzed by LC-MS/MS. All conditions were executed in triplicates from one batch of A431 lysate. B Volcano plots of significantly enriched proteins in the indicated probe sample versus the competition with the parent inhibitor. Two-fold enrichment and p–value of 0.01 were taken as cut-offs to determine hits in the upper right quadrant. Note that black dots denote proteins that were not significantly enriched in probe versus dmso (see Fig. S6 for volcano plots versus dmso). Also note that the few negatively-enriched proteins (upper left section) may either be false positives or may have undergone enhanced labeling because of an increased probe availability by competition with the parent compound. C Venn diagram of the number of targets that were significantly enriched versus dmso and versus parent inhibitor. D Classification of target proteins for each probe – these were classified as kinases, nucleotide or nucleoside binding, kinase interactors or other function. E Volume distribution calculated from the MD trajectories. The most frequent conformation of the KIRA6 scaffold has a bounding box volume of 628 ų, while the extreme conformations reach up to 998 ų. In comparison, the other scaffolds show smaller volume ranges, with most frequent conformation volume of 219 ų and extreme conformation volumes up to 360 ų for the linsitinib scaffold, and a most frequent conformation volume of 351 ų and extreme conformation volumes up to 414 ų for the acalabrutinib scaffold (numerical source data in Supplemental Data D2). F Principal component analysis of the MD trajectories reveals the largest variation in conformations for acalabrutinib and the smallest variation for linsitinib (numerical source data in Supplemental Data D3).
KIRA6 probe 1 exhibited the highest selectivity compared with the other imidazopyrazine-based probes. To gain insight in how the molecular structure of the probes may explain their differences we performed MD simulations37 (50 ns in water) of the imidazopyrazine scaffold including the large substituent at the 1-position, which interact with the ATP-binding pocket, but omitting the solvent-exposed 3-substituent (Fig. S9). Calculation of the molecular volumes from the MD trajectories (see Movies S1–S3) reveals that KIRA6 exhibits the largest three-dimensional size of the three scaffolds (Fig. S10) and displays a substantial conformational flexibility (Figs. 4E, and S10). KIRA6 is reported as a type II kinase inhibitor that stabilizes the inactive IRE1α kinase conformation13. Type II inhibitors generally display larger substituents that protrude deep into the ATP binding cleft of the inactive conformation, a feature that was considered to improve selectivity across the kinome38. The larger 1-substituent of KIRA6 may also cause steric conflicts in protein families other than kinases, which could explain the overall lower number of identified targets for the KIRA6 probe 1.
A principal component analysis of the different conformations of the KIRA6, acalabrutinib, and linsitinib scaffolds revealed that linsitinib has a very rigid conformation, in contrast to KIRA6 and acalabrutinib, which occupy a larger conformational space (Fig. 4F; see also Movies S1–S3). To further explore how the acalabrutinib and linsitinib probes bind to their targets, we decided to perform further in silico analysis by flexible molecular docking. First, we used the primary targets BTK and IGF1R to verify that docking in the ATP binding pocket produces reasonable results. Indeed, good overlap of docked imidazopyrazine scaffolds with crystallized ligands was found (Fig. S11), whereas docking against a negative control protein did not give sensible results (Fig. S12). Next, we selected all nucleotide-binding protein targets identified by proteomics for which crystal structures were available in the Protein Data Bank and performed molecular docking using conformations extracted from the MD trajectories as starting points (see methods section in supporting information for details). We consistently found that the linsitinib scaffold displayed higher calculated affinity compared with the acalabrutinib scaffold (Fig. S13). These in silico results indicate that the compact structure of linsitinib in solution (Fig. 4E), along with the distinct orientation of its substituent (Fig. S10), may favor its binding to a higher number of off-targets that bind nucleotides or nucleosides. Additionally, the more rigid linsitinib scaffold with its very defined conformation may undergo a lower entropic penalty when going to the bound state, compared to the highly flexible acalabrutinib (Fig. 4F). Collectively, the present study highlights that the proteome selectivity of kinase inhibitors sharing the imidazopyrazine core is likely defined by the size, rigidity and the spatial arrangement of the substituents to be accommodated within a protein pocket.
Conclusion
In conclusion, we have synthesized and evaluated photoaffinity probes based on imidazopyrazine kinase inhibitors, functionalized with a minimalist diazirine alkyne linker. Using gel-based and chemical proteomics experiments, we showed that these probes display a substantial amount of off-targets. The differences in selectivity may be explained by the size and flexibility of the substituent at the 1-position, and future analysis of a wider set of compounds may provide further evidence for this idea. Moreover, the utilized strategy may be more generally applied for future evaluation of selectivity of other kinase inhibitor scaffolds.
Methods
Chemistry
Chemical syntheses and characterizations are provided as Supplemental Methods in the supporting information.
Cell culture
Cells were grown at 37 °C under a humidified 5% CO2 atmosphere. A375, A431, Ramos, and MCF-7 cells were grown in high-glucose DMEM medium (Sigma) in culture flasks. All media were supplemented with 10% FBS (VWR), 100 U/mL penicillin/streptomycin (Fisher Scientific), and 2 mM GlutaMax (Thermo Fisher), when not included in the medium.
For generating cell lysates for A375, A431, Ramos, and MCF-7 cells, cells were grown in T175 flasks up to ∼90% confluence. The growth medium was aspirated, and the cells were washed twice with DPBS (VWR), followed by the addition of fresh DBPS. The cells were then harvested by scraping. For generating cell lysates from Ramos cells, cells were grown in T175 flasks up to 2 × 106 cells/mL. The cells were spun down to remove the growth medium, washed three times with DPBS Cell pellets were lysed by adding lysis buffer containing 100 mM Hepes, 10% sucrose, 1% TritonX100 (pH 7.4), and EDTA-free protease inhibitor cocktail (Sigma). Cells were kept on ice for 30 min and briefly vortexed every 10 min. Then cells were passed through a 26 G needle 10 times and the cell debris was removed by centrifugation. The lysates were aliquoted and stored at −80 °C. Protein concentrations for each sample were determined using the BCA assay (Fisher Scientific).
Whole cell lysate labeling
Whole cell lysates (for all cell lines) were normalized to a concentration of 1 mg/mL in a volume of 30 μL using reaction buffer (50 mM HEPES, 150 mM NaCl, and 0.03% v/v Triton-X-100, pH 7.4). Samples were then treated with various concentration of probe or DMSO, mixed by vortexing, and immediately irradiated for 6 min at RT. UV irradiation was performed with a UVP Blak-Ray B-100 Series High-Intensity UV Lamp (Fisher Scientific), by pre-warming up the lamp for 4 min prior to photoaffinity labeling, followed by placing the samples approximately 4 cm under the lamp. For competition experiments, samples were pre-treated with 10 μM of the indicated probe and a 10-fold excess of inhibitors KIRA6 (Bio-Connect), acalabrutinib (MedChemExpress), linsitinib (MedChemExpress) or staurosporin (Thermo Fisher) and incubated at 37 °C for 1 h prior to irradiation. After irradiation, probes were clicked onto TAMRA-azide (Carl Roth). Click reaction was performed using the following conditions: 25 μM of tag-azide, 50 μM of THPTA (Sigma Aldrich), 1 mM of CuSO4 (freshly prepared), and 1 mM of sodium ascorbate (freshly prepared). Click reaction was incubated for 1 h at RT. The reaction was quenched by addition of 10 μL of 4 × SDS loading buffer. Samples were resolved by 10% SDS-PAGE and visualized using a Typhoon FLA 9500 fluorescence scanner. Images were processed and analyzed using ImageJ39. Following visualization, gels were stained with Coomassie using ROTI®Blue (Carl Roth).
On-bead digestion
Seven hundered fifty micrograms total protein from A431 whole cell lysates were normalized to a concentration of 1 mg/mL in reaction buffer. All samples were performed in triplicates. Samples were pre-treated with competitor (KIRA6, acalabrutinib, or linsitinib; 10 fold excess compared with probe) or DMSO and incubated at 37 °C for 1 h. Samples were then treated with 10 μM probe or DMSO, mixed by vortexing, and immediately irradiated for 6 min at RT. After irradiation, the probes were clicked onto TAMRA-azide-PEG-biotin as described above. The excess reagents from the sample were then removed by acetone precipitation (twice). Protein pellets were resuspended in 230 μL of 6 M urea and 50 mM ammonium bicarbonate, followed by reduction (30 min, 37 °C), alkylation (30 min, 37 °C, dark), and quenching (30 min, 37 °C, dark) in the presence of 10 mM DTT, 25 mM iodoacetamide and 25 mM DTT, respectively. Samples were diluted to 1 M urea using 0.1% SDS in 50 mM ammonium bicarbonate and incubated with 20 μL of pre-washed streptavidin beads (ThermoFisher) for 1 h with mixing at RT. The supernatant was removed and beads were sequentially washed with 0.33% SDS in PBS (2 × 1 mL), 1 M NaCl in PBS (2 × 1 mL), and 50 mM ammonium bicarbonate (2 × 1 mL). Beads were resuspended in 50 μL on-beads digestion buffer (50 mM ammonium bicarbonate, 0.01% ProteaseMax (Promega), and 5% (vol/vol) acetonitrile) in the presence of 0.5 µg trypsin (Thermo Scientific). After overnight digestion, the supernatant was brought to 0.75% TFA and loaded on C18 spin columns (Thermo Scientific) for desalting.
Mass spectrometry and data analysis
The resulting peptide mixture was analyzed by high-resolution LC-MS/MS using an Ultimate 3000 Nano Ultra High-Pressure Chromatography (UPLC) system interfaced with a Q Exactive Hybrid Quadrupole-Orbitrap mass spectrometer via an EASY-spray (C-18, 15 cm) column (Thermo Fisher Scientific). Peptides were identified by MASCOT 2.2.07 (Matrix Science; www.matrixscience.com) using the Homo sapiens database (204052 entries, 21/03/2024), adopting the following MASCOT search parameters: trypsin/P digestion, 2 missed cleavages allowed, carbamidomethyl (C) as fixed modification, oxidation (M) as variable modification. Peptide tolerance was set at 10 ppm for MS and at 20 mmu for MS/MS. Mascot XML output files were further analyzed using Progenesis QI for proteomics (Waters). Only MS/MS spectra with rank below 8 and PSMs with a score of at least 29 were taken into account. Relative quantification using Hi3 peptides was adopted. Normalization of quantified protein abunances was executed according to the “Normalize to all proteins” default normalization method of the Progenesis software. All quantified keratins and albumin were removed. Probe-enriched targets were defined as proteins with a minimum of 2 unique peptides and with a LFQ ratio >2 between probe and DMSO as well as between probe and competition sample, and with a p-value < 0.01 by 2-sided Student’s t-test. Statistical analysis was executed by Qlucore software using log2 transformed protein abundances. The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE40 partner repository with the dataset identifier PXD053278.
Computational analysis
All automated analyses were performed using custom Jupyter Notebooks41 on the Google Colab framework42. The analysis encompassed several types of data processing and statistical evaluations, detailed as follows:
Data Processing: preprocessing of raw data, including normalization and transformation, was carried out using Python libraries such as NumPy and Pandas.
Visualization: data visualization (i.e., plotting) was achieved through the use of Matplotlib and Seaborn.
Molecular dynamics analysis: extraction of conformations from the molecular dynamics (MD) trajectory was performed using the mdtraj library. Conformation clustering was conducted using the KMeans algorithm from the sklearn. Cluster module.
In silico studies molecular dynamics simulations
Simulations were performed on a cloud-based (Google Colab) jupyter notebook (MD_Small_Molecules.ipynb) developed previously37. Briefly, small molecules were prepared using SMILES as input and energy-minimized with TorchANI. The molecular topology was generated using the General Amber Force Field 2 (GAFF2) within the AMBER software suite. The simulation box was prepared by solvating the system with TIP3P water model and adding 0.15 M NaCl to mimic physiological conditions. The simulations were performed using OpenMM. Conformations used for docking were extracted from the MD trajectories based on clustering using KMeans algorithm (extracting structural features from each frame and grouping them into clusters) and energy ranking (i.e., most energetically favorable conformations). The initial and final atom coordinates before and after MD simulation, respectively, are supplied as Supplemental Data D4.
Molecular docking
The following crystal structures of human proteins were used for docking: CSNK1A1: PDB 6GZD for CSNK1A1, PDB 4HNF for CSNK1D, PDB 5TDH for GNAI1, PDB 2P0W for HAT1, PDB 4DRZ for RAB14, PDB 6JJU for ATP2A2, PDB 4G5O for GNAI3, PDB 6VFZ for IDH2, PDB 6LPF for LARS, and PDB 6JTG for OPA1. Protein structures for docking were prepared using AutoDockTools 1.5.643. Co-crystallized ligands were used to define the grid box. Acalabrutinib and linsitinib structures in PDB format (extracted from MD trajectory) were processed using the prepare_ligand4.py script from AutoDockTools44 to generate PDBQT files. This preparation included adding flexible torsions to the ligands. Flexible docking simulations were performed using AutoDock Vina 1.2.5 for all proteins and ligand conformations. The visualization of the results was done using PyMOL molecular viewer and data analysis was performed on custom Jupyter Notebook on the Google Colab framework. Error bars in the plot showing binding affinities indicate variability in the predicted affinity values, specifically represented by whiskers in the boxplot. Significance tests were performed to compare the binding affinities between acalabrutinib and linsitinib for each protein, using Welch’s t-test to accommodate any unequal variance. Additionally, Mann–Whitney U tests were used to evaluate differences across protein pairs for each molecule to confirm statistical significance of binding preferences. The statistical significance of comparisons is indicated by p-values and marked with stars. p ≤ 0.05: * (significant), p ≤ 0.01: ** (highly significant), p ≤ 0.001: *** (very highly significant), p ≤ 0.0001: **** (extremely significant).
Reporting summary
Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.
Responses