Journal of Chemical Information and Modeling

期刊名称：Journal of Chemical Information and Modeling

期刊ISSN：1549-9596

期刊官方网站：http://pubs.acs.org/journal/jcisd8

出版商：American Chemical Society (ACS)

出版周期：Bimonthly

影响因子：6.162

始发年份：2005

年文章数：227

是否OA：否

Pharmaceutical Cocrystal Discovery via 3D-SMINBR: A New Network Recommendation Tool Augmented by 3D Molecular Conformations

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-03 , DOI: 10.1021/acs.jcim.3c00066

LuluZheng,BinZhu,ZengruiWu,MeiGuo,JinyaoChen,MinghuangHong,GuixiaLiu,WeihuaLi,GuobinRen,YunTang

Cocrystals have significant potential in various fields such as chemistry, material, and medicine. For instance, pharmaceutical cocrystals have the ability to address issues associated with physicochemical and biopharmaceutical properties. However, it can be challenging to find proper coformers to form cocrystals with drugs of interest. Herein, a new in silico tool called 3D substructure–molecular-interaction network-based recommendation (3D-SMINBR) has been developed to address this problem. This tool first integrated 3D molecular conformations with a weighted network-based recommendation model to prioritize potential coformers for target drugs. In cross-validation, the performance of 3D-SMINBR surpassed the 2D substructure-based predictive model SMINBR in our previous study. Additionally, the generalization capability of 3D-SMINBR was confirmed by testing on unseen cocrystal data. The practicality of this tool was further demonstrated by case studies on cocrystal screening of armillarisin A (Arm) and isoimperatorin (iIM). The obtained Arm–piperazine and iIM–salicylamide cocrystals present improved solubility and dissolution rate compared to their parent drugs. Overall, 3D-SMINBR augmented by 3D molecular conformations would be a useful network-based tool for cocrystal discovery. A free web server for 3D-SMINBR can be freely accessed at http://lmmd.ecust.edu.cn/netcorecsys/.

Machine Learning-Based Prediction of Drug-Induced Hepatotoxicity: An OvA-QSTR Approach

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-26 , DOI: 10.1021/acs.jcim.3c00687

FeyzaKelleci̇Çeli̇k,GülKaraduman

Drug-induced hepatotoxicity, also known as drug-induced liver injury (DILI), is among the possible adverse effects of pharmacotherapy. This clinical condition is accepted as one of the factors leading to patient mortality and morbidity. The LiverTox database was built by the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) to predict potential liver damage from medications and take appropriate precautions. The database has classified medicines into seven risk categories (A, B, C, D, E, E*, and X) to avoid medicine-induced liver toxicity. The hepatic damage risk decreases from group A to group E. This study did not include the E* and X classes because they contained unverified and unknown data groups. Our study aims to predict potential liver damage of new drug molecules without using experimental animals. We predict which of the LiverTox risk category drugs with unknown liver toxicity potential will fall into using our one-vs-all quantitative structure–toxicity relationship (OvA-QSTR) model. Our dataset, consisting of 678 organic drug molecules from different pharmacological classes, was collected from LiverTox. The OvA-QSTR models implemented by Bayesian Network (BayesNet) performed well based on the selected descriptors, with the precision–recall curve (PRC) areas ranging from 0.718 to 0.869. Our OvA-QSTR models provide a reliable premarketing risk evaluation of pharmaceutical-induced liver damage potential and offer predictions for different risk levels in DILI.

Transformer-Based Molecular Generative Model for Antiviral Drug Design

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-06-27 , DOI: 10.1021/acs.jcim.3c00536

JiashunMao,JianminWang,AmirZeb,Kwang-HwiCho,HaiyanJin,JongwanKim,OnjuLee,YunyunWang,KyoungTaiNo

Since the Simplified Molecular Input Line Entry System (SMILES) is oriented to the atomic-level representation of molecules and is not friendly in terms of human readability and editable, however, IUPAC is the closest to natural language and is very friendly in terms of human-oriented readability and performing molecular editing, we can manipulate IUPAC to generate corresponding new molecules and produce programming-friendly molecular forms of SMILES. In addition, antiviral drug design, especially analogue-based drug design, is also more appropriate to edit and design directly from the functional group level of IUPAC than from the atomic level of SMILES, since designing analogues involves altering the R group only, which is closer to the knowledge-based molecular design of a chemist. Herein, we present a novel data-driven self-supervised pretraining generative model called “TransAntivirus” to make select-and-replace edits and convert organic molecules into the desired properties for design of antiviral candidate analogues. The results indicated that TransAntivirus is significantly superior to the control models in terms of novelty, validity, uniqueness, and diversity. TransAntivirus showed excellent performance in the design and optimization of nucleoside and non-nucleoside analogues by chemical space analysis and property prediction analysis. Furthermore, to validate the applicability of TransAntivirus in the design of antiviral drugs, we conducted two case studies on the design of nucleoside analogues and non-nucleoside analogues and screened four candidate lead compounds against anticoronavirus disease (COVID-19). Finally, we recommend this framework for accelerating antiviral drug discovery.

Understanding the Excited-State Relaxation Mechanisms of Xanthophyll Lutein by Multi-configurational Electronic Structure Calculations

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-25 , DOI: 10.1021/acs.jcim.3c00640

Bo-WenYin,Jie-LeiWang,Pu-JieXue,Teng-ShuoZhang,Bin-BinXie,LinShen,Wei-HaiFang

The contradictory behaviors in light harvesting and non-photochemical quenching make xanthophyll lutein the most attractive functional molecule in photosynthesis. Despite several theoretical simulations on the spectral properties and excited-state dynamics, the atomic-level photophysical mechanisms need to be further studied and established, especially for an accurate description of geometric and electronic structures of conical intersections for the lowest several electronic states of lutein. In the present work, semiempirical OM2/MRCI and multi-configurational restricted active space self-consistent field methods were performed to optimize the minima and conical intersections in and between the 1Ag–, 2Ag–, 1Bu+, and 1Bu– states. Meanwhile, the relative energies were refined by MS-CASPT2(10,8)/6-31G*, which can reproduce correct electronic state properties as those in the spectroscopic experiments. Based on the above calculation results, we proposed a possible excited-state relaxation mechanism for lutein from its initially populated 1Bu+ state. Once excited to the optically bright 1Bu+ state, the system will propagate along the key reaction coordinate, i.e., the stretching vibration of the conjugated carbon chain. During this period of time, the 1Bu– state will participate in and forms a resonance state between the 1Bu– and 1Bu+ states. Later, the system will rapidly hop to the 2Ag– state via the 1Bu+/2Ag– conical intersection. Finally, the lutein molecule will survive in the 2Ag– state for a relatively long time before it internally converts to the ground state directly or via a twisted S1/S0 conical intersection. Notably, though the photophysical picture may be very different in solvents and proteins, the current theoretical study proposed a promising calculation protocol and also provided many valuable mechanistic insights for lutein and similar carotenoids.

Molecular Dynamics Simulations Elucidate the Molecular Basis of Pre-mRNA Translocation by the Prp2 Spliceosomal Helicase

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-06-28 , DOI: 10.1021/acs.jcim.3c00585

SeforaNaomiAgrò,RiccardoRozza,SantiagoMovilla,JanaAupič,AlessandraMagistrato

The spliceosome machinery catalyzes precursor-messenger RNA (pre-mRNA) splicing by undergoing at each splicing cycle assembly, activation, catalysis, and disassembly processes, thanks to the concerted action of specific RNA-dependent ATPases/helicases. Prp2, a member of the DExH-box ATPase/helicase family, harnesses the energy of ATP hydrolysis to translocate a single pre-mRNA strand in the 5′ to 3′ direction, thus promoting spliceosome remodeling to its catalytic-competent state. Here, we established the functional coupling between ATPase and helicase activities of Prp2. Namely, extensive multi-μs molecular dynamics simulations allowed us to unlock how, after pre-mRNA selection, ATP binding, hydrolysis, and dissociation induce a functional typewriter-like rotation of the Prp2 C-terminal domain. This movement, endorsed by an iterative swing of interactions established between specific Prp2 residues with the nucleobases at 5′- and 3′-ends of pre-mRNA, promotes pre-mRNA translocation. Notably, some of these Prp2 residues are conserved in the DExH-box family, suggesting that the translocation mechanism elucidated here may be applicable to all DExH-box helicases.

AlvaBuilder: A Software for De Novo Molecular Design

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-03 , DOI: 10.1021/acs.jcim.3c00610

AndreaMauri,MatteoBertola

AlvaBuilder is a software tool for de novo molecular design and can be used to generate novel molecules having desirable characteristics. Such characteristics can be defined using a simple step by step graphical interface, and they can be based on molecular descriptors, on predictions of QSAR/QSPR models, and on matching molecular fragments or used to design compounds similar to a given one. The molecules generated are always syntactically valid since they are composed by combining fragments of molecules taken from a training data set chosen by the user. In this paper, we demonstrate how the software can be used to design new compounds for a defined case study. AlvaBuilder is available at http://www.alvascience.com/alvabuilder/

charmm2gmx: An Automated Method to Port the CHARMM Additive Force Field to GROMACS

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-03 , DOI: 10.1021/acs.jcim.3c00860

AndrásFWacha,JustinALemkul

CHARMM is one of the most widely used biomolecular force fields. Although developed in close connection with a dedicated molecular simulation engine of the same name, it is also usable with other codes. GROMACS is a well-established, highly optimized, and multipurpose software for molecular dynamics, versatile enough to accommodate many different force field potential functions and the associated algorithms. Due to conceptional differences related to software design and the large amount of numeric data inherent to residue topologies and parameter sets, conversion from one software format to another is not straightforward. Here, we present an automated and validated means to port the CHARMM force field to a format read by the GROMACS engine, harmonizing the different capabilities of the two codes in a self-documenting and reproducible way with a bare minimum of user interaction required. Being based entirely on the upstream data files, the presented approach does not involve any hard-coded data, in contrast with previous attempts to solve the same problem. The heuristic approach used for perceiving the local internal geometry is directly applicable for analogous transformations of other force fields.

Transfer Learning of Full Molecular Weight Distributions via High-Throughput Computer-Controlled Polymerization

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-11 , DOI: 10.1021/acs.jcim.3c00504

JinDaTan,BalamuruganRamalingam,SweeLiangWong,JayceJianWeiCheng,Yee-FunLim,VijilaChellappan,SaifAKhan,JatinKumar,KedarHippalgaonkar

The skew and shape of the molecular weight distribution (MWD) of polymers have a significant impact on polymer physical properties. Standard summary metrics statistically derived from the MWD only provide an incomplete picture of the polymer MWD. Machine learning (ML) methods coupled with high-throughput experimentation (HTE) could potentially allow for the prediction of the entire polymer MWD without information loss. In our work, we demonstrate a computer-controlled HTE platform that is able to run up to 8 unique variable conditions in parallel for the free radical polymerization of styrene. The segmented-flow HTE system was equipped with an inline Raman spectrometer and offline size exclusion chromatography (SEC) to obtain time-dependent conversion and MWD, respectively. Using ML forward models, we first predict monomer conversion, intrinsically learning varying polymerization kinetics that change for each experimental condition. In addition, we predict entire MWDs including the skew and shape as well as SHAP analysis to interpret the dependence on reagent concentrations and reaction time. We then used a transfer learning approach to use the data from our high-throughput flow reactor to predict batch polymerization MWDs with only three additional data points. Overall, we demonstrate that the combination of HTE and ML provides a high level of predictive accuracy in determining polymerization outcomes. Transfer learning can allow exploration outside existing parameter spaces efficiently, providing polymer chemists with the ability to target the synthesis of polymers with desired properties.

Quantum Descriptors for Predicting and Understanding the Structure–Activity Relationships of Michael Acceptor Warheads

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-18 , DOI: 10.1021/acs.jcim.3c00720

RuibinLiu,ErikA.Vázquez-Montelongo,ShuhuaMa,JanaShen

Predictive modeling and understanding of chemical warhead reactivities have the potential to accelerate targeted covalent drug discovery. Recently, the carbanion formation free energies as well as other ground-state electronic properties from density functional theory (DFT) calculations have been proposed as predictors of glutathione reactivities of Michael acceptors; however, no clear consensus exists. By profiling the thiol-Michael reactions of a diverse set of singly- and doubly-activated olefins, including several model warheads related to afatinib, here we reexamined the question of whether low-cost electronic properties can be used as predictors of reaction barriers. The electronic properties related to the carbanion intermediate were found to be strong predictors, e.g., the change in the Cβ charge accompanying carbanion formation. The least expensive reactant-only properties, the electrophilicity index, and the Cβ charge also show strong rank correlations, suggesting their utility as quantum descriptors. A second objective of the work is to clarify the effect of the β-dimethylaminomethyl (DMAM) substitution, which is incorporated in the warheads of several FDA-approved covalent drugs. Our data suggest that the β-DMAM substitution is cationic at neutral pH in solution and promotes acrylamide’s intrinsic reactivity by enhancing the charge accumulation at Cα upon carbanion formation. In contrast, the inductive effect of the β-trimethylaminomethyl substitution is diminished due to steric hindrance. Together, these results reconcile the current views of the intrinsic reactivities of acrylamides and contribute to large-scale predictive modeling and an understanding of the structure–activity relationships of Michael acceptors for rational TCI design.

Identification of Robust Antibiotic Subgroups by Integrating Multi-Species Drug–Drug Interactions

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-17 , DOI: 10.1021/acs.jcim.3c00937

JiLv,GuixiaLiu,YuanJu,HouhouHuang,DalinLi,YingSun

Previous studies have shown that antibiotics can be divided into groups, and drug–drug interactions (DDI) depend on their groups. However, these studies focused on a specific bacteria strain (i.e., Escherichia coli BW25113). Existing datasets often contain noise. Noisy labeled data may have a bad effect on the clustering results. To address this problem, we developed a multi-source information fusion method for integrating DDI information from multiple bacterial strains. Specifically, we calculated drug similarities based on the DDI network of each bacterial strain and then fused these drug similarity matrices to obtain a new fused similarity matrix. The fused similarity matrix was combined with the T-distributed stochastic neighbor embedding algorithm, and hierarchical clustering algorithm can effectively identify antibiotic subgroups. These antibiotic subgroups are strongly correlated with known antibiotic classifications, and group–group interactions are almost monochromatic. In summary, our method provides a promising framework for understanding the mechanism of action of antibiotics and exploring multi-species group–group interactions.

Water Network-Augmented Two-State Model for Protein–Ligand Binding Affinity Prediction

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-11 , DOI: 10.1021/acs.jcim.3c00567

XiaoyangQu,LinaDong,DingLuo,YubingSi,BinjuWang

Water network rearrangement from the ligand-unbound state to the ligand-bound state is known to have significant effects on the protein–ligand binding interactions, but most of the current machine learning-based scoring functions overlook these effects. In this study, we endeavor to construct a comprehensive and realistic deep learning model by incorporating water network information into both ligand-unbound and -bound states. In particular, extended connectivity interaction features were integrated into graph representation, and graph transformer operator was employed to extract features of the ligand-unbound and -bound states. Through these efforts, we developed a water network-augmented two-state model called ECIFGraph::HM-Holo-Apo. Our new model exhibits satisfactory performance in terms of scoring, ranking, docking, screening, and reverse screening power tests on the CASF-2016 benchmark. In addition, it can achieve superior performance in large-scale docking-based virtual screening tests on the DEKOIS2.0 data set. Our study highlights that the use of a water network-augmented two-state model can be an effective strategy to bolster the robustness and applicability of machine learning-based scoring functions, particularly for targets with hydrophilic or solvent-exposed binding pockets.

A Conserved Local Structural Motif Controls the Kinetics of PTP1B Catalysis

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-06-28 , DOI: 10.1021/acs.jcim.3c00286

ChristineYYeh,JesusAIzaguirre,JackBGreisman,LindsayWillmore,PaulMaragakis,DavidEShaw

Protein tyrosine phosphatase 1B (PTP1B) is a negative regulator of the insulin and leptin signaling pathways, making it a highly attractive target for the treatment of type II diabetes. For PTP1B to perform its enzymatic function, a loop referred to as the “WPD loop” must transition between open (catalytically incompetent) and closed (catalytically competent) conformations, which have both been resolved by X-ray crystallography. Although prior studies have established this transition as the rate-limiting step for catalysis, the transition mechanism for PTP1B and other PTPs has been unclear. Here we present an atomically detailed model of WPD loop transitions in PTP1B based on unbiased, long-timescale molecular dynamics simulations and weighted ensemble simulations. We found that a specific WPD loop region─the PDFG motif─acted as the key conformational switch, with structural changes to the motif being necessary and sufficient for transitions between long-lived open and closed states of the loop. Simulations starting from the closed state repeatedly visited open states of the loop that quickly closed again unless the infrequent conformational switching of the motif stabilized the open state. The functional importance of the PDFG motif is supported by the fact that it is well conserved across PTPs. Bioinformatic analysis shows that the PDFG motif is also conserved, and adopts two distinct conformations, in deiminases, and the related DFG motif is known to function as a conformational switch in many kinases, suggesting that PDFG-like motifs may control transitions between structurally distinct, long-lived conformational states in multiple protein families.

Linear-Scaling Kernels for Protein Sequences and Small Molecules Outperform Deep Learning While Providing Uncertainty Quantitation and Improved Interpretability

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-27 , DOI: 10.1021/acs.jcim.3c00601

JonathanParkinson,WeiWang

Gaussian process (GP) is a Bayesian model which provides several advantages for regression tasks in machine learning such as reliable quantitation of uncertainty and improved interpretability. Their adoption has been precluded by their excessive computational cost and by the difficulty in adapting them for analyzing sequences (e.g., amino acid sequences) and graphs (e.g., small molecules). In this study, we introduce a group of random feature-approximated kernels for sequences and graphs that exhibit linear scaling with both the size of the training set and the size of the sequences or graphs. We incorporate these new kernels into our new Python library for GP regression, xGPR, and develop an efficient and scalable algorithm for fitting GPs equipped with these kernels to large datasets. We compare the performance of xGPR on 17 different benchmarks with both standard and state-of-the-art deep learning models and find that GP regression achieves highly competitive accuracy for these tasks while providing with well-calibrated uncertainty quantitation and improved interpretability. Finally, in a simple experiment, we illustrate how xGPR may be used as part of an active learning strategy to engineer a protein with a desired property in an automated way without human intervention.

Interaction of Radiopharmaceuticals with Somatostatin Receptor 2 Revealed by Molecular Dynamics Simulations

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-19 , DOI: 10.1021/acs.jcim.3c00712

SilviaGervasoni,IşılayÖztürk,CamillaGuccione,AndreaBosin,PaoloRuggerone,GiulianoMalloci

The development of drugs targeting somatostatin receptor 2 (SSTR2), generally overexpressed in neuroendocrine tumors, is focus of intense research. A few molecules in conjugation with radionuclides are in clinical use for both diagnostic and therapeutic purposes. These radiopharmaceuticals are composed of a somatostatin analogue biovector conjugated to a chelator moiety bearing the radionuclide. To date, despite valuable efforts, a detailed molecular-level description of the interaction of radiopharmaceuticals in complex with SSTR2 has not yet been accomplished. Therefore, in this work, we carefully analyzed the key dynamical features and detailed molecular interactions of SSTR2 in complex with six radiopharmaceutical compounds selected among the few already in use (64Cu/68Ga-DOTATATE, 68Ga-DOTATOC, 64Cu-SARTATE) and some in clinical development (68Ga-DOTANOC, 64Cu-TETATATE). Through molecular dynamics simulations and exploiting recently available structures of SSTR2, we explored the influence of the different portions of the compounds (peptide, radionuclide, and chelator) in the interaction with the receptor. We identified the most stable binding modes and found distinct interaction patterns characterizing the six compounds. We thus unveiled detailed molecular interactions crucial for the recognition of this class of radiopharmaceuticals. The microscopically well-founded analysis presented in this study provides guidelines for the design of new potent ligands targeting SSTR2.

Dissecting the Effect of Temperature on Hyperthermophilic Pf2001 Esterase Dimerization by Molecular Dynamics

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-15 , DOI: 10.1021/acs.jcim.3c00415

XueZhang,LeiLi,QingchuanZheng

Pf2001 esterase (Pf2001) from Pyrococcus furiosus has hyperthermophilic properties and exerts a biocatalytic function in a dimeric state. Crystal structures revealed that the structural rearrangement of the cap domain is responsible for the Pf2001 dimer formation. However, the details of the cap domain remodeling and the effects of temperature on the dimerization process remain elusive at the molecular level, taking into account that experimental methods are difficult to capture the dynamic process of dimerization to some extent. Herein, four dimer models based on the monomeric crystal structure (PDB ID: 5G59) were constructed to investigate the conformational transition details and temperature effects in the dimerization by conventional molecular dynamics and accelerated molecular dynamics simulations. Our simulation results indicate that the monomer undergoes a conformational change into a “preparatory state” at high temperatures, which is more favorable for its transformation into a stable dimer. The subsequent free energy landscape analysis further identifies four intermediate states (from separated state to dimeric state) and discloses that a more accessible α-helix driven by stronger hydrophobic interactions induces a rearrangement of the cap domain, displaying a “tic-tac-toe” activation feature that is important for stabilizing the dimer interface and facilitating the formation of hydrophobic pockets. In addition, the electrostatic potential surface analysis illustrates that the weaker electrostatic repulsion (Lys and Arg) in the dimer interface at high temperatures is also a key factor for dimer stabilization. Altogether, our results can provide molecular-level insight into the dimer formation process of hyperthermophilic esterase and would be useful to understand the enzymatic specificity of α/β-hydrolase.

Preprocessing of Single Cell RNA Sequencing Data Using Correlated Clustering and Projection

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-04 , DOI: 10.1021/acs.jcim.3c00674

YutaHozumi,KiyotoAramisTanemura,Guo-WeiWei

Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell–cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing the downstream analysis. We present Correlated Clustering and Projection (CCP), a new data-domain dimensionality reduction method, for the first time. CCP projects each cluster of similar genes into a supergene defined as the accumulated pairwise nonlinear gene–gene correlations among all cells. Using 14 benchmark data sets, we demonstrate that CCP has significant advantages over classical principal component analysis (PCA) for clustering and/or classification problems with intrinsically high dimensionality. In addition, we introduce the Residue-Similarity index (RSI) as a novel metric for clustering and classification and the R-S plot as a new visualization tool. We show that the RSI correlates with accuracy without requiring the knowledge of the true labels. The R-S plot provides a unique alternative to the uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE) for data with a large number of cell types.

A Highly Sensitive Model Based on Graph Neural Networks for Enzyme Key Catalytic Residue Prediction

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-03 , DOI: 10.1021/acs.jcim.3c00273

XiaoweiShen,ShidingZhang,JianyuLong,ChangjingChen,MengWang,ZihengCui,BiqiangChen,TianweiTan

Determining the catalytic site of enzymes is a great help for understanding the relationship between protein sequence, structure, and function, which provides the basis and targets for designing, modifying, and enhancing enzyme activity. The unique local spatial configuration bound to the substrate at the active center of the enzyme determines the catalytic ability of enzymes and plays an important role in the catalytic site prediction. As a suitable tool, the graph neural network can better understand and identify the residue sites with unique local spatial configurations due to its remarkable ability to characterize the three-dimensional structural features of proteins. Consequently, a novel model for predicting enzyme catalytic sites has been developed, which incorporates a uniquely designed adaptive edge-gated graph attention neural network (AEGAN). This model is capable of effectively handling sequential and structural characteristics of proteins at various levels, and the extracted features enable an accurate description of the local spatial configuration of the enzyme active site by sampling the local space around candidate residues and special design of amino acid physical and chemical properties. To evaluate its performance, the model was compared with existing catalytic site prediction models using different benchmark datasets and achieved the best results on each benchmark dataset. The model exhibited a sensitivity of 0.9659, accuracy of 0.9226, and area under the precision-recall curve (AUPRC) of 0.9241 on the independent test set constructed for evaluation. Furthermore, the F1-score of this model is nearly four times higher than that of the best-performing similar model in previous studies. This research can serve as a valuable tool to help researchers understand protein sequence–structure–function relationships while facilitating the characterization of novel enzymes of unknown function.

Kinetic Barrier to Enzyme Inhibition Is Manipulated by Dynamical Local Interactions in E. coli DHFR

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-25 , DOI: 10.1021/acs.jcim.3c00818

EbruCetin,TandacF.Guclu,IsikKantarcioglu,IlonaK.Gaszek,ErdalToprak,AliRanaAtilgan,BurcuDedeoglu,CananAtilgan

Dihydrofolate reductase (DHFR) is an important drug target and a highly studied model protein for understanding enzyme dynamics. DHFR’s crucial role in folate synthesis renders it an ideal candidate to understand protein function and protein evolution mechanisms. In this study, to understand how a newly proposed DHFR inhibitor, 4′-deoxy methyl trimethoprim (4′-DTMP), alters evolutionary trajectories, we studied interactions that lead to its superior performance over that of trimethoprim (TMP). To elucidate the inhibition mechanism of 4′-DTMP, we first confirmed, both computationally and experimentally, that the relative binding free energy cost for the mutation of TMP and 4′-DTMP is the same, pointing the origin of the characteristic differences to be kinetic rather than thermodynamic. We then employed an interaction-based analysis by focusing first on the active site and then on the whole enzyme. We confirmed that the polar modification in 4′-DTMP induces additional local interactions with the enzyme, particularly, the M20 loop. These changes are propagated to the whole enzyme as shifts in the hydrogen bond networks. To shed light on the allosteric interactions, we support our analysis with network-based community analysis and show that segmentation of the loop domain of inhibitor-bound DHFR must be avoided by a successful inhibitor.

How to Compute Atomistic Insight in DFT Clusters: The REG-IQA Approach

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-10 , DOI: 10.1021/acs.jcim.3c00404

FabioFalcioni,PaulLAPopelier

The relative energy gradient (REG) method is paired with the topological energy partitioning method interacting quantum atoms (IQA), as REG-IQA, to provide detailed and unbiased knowledge on the intra- and interatomic interactions. REG operates on a sequence of geometries representing a dynamical change of a system. Its recent application to peptide hydrolysis of the human immunodeficiency virus-1 (HIV-1) protease (PDB code: 4HVP) has demonstrated its full potential in recovering reaction mechanisms and through-space electrostatic and exchange–correlation effects, making it a compelling tool for analyzing enzymatic reactions. In this study, the computational efficiency of the REG-IQA method for the 133-atom HIV-1 protease quantum mechanical system is analyzed in every detail and substantially improved by means of three different approaches. The first approach of smaller integration grids for IQA integrations reduces the computational overhead by about a factor of 3. The second approach uses the line-simplification Ramer–Douglas–Peucker (RDP) algorithm, which outputs the minimal number of geometries necessary for the REG-IQA analysis for a predetermined root mean squared error (RMSE) tolerance. This cuts the computational time of the whole REG analysis by a factor of 2 if an RMSE of 0.5 kJ/mol is considered. The third approach consists of a “biased” or “unbiased” selection of a specific subset of atoms of the whole initial quantum mechanical model wave-function, which results in more than a 10-fold speed-up per geometry for the IQA calculation, without deterioration of the outcome of the REG-IQA analysis. Finally, to show the capability of these approaches, the findings gathered from the HIV-1 protease system are also applied to a different system named haloalcohol dehalogenase (HheC). In summary, this study takes the REG-IQA method to a computationally feasible and highly accurate level, making it viable for the analysis of a multitude of enzymatic systems.

PREFER: A New Predictive Modeling Framework for Molecular Discovery

Journal of Chemical Information and Modeling ( IF 6.162 ) Pub Date : 2023-07-24 , DOI: 10.1021/acs.jcim.3c00523

JessicaLanini,GianlucaSantarossa,FintonSirockin,RichardLewis,NikolasFechner,HubertMisztela,SarahLewis,KrzysztofMaziarz,MeganStanley,MarwinSegler,NikolausStiefl,NadineSchneider

Machine-learning and deep-learning models have been extensively used in cheminformatics to predict molecular properties, to reduce the need for direct measurements, and to accelerate compound prioritization. However, different setups and frameworks and the large number of molecular representations make it difficult to properly evaluate, reproduce, and compare them. Here we present a new PREdictive modeling FramEwoRk for molecular discovery (PREFER), written in Python (version 3.7.7) and based on AutoSklearn (version 0.14.7), that allows comparison between different molecular representations and common machine-learning models. We provide an overview of the design of our framework and show exemplary use cases and results of several representation–model combinations on diverse data sets, both public and in-house. Finally, we discuss the use of PREFER on small data sets. The code of the framework is freely available on GitHub.

中科院SCI期刊分区

大类学科	小类学科	TOP	综述
化学2区	CHEMISTRY, MEDICINAL 药物化学2区	否	否

补充信息

自引率	H-index	SCI收录状况	PubMed Central (PML)
7.80	131	Science Citation Index Science Citation Index Expanded	否

投稿指南

期刊投稿网址: http://acs.manuscriptcentral.com/acs
投稿指南: http://publish.acs.org/publish/author_guidelines?coden=jcisd8
投稿模板: http://pubs.acs.org/page/jcisd8/submission/authors.html#TEMPLATES
参考文献格式: http://pubs.acs.org/page/jcisd8/submission/reference-guidelines.html
收稿范围: Journal of Chemical Information and Modeling出版化学信息学和分子建模中的新方法和重要应用。化学、计算机和信息研究人员为本期刊的主要关注群体，及时查看独到的研究成果、编程创新和软件评论等行业最新动态。期刊收录研究方向：化学数据库的表现形式和基于计算机的搜索，分子建模，新材料/催化剂/配体的计算机辅助分子设计，化学软件的新算法或有效算法的开发，生物制药化学（包含生物活动分析和药物发现相关报道）。
收录载体: Viewpoints Articles Perspectives Reviews Letters Application Notes Additions and Corrections Retractions Expressions of Concern