Poster

Integrating Molecular Dynamics Simulations and Machine Learning to Uncover Important Determinants of protein-protein Interactions

Abstract

A family of secreted glycoproteins, Wnt protein signaling is essential for regulating cell development. Dysregulated Wnt signaling is implicated in several diseases, in particular, cancer resulting from uncontrolled cell proliferation. For signaling activation, Wnt must bind with WLS, a transmembrane protein that facilitates its transport to the cell membrane for secretion. This binding requires Wnt undergo palmitoylation (PAM) which adds a lipid group to a highly conserved serine residue enabling it to interact with a specific hydrophobic pocket within WLS. The precise assembly of Wnt, WLS, and PAM into a complex that is both energetically favorable and efficient for extracellular transport is poorly understood. While structural studies have provided important insights into signal transduction, structural information is only available for a few of the human Wnt family proteins. Using data from molecular dynamics (MD) simulations, we developed machine learning (ML) models to characterize residue-level differences between four human Wnt family proteins in complex with WLS. We ran 1.5 microsecond MD simulations using crystal structures of Wnt3a and Wnt8a and homology models of Wnt1 and Wnt5a producing 75,000 frames per protein. Features were extracted as WNT-WLS residue pairs within a 12Å contact distance (n=985) and the final training and testing sets contained 160,000 and 70,000 frames, respectively. Subclustering within regions of known structural significance reduced the feature set to 210 residue pairs. The reduced feature set was used to train a Random Forest multiclass classification model, optimized through hyperparameter tuning and 10-fold cross-validation, which achieved an accuracy of 95.9%. Permutation feature importance recovered 42% of residues pairs previously shown in biochemical experiments to play a vital role in Wnt signaling. These findings offer new insight into the intricate specificity of WNT-WLS interactions and demonstrate the value of combining MD with ML to identify critical molecular determinants of WNT signaling.

Related