Session TOE. There are 4 abstracts in this session.

Session: Informatics: Emerging and New Approaches, time: 3:00 - 3:25 pm

Semantic Computing for Protein Knowledge Network Discovery

Cathy H Wu
University of Delaware, Newark, DE

To realize the value of genome-scale data for disease understanding we have develop a semantic computing framework that connects text mining, data mining and biomedical ontology for protein knowledge discovery. We have employed natural language processing and machine learning approaches with linguistic generalization to develop text mining tools applicable to various types of entities and relations, including post-translational modification (PTM) enzyme-substrate-site relationships and their protein-protein interactions, as well as associations with diseases, genomic anomalies and drug responses. To foster large-scale text analytics across documents, we have further developed iTextMine with an automated workflow. To support protein-centric semantic integration of biomedical data of increasing volume and complexity, we have developed the Protein Ontology as a reference ontology in the Open Biological and Biomedical Ontologies Foundry, allowing both human understanding and computational reasoning of proteins in biological contexts. Through federated SPARQL queries of multiple ontologies and knowledge sources, including PTM-proteoform sites specific functional annotation in PRO, we proposed mechanisms connecting PTMs, variants, and cancer. The rich PTM knowledge is being integrated in iPTMnet to support exploration of PTM enzyme-substrate relationships, regulation of PTM enzymes, cross-talk, and conservation across species. Our tools are FAIR (Findable, Accessible, Interoperable, Reusable), accessible programmatically via RESTful API, and dockerized and available from project websites, public repositories and cloud-based environment. This talk will highlight projects involving PTM enzyme (kinase) enrichment analysis and the use of LINCS (Library of Integrated Network-based Cellular Signatures) for interpretation of large-scale proteomic data to identify the upstream signaling pathways that are responsible for the observed PTM state of the cell and further our understanding of the impact of kinase inhibitor drugs on signaling pathways in cancer therapy.

Tips and Tricks (if submitted):

Session: Informatics: Emerging and New Approaches, time: 3:25 - 3:50 pm

Algorithms and databases for mining insight from quantitative mass spectrometry experiments of post-translational modifications

Kristen Naegle
University of Virginia, Charlottesville, VA

My research lab is deeply interested in predicting and testing the function of tyrosine phosphorylation in proteins and protein networks. In the pursuit of this work, we have developed new algorithms and resources for identifying, analyzing, and inferring meaning about post-translational modifications. In this talk, I will share methods for ensemble clustering, which have identified new protein-protein interactions from quantitative mass spectrometry data and share information about ProteomeScout, our resource for improving accessibility and analysis of quantitative experiments and whole proteomes-level information about post-translational modifications. 

Tips and Tricks (if submitted):

Session: Informatics: Emerging and New Approaches, time: 3:50 - 4:05 pm

Topoliogiical Scoring of Protein Interaction Networks

Michael Washburn
Stowers Institute for Medical Research, Kansas City, MO

It remains a significant challenge to define individual protein associations within networks where an individual protein can directly interact with other proteins and/or be part of large complexes, which contain functional modules.  Here we demonstrate the topological scoring (TopS) algorithm for the analysis of quantitative proteomic analyses of affinity purifications. Data is analyzed in a parallel fashion where a bait protein is scored in an individual affinity purification by aggregating information from the entire dataset.  A broad range of scores is obtained which indicate the enrichment of an individual protein in every bait protein analyzed.  TopS was applied to interaction networks derived from human DNA repair proteins and yeast chromatin remodeling complexes.  TopS captured direct protein interactions and modules within complexes. TopS is a rapid method for the efficient and informative computational analysis of datasets, is complementary to existing analysis pipelines, and provides new insights into protein interaction networks.

Tips and Tricks (if submitted):

Session: Informatics: Emerging and New Approaches, time: 4:05 - 4:20 pm

High-Throughput Identification of MS-Cleavable and Non-cleavable Chemically Crosslinked Peptides with MetaMorpheus

Lei Lu; Michael R. Shortreed; Robert J. Millikin; Lloyd M. Smith
University of Wisconsin, Madison, WI

Protein chemical cross-linking combined with mass spectrometry has become an important technique for the analysis of protein structure and protein–protein interactions. Reliable, rapid, and user-friendly tools for large-scale analysis of cross-linked proteins, however, are still needed. MetaMorpheus has recently been updated to identify both MS-cleavable and noncleavable cross-linked peptides. MetaMorpheus crosslink search does not require the presence of signature fragment ions, a major advantage compared with similar programs. One complication associated with the need for signature ions from cleavable cross-linkers such as DSSO (disuccinimidyl sulfoxide) is the requirement for multiple fragmentation types and energy combinations, which is not necessary for MetaMorpheus. MetaMorpheus can, however, search fragmentation from multiple dissociation types for the same precursor (e.g. CID & ETD, CID & HCD) and also MS2/MS3 data for those users desirous of such information. Another significant advantage of MetaMorpheus is the ability to perform proteome-wide analysis. MetaMorpheus is also faster than other currently available MS-cleavable cross-link search software programs. Finally, MetaMorpheus provides immediate and straightforward MS2 annotation of each assignment in a format that can be exported in portable data format (.pdf). This feature enables users to manually validate identifications.

Tips and Tricks (if submitted):