Combining machine learning and targeted mass spectrometry to validate protein isoforms

Bioinformatics Lab News Proteomics Research

Similar Posts

Image Description
Lab opens in Colorado!
Tue 30 May 2017 |

Hello! I have recently moved to from the …

Image Description
Mass Spectrometer Delivered!
Sun 20 Aug 2017 |

We acquired a Thermo Q-Exactive HF mass spectrometry, …

Image Description
AHA BCVS Meeting 2017

For this year's BCVS in Portland, I co-chaired …

Image Description
PubPularDB and FaBian Updates
Wed 06 Sep 2017 |

Our lab recently received funding from the NIH …

Image Description
BioArXiV article on Gene Annotation Bias
Fri 20 Oct 2017 |

We previously published a data science method PubPular …

Image Description
Quantifying the value of basic science
Thu 07 Sep 2017 |

In this era of constrained research funding, the …

Image Description
Lab receives 5-year R01 funding
Sat 02 Jun 2018 |

We are excited to receive a generous 5-year …

Image Description
Popular protein manuscript released on bioRxiv
Tue 25 Sep 2018 |

Our manuscript on popular proteins across the human …

Image Description
Alternative splice isoform companion data on website
Sat 11 Apr 2020 |

We recently published our study on finding protein …

Image Description
What does oxidative stress do to the heart? Examining proteomic changes
Mon 01 Jun 2020 |

The heart is an organ that is very …

Image Description
Recent publications October 2020
Fri 30 Oct 2020 |

Check out some recent publications from our team …

Image Description
Lab Receives 7th Percentile for R01 Application
Thu 01 Jul 2021 |

We are grateful to have received a favorable …

Image Description
Lab awarded 1.96 millions to study stress response and senesence
Thu 06 Oct 2022 |

Highlighted in Dean's weekly message of CU School …

Image Description
Received NoA for R01 Renewal
Fri 24 Mar 2023 |

Today we are ecstatic to receive the Notice …

Feb. 16, 2021, 10:53 p.m.


Why identify protein isoforms?

Alternative splicing plays a very important role in the heart. In a previous work, we found that the heart is one of the examined tissues that are the most affected by alternative splicing (link). Important proteins in the heart that have splice variants include tropomycin 1 as well as titin, the proper splicing of which have been implicated in congenital heart diseases.

A current blind spot of alternative splicing research is that most isoforms have been defined only at the mRNA level, and there are not enough technologies that can allow different protein isoforms to be detected. This is important if we want to know whether the spliced isoforms are correctly translated and what their potential molecular functions (localization, interactions) are. Sometimes the isoforms can be distinguished by gel migration patterns, but often the isoforms have very similar molecular weights and so may or may not separate on a gel.

What is spectrum prediction?

We previously developed an "RNA-guided proteomics" approach to identify some candidate protein isoforms in the heart from proteomics data (link). However, identifying "non-canonical" peptide sequences (i.e., protein products not encoded from the most common/prominent version of a gene) from shotgun proteomics data can be prone to false positives and so requires careful validation. Targeted mass spectrometry could provide an avenue to verify isoform discovery, by allowing the isoform to be targeted and detected again in additional samples. But a challenge of building targeted mass spectrometry assays is that it is labor intensive and often requires the use of expensive stable isotope labeled peptide standards to verify peptide identity.

Several approaches have now been described that allows the fragmentation spectrum of a peptide to be predicted in silico. This could mean an easier way to build targeted mass spectrometry assays without using expensive peptide standards. These prediction approaches usually fall into two camps, using either a detailed physicochemical model that predicts peptide behaviors, or using a data-driven, deep learning based approach to predict the fragmentation pattern from existing experimentald data. Prosit is one such deep learning algorithm that has been shown to perform exceptionally well in predicting the fragmentation spectra of peptides.

Does this work with alternative peptides?

The current Prosit model was trained against a large library of synthetic peptides, and it was not completely clear if it works well for alternative or novel isoform sequences. We verified this in our study by comparing Prosit prediction to a number of synthetic peptide standards for the isoform sequences we identified, then used the result to build a number of "computation-assisted" targeted mass spectrometry assay. We showed that these assays allow some candidate protein isoforms to be reliably re-identified in human heart tissue as well as cultured human AC16 cells, suggesting they have good potential to be employed for isoform quantification and functional studies.

To read more, check out this new paper by Erin, Juliana, and others online at JMCC (link)!