PROTEIN QUANTITATION PIPELINE FOR TOP DOWN PROTEOMICS
Advisor:
Dr. Khalid Mehmood Ul Hasan
Abstract:
The cycle of life on Earth is fueled through one single most important entity, protein, which has
major implications in almost every aspect of human life. Over the last few decades general
public of Pakistan has been exposed to slow poisoning caused by contaminated water. This water
contains hazardous amount of heavy metals which get absorbed by the bacteria through the
process of Biosorption leading to the vast phenomenon of Microbial Fixation. Due to rapid
industrialization, the fixation process is deteriorating at an alarming rate, and hence, as a result of
seepage through drains, the underground water table is at risk. Identification and quantitation of
proteins responsible for microbial fixation, the only natural process to remove hazardous heavy
metals from the water, can help us unlocking a plethora of ways to combat the issue of heavy
metal contamination in Pakistan. Determining the protein content of these bacteria can lead. Our
tool is free, open source and open architecture web service that provides an intuitive search
environment for identifying and quantifying proteins. In this work, we extend our tool for
analyzing top-down proteomics data by incorporating a much-needed top down protein
quantitation tool. Currently its salient features include: (i) intensity weighted sliding window
protocol for intact protein mass tuning, (ii) de novo peptide sequence tag extraction and its
scoring, (iii) abundance weighted in silico spectral comparison, we further extended its horizon
of capabilities by adding state-of-the-art algorithms for deconvolution and quantitation. Label
free quantitation techniques are incorporated in first phase comprising of (i) Polynomial
regression for missing peaks, (ii) spectral counting, a relative quantitation approach, (iii)
extracted-ion chromatogram (XIC) used for simple peak detection thus quantitation. We have
also taken its implementation from CPU to groundbreaking and emergent Graphical Processing
Unit (GPU) technology. Further our aims include developing a top-down protein identification
and quantitation engine implemented in the ASP.NET framework, which is a GPU based, open
source, open architecture and publicly available web service that will use NVIDIAs CUDA
toolkit for implementing state-of-the-art algorithms for this purpose.