paSNPg: A GBS-Based Pipeline for Protein-Associated SNP Discovery and Genotyping in Non-Model Species.

Fu, Y.B. and Dong, Y. (2015). "paSNPg: A GBS-Based Pipeline for Protein-Associated SNP Discovery and Genotyping in Non-Model Species.", Journal of Proteomics & Bioinformatics. doi : 10.4172/jpb.1000368  Access to full text

Abstract

Genotyping-by-sequencing (GBS) has recently developed as a feasible genomic approach for exploring genome-wide genetic variation for population and evolutionary genomic analyses of non-model species. To facilitate the acquisition of function-associated genetic variation data in natural populations, we present a GBS-based pipeline called paSNPg for protein-associated SNP (paSNP) discovery and genotyping in non-model organisms. The pipeline was developed through the expansion of the published npGeno utility to assemble nuclear contigs from raw GBS sequence data, separate protein-associated contigs from all assembled contigs based on published PEP (or Predictions on Entire Proteomes) sequence data sets, and call paSNPs across assayed samples based on protein-associated contigs. Testing the pipeline with two GBS sequence data sets, Arabidopsis thaliana and Oryza sativa, revealed its potential use in exploring protein-associated genetic variation from genomic DNAs of non-model species.

Date modified: