Abstract
Galaxy web-based platform for Next Generation Sequence (NGS) data analysis provides
unprecedented opportunities to characterize, analyze and computationally visualize genomic
landscapes with limited-resources. An initiative was taken to explore this pipeline for NGS data-analysis
by using Galaxy platform, for its relative accessibility, reproducibility, transparency and scalability.
Methods: Variant calling and associated workflows were executed on NGS pooled-seq data of 12 Pakistani
Teddy goats. Different tools used in this pipeline are FastQC for quality checks, Trimmomatic for trimming data,
SAM/BAM tools for conversion of file formats, Picard tools for marking deduplicates, VCFtools/FreeBayes for
genomic variant detection and SnpSift to annotate the variants.
Results: Highly associated functionally untrivial 43,712 loci were percolated having 87,510 alleles. Besides,
1,548 variants with 1,134 SNPs, 23 mixed variants, 76 MNP, 183 insertions and 132 deletions were observed in
Teddy breed using San Clement ARS1 reference genome. Furthermore, 1,283 homozygous and 265
heterozygous variant were also divulged out of 43,447 loci. These variants are likely to be liable for general
phenotypic traits of Teddy with smaller body-size, tender meat quality and agility along with other breed specific
traits.
Conclusion: Galaxy fulfills the core function of reproducibility and easy accessibility by removing the gaps
between large data analysis and its interpretations. This variant calling pipeline reveals the genomic differences
of Teddy specific characteristics as compare to ARS1 reference genome.
Rashid Saif, Aniqa Ejaz, Tania Mehmood, Fatima Asif, Suliman Mohammad Alghanem, Talha Saleem Ahmad. (2019) Introduction to Galaxy Platform for NGS Variant Calling Pipeline, Advancements in Life Sciences, Volume 7, Issue 3.
-
Views
954 -
Downloads
77