Course Overview
This course offers a comprehensive introduction to Next-Generation Sequencing (NGS) and whole exome sequencing (WES) with a focus on Python-based automated variant calling pipelines. Designed for students, researchers, and early-career professionals, it provides both theoretical knowledge and practical hands-on training to process and analyze exome sequencing data efficiently. Participants will learn to identify and interpret genomic variations, including single nucleotide polymorphisms (SNPs) and insertions/deletions (indels), using Python scripts and industry-standard bioinformatics tools.
Course Overview
The course begins with a foundational introduction to NGS and WES, covering sequencing platforms, technologies, advantages, and limitations. Participants will gain skills in obtaining raw exome datasets for any organism or disease context and understanding key concepts such as haploid vs. diploid organisms, ploidy in disease research, germline vs. somatic mutations, and the significance of SNPs, structural variations, and copy number variations (CNVs).
The practical component focuses on Python-based workflows, including installation of Anaconda, setting up computational environments, and building automated pipelines for variant calling. Students will perform quality control, read trimming, mapping against reference genomes, post-alignment processing, and variant identification to generate reliable Variant Call Format (VCF) files. Hands-on sessions cover the use of tools such as GATK, Freebayes, VCFtools, SnpSift, SnpEff, and Variant Effect Predictor (VEP) for filtering, annotating, and predicting the effects of variants.
The final segment emphasizes downstream functional analysis, teaching participants how to perform gene ontology (GO) enrichment, pathway analysis using KEGG, PANTHER, and Reactome, and data retrieval from genomic repositories like ArrayExpress, Gene Expression Omnibus (GEO), and NCBI databases. Students also gain experience working with genome assembly files (BED, GTF/GFF, SAM/BAM) to ensure comprehensive understanding of the complete WES workflow.
What You'll Learn
- • Fundamentals of NGS and whole exome sequencing
- • Retrieval and management of raw WES datasets
- • Quality control, trimming, and mapping of reads
- • Variant calling (SNPs, indels, CNVs, structural variants) using Python-based pipelines
- • Filtering, annotation, and functional effect prediction of variants
- • Development of fully automated Python pipelines for WES analysis
- • Downstream functional enrichment analysis: GO and pathway analysis
- • Handling genome assembly and alignment files (BED, GTF/GFF, SAM/BAM)
Who Should Enroll
This course is suitable for students, graduates, and researchers in Bioinformatics, Biotechnology, Molecular Biology, Genetics, and related life sciences seeking practical expertise in automated variant calling and advanced WES data analysis using Python.