Bioinformatics and Genomics

Mission and Principles

Mission

Penn State’s Graduate Program in Bioinformatics and Genomics trains scientists to work at the intersection of biology and computer science/statistics. We teach students to generate, analyze, and use large and complex biological data so they can pursue answers to some the most pressing questions of our time. Students’ joint fluency in the contemporary life sciences and the use of computational and statistical tools allows them to keep pace with the quickly evolving landscape of high-throughput “omics” technologies and gives them a unique interdisciplinary advantage in the job market.  

Core Principles

The Bioinformatics and Genomics program instills a high level of scientific integrity among its graduate students. This training helps students to grow into scientists of the highest caliber who are well-respected in their fields. Specifically, students in the program are expected to produce research that is:

  • Transparent: All steps in data acquisition, processing, and analysis must be clearly described using documented methods and freely available tools. 
  • Reproducible: All steps in data acquisition, processing, and analysis must be repeatable by second parties, and must generate equivalent results.
  • Statistically Sound: Results must be reached by means of sound and appropriate statistical methodology.
  • Robust: Results must be robust to arbitrary choices in the data acquisition, processing, and analysis pipeline. 
  • Efficient: Computational and statistical tools intended for community use must be optimized for running time and memory usage.
  • Valid: Conclusions drawn from results must improve our understanding of processes and mechanisms, thus yield biological insights. Conclusions also must be amenable to experimental validation in biological systems.

A set of courses and training activities are designed to train students in the following areas:

  1. Foundations of genomics, molecular genetics
  2. Sequencing technologies, genome assembly, alignments, read mapping
  3. Basic programming and scripting for bioinformatics
  4. Algorithm development in bioinformatics
  5. Applied statistics, and statistical methods for “omics” data
  6. Competence in R and equivalent statistical software
  7. Transcriptome analyses and techniques (microarray, RNA-seq)
  8. Comparative genomics, molecular evolution, function inferred from signatures of negative and positive selection
  9. Finding and functional analysis of protein-coding genes
  10. Genome variation, mutagenesis, connections to phenotypes
  11. Genome mapping, Mendelian inheritance of genes and DNA markers