Tools | Bioinformatics and Genomics | The Huck Institutes (en-US)

The Roar supercomputer is Penn State's high-performance research cloud, managed by the Institution for Computational and Data Sciences. There are free and paid allocations available to Penn State researchers, including participants in the BG program. More information and instructions for access are available on the ICDS website.

A further list of bioinformatics tools, platforms, and software developed by Penn State researchers for biological data analysis is available below.

Galaxy, an open, web-based platform for accessible, reproducible and transparent computational biomedical research

Biostars-Bioinformatics Explained, a forum to explore bioinformatics, computational genomics and biological data analyses

Biostars-Galaxy Explained, a forum to explore Galaxy

Neurostars, a forum to engage Neuroinformatics community

Bioconductor User Support

PyBlue, a simple static site generator

Genetrack, a bioinformatics software package for sorting, queirying and visualizing interval oriented data

BooleanNet, a Boolean network simulation software for life science

Genome browser with erythroid transcription factor occupancy and other features of gene regulation, genome-wide in mouse

KmerGenie, kmer size selection for genome assembly

TwoPaCo, deBruijn graph construction from complete genomes

bcalm, deBruijn graph compaction in low memory

FlowgramFixer, base caller for IonTorrent sequencing data

SPRITE, parallel SNP Detection Pipeline

FASCIA, parallel subgraph counting for determining approximate counts of tree-structured subgraphs in large networks

BEAM (Source code), BEAM2 (Source code), BEAM3 (Souce code, compiling needs GNU Scientific Library) and BEAMimpute for SNP-SNP interaction association mapping

PASS, PASS2 (Source code) Peak calling in ChIP data based on Poisson de-clumping, controls FWER and FDR

GPASS for detecting SNP disease associations in case control studies

dCaP Joint peak caller and differential binding detector for ChIP-Seq data in multiple samples.

DBM Dynamic Bayesian Markov model for genotype calling, haplotype inference, de novo inference of population structure and local admixture for next-gen sequencing data

TIPS Tree based Bayesian detection method of subtle population structures

CHB Coalescence guided Baysian inference of haplotypes from genotype data

EulerAlign Alignment of DNA sequences using Eulerian graphs

MultiGPS, a framework for analyzing collections of multi-condition ChiP-seq datasets

STAMP, a webserver resource for aligning transcription factor DNA binding motifs

PipMaker and MultiPipMaker server software (bzipped tar file of source code; beta version; latest release: 2011-Aug-12)

LASTZ alignment program (latest release: 1.02.00, 2010-Jan-12)

VennGenerator (latest release: 2009-Jul-23)

DIAL (gzipped tar file of source code; latest release: 2011-Jun-06)

YASRA, Yet Another Short Read Assembler (gzipped tar file of source code; latest release: 2014-Mar-27)

CHAP (fast version; gzipped tar file; 71 Mb; 2011-Aug-02)

CHAP 2 (link to GitHub)

StructureFold, at Galaxy, for RNA secondary structure mapping and reconstruction

ShortStack, for comprehensive annotation and quantification of small RNA genes

PrISE : Prediction of protein-protein Interface residues using Structural Elements

EnsembleGly: glycosylation site prediction

PRIDB: The Protein-RNA Interaction Database

ProtinDB - PROTein-protein INterface residues Data Base

INDUS - INtelligent Data Understanding System

Flynotyper, a quantitative tool for functional genetic analysis in D. melanogaster

Phenogram, for creating chromosomal ideograms

PheWAS-View, for visually integrating PheWAS results

PLATO, a Platform for the Analysis, Translation and Organization of large-scale data

Synthesis-View, for data visualization

Spectrum/EPP: Estimation and Projection Package is used to estimate and project adult HIV prevalence and incidence from surveillance data.

IMIS: R-package for Incremental Mixture Importance Sampling. Reference: Raftery and Bao (2010) Biometrics.

LiBaC: The primary use is to identify positively selected sites when the process of evolution is highly heterogeneous among sites. Reference: Bao, Gu, Dunn and Bielawski (2008) Molecular Biology and Evolution.

Precursor Identifier, Identify biomass precursors that are not produced upon essential (synthetic lethal) gene deletion

OptCom, a comprehensive modeling framework for the flux balance analysis of microbial communities

OptForce, identify the minimal set of genetic interventions that shape the metabolism of a microorganism

SL Finder identify synthetic lethal genes or reactions in genome-scale metabolic models

EMU generato, Elementary Metabolite Unit generation code for isotope mapping models

GrowMatch, reconciling in silico predictions with in vivo growth observations

GapFind/GapFill, identifying and filling network gaps for genome-scale metabolic models

OptKnock, strain redesign for overproduction using gene/reaction deletions

IPRO, integrated environment for various protein engineering tasks

MAPs, a database of Modular Antibody Parts for predicting and designing antibody variable domains

OptZyme, enzyme redesign through the use of transition state analogues

OptCDR, de novo design of antibody Complementarity Determining Regions for binding targeted epitopes in antigens

eShuffle, prediction of crossover distributions using DNA shuffling