Major programmatic thrusts are on Big Data Analytics, Biomedical Informatics and Translational Health Informatics

The curriculum for the B2D2K program includes (optional) bridge, required and elective courses designed to provide specialized training. These are supplemented by participation in seminars, symposia, responsible conduct of research and professional development activities.

Bridge Courses (optional)

Trainees are expected to enter the program with strong undergraduate preparation in quantitative or life sciences disciplines. For life science students, two bridge courses will be offered:

Elements of Algorithmic Thinking

This course provides introduction to computational problem solving and programming. Elements of data structures, algorithms, complexity, programming in python and hands-on data analysis exercises will be taught.

Elements of Statistical Thinking and Data Analysis

This course teaches probability, sampling techniques, data summarization, common sampling distributions, statistical inference and hypothesis testing, regression, non-parametric inference.

Required Courses (12 Credits)

Machine Learning and Predictive Modeling: Statistical and algorithmic models of learning; data representation, feature selection dimensionality reduction; probabilistic generative models (including Bayesian networks, dynamic Bayesian networks, topic models) and kernel methods for clustering, classification and regression; time series analysis. Applications. Practicum.

Data ManagementRelational Databases, RDF and RDFS, OWL and Description Logics, Biomedical Ontologies (GO, ICD-10). Applications. Practicum.

Data Privacy: Statistical methods for data privacy, confidentiality, and disclosure limitation, computational methods (differential privacy, cryptography, k-anonymity), Privacy-preserving data analytics. Applications. Practicum.

Big Data Analytics: Data analytics at scale. Hadoop, MapReduce, Scala, Spark and other tools for Big Data analytics at scale. Practicum.    


Recommended Biomedical and Life Sciences Electives (9 credits):

B2D2K trainees will select 3 Biomedical and Life Sciences electives based on their research interests and recommendations from their co-major professors and thesis committee. Examples include: Genomics, Foundations in Data Driven Life Sciences, Statistical Analysis of Genomic Data, Systems Biology and Networks, Functional Genomics, Applied Bioinformatics; Human Development Across the Lifespan Biological Systems in Developmental Context; Human Development Intervention; Design and Evaluation of Prevention and Health Promotion Programs; Best Practices in Preventive Intervention; Multivariate Study of Change and Human Development; Methods of Statistical Analysis in Human Development; Strategies for Data Analysis in Developmental Research; Measurement in Human Development; Social Epidemiology; Observational Methodologies for Development; Dynamical Systems Analysis for Behavioral Sciences;  Person-specific data analysis.

Recommended Advanced Data Sciences Electives (9 Credits)

Upon completion of the Core coursework, B2D2K trainees will select 3 Advanced Data Sciences electives based on their research interests and recommendations from their co-major professors and thesis committee. Examples include: Principles of Artificial Intelligence, Data Mining, Causal Inference, Advanced Data Mining, Network Analytics, Advanced Network Analytics, Linear Regression Methods, Design & Analysis of Experiments, Analysis of Discrete Data, Applied Time Series Analysis, Longitudinal Data Analysis, Spatial Data Analysis, Visual Analytics, Optimization for Machine Learning, Information Retrieval and Search, Statistical Inference, Nonparametric and Semiparametric Models, Multivariate Analysis, Feature Screening and Selection, Latent Class Analysis, Longitudinal Structural Equation Modeling, Parallel Computing, Advanced Topics in Data Sciences.

Modular Short Courses on Experimental and Computational Techniques:

B2D2K fellows can choose from optional short modules on specific experimental or computational techniques (e.g., Molecular Biology Techniques, Statistical Methods, Programming (e.g., R, Python), High Performance Computing, etc. These are hands-on introductions to the use of specific techniques in a laboratory setting.