Faculty Research

JUN S. LIU'S BIOINFORMATICS LAB

Biological Sequence Analysis and Motif Discovery Research Group

We focus on developing novel statistical and computational tools for recognizing patterns in protein or DNA sequences and understanding the functional or structural relationships among these sequences. The popular algorithms resulting from our effort along these directions include the Gibbs Motif Sampler for DNA regulatory binding motif discoveries, PROBE for detecting remote protein homologs and multiple alignment, BayesAligner for pairwise sequence alignment, and BioProspector, a significantly improved motif-detection method.

Genetics Analysis Based on SNPs and Haplotypes

We develop methods for handling a large number of linked SNPs and Bayesian algorithms for the fine mapping of disease genes using population linkage disequilibrium data. Two novel algorithms are developed, namely BLADE and Haplotyper.

Clustering Methods

We study methods for clustering high-dimensional data, both numerical ones and categorical ones, e.g., microarray data, sequence motif data, molecular image data.

Advanced Monte Carlo and Optimization Methods

Efforts are devoted to the development of novel techniques for efficient Monte Carlo computations. A sample list of topics include the design and study of novel Markov chain Monte Carlo algorithms, Monte Carlo methods for estimating the normalizing constant (e.g., partition function, the Bayes factors), sequential Monte Carlo strategies, evolution Monte Carlo, and dynamic weighting methods.

Single-molecule Studies and Statistics

In collaboration with Professor Xie's group in the Chemistry Department, we investigate the intricate statistical issues arising from the recent experimental breakthroughs in single-molecule studies. Some of the issues are the statistical modeling of various experimental limitations, the information strength/limitation of the data, models for the underlying stochastic processes, the inferences on the autocorrelation curve, and discrimination of competing models.


DONALD B. RUBIN

The U.S. Census

Based on data from the 1990 Census and 1990 Post-Enumeration Survey, we are researching several methods for improving estimates of undercounting/overcounting, including strategies for downweighting influential clusters, the potential use of substitution sampling, and triple system estimation through the use of administrative records.


ALAN M. ZASLAVSKY

Statistical Methods for Health Services Research

Research is focused on the development, implementation and analysis of CAHPS, a comprehensive program involving a survey instrument for eliciting enrollee reports and ratings of their health plans and the care they receive through them, a standard analysis package, and a set of templates for reporting results back to potential enrollees and purchasers. A particularly useful line of investigation has developed from work with the implementation of the CAHPS for the Medicare Managed Care population. Dr. Zaslavsky has developed a case-mix adjustment model for this population and recently elucidated the dimensions and sources of variability in several aspects of quality ("access", "customer service", "medical services", and "advice"). Currently, he is investigating the potential of health status measures in the same survey to quantify adverse selection. A second component of his work on quality has centered around the development and evaluation of HEDIS clinical measures of the quality of care provided by plans, primarily involving preventive services, screening, and management of chronic conditions.