|
|
|
Research
Monte Carlo Methods
|
With the recent advances in computing power, many problems that up to
recently were infeasible to solve can now be answered. In many
situations, the answers can be achieved through the use of Monte Carlo
methods. While the most popular simulation based approach today is Markov
chain Monte Carlo (MCMC), my interests are in the area of sequential
importance sampling (SIS), also known as sequential imputation or particle
filtering. In SIS, the random variables to be sampled are decomposed
into blocks, and each block is sampled sequentially based on more and
more of the observed data. Whereas MCMC samplers generate equally
weighted, but correlated samples, SIS samplers generate independent,
but unequally weighted samples. MCMC and SIS can be seen as
complementary procedures. Often in problems where SIS is useful, MCMC
will be less efficient, or perhaps infeasible. The opposite also
occurs where MCMC will give answers when SIS cannot.
An important issue with SIS is finding a useful decomposition of the
probability structure, so that sampling can be done easily and quickly
and that the importance sampling weights are well behaved. Usually
these are competing issues. Often the faster, easier sampling schemes
tend to have poorer behaved importance weights, while the schemes that
have better behaved weights are slower and more difficult to implement.
The key is to find a balance between these two issues and to minimize
the standard errors of the quantities estimated from the sample.
|
Statistical Genetics
|
Much of my interest in Monte Carlo methods comes from my work in
statistical genetics. A problem of great interest is how to calculate
linkage statistics in large pedigrees and a moderate to large number of
loci. Exact calculation methods will break down in these situations.
For example, peeling becomes infeasible with a moderate number of loci, while
the hidden Markov model, as implemented in programs such GENEHUNTER, cannot
handle moderate sized pedigrees. One approach that has been
successful in dealing with large simple pedigrees, is SIS, which has
been implemented in the package
SIMPLE.
|
Command and Control
|
The estimation of the level of threat in a battlespace is an important
consideration for a battle commander. One possible measure of threat
is the danger field, which describes the expectage damage due to
explosive weapons in the battlespace (Irwin
et al., 2002). The figure on the right shows the evolution of the
damage field as five tanks move through the battlespace over a five
hour period.
A threat of particular interest to aviators is that posed by mobile
anti-aircraft lauchers. Knowledge of how these launchers could be
deployed is important in the planning of airstrikes into enemy
territory. Underlying the forecasting of launcher locations is the
historical location patterns used and how and when they change.
Changes in launcher locations can be investigated via testing based of
summaries of the launcher intensity patterns (Kornak
et al., 2006) as estimated by a kernel intensity estimate.
|
Hierachical Modeling
|
A powerful technique for dealing with many problems in statistics today is the
use of hierarchical models. They can be used in a wide range of problems,
as they allow complex processes to be described by simpler, more easily
understood subprocesses. For example, in pedigree analysis, one level
of the hierarchy uses Mendelian laws with models of interference to
describe how genetic material is passed through the family. The next
level of the hierarchy describes the relationship between the observed
trait data conditional on the pedigree members genetic makeup.
Another example is the danger field example mentioned in the previous section.
The movement of the tanks is described by a Markov model under the
assumption that each of the tanks has three space-time waypoints as part
of their paths. Then conditional on the tank positions at each time,
there is a model describing the possible attack locations and the
potential damage at each location.
One field where hierarchical modeling is particularly useful is that of
environmental problems. The hierarchical structure allows for
consistent modeling of the space-time structure and the interaction of
a wide range of factors. For example, hierarchical modeling has been
used for ozone forcasting in a five state region around Lake Michigan
(McMillian
et al., 2005). The hierarchical model used involves ozone
transport, meteorology (winds, temperature, air pressure, etc), and an
ozone region switching scheme which describes whether the
region is in a period of high or low ozone levels. Another example is
sea surface temperature forecasting in the tropical Pacific ocean
(Berliner et al, 2000). Their model involves observed sea surface
temperatures, the Southern Oscillation Index (sea level pressure), and
zonal wind data. This model is based on three ozone regimes and the
hierarchical structure allows for regime based forecasts. To view
their forecasts, go to http://www.stat.ohio-state.edu/~sses/collab_enso.php.
|
|