New Mathematical Challenges from Molecular Biology and Genetics (09w5062)
Organizers
Richard Durrett (Cornell University)
Ed Perkins (University of British Columbia)
Objectives
I. Phenomena in needs of models
1. Regulatory sequence and gene expression evolution.
Transcription factors bind to short sequences 6-9 letters long within
a few thousand nucleotides of the start of a gene. Much information
has accumulated about transcription factors and the sequences to which
they bind, but we are only starting to understand how they evolve and
how the mechanisms depend on the organisms population size. One
motivation for problem is to understand to what extent changes in gene
expression patterns contribute to differences between humans and
other primates. However, in order to understand microarray data
related to this question, one needs models for the change of
expression differences.
2. Gene Networks
Gene networks have traditionally been modeled by large systems of
differential equations. However, work of Albert and Other on the
segment polarity network in Drosophila, has shown that Boolean
networks with each gene simply on (1) or off (0) can produce very
accurate models. Understanding the wiring diagram of a particular
gene network is a difficult experimental problem, but mathematics can
add insights by studying random Boolean networks and the evolution of
idealized models.
3. Recombination Hot Spots
Recombination hot spots have been documented in various parts of the
genome by sperm typing experiments, but statistical study of
polymorphism data has suggested these may account for much of the
recombination that occurs. Mathematical work is needed to try to
obtain a rigorous understanding of the inference procedures. Some of
the well-documented hotspots in humans are not present in
chimpanzees. Others are present in some population samples but not in
others. Models are needed to study their evolution and help identify
mechanisms that govern the gain and loss of hotspots.
II. Recent useful probabilistic methodology
A. Detecting adaptive evolution from patterns of genetic variability
is an old and important problem. Recent results in the probability
community have given new insights into the footprints that the
fixation of an adaptive mutation may have in nearby regions on the
chromosome. For example, see work of Durrett and Schweinsberg in the
context of finite population Moran models, and Etheridge, Pfaffelhuber
and Wakolbinger in the context of infinite population Fisher-Wright
models.
B. New coalescent models.
The coalescents with multiple and simultaneous multiple collisions
were invented more or less as a mathematical curiosity, but play an
important role in the developments mentioned in A, and have recently
found a new motivation in the study of marine species, where the
offspring distribution has heavy tails with infinite variance. These
features invalidate the use of Kingman's coalescent and the related
Ewens sampling formula. Recent work of Birkner, Blath, Capaldo,
Etheridge, Mohle, Schweinsberg and Wakolbinger have shown how beta
coalescents give the ancestry of heavy tailed branching processes,
extending the use of Kingman's coalescent in the case of finite
variance distributions. The coalescent has been adapted to the new
setting and some first steps have been taken toward understanding
properties of samples for example in recent preprints by Birkner and
Blath and also work of J. Berestycki, N. Berestycki and
J. Schweinsberg.
C. Implications of Spatial Structure.
The stepping stone model has been studied for more than 50 years, but
the last five have seen new results from probabilists which show that
"isolation by distance" in the human population shape patterns of
variability in the human genome (see, for example, recent work in
"Genetics" by Matsen and Wakely and also Arkendra and Durrett).
These results raise questions about the validity of tests of positive
selection and accuracy of recombination estimates in 2 above made on
the basis of a model that assumes that the human population is
homogeneously mixing. In the other direction, new theoretical work is
needed to understand what dispersal rates and kernels will cause a
system to behave like a homogeneously mixing population.





