Emerging Statistical Challenges in Genome and Translational Research (08w5062)

Arriving Sunday, June 1 and departing Friday June 6, 2008


Jennifer Bryan (University of British Columbia)
Sandrine Dudoit (University of California, Berkeley)
Jane Fridlyand (Genentech Inc.)
Darlene Goldstein (Ecole Polytechnique Federale de Lausanne)
Sunduz Keles (University of Wisconsin, Madison)
Katherine S. Pollard (University of California, Davis)


The primary objectives of this workshop are

(1) to address emerging statistical problems
in the analysis and combination of diverse datasets arising from
genome-scale assays applied in clinical and molecular genetic research;
(2) to facilitate meaningful interactions
between the experimental biologists and research-oriented clinicians
who produce genome-scale data and the statisticians who develop
and implement appropriate analytical methodology.
Substantive collaborations between these groups are
vital for transforming the massive amount of data produced by new
technologies into important biological discoveries and translational research.

A secondary aim is to honor and celebrate the achievements and ongoing
contributions to this field of Professor Terry Speed,
who turns 65 in 2008.
We believe that an appropriate recognition is
to carry forward his first-rate example of forging productive
statistical-biological hands-on collaborations,
and that this workshop provides an effective means of doing so.

The workshop is intended to foster deeper connections between the
life science and statistical research communities and to be a forum for
(1) the dissemination of cutting-edge biotechnical and methodological developments
and (2) the identification of open data analysis problems.
The challenges include not only analyzing
genotypes, gene and protein expression and DNA-protein interaction data,
but also relating these to phenotypic data,
such as clinical outcomes, and further relating
all of these to existing databases containing different types of meta-data.

We anticipate that this workshop will enable statisticians to
articulate theoretically grounded statistical formulations of existing and
emerging computational biological and clinical problems; create an exceptional
opportunity for exchanging ideas between the communities; and help to
shape the future of this dynamic field.
Input from life scientists is absolutely crucial for
development of relevant statistical methodologies.
We therefore target areas that are relatively
new to statisticians, as well as areas
that have already been greatly influenced by statistical approaches.

The five targeted areas for the workshop include: classification of patients based on genetic and genomic data,
computational population genetics, pharmacogenomics,
emerging technologies and data integration.
We expect that the interaction between statisticians
and biologists will lead to major advances in
the analysis and integration of genome-scale data sets and in translational research.
In addition, this relatively new and rapidly developing field
enjoys an excellent representation of both young researchers and women.

It is now well accepted that the capacity to generate genome-wide data
has far outpaced the ability to analyze and interpret it.
The rapid development of new high-throughput technologies
allows biological investigations on an ever-growing scale.
Statistical genomics has adapted well to these changes,
due to the great interest of statisticians in the methodological challenges
inherent in a quickly evolving domain.
Addressing the new statistical demands has clear relevance
for continued progress in biological and biomedical research
predicated on genome-scale assays.

Genome-scale data are rising in prominence and are rapidly becoming
critical to the study of human disease, most strikingly in cancer.
Yet without sound methodology, accompanied by computationally feasible implementations,
we risk missing, or misinterpreting, important information contained in these data.
This workshop will help to enable transformation of the vast data resources
emanating from multiple, diverse types
of high-throughput assays into realizable health benefits,
including improved diagnostics,
prognostication, risk assessment and individualized treatment.

There are several well-established computational biology conferences
where the primary quantitative discipline is computer science, not statistical science.
These meetings are often focused more narrowly on databases,
algorithmic aspects, or specific software,
and contain a rather weaker interdisciplinary component.
Likewise, even though major statistical conferences often have
sessions on computational biology,
the audience is almost exclusively statisticians.
Opportunities for true interdisciplinary interaction are few.
Yet in this field, rapid communication between
biologists, clinicians and statisticians is absolutely vital.

Our aim is to organize a more fully interdisciplinary
workshop to address the challenges posed by the
enormous need for quantitative data integration and modeling in biology.
Although the field is very broad,
our intent is to focus on the statistical,
mathematical, and computing aspects without
losing sight of the underlying biology.
To this end, biologists have an intrinsic role to fulfill.
A workshop that specifically brings life scientists and statisticians
together would certainly be an important
development in the field,
and BIRS provides an unbeatable environment.