Schedule for: 18w5144 - Mathematical and Statistical Challenges in Bridging Model Development, Parameter Identification and Model Selection in the Biological Sciences

Arriving in Banff, Alberta on Sunday, November 11 and departing Friday November 16, 2018
Sunday, November 11
16:00 - 17:30 Check-in begins at 16:00 on Sunday and is open 24 hours (Front Desk - Professional Development Centre)
18:00 - 19:30 Dinner (Vistas Dining Room)
20:00 - 22:00 Informal gathering (Corbett Hall Lounge (CH 2110))
Monday, November 12
07:00 - 08:45 Breakfast
Breakfast is served daily between 7 and 9am in the Vistas Dining Room, the top floor of the Sally Borden Building.
(Vistas Dining Room)
08:45 - 09:00 Introduction and Welcome by BIRS Staff
A brief introduction to BIRS with important logistical information, technology instruction, and opportunity for participants to ask questions.
(TCPL 201)
09:00 - 10:00 Introduction by the organisers (TCPL 201)
10:00 - 10:30 Coffee Break (TCPL Foyer)
10:30 - 11:00 Darren Wilkinson: Scalable algorithms for Markov process parameter inference
Inferring the parameters of continuous-time Markov process models using partial discrete-time observations is an important practical problem in many fields of scientific research. Such models are very often "intractable", in the sense that the transition kernel of the process cannot be described in closed form, and is difficult to approximate well. Nevertheless, it is often possible to forward simulate realisations of trajectories of the process using stochastic simulation. There have been a number of recent developments in the literature relevant to the parameter estimation problem, involving a mixture of approximate, sequential and Markov chain Monte Carlo methods. This talk will compare some of the different "likelihood free" algorithms that have been proposed, including sequential ABC and particle marginal Metropolis Hastings, paying particular attention to how well they scale with model complexity. Emphasis will be placed on the problem of Bayesian parameter inference for the rate constants of stochastic biochemical network models, using noisy, partial high-resolution time course data.
(TCPL 201)
11:00 - 11:30 Matthias Chung: From parameter and uncertainty estimation to optimal experimental design: challenges in biological dynamical systems inference
Inference through data and mathematical modeling is particularly challenging for biological systems with noisy data, model uncertainties, and unknown mechanisms. Here, parameter and uncertainty estimation problems are typically ill-posed, meaning solutions do not exist, are not unique, or do not depend continuously on the data. Furthermore, experimentalists face the dilemma between accuracy and costs of an experiment. In this talk we will discuss new developments for parameter and uncertainty estimation for dynamical systems as well as discuss novel techniques for optimal experimental design.
(TCPL 201)
11:30 - 12:00 Adelle Coster: Building models that encode both the known and the unknown
In biological systems, often the mechanisms and the players are largely unknown. Quantitative measurements shed light on the properties of some aspects however many parts remain hidden. Datasets are often sparse, and so it is not always possible to infer the underlying structure from the data alone. This is where quantitative modelling comes in; it allows us to test possible mechanistic hypotheses. I will outline some of the protocols I have been using to build and assess deterministic models of biological systems using data from multiple experimental modalities. These use the traditional least-squares approach, overlaid by simultaneous optimisation of multiple datasets. My motivation is to both understand how the biology might work under normal operating conditions, and also to try to identify what changes might be invoked when the system is perturbed. Stochastic models, however, need to be assessed differently to deterministic ones. I will discuss my exploration of queueing models for cellular transport and the ideas I have been exploring both to optimise and assess these systems. It is my aim to develop some protocols that are both mathematically robust but also enable the assessment of different models for perturbations in real-world systems that are often constrained by limited datasets. Thus audience participation is warmly welcomed and greatly encouraged!
(TCPL 201)
12:00 - 13:30 Lunch
Lunch is served daily between 11:30am and 1:30pm in the Vistas Dining Room, the top floor of the Sally Borden Building.
(Vistas Dining Room)
13:45 - 14:00 Group Photo
Meet in foyer of TCPL to participate in the BIRS group photo. The photograph will be taken outdoors, so dress appropriately for the weather. Please don't be late, or you might not be in the official group photo!
(TCPL 201)
14:00 - 14:30 Gary Mirams: Challenges in ion channel model calibration, selection and discrepancy
Mathematical models of ion channels were first proposed in the Nobel Prize winning work of Hodgkin and Huxley in 1952. Their generalisation to allow dependence in gating processes brings us ‘Markov state models’ with extra flexibility, but potentially a lot more parameters and possible choices for model structure in terms of the number of states the channels can occupy. This complexity brings challenges in both model training/calibration and model selection, as illustrated by the range of possible models that are proposed in the literature for the same ion channel. I will present our work on designing and testing novel information-rich experiments to facilitate model calibration and selection for the hERG/IKr potassium channel, which plays an important role when considering drug action on the heart. We compare model selection via penalising model complexity with information criteria or Bayes factors to running independent validation experiments. I will outline the ongoing challenges in choosing the best validation experiments for our models and assessing the role of model discrepancy when we come to make predictions in previously unseen and perhaps safety-critical situations.
(TCPL 201)
14:30 - 15:00 Adam MacLean: Hybrid modeling and parameter inference reveals branching constraints for kidney morphogenesis
Many organs, including the kidney, lungs, and vasculature, display intricate branching structures that are critical for function. In the kidney, initiation of branching in the uretic bud is driven by subcellular interactions between the epithelium and mesenchymal cells that surround the tips of outgrowing branches. To capture the spatial heterogeneities observed during branching, we introduce a hybrid model with agent-based cell dynamics and PDE description of GDNF, a crucial signaling factor for branching. Due to the simulation cost of the model, we implement a recent method: "Approximate Approximate Bayesian Computation" to enable parameter inference. By comparison to ex vivo kidney uretic branching data, we show that GDNF-regulated growth mechanisms can explain early epithelial cell branching only if epithelial cell division depends in a switch-like way on the local growth factor concentration. These predictions refine the roles (both via proliferation control and chemotaxis) that GNDF regulates branching morphogenesis.
(TCPL 201)
15:00 - 15:30 Coffee Break (TCPL Foyer)
15:30 - 16:00 Alexander Browning: A Bayesian sequential learning framework to parametrise a model of melanoma invasion into human skin
We present a novel framework to parameterise a mathematical model of melanoma cell invasion into human skin. Our technique uses a suite of increasingly sophisticated experimental data to sequentially estimate the proliferation rate, diffusivity and a parameter that quantities the invasion of the cells into human skin. Our Bayesian sequential learning approach is simple to implement, computationally efficient and leads to well-defined parameter estimates. In contrast, taking a naive approach that attempts to estimate all parameters from a single set of data from the same experiment fails to produce meaningful results.
(TCPL 201)
16:00 - 16:30 Michael Plank: Spatial moment models for collective cell behaviour
Many PDE-based models of collective cell behaviour implicitly assume that the population of cells is ‘well mixed’. This is called a spatial mean-field assumption. In reality, populations often have a more complex spatial structure, such as clusters and/or spatial segregation of cells. This spatial structure is both a cause and an effect of non-local interactions among cells and can make a significant difference to model predictions about, for example, cell densities and invasion speeds. I will describe an individual- based model of collective cell behaviour that is based on interactions between pairs of cells, including a neighbour-dependent directional bias. I will then show how a continuum approximation to the individual-based model can be derived using spatial moment dynamics. This approximation tracks important features of the population spatial structure and incorporates non-local interactions that affect processes such as movement and mortality. I will show how experimental data from cell imaging experiments can be used to estimate model parameters and highlight some interesting problems.
(TCPL 201)
18:00 - 19:30 Dinner (Vistas Dining Room)
Tuesday, November 13
07:00 - 09:00 Breakfast (Vistas Dining Room)
09:00 - 09:30 Rob Deardon: Emulation-based methods for parameterizing spatial infectious disease models
Statistical inference for spatial models of infectious disease spread is often very computationally expensive. Such models are generally fitted in a Bayesian Markov chain Monte Carlo (MCMC) framework, which requires multiple calculation of what is often a computationally cumbersome likelihood function. This problem is especially severe when there are large numbers of latent variables to compute. Here, we propose two methods of inference based on so-called emulation techniques. One method consists of approximating the likelihood via a Gaussian process built using "ABC-style" summary statistics. The second method consists of approximating the likelihood directly with the Gaussian process, but using pseudo-marginal approximations to allow for latent variables such as infectious periods. These methods are set in a Bayesian MCMC context, but avoid calculation of the computationally expensive likelihood function by replacing it via the aforementioned Gaussian processes. We show that such methods can be used to infer the model parameters and underlying characteristics of spatial disease systems, and that this can be done in much more computationally efficient manner than full Bayesian MCMC allows.
(TCPL 201)
09:30 - 10:00 Dennis Prangle: Variational inference for stochastic differential equations
A stochastic differential equation (SDE) defines a random 
function of time, known as a diffusion process, by describing its 
instantaneous behaviour. As such, SDEs are powerful modelling tools in
 fields such as econometrics, biology, physics and epidemiology. This 
talk considers the common problem where SDEs involve unknown parameters which we wish to infer from partial noisy observations of the diffusion process. Parameter inference for SDEs is challenging due to the latent diffusion process. Working with discretised SDEs, we approximate the joint posterior distribution of parameters and latent diffusion using 
variational inference. That is, we introduce a flexible family of approximations to the posterior distribution and use optimisation to select the member closest to the true posterior. The novelty of our approach is using a recurrent neural network to approximate the posterior for the latent diffusion conditional on the parameters. This neural network learns how to provide diffusion paths which bridge between observations in the same way the diffusion process does, conditional on particular parameter values. Overall the method provides a fast and generic approach to SDE inference. The talk will describe this method and illustrate it on population 
dynamics and epidemic examples. Future extensions will also be briefly discussed, including application beyond SDEs and use for model selection and improvement.
(TCPL 201)
10:00 - 10:30 David Campbell: Testing for statistical parameter identifiability
Parameter identifiability describes the ability to estimate parameters from a model. Lack of identifiability can come from a variety of means. The model might be set up so that it’s not structurally possible to estimate parameters even with perfect data, the parameters might have impractically large confidence intervals, or the data noise might line up in such a way that they have no statistical information about the parameters. Lack of statistical identifiability paradoxically requires parameter estimates in order to determine if parameters are estimable. In this talk we consider methods for assessing identifiability using two strategies; an ANOVA test for identifiability and a generalized additive regression model selection approach. This is joint work with Subhash Lele and Peter Solymos from the University of Alberta.
(TCPL 201)
10:30 - 11:00 Coffee Break (TCPL Foyer)
11:00 - 11:30 Alexandre Bouchard-Cote: Bayesian computational biology
The Bayesian paradigm offers in principle a nice framework to incorporate various kinds of data, and to perform model selection, inference and prediction. However, Bayesian methods have been historically time-costly, both in terms of the analyst and in terms of computational runtime. I will present some of our recent work which alleviates the analyst's and computational cost of Bayesian analysis, thanks to advances in non-reversible Monte Carlo methods and their implementation in a probabilistic programming framework. I will also illustrate how we apply these non-reversible methods to challenging inferential problems in single-cell genomics.
(TCPL 201)
11:30 - 12:00 Discussion (TCPL 201)
12:00 - 13:30 Lunch (Vistas Dining Room)
13:30 - 14:00 Thomas Prescott: Multifidelity approaches to approximate Bayesian computation
A vital stage in the mathematical modelling of real-world systems is the calibration of a model's parameters to observed data. Likelihood-free parameter inference methods, such as approximate Bayesian computation (ABC), build Monte Carlo samples of the uncertain parameter distribution by comparing the data with large numbers of model simulations. However, the computational expense of generating these simulations forms a significant bottleneck in the practical application of such methods. In this talk we identify how simulations of cheaper, lower fidelity models have been used separately in two complementary ways to reduce the computational expense of building these samples, the cost of introducing additional variance. We explore how these can be unified so that the cost and benefit are optimally balanced, and we characterise the optimal choice of how often to use cheap, low fidelity simulations in place of high fidelity simulations in the Monte Carlo algorithm. The resulting multifidelity ABC algorithm gives improved performance, in terms of maximising the ratio of effective sample size to computational time, over existing multifidelity and high fidelity approaches.
(TCPL 201)
14:00 - 14:30 Ramon Grima: Computationally efficient parameter estimation for gene regulatory networks
There is a large literature on methods for inferring the rate parameters of a gene regulatory network from single-cell data. The most popular and powerful methods rely on approximating the likelihood using the linear-noise or moment-closure approximations and then obtaining the posterior distribution of parameters within a Bayesian approach by means of Markov Chain Monte Carlo (MCMC). Because of the likelihood approximations made, these methods are accurate for systems with linear reaction rates or those with weakly nonlinear rates. Invariably the length of time taken for the MCMC to terminate and lead to a reliable posterior is typically of the order of hours, making this a computationally expensive task especially for large data sets. In this talk I will describe a new inference approach which leads to reliable parameter estimates for nonlinear systems (such as genetic feedback loops) in a small fraction of the time take by MCMC based methods.
(TCPL 201)
14:30 - 15:00 Simon Cotter: Transport map-accelerated adaptive importance sampling for inverse problems of multiscale stochastic chemical networks
In this talk we will consider how transport maps can simplify the problem of sampling from complex posterior distributions arising from inverse problems in biology, using adaptive importance sampling methods. Adaptive importance sampling methods use a mixture distribution to approximate the posterior, in order to produce an efficient importance sampling scheme. However, if there are complex structures such as strong correlations or sharp ridges in the posterior, these methods require a large increase in the number of ensemble members, or they may become unstable. In this work, we investigate the use of transport maps to stabilise and speed up sampling using these methods for inverse problems of multiscale stochastic chemical networks, where the posterior is often concentrated close to a lower-dimensional manifold, without the need for an increase in the number of particles.
(TCPL 201)
15:00 - 15:30 Coffee Break (TCPL Foyer)
15:30 - 16:00 Barbel Finkenstadt: Inference for stochastic oscillators with distributed delays

The time evolution of molecular species involved in biochemical reaction networks often arises from complex stochastic processes involving many species and reaction events. Inference for such systems is profoundly challenged by the relative sparseness of experimental data, as measurements are often limited to a small subset of the participating species measured at discrete time points. The need for model reduction can be realistically achieved for oscillatory dynamics resulting from negative translational and transcriptional feedback loops (TTFLs) by the introduction of probabilistic time-delays. Although this approach yields a simplified model, inference is challenging and subject to ongoing research. The linear noise approximation (LNA) has recently been proposed to address such systems in stochastic form and will be exploited here. We develop a novel filtering approach for the LNA in stochastic systems with distributed delays, which allows the parameter values and unobserved states of a stochastic negative feedback model to be inferred from univariate time-series data. The performance of the methods is tested for simulated data. Results are obtained for real data when the model is fitted to imaging data on Cry1, a key gene involved in the mammalian central circadian clock, observed via a luciferase reporter construct in a mouse suprachiasmatic nucleus (SCN).
(TCPL 201)
16:00 - 16:30 Priscilla Greenwood: Stochastic vs. deterministic modeling in bio-science
What is a natural way to write a stochastic model starting from a deterministic dynamical system representing, say, a population or epidemic process? What are some pros and cons of stochastic vs. deterministic models in this context? A useful approximation result about stochastically sustained oscillations applies to many such models, enabling explicit computations.
(TCPL 201)
18:00 - 19:30 Dinner (Vistas Dining Room)
Wednesday, November 14
07:00 - 09:00 Breakfast (Vistas Dining Room)
09:00 - 09:30 Jonathan Dushoff: Bridging between statistics and science: Some philosophical claptrap
Generation intervals have played the second fiddle in the history of dynamical disease modeling. They are needed for model fitting, but we pay little attention to them. The West African Ebola epidemic brought the importance of generation intervals into sharper focus. I will discuss some issues in interpreting and estimating generation intervals, and how they affect model fitting and forecasting.
(TCPL 201)
09:30 - 10:00 Mark Lewis: Study design and parameter estimability for spatial and temporal ecological models using data cloning
Here, we show how a new statistical computing method called data cloning can be used to inform study design by assessing the estimability of parameters under different spatial and
 temporal scales of sampling. A case study of parasite transmission from farmed to wild
 salmon highlights that assessing the estimability of ecologically relevant parameters should be a key step when designing studies in which fitting complex mechanistic models is the end goal.
(TCPL 201)
10:00 - 10:30 Aaron King: Forward-in-time phylodynamics via sequential Monte Carlo
Phylodynamics seeks to glean information about infectious disease dynamics from pathogen sequences. Most currently available methods first estimate phylogenetic trees from sequence data, then estimate a transmission model conditional on these phylogenies. Outside limited classes of models, existing methods are unable to enforce logical consistency between the model of transmission and that underlying the phylogenetic reconstruction. Such conflicts in assumptions can lead to bias in the resulting inferences. Here, I describe a general, statistically efficient, plug-and-play method to jointly estimate transmission and phylogeny. This method explicitly connects the model of transmission and the model of phylogeny so as to avoid the aforementioned inconsistency. I demonstrate the feasibility of our approach through simulation and apply it to estimate stage-specific infectiousness in a subepidemic of HIV in Detroit, Michigan.
(TCPL 201)
10:30 - 11:00 Coffee Break (TCPL Foyer)
11:00 - 11:30 Oksana Chkrebtii: Identifying individual disease dynamics in a stochastic multi-pathogen model from aggregated reports and laboratory data
Influenza and respiratory syncytial virus are the leading etiologic agents of seasonal acute respiratory infections around the world. Medical doctors usually base the diagnosis of acute respiratory infections on patients' symptoms and do not always conduct laboratory tests necessary to identify individual viruses due to cost constraints. We establish a framework that enables the identification of individual pathogen dynamics given aggregate reports and a small number of laboratory tests for influenza and respiratory syncytial virus in a sample of patients, which can be obtained at relatively small additional cost. We consider a stochastic Susceptible-Infected-Recovered model of two interacting epidemics and infer the parameters defining their relationship in a Bayesian hierarchical setting as well as the posterior trajectories of infections for each illness over multiple years from the available data. We consider a case study based on data collected from a sentinel program at a general hospital in San Luis Potosi, Mexico.
(TCPL 201)
11:30 - 12:00 Discussion (TCPL 201)
12:00 - 13:30 Lunch (Vistas Dining Room)
13:30 - 17:30 Free Afternoon (Banff National Park)
18:00 - 19:30 Dinner (Vistas Dining Room)
Thursday, November 15
07:00 - 09:00 Breakfast (Vistas Dining Room)
09:00 - 09:30 Mike Dowd: Sequential Monte Carlo approaches for inference in dynamical systems: application to spatio-temporal models of ocean biogeochemistry
State space models are widely used for integrating ecological data with dynamical models in order to estimate the system state and parameters. Many such dynamical models are stochastic with strong nonlinearities, and the observations used are frequently non-Gaussian. The consequence is that sampling based inference must be used for many realistic state space models. The fundamental building block for doing this are sequential Monte Carlo methods, such as the particle filter, which provides the basis for likelihood based methods (like multiple iterated filtering) as well as Bayesian approaches (like particle MCMC). Unfortunately, basic particle filters do not scale well as the dimensionality of the state space increases, requiring exponentially larger sample sizes. In practice, this means that purely time-dependent ecological models have been emphasized, and the spatial aspects either ignored or approximated. In this talk, I present work on the extension of state space models so that spatio-temporal dynamic ecological systems can be effectively treated. The approaches involve approximations, along with the design of novel sequential Monte Carlo approaches. The problems that motivate, and are used to illustrate this work, are the marine plankton models used in biological oceanography (the so-called PZND - phytoplankton, zooplankton, nutrient, detritus - class of models). These time-dependent models are generally embedded within ocean circulation model that provide the spatial context. Here, we will consider both one- and three-dimensional ocean models.
(TCPL 201)
09:30 - 10:00 Oliver Maclaren: Lessons for biological parameter estimation from large-scale engineering inverse problems
This talk is about transferring methods and lessons from the field of large-scale engineering inverse problems to the types of models used in computational and mathematical biology. This is motivated by my recent work with geothermal reservoir engineers and inverse problems experts on estimating parameter fields in large-scale simulation models of geothermal systems. Initial attempts to transfer techniques from my work on biological parameter estimation to these problems were somewhat successful, but new challenges quickly appeared. In particular, the simulation models used in these areas are particularly expensive to compute, parameters are frequently in the form of complex spatial fields, and a diverse range of regularisation types and prior models are required to capture known or suspected structures. To address these issues we have worked to a) replace expensive models with cheaper models while accounting for the associated bias and overconfidence that this can introduce, b) compute model derivatives efficiently using the adjoint method and related techniques and c) implement nonlinear parameter identifiability analysis and regularisation parameter selection methods. I will discuss how the methods and lessons I've learned in tackling these new problems might be transferred back to parameter estimation and uncertainty quantification problems in computational and mathematical biology.
(TCPL 201)
10:00 - 10:30 Coffee Break (TCPL Foyer)
10:30 - 11:00 Barbara Holland: Assessing model adequacy in molecular phylogenetics
Molecular phylogenetics is concerned with estimating the evolutionary relationships amongst species on the basis of molecular sequence data (typically aligned DNA or amino acid sequences). Statistical approaches to this problem have been developed where evolution is modelled as a Markov process acting on a tree. Given a specific tree along with parameter choices in our Markov model, these models allow us to calculate the probability of seeing any particular pattern of nucleotides at the tips of the tree (i.e. in the modern-day species). We can then calculate a likelihood score for a sequence alignment given our model; from here we can use maximum likelihood or Bayesian approaches to find choices of tree and model parameters that give the best explanation of the sequence data. This provides a nice framework for inference, but there are some interesting challenges that arise. For instance • Our model is a mix of combinatorial structure (the branching pattern in the tree) and continuous parameters (e.g. rates of change from one particular nucleotide to another, lengths of edges in the tree), this makes it difficult to assess the uncertainty in our estimates, i.e. what is 95% CI for a tree. • It is a common maxim that just because a model fits “best” doesn’t mean it fits “well”. However, it is surprisingly difficult to do anything analogous to residual diagnostics in a phylogenetic context. Our models basically give us a multinomial distribution on site patterns. Even for just 10 species there are $4^10$ ~ 1 million possible site patterns, and in any particular sequence alignment most will not be observed. We need more tools to help us visualise how well our models fit and where they break down. • How sure are we that evolution is tree-like? There are many biological processes (hybridisation, recombination, lateral gene transfer) that cannot be well modelled by a tree. As soon as we move away from the tree assumption we need to think harder about model selection issues as statistical identifiability can become a problem. This talk will give an introduction to statistical phylogenetics and survey some of the above issues.
(TCPL 201)
11:00 - 11:30 Paul Francois: Untangling the hairball: fitness based reduction of biological networks
Complex mathematical models of interaction networks are routinely used for prediction in systems biology. However, it is difficult to reconcile network complexities with a formal understanding of their behavior. I will describe a simple procedure to reduce biological models to functional submodules, using statistical mechanics of complex systems combined with a fitness-based approach inspired by in silico evolution. I will illustrate our approach on different models of immune recognition by T cells. An intractable model of immune recognition with close to a hundred individual transition rates is reduced to a simple two-parameter model. We identify three different mechanisms for early immune recognition, and automatically discovers similar functional modules in different models of the same process, allowing for model classification and comparison. Our procedure can be applied to biological networks based on rate equations using a fitness function that quantifies phenotypic performance.
(TCPL 201)
11:30 - 12:00 Discussion (TCPL 201)
12:00 - 13:30 Lunch (Vistas Dining Room)
13:30 - 14:00 Jill Gallaher: Systemic dynamics and effects from multiple metastases during adaptive therapy in prostate cancer
Despite the growing acknowledgement that heterogeneity is driving treatment failure in advanced cancers, it is not often recognized that a successful treatment must be designed with the evolutionary response of the disease in mind. Adaptive therapy is an evolutionary-based treatment strategy that aims to balance cell kill with toxicity, by exploiting the competition between the resistant and sensitive populations. The aim is to keep a constant tumor volume by adjusting the dose such that a shrinking tumor will receive a lower dose while a growing tumor will receive a higher dose. It has been shown to be effective in pre-clinical mouse studies of triple negative breast cancer and clinical trials of metastatic castrate-resistant prostate cancer. Decision-making in the clinic is based on a systemic marker of tumor burden (prostate specific antigen, PSA, in prostate cancer). However, a systemic measure of disease ignores effects from multiple distinct metastatic lesions. We use an off-lattice agent-based model calibrated to the timescales of the prostate cancer trial to investigate how number, size and composition of multiple metastatic lesions treated with adaptive therapy affects the systemic dynamics of disease burden.
(TCPL 201)
14:00 - 14:30 Susanna Röblitz: Empirical Bayes methods for prior estimation in systems biology modelling
One of the main goals of mathematical modelling in systems biology related to medical applications is to obtain patient-specific parameterizations and model predictions. In clinical practice, however, the number of available measurements for single patients is usually limited due to time and cost restrictions. This hampers the process of making patient-specific predictions about the outcome of a treatment. On the other hand, data are often available for many patients, in particular if extensive clinical studies have been performed. Therefore, before applying Bayes’ rule separately to the data of each patient (which is typically performed using a non-informative prior), it is meaningful to use empirical Bayes methods in order to construct an informative prior from all available data. 
In the non-parametric case, the maximum likelihood estimate is known to overfit the data, an issue that is commonly tackled by regularization. However, the majority of regularizations are ad-hoc choices which lack invariance under reparametrization of the model and hence result in inconsistent estimates for equivalent models.
 We introduce the empirical reference prior, a non-parametric, transformation-invariant estimator for the prior distribution, which represents a symbiosis between the objective and empirical Bayes methodologies.
 We demonstrate the performance of this approach by applying it to an ordinary differential equation model for the human menstrual cycle, a typical example from systems biology modelling.
(TCPL 201)
14:30 - 15:00 Coffee Break (TCPL Foyer)
15:00 - 15:30 Jonathan Harrison: Experimental verification of a coarse-grained model predicts that production is rate-limiting for mRNA localization
Identifying bottlenecks in a cellular process indicates key targets for regulation by the cell. However, in many cases, these rate-limiting steps are not identified or well understood. mRNA localization by molecular-motor-driven transport is crucial for cell polarization, but the rate-limiting processes underlying the localization processes are not fully understood. To make progress on this important problem, we use a combined experiment-theory approach to examine the rate-limiting steps in the localization of gurken/TGF-alpha mRNA in Drosophila egg chambers. We construct a coarse-grained model of the localisation that encodes simplified descriptions of the range of steps involved in localization, including production and transport between and within cells. Using Bayesian inference, we relate this model to quantitative single molecule fluorescence in situ hybridization data, and draw three main conclusions. First, we characterize the formation of higher order assemblies of RNA-protein complexes in the oocyte. Second, by analysing steady state behaviour in the model, we estimate the extent of the bias in transport directionality through ring canals between cells. Finally, by parameterizing the full dynamic model, we provide estimates for the rates of the different steps of localization, and predict that the rate of mRNA production, rather than transport, is rate-limiting. Together, our results strongly suggest that production is rate-limiting for gurken mRNA localization in Drosophila development, but that mRNA localization is a tightly regulated process.
(TCPL 201)
15:30 - 16:00 John Fricks: Estimating velocity from time traces of molecular motors
How does one measure the velocity of an object? Seems like a simple question. However, in cell biology – with lots of measurement error, Brownian dynamics, and attachment-detachment dynamics – this can be anything but simple. As a case-study, we will look at molecular motors, such as kinesin and dynein, which in a laboratory setting carry a cargo along a microtubule until detachment after a random time. Should we take the total displacement over the total time until detachment and average these velocities over different paths? Should the shorter paths be discounted in the analysis? Should we concatenate paths instead? Should we use a mean squared displacement analysis, and how would observational error effect this approach? These and other possible approaches will be explored.
(TCPL 201)
16:00 - 16:30 David Umulis: Three-dimensional finite element modeling of dynamic BMP gradient formation in zebrafish embryonic development
Bone Morphogenetic Proteins (BMPs) play a significant role in dorsal-ventral (DV) patterning of the early zebrafish embryo. BMP signaling is regulated by extracellular, intracellular, and cell membrane components. BMPs pattern the embryo during development at the same time that cells grow and divide to enclose the yolk during a process called epiboly. We developed a new three-dimensional growing finite element model to simulate the BMP patterning and epiboly process during the blastula stage. Quantitative whole mount RNA scope data of BMP2b and phosphorylated-SMAD data are collected and analyzed to precisely test the hypotheses of gradient formation in our model. We found that the growth model results in consistent spatially and temporally evolving BMP signaling dynamics within a range of biophysical parameters including a minimal rate of ligand diffusion.
(TCPL 201)
18:00 - 19:30 Dinner (Vistas Dining Room)
Friday, November 16
07:00 - 09:00 Breakfast (Vistas Dining Room)
09:00 - 10:30 Structured discussion and emergent topics (TCPL 201)
10:30 - 11:00 Coffee Break (TCPL Foyer)
11:00 - 11:30 Discussion and concluding remarks (TCPL 201)
11:30 - 12:00 Checkout by Noon
5-day workshop participants are welcome to use BIRS facilities (BIRS Coffee Lounge, TCPL and Reading Room) until 3 pm on Friday, although participants are still required to checkout of the guest rooms by 12 noon.
(Front Desk - Professional Development Centre)
12:00 - 13:30 Lunch from 11:30 to 13:30 (Vistas Dining Room)