# Mathematics and physics of polymer entanglement: Emerging concepts and biomedical applications (10w5100)

Arriving in Banff, Alberta Sunday, January 10 and departing Friday January 15, 2010

## Organizers

Hue Sun Chan (University of Toronto)

Eric Rawdon (University of Saint Thomas)

Christine Soteros (University of Saskatchewan)

Lynn Zechiedrich (Baylor College of Medicine)

## Objectives

This workshop will focus on the mathematics associated with a very specific array of cutting edge problems at the interface between the mathematical, physical, and biological sciences which show the most promise for immediate progress on questions arising from molecular biology studies of DNA and other biopolymers.

In the last decade or so, tremendous advances in the understanding of DNA behavior, including the effects of (i) storage (in viral capsids, eukaryotic nuclei, or bacterial cells), (ii) entanglement (knots and links), (iii) replication, (iv) transcription into RNA, and (v) repair and recombination (including site-specific and general), have been made at the hands of researchers working at the interface of mathematics, biology, and physics. Not only has the understanding of DNA as a biopolymer advanced rapidly, but emerging concepts, particularly from the Buck, Chan, Liu, Stasiak, and Zechiedrich groups have reached beyond the scope of DNA to a general understanding of the previously little-explored basic relationship between the local geometry of chain juxtaposition and global topology in polymer chains. Numerical simulations of lattice models as well as continuum freely-jointed and wormlike chain models demonstrated convincingly that the degree of 'hookedness' of an observed local juxtaposition correlates well with global topological complexity and the likelihood that a topoisomerase-like segment passage at the given juxtaposition would disentangle. This is a new paradigm opening up many avenues of computational and experimental research. Moreover, these novel numerical results also serve to suggest a wealth of questions and conjectures that may be fruitfully addressed by field theory arguments from physics and by rigorous mathematics. Indeed, during the same time, we (the proposed co-organizers) have noted a drastic increase in the precision in the language of biologists, with their incorporation of such important concepts as 'conjecture', 'hypothesis', and 'theory' following the traditional mathematical usage. An improved understanding of the languages of each of the three disciplines improves the communication, and as such, the understanding of each other. Increasing the awareness of mathematicians to (i) the complexity of the biological problems, as well as (ii) the cutting edge research results, even before they are published, will facilitate an increased understanding of biopolymers, which is the goal of this conference. Some progress on these fronts was achieved as a result of the 2007 BIRS workshop 07w5095, The Mathematics of Knotting and Linking in Polymer Physics and Molecular Biology. For the 2010 workshop we plan to include more Biologists/Experimentalists than before and we expect that this will be the first opportunity for many of the invitees from different disciplines to meet each other. Thus the proposed workshop will not only enable the advancement of existing collaborations at the interface between Biology, Mathematics and Physics but will encourage the development of new ones.

The conference is timely for an additional important reason. Research funding for the pure mathematical and physical sciences has decreased recently. However, together with this troubling trend, there is an increase in funding opportunities for mathematicians and physicists working at the interface of the biological sciences, perhaps particularly in regard to medically relevant research. Rich in important problems only answerable with an interdisciplinary approach, the study of DNA polymer science has had extraordinary successes quite recently, with the vast majority of these occurring at the interface of disciplines. Bringing a cadre of researchers working at the interface of polymer science to the Banff International Research Station for Mathematical Innovation and Discovery provides the opportunity to bridge these fields. We expect to forge new collaborations as well as exchange knowledge in this proposed cutting-edge conference. The specific topics to be included are:

1. Linking Number (Lk) of DNA:

Recent emerging results, particularly from the Maddox, Harris, and Zechiedrich groups, have been made in all atom simulations of DNA. Whereas coarse-grained models have been extremely useful for understanding the behavior of these polymers (for example, knotting, linking, and in general how they are packed into small spaces), the next series of questions must begin to include the surprising way that the change in linking number, Lk, is manifested in DNA. The observed bimodal response of DNA to Lk shows complete collapse of the DNA helix in sequence-dependent localized regions of the biopolymer with a concomitant relaxation back to B-form DNA in the rest of the biopolymer. At the same time, for the overwound helix, elastic polymer rod models work perfectly well. Mathematically and physically, this means, at least in the helix unwinding direction, that the assumptions of elastic rod theory are wrong and suggests that perhaps an asymmetric torsional potential would be physically more appropriate.

2. Electrostatics of DNA:

A highly charged polymer, DNA contains -2 net charges/base pair. Mathematical modeling of DNA has begun to include the charge, but biologically the counterions (monovalent and divalent cations, together with their respective anions) and water surrounding the DNA effectively shield the charged polymer so that DNA-DNA interactions are common. Recent advances and emerging concepts here include that counterion concentrations increase with increasing Lk of DNA. These alterations in counterion concentrations, in turn, impact how the DNA is stored, replicated, metabolized, etc.

3. DNA sequence effects:

Led mostly by the Olson group, the so-called "base-pair step parameters" provide remarkable predictive powers with regards to the conformation of a DNA polymer. Next approaches should start to include not only nearest neighbor effects, but even next nearest neighbor effects. How to model this mathematically and computationally is an enormous yet exciting new challenge. The DNA sequence, of course, dictates both the structural deformations that occur as a consequence of underwinding and overwinding DNA, as well as the electrostatics. In addition, the DNA sequence, as well as Lk, counterions, and water, all come into play in the formation of the so-called "alternative secondary structure of DNA". The K. Vazquez and Yang groups have made great inroads into the understanding of these structures and how important they are for DNA. Medically, the structures that result can cause human suffering and account for the cause of several important and fairly common human diseases.

4. DNA replication factories:

Instead of free, unconstrained DNA filling up space in a cell, in fact the proteins that replicate and transcribe DNA are "fixed" in the cell in what biologists have named "factories". During replication, for example, this means that the DNA moves, at a rate of 100-1000 base pairs/second. In front of the factories, the DNA will have to be transiently overwound and this overwinding is unlikely to be allowed to adopt the geometric configuration of writhe. White's adaptation to DNA of Lk = Tw + Wr, therefore, must be now modeled to limit writhe and mathematical considerations of variations in twist and writhe should aid in the understanding of this important biological phenomenon. The single double helix train track in front of the factory, during replication has, behind the factory, split into two train tracks with partial gaps on one side and a nick on the other. The Schvartzman, Stasiak, and Wang groups study this process. The topological interplay between linking number and catenation is likely to be governed by mathematical and physical principals. Thus replicating DNA involves extraordinary structures with tremendous topological complications, and this is a mechanism desperately in need for improved mathematical modeling. At the same time, it is the long-established concepts in DNA topology and knot theory that have helped guide the understanding of this remarkable biopolymer. The mathematics involved includes tangle, braid, knot, link, and polymer modeling. The study of the characteristics of both equilibrium as well as kinetic aspects of DNA now include geometric, spatial, and topological facets that may be implicated in these mechanisms as well as the characteristics of polymers under a variety of solvent conditions. While these studies require advances in computational methods to fully illuminate the equilibrium properties, sufficient information appears already to be available to inform an understanding of experimental observations.

5. DNA length effects:

The typical length of DNA in a cell ranges from thousands of base pairs in a virus, ~4 megabase pairs in bacteria, to ~3 billion base pairs in mammals or equivalently ~10 to 10 million Kuhn lengths. How does the length of DNA influence its topological and geometric properties such as knotting, linking and supercoiling? Is an organism's natural length of DNA optimal in terms of minimizing the possibility of topological obstructions to vital cellular processes such as replication and transcription while maximizing the amount of information that can be stored? In order to address this kind of question, theorists investigate the length dependence of the topological and geometric properties of model polymers. For lattice models of polymers, one can obtain mathematical proofs for the limiting behavior of, for example, knotting and linking probabilities as polymer length goes to infinity (e.g. the works of Whittington, Sumners, Soteros, Orlandini, Tesi, Janse van Rensburg). Well established statistical mechanics and field theory arguments can also be used to predict the finite length scaling behavior of polymer properties such as the knotting probability or the average squared radius of gyration. Determining the length scale for which this scaling behavior is relevant, however, requires computer simulations and comparison to experiments. The Stella and Orlandini groups have studied lattice model ring polymers (self-avoiding polygons) in a good solvent up to length 200,000 and in a poor solvent up to length 2,000. Off-lattice, Rawdon, Millett, and Deguchi have studied polygons up to length 500. The single molecule DNA experiments of Dietler's group have started to bridge the gap between theory and experiment regarding the metric properties of knotted versus unknotted DNA while at the same time raising new questions. In general, much work remains on both the theory and experimental side in order to further bridge the gap. The mathematical facet of this work brings together topologists, geometers, statisticians, and computational scientists. One challenge is the development of new computer methods for rapidly identifying the knots and links that are generated in the course of Monte Carlo simulations. One of the most powerful tools is the calculation of the HOMFLY polynomial. The program was written in 1985 for applications to knots and links with no more than 128 crossings. While it has stood the test of time well, this tool is now being used to study polygonal models with many thousands of crossings by using special pre-analysis programs attributed to Millett and to Thistlethwaite. It is time to revisit the strategies that have been employed in the smaller ranges and to develop new ones that will enable more rapid study of knotted and linked configurations as complex as those which arise in simulation and under actual physical experimental conditions. Another major challenge on the theoretical side is the need to incorporate greater complexity in the models in order to explore the effects listed in items 1- 4 above and at the same time explore the length dependence of these effects.

6. DNA confinement effects:

A principal application of the technology discussed above and any new methods developed will be to macromolecules in confined geometries, for example polymers between two parallel planes as in models of steric stabilization of dispersions or in DNA molecules contained in a capsid (Arsuaga, Roca, and Sumners are leaders in this field). Macromolecules so confined exhibit significantly different average and individual structure in comparison with those in free environments. Excellent mathematical modeling of this is being done by the Rawdon, Soteros, and Whittington groups. Effective confining arises in the case of macromolecules that have specific hydrophobic and hydrophilic regions or when regions have restricted flexibility or torsion. While, in general, one might believe that great progress has occurred in the understanding of storing, knotting, and winding of polymers, in fact rather little is known rigorously and many fundamental questions seem just beyond our grasp, both theoretically or via numerical studies. Further effort is clearly needed and promising steps are being taken in these areas.

In the last decade or so, tremendous advances in the understanding of DNA behavior, including the effects of (i) storage (in viral capsids, eukaryotic nuclei, or bacterial cells), (ii) entanglement (knots and links), (iii) replication, (iv) transcription into RNA, and (v) repair and recombination (including site-specific and general), have been made at the hands of researchers working at the interface of mathematics, biology, and physics. Not only has the understanding of DNA as a biopolymer advanced rapidly, but emerging concepts, particularly from the Buck, Chan, Liu, Stasiak, and Zechiedrich groups have reached beyond the scope of DNA to a general understanding of the previously little-explored basic relationship between the local geometry of chain juxtaposition and global topology in polymer chains. Numerical simulations of lattice models as well as continuum freely-jointed and wormlike chain models demonstrated convincingly that the degree of 'hookedness' of an observed local juxtaposition correlates well with global topological complexity and the likelihood that a topoisomerase-like segment passage at the given juxtaposition would disentangle. This is a new paradigm opening up many avenues of computational and experimental research. Moreover, these novel numerical results also serve to suggest a wealth of questions and conjectures that may be fruitfully addressed by field theory arguments from physics and by rigorous mathematics. Indeed, during the same time, we (the proposed co-organizers) have noted a drastic increase in the precision in the language of biologists, with their incorporation of such important concepts as 'conjecture', 'hypothesis', and 'theory' following the traditional mathematical usage. An improved understanding of the languages of each of the three disciplines improves the communication, and as such, the understanding of each other. Increasing the awareness of mathematicians to (i) the complexity of the biological problems, as well as (ii) the cutting edge research results, even before they are published, will facilitate an increased understanding of biopolymers, which is the goal of this conference. Some progress on these fronts was achieved as a result of the 2007 BIRS workshop 07w5095, The Mathematics of Knotting and Linking in Polymer Physics and Molecular Biology. For the 2010 workshop we plan to include more Biologists/Experimentalists than before and we expect that this will be the first opportunity for many of the invitees from different disciplines to meet each other. Thus the proposed workshop will not only enable the advancement of existing collaborations at the interface between Biology, Mathematics and Physics but will encourage the development of new ones.

The conference is timely for an additional important reason. Research funding for the pure mathematical and physical sciences has decreased recently. However, together with this troubling trend, there is an increase in funding opportunities for mathematicians and physicists working at the interface of the biological sciences, perhaps particularly in regard to medically relevant research. Rich in important problems only answerable with an interdisciplinary approach, the study of DNA polymer science has had extraordinary successes quite recently, with the vast majority of these occurring at the interface of disciplines. Bringing a cadre of researchers working at the interface of polymer science to the Banff International Research Station for Mathematical Innovation and Discovery provides the opportunity to bridge these fields. We expect to forge new collaborations as well as exchange knowledge in this proposed cutting-edge conference. The specific topics to be included are:

1. Linking Number (Lk) of DNA:

Recent emerging results, particularly from the Maddox, Harris, and Zechiedrich groups, have been made in all atom simulations of DNA. Whereas coarse-grained models have been extremely useful for understanding the behavior of these polymers (for example, knotting, linking, and in general how they are packed into small spaces), the next series of questions must begin to include the surprising way that the change in linking number, Lk, is manifested in DNA. The observed bimodal response of DNA to Lk shows complete collapse of the DNA helix in sequence-dependent localized regions of the biopolymer with a concomitant relaxation back to B-form DNA in the rest of the biopolymer. At the same time, for the overwound helix, elastic polymer rod models work perfectly well. Mathematically and physically, this means, at least in the helix unwinding direction, that the assumptions of elastic rod theory are wrong and suggests that perhaps an asymmetric torsional potential would be physically more appropriate.

2. Electrostatics of DNA:

A highly charged polymer, DNA contains -2 net charges/base pair. Mathematical modeling of DNA has begun to include the charge, but biologically the counterions (monovalent and divalent cations, together with their respective anions) and water surrounding the DNA effectively shield the charged polymer so that DNA-DNA interactions are common. Recent advances and emerging concepts here include that counterion concentrations increase with increasing Lk of DNA. These alterations in counterion concentrations, in turn, impact how the DNA is stored, replicated, metabolized, etc.

3. DNA sequence effects:

Led mostly by the Olson group, the so-called "base-pair step parameters" provide remarkable predictive powers with regards to the conformation of a DNA polymer. Next approaches should start to include not only nearest neighbor effects, but even next nearest neighbor effects. How to model this mathematically and computationally is an enormous yet exciting new challenge. The DNA sequence, of course, dictates both the structural deformations that occur as a consequence of underwinding and overwinding DNA, as well as the electrostatics. In addition, the DNA sequence, as well as Lk, counterions, and water, all come into play in the formation of the so-called "alternative secondary structure of DNA". The K. Vazquez and Yang groups have made great inroads into the understanding of these structures and how important they are for DNA. Medically, the structures that result can cause human suffering and account for the cause of several important and fairly common human diseases.

4. DNA replication factories:

Instead of free, unconstrained DNA filling up space in a cell, in fact the proteins that replicate and transcribe DNA are "fixed" in the cell in what biologists have named "factories". During replication, for example, this means that the DNA moves, at a rate of 100-1000 base pairs/second. In front of the factories, the DNA will have to be transiently overwound and this overwinding is unlikely to be allowed to adopt the geometric configuration of writhe. White's adaptation to DNA of Lk = Tw + Wr, therefore, must be now modeled to limit writhe and mathematical considerations of variations in twist and writhe should aid in the understanding of this important biological phenomenon. The single double helix train track in front of the factory, during replication has, behind the factory, split into two train tracks with partial gaps on one side and a nick on the other. The Schvartzman, Stasiak, and Wang groups study this process. The topological interplay between linking number and catenation is likely to be governed by mathematical and physical principals. Thus replicating DNA involves extraordinary structures with tremendous topological complications, and this is a mechanism desperately in need for improved mathematical modeling. At the same time, it is the long-established concepts in DNA topology and knot theory that have helped guide the understanding of this remarkable biopolymer. The mathematics involved includes tangle, braid, knot, link, and polymer modeling. The study of the characteristics of both equilibrium as well as kinetic aspects of DNA now include geometric, spatial, and topological facets that may be implicated in these mechanisms as well as the characteristics of polymers under a variety of solvent conditions. While these studies require advances in computational methods to fully illuminate the equilibrium properties, sufficient information appears already to be available to inform an understanding of experimental observations.

5. DNA length effects:

The typical length of DNA in a cell ranges from thousands of base pairs in a virus, ~4 megabase pairs in bacteria, to ~3 billion base pairs in mammals or equivalently ~10 to 10 million Kuhn lengths. How does the length of DNA influence its topological and geometric properties such as knotting, linking and supercoiling? Is an organism's natural length of DNA optimal in terms of minimizing the possibility of topological obstructions to vital cellular processes such as replication and transcription while maximizing the amount of information that can be stored? In order to address this kind of question, theorists investigate the length dependence of the topological and geometric properties of model polymers. For lattice models of polymers, one can obtain mathematical proofs for the limiting behavior of, for example, knotting and linking probabilities as polymer length goes to infinity (e.g. the works of Whittington, Sumners, Soteros, Orlandini, Tesi, Janse van Rensburg). Well established statistical mechanics and field theory arguments can also be used to predict the finite length scaling behavior of polymer properties such as the knotting probability or the average squared radius of gyration. Determining the length scale for which this scaling behavior is relevant, however, requires computer simulations and comparison to experiments. The Stella and Orlandini groups have studied lattice model ring polymers (self-avoiding polygons) in a good solvent up to length 200,000 and in a poor solvent up to length 2,000. Off-lattice, Rawdon, Millett, and Deguchi have studied polygons up to length 500. The single molecule DNA experiments of Dietler's group have started to bridge the gap between theory and experiment regarding the metric properties of knotted versus unknotted DNA while at the same time raising new questions. In general, much work remains on both the theory and experimental side in order to further bridge the gap. The mathematical facet of this work brings together topologists, geometers, statisticians, and computational scientists. One challenge is the development of new computer methods for rapidly identifying the knots and links that are generated in the course of Monte Carlo simulations. One of the most powerful tools is the calculation of the HOMFLY polynomial. The program was written in 1985 for applications to knots and links with no more than 128 crossings. While it has stood the test of time well, this tool is now being used to study polygonal models with many thousands of crossings by using special pre-analysis programs attributed to Millett and to Thistlethwaite. It is time to revisit the strategies that have been employed in the smaller ranges and to develop new ones that will enable more rapid study of knotted and linked configurations as complex as those which arise in simulation and under actual physical experimental conditions. Another major challenge on the theoretical side is the need to incorporate greater complexity in the models in order to explore the effects listed in items 1- 4 above and at the same time explore the length dependence of these effects.

6. DNA confinement effects:

A principal application of the technology discussed above and any new methods developed will be to macromolecules in confined geometries, for example polymers between two parallel planes as in models of steric stabilization of dispersions or in DNA molecules contained in a capsid (Arsuaga, Roca, and Sumners are leaders in this field). Macromolecules so confined exhibit significantly different average and individual structure in comparison with those in free environments. Excellent mathematical modeling of this is being done by the Rawdon, Soteros, and Whittington groups. Effective confining arises in the case of macromolecules that have specific hydrophobic and hydrophilic regions or when regions have restricted flexibility or torsion. While, in general, one might believe that great progress has occurred in the understanding of storing, knotting, and winding of polymers, in fact rather little is known rigorously and many fundamental questions seem just beyond our grasp, both theoretically or via numerical studies. Further effort is clearly needed and promising steps are being taken in these areas.