# Theory and Practice of Satisfiability Solving (18w5208)

Arriving in Oaxaca, Mexico Sunday, August 26 and departing Friday August 31, 2018

## Organizers

Jakob Nordstrom (KTH Royal Institute of Technology)

Sam Buss (University of California, San Diego)

Daniel Le Berre (Université d'Artois)

Moshe Vardi (Rice University)

## Objectives

The satisfiability (SAT) problem has been a central problem in computer science for most of the existence of this field of research, and has more recently been acknowledged as a major challenge also for modern mathematics (in the form of one of the Millennium Prize Problems). In order to make progress in our understanding of this problem, a strong case can be made for the importance of increased exchange between applied SAT solving and theoretical computer science (in particular, computational complexity and proof complexity). Below, we try to outline some of the questions that have served as motivation for earlier workshops focused on the theory and practice of satisfiability solving held at BIRS in 2014, Dagstuhl in 2015, and the Fields Institute in 2016 --- questions that still remain formidable challenges. We want to stress that the list below is far from exhaustive, and in fact we believe one important outcome of a meeting in Banff in 2018 would be to stimulate the process of uncovering further questions of common interest.\\

WHAT MAKES FORMULAS HARD OR EASY IN PRACTICE?\\

The best SAT solvers today, known as conflict-driven clause learning (CDCL) solvers, are essentially searching for proofs in the resolution proof system. Given the progress during the last decade on solving large-scale instances, it is natural to ask how the spectacular success of the CDCL paradigm can be explained. And given that there are still very small formulas that resist even the most powerful CDCL solvers (even though, intriguingly, some of these formulas even have very short resolution proofs), a complementary interesting question is how one can determine whether a particular formula is infeasible or tractable. Somewhat unexpectedly, very little turns out to be known about these questions (other than that the standard worst case measure used in computational complexity does not seem to be very meaningful here).\\

Different candidate explanations have been offered in terms of parameterized complexity, proof complexity, or community structure, but while these approaches seem intuitively plausible a rigorous understanding has so far remained elusive. It seems clear that the only way CDCL solvers can solve real-world instances with millions of variables is that they somehow exploit the structure of these instances, but we would want to understand what exactly this structure is and precisely how it is exploited. As a concrete example, in real-world instances the formulas often arise from an iterative unfolding over time of a state transition function (for instance, in bounded model checking). Can anything be learned from theoretical research about such an iterative logic structure that can then be exploited in practical solvers?\\

CAN COMPUTATIONAL COMPLEXITY THEORY SHED LIGHT ON CRUCIAL SAT SOLVING ISSUES?\\

In addition to the fundamental but quite general question of understanding the hardness or easiness of proving formulas in practice, there are also many slightly more specific questions which seem to hold great potential for combined theoretical and applied research. We list some such questions below.\\

- A critical issue for modern CDCL solvers is memory management. The answers to the questions of (a) how many clauses to learn and keep, (b) how quickly to increase this limit, and (c) which clauses to throw away once the limit has been reached, are all crucial design choices. There are well-developed heuristics that seem to perform well in many practical settings, but these heuristics are poorly understood, or not understood at all, from a theoretical point of view. If we do not comprehend when or why heuristics are good or bad, this makes it hard to develop and improve them further.\\

- In addition to clause learning, the concept of restarts is empirically known to have a decisive impact on performance, but again there is scant theoretical explanation for this. It would be highly desirable to understand why restarts are so important. The reason why clause learning increases efficiency greatly is fairly clear --- without it the solver will only generate so-called tree-like proofs, and tree-like resolution is known to be exponentially weaker than general resolution. However, the question of whether frequent research actually strengthens the theoretical reasoning power, or is just a helpful way of getting a second chance to start over when some other heuristic went wrong, is still wide open.\\

- Given the advent of computers with multi-core processors, a highly topical question is whether this (rather coarse-grained) parallelization can be used to speed up SAT solving. Though efficient parallelization seems to hold out great promise for performance improvements, the actual results so far are quite disappointing. In fact, in the parallel track of the SAT competition 2016 many of the solvers were *slower* when run on multiple cores compared to a single core! This is a barrier for further adoption of SAT technology already today and will become a more substantial problem as thousands of cores and cloud computing are becoming the dominant computing platforms. A theoretical understanding on how SAT can be parallelized will be essential to develop new parallelization strategies to adapt SAT to this new computing paradigm.\\

CAN MORE EFFICIENT SAT SOLVERS BE BUILT USING STRONGER PROOF SYSTEMS THAN RESOLUTION?\\

Although the performance of CDCL SAT solvers is impressive, it is nevertheless astonishing, not to say disappointing, that the state-of-the-art solvers are still resolution-based. Resolution lies very close to the bottom in the hierarchy of propositional proof systems, and there many other proof systems based on different forms of mathematical reasoning that are known to be strictly stronger. Some of these appear to be natural candidates for serving as a basis for stronger SAT solvers than those using CDCL. \\

In particular, proof systems such as polynomial calculus (based on Groebner basis calculations) and cutting planes (based on integer linear programming) are known to be exponentially more powerful than resolution. While there has been some work on building SAT solvers on top of these proof systems, progress has been quite limited. It is hard to say whether this is due to fundamental barriers or whether it is simply a question of more time needing to be invested (comparing to the huge amount of work that has gone into resolution-based solvers ever since the 1960s). \\

One important part of the success of CDCL solvers seems to be that many different aspects of solver behaviour are governed by conflicts (i.e., when partial truth value assignments are discovered to falsify the formula) and analysis of such conflicts. This is true not only for the clause learning --- as is clear from the term “conflict-driven clause learning” --- but also for, e.g., branching and restart heuristics, and indeed it has been proposed that the proper way of referring to what these solvers do is that they perform “conflict-driven search.” In order to build efficient algebraic solvers based on Groebner bases or so-called pseudo-Boolean solvers based on integer linear programming, it would be desirable to find a good way of lifting the concept of conflict-driven search to these stronger methods of reasoning.\\

Another important part of this work would seem to be to gain a deeper theoretical understanding of the power and limitations of the underlying proof systems polynomial calculus and cutting planes. Here there are many fairly long-standing open theoretical questions. At the same time, during the last decade there has been substantial progress in proof complexity, which gives hope that the time might have come for decisive breakthroughs.

WHAT MAKES FORMULAS HARD OR EASY IN PRACTICE?\\

The best SAT solvers today, known as conflict-driven clause learning (CDCL) solvers, are essentially searching for proofs in the resolution proof system. Given the progress during the last decade on solving large-scale instances, it is natural to ask how the spectacular success of the CDCL paradigm can be explained. And given that there are still very small formulas that resist even the most powerful CDCL solvers (even though, intriguingly, some of these formulas even have very short resolution proofs), a complementary interesting question is how one can determine whether a particular formula is infeasible or tractable. Somewhat unexpectedly, very little turns out to be known about these questions (other than that the standard worst case measure used in computational complexity does not seem to be very meaningful here).\\

Different candidate explanations have been offered in terms of parameterized complexity, proof complexity, or community structure, but while these approaches seem intuitively plausible a rigorous understanding has so far remained elusive. It seems clear that the only way CDCL solvers can solve real-world instances with millions of variables is that they somehow exploit the structure of these instances, but we would want to understand what exactly this structure is and precisely how it is exploited. As a concrete example, in real-world instances the formulas often arise from an iterative unfolding over time of a state transition function (for instance, in bounded model checking). Can anything be learned from theoretical research about such an iterative logic structure that can then be exploited in practical solvers?\\

CAN COMPUTATIONAL COMPLEXITY THEORY SHED LIGHT ON CRUCIAL SAT SOLVING ISSUES?\\

In addition to the fundamental but quite general question of understanding the hardness or easiness of proving formulas in practice, there are also many slightly more specific questions which seem to hold great potential for combined theoretical and applied research. We list some such questions below.\\

- A critical issue for modern CDCL solvers is memory management. The answers to the questions of (a) how many clauses to learn and keep, (b) how quickly to increase this limit, and (c) which clauses to throw away once the limit has been reached, are all crucial design choices. There are well-developed heuristics that seem to perform well in many practical settings, but these heuristics are poorly understood, or not understood at all, from a theoretical point of view. If we do not comprehend when or why heuristics are good or bad, this makes it hard to develop and improve them further.\\

- In addition to clause learning, the concept of restarts is empirically known to have a decisive impact on performance, but again there is scant theoretical explanation for this. It would be highly desirable to understand why restarts are so important. The reason why clause learning increases efficiency greatly is fairly clear --- without it the solver will only generate so-called tree-like proofs, and tree-like resolution is known to be exponentially weaker than general resolution. However, the question of whether frequent research actually strengthens the theoretical reasoning power, or is just a helpful way of getting a second chance to start over when some other heuristic went wrong, is still wide open.\\

- Given the advent of computers with multi-core processors, a highly topical question is whether this (rather coarse-grained) parallelization can be used to speed up SAT solving. Though efficient parallelization seems to hold out great promise for performance improvements, the actual results so far are quite disappointing. In fact, in the parallel track of the SAT competition 2016 many of the solvers were *slower* when run on multiple cores compared to a single core! This is a barrier for further adoption of SAT technology already today and will become a more substantial problem as thousands of cores and cloud computing are becoming the dominant computing platforms. A theoretical understanding on how SAT can be parallelized will be essential to develop new parallelization strategies to adapt SAT to this new computing paradigm.\\

CAN MORE EFFICIENT SAT SOLVERS BE BUILT USING STRONGER PROOF SYSTEMS THAN RESOLUTION?\\

Although the performance of CDCL SAT solvers is impressive, it is nevertheless astonishing, not to say disappointing, that the state-of-the-art solvers are still resolution-based. Resolution lies very close to the bottom in the hierarchy of propositional proof systems, and there many other proof systems based on different forms of mathematical reasoning that are known to be strictly stronger. Some of these appear to be natural candidates for serving as a basis for stronger SAT solvers than those using CDCL. \\

In particular, proof systems such as polynomial calculus (based on Groebner basis calculations) and cutting planes (based on integer linear programming) are known to be exponentially more powerful than resolution. While there has been some work on building SAT solvers on top of these proof systems, progress has been quite limited. It is hard to say whether this is due to fundamental barriers or whether it is simply a question of more time needing to be invested (comparing to the huge amount of work that has gone into resolution-based solvers ever since the 1960s). \\

One important part of the success of CDCL solvers seems to be that many different aspects of solver behaviour are governed by conflicts (i.e., when partial truth value assignments are discovered to falsify the formula) and analysis of such conflicts. This is true not only for the clause learning --- as is clear from the term “conflict-driven clause learning” --- but also for, e.g., branching and restart heuristics, and indeed it has been proposed that the proper way of referring to what these solvers do is that they perform “conflict-driven search.” In order to build efficient algebraic solvers based on Groebner bases or so-called pseudo-Boolean solvers based on integer linear programming, it would be desirable to find a good way of lifting the concept of conflict-driven search to these stronger methods of reasoning.\\

Another important part of this work would seem to be to gain a deeper theoretical understanding of the power and limitations of the underlying proof systems polynomial calculus and cutting planes. Here there are many fairly long-standing open theoretical questions. At the same time, during the last decade there has been substantial progress in proof complexity, which gives hope that the time might have come for decisive breakthroughs.