The Quest for Efficient Boolean Satisfiability Solvers (Презентации лекций), страница 2
Описание файла
Файл "The Quest for Efficient Boolean Satisfiability Solvers" внутри архива находится в следующих папках: Презентации лекций, Статьи к лекциям. PDF-файл из архива "Презентации лекций", который расположен в категории "". Всё это находится в предмете "boolean sat-smt solvers for software engineering" из 11 семестр (3 семестр магистратуры), которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 2 страницы из PDF
To satisfy a CNF formula, each clausemust be satisfied individually. If there exists a clause in the formula that has all itsliterals assigned value 0, then the current variable assignment or any variableassignment that contains this will not be able to satisfy the formula. A clause that hasall its literals assigned to value 0 is called a conflicting clause.DPLL(formula, assignment) {necessary = deduction(formula, assignment);new_asgnmnt = union(necessary, assignment);if (is_satisfied(formula, new_asgnmnt))return SATISFIABLE;else if (is_conflicting(formula, new_asgnmnt))return CONFLICT;var = choose_free_variable(formula, new_asgnmnt);asgn1 = union(new_asgnmnt, assign(var, 1));if (DPLL(formula, asgn1)==SATISFIABLE)return SATISFIABLE;else {asgn2 = union (new_asgnmnt, assign(var, 0));return DPLL(formula, asgn2);}}Fig.
1. The recursive description of DPLL algorithmTraditionally the DPLL algorithm is written in a recursive manner as shown in Fig.1. Function DPLL() is called with a formula and a set of variable assignments.Function deduction() will return with a set of the necessary variable assignmentsthat can be deduced from the existing variable assignments. The recursion will end ifthe formula is either satisfied (i.e. evaluates to 1 or true) or unsatisfied (i.e. evaluatesto 0 or false) under the current variable assignment. Otherwise, the algorithm willchoose an unassigned variable from the formula and branch on it for both phases.
Thesolution process begins with calling the function DPLL() with an empty set ofvariable assignments.In [25], the authors generalized many of the actual implementations of varioussolvers based on DPLL and rewrote it in an iterative manner as shown in Fig. 2. Thealgorithm described in Fig. 2 is an improvement of algorithm in Fig. 1 as it allows thesolver to backtrack non-chronologically, as we will see in the following sections.Different solvers based on DPLL differ mainly in the detailed implementation of eachof the functions shown in Fig. 2. We will use the framework of Fig.
2 as thefoundation for our discussions that follow.The algorithm described in Fig. 2 is a branch and search algorithm. Initially, noneof the variables is assigned a value. We call unassigned variables free variables. Firstthe solver will do some preprocessing on the instance to be solved, done by functionpreprocess() in Fig. 2.
If preprocessing cannot determine the outcome, the mainloop begins with a branch on a free variable by assigning it a value. We call thisoperation a decision on a variable, and the variable will have a decision levelassociated with it, starting from 1 and incremented with subsequent decisions. This isdone by function decide_next_branch() in Fig. 2. After the branch, theproblem is simplified as a result of this decision and its consequences.
The functiondeduce() performs some reasoning to determine variable assignments that areneeded for the problem to be satisfiable given the current set of decisions. Variablesthat are assigned as a consequence of this deduction after a branch will assume thesame decision level as the decision variable. After the deduction, if all the clauses aresatisfied, then the instance is satisfiable; if there exists a conflicting clause, then thestatus = preprocess();if (status!=UNKNOWN) return status;while(1) {decide_next_branch();while (true) {status = deduce();if (status == CONFLICT) {blevel = analyze_conflict();if (blevel == 0)return UNSATISFIABLE;else backtrack(blevel);}else if (status == SATISFIABLE)return SATISFIABLE;else break;}}Fig.
2. The iterative description of DPLL algorithmcurrent branch chosen cannot lead to a satisfying assignment, so the solver willbacktrack (i.e. undo certain branches). Which decision level to backtrack to isdetermined by the function analyze_conflict(). Backtrack to level 0 indicatesthat even without any branching, the instance is still unsatisfiable. In that case, thesolver will declare that the instance is unsatisfiable. Within the functionanalyze_conflict(), the solver may do some analysis and record someinformation from the current conflict in order to prune the search space for the future.This process is called conflict-driven learning. If the instance is neither satisfied norconflicting under the current variable assignments, the solver will choose anothervariable to branch and repeat the process.3. The Components of a DPLL SAT SolverIn this section of the paper, we discuss each of the components of a DPLL SATsolver.
Each of these components has been the subject of much scrutiny over theyears. This section focuses on the main lessons learnt in this process.3.1 The Branching HeuristicsBranching occurs in the function decide_next_branch() in Fig. 2. When nomore deduction is possible, the function will choose one variable from all the freevariables and assign it to a value. The importance of choosing good branchingvariables is well known - different branching heuristics may produce drasticallydifferent sized search trees for the same basic algorithm, thus significantly affect theefficiency of the solver. Over the years many different branching heuristics have beenproposed by different researchers.
Not surprisingly, comparative experimentalevaluations have also been done (e.g. [26, 27]).Early branching heuristics such as Bohm’s Heuristic (reported in [28]), MaximumOccurrences on Minimum sized clauses (MOM) (e.g. [14]), and Jeroslow-Wang [29]can be regarded as greedy algorithms that try to make the next branch generate thelargest number of implications or satisfy most clauses.
All these heuristics use somefunctions to estimate the effect of branching on each free variable, and choose thevariable that has the maximum function value. These heuristics work well for certainclasses of instances. However, all of the functions are based on the statistics of theclause database such as clause length etc. These statistics, though useful for randomSAT instances, usually do not capture relevant information about structured problems.In [26], the author proposed the use of literal count heuristics. Literal countheuristics count the number of unresolved (i.e. unsatisfied) clauses in which a givenvariable appears in either phase.
In particular, the author found that the heuristic thatchooses the variable with dynamic largest combined sum (DLIS) of literal counts inboth phases gives quite good results for the benchmarks tested. Notice that the countsare state-dependent in the sense that different variable assignments will give differentcounts. The reason is because whether a clause is unresolved (unsatisfied) depends onthe current variable assignment. Because the count is state-dependent, each time thefunction decide_next_branch() is called, the counts for all the free variablesneed to be recalculated.As the solvers become more and more efficient, calculating counts for branchingdominates the run time.
Therefore, more efficient and effective branching heuristicsare needed. In [20], the authors proposed the heuristic called Variable StateIndependent Decaying Sum (VSIDS). VSIDS keeps a score for each phase of avariable. Initially, the scores are the number of occurrences of a literal in the initialproblem. Because modern SAT solvers have a learning mechanism, clauses are addedto the clause database as the search progresses. VSIDS increases the score of avariable by a constant whenever an added clause contains the variable. Moreover, asthe search progresses, periodically all the scores are divided by a constant number. Ineffect, the VSIDS score is a literal occurrence count with higher weight on the morerecently added clauses.
VSIDS will choose the free variable with the highestcombined score to branch. Experiments show that VSIDS is quite competitivecompared with other branching heuristics on the number of branches needed to solvea problem. Because VSIDS is state independent (i.e. scores are not dependent on thevariable assignments), it is cheap to maintain. Experiments show that the decisionprocedure using VSIDS takes a very small percentage of the total run time even forproblems with millions of variables.More recently, [21] proposed another decision scheme that pushes the idea ofVSIDS further. Like VSIDS, the decision strategy is trying to decide on the variablesthat are “active recently”.
In VSIDS, the activity of a variable is captured by the scorethat is related to the literal’s occurrence. In [21], the authors propose to capture theactivity by conflicts. More precisely, when a conflict occurs, all the literals in theclauses that are responsible for the conflict will have their score increased. A clause isresponsible for a conflict if it is involved in the resolution process of generating thelearned clauses (described in the following sections). In VSIDS, the focus on “recent”is captured by decaying the score periodically. In [21], the scores are also decayedperiodically. Moreover, the decision heuristic will limit the decision variable to beamong the literals that occur in the last added clause that is unresolved.
Theexperiments seem to indicate that the new decision scheme is more robust comparedwith VSIDS on the benchmarks tested.In other efforts, satz [22] proposed the use of look-ahead heuristics for branching;and cnfs [23] proposed the use of backbone-directed heuristics for branching. Theyshare the common feature that they both seem to be quite effective on difficultrandom problems.