Learning Rate Based Branching Heuristic for SAT Solvers (Презентации лекций), страница 3
Описание файла
Файл "Learning Rate Based Branching Heuristic for SAT Solvers" внутри архива находится в следующих папках: Презентации лекций, Статьи к лекциям. PDF-файл из архива "Презентации лекций", который расположен в категории "". Всё это находится в предмете "boolean sat-smt solvers for software engineering" из 11 семестр (3 семестр магистратуры), которые можно найти в файловом архиве МГУ им. Ломоносова. Не смотря на прямую связь этого архива с МГУ им. Ломоносова, его также можно найти и в других разделах. .
Просмотр PDF-файла онлайн
Текст 3 страницы из PDF
Wewill demonstrate empirically the effectiveness of LRB at solving the benchmarksfrom the 4 previous SAT Competitions.5.1Exponential Recency Weighted Average (ERWA)We will explain how to apply ERWA as a branching heuristic through the MABabstraction. First we will provide a conceptual explanation, that is easier tocomprehend. Then we will provide a complementary explanation from the implementation’s perspective, which is equivalent to the conceptual explanation,but provides more details.Conceptually, each variable v maintains its own time series tsv containingthe observed rewards for v. Whenever a variable v transitions from assigned tounassigned, ERWA will calculate the LR r for v (see Section 3) and append thereward r to the time series by updating tsv ← append(tsv , r).
When the solverrequests the next branching variable, ERWA will select the variable v ∗ wherev ∗ = argmaxv∈U (emaα (tsv )) and U is the set of currently unassigned variables.The actual implementation takes advantage of the incrementality of EMAto avoid storing the time series ts, see Algorithm 1 for pseudocode of the implementation. Alternative to the above description, each variable v maintains afloating point number Qv representing emaα (tsv ). When v receives reward r,then the implementation updates Qv using the incrementality of EMA, that is,Qv ← (1 − α) · Qv + α · r (see line 24 of Algorithm 1).
When the solver requests the next branching variable, the implementation will select the variablev ∗ where v ∗ = argmaxv∈U Qv and U is the set of currently unassigned variables(see line 28 of Algorithm 1). Note that Qv can be stored in a priority queuefor all unassigned variables v, hence finding the maximum will take logarithmictime in the worst-case. The implementation is equivalent to the prior conceptualdescription, but significantly more efficient in both memory and time.For our experiments, we initialize the step-size α = 0.4. We follow the convention of typical ERWA to decrease the step-size over time [31]. After each conflict,the step-size is decreased by 106 until it reaches 0.06 (see line 14 in Algorithm 1),and remains at 0.06 for the remainder of the run. This step-size managementis equivalent to the one in CHB [19] and is similar to how the Glucose solvermanages the VSIDS decay factor by increasing it over time [4].5.2Extension: Reason Side Rate (RSR)Recall that LR measures the participation rate of variables in generating learntclauses.
That is, variables with high LR are the ones that frequently appear inthe generated learnt clause and/or the conflict side of the implication graph. Ifa variable appears on the reason side near the learnt clause, then these variables just missed the mark. We show that accounting for these close proximityvariables, in conjunction with the ERWA heuristic, optimizes the LR further.More precisely, if a variable v appears in a reason clause of a variable in alearnt clause l, but does not occur in l, then we say that v reasons in generatingthe learnt clause l. We define I as the interval of time between the assignmentof v until v transitions back to being unassigned. Let A be the number learntclauses which v reasons in generating in interval I and let L be the number oflearnt clauses generated in interval I. The reason side rate (RSR) of variable vat interval I is defined as AL.Recall that in ERWA, the estimates are updated incrementally as Qv ←(1 − α) · Qv + α · r where r is the LR of v.
This extension modifies the updateAto Qv ← (1 − α) · Qv + α · (r + AL ) where L is the RSR of v (see line 20 inAlgorithm 1 Pseudocode for ERWA as a branching heuristic using our MABabstraction for maximizing LR.1: procedure Initialize2:α ← 0.43:LearntCounter ← 04:for v ∈ V ars do5:Qv ← 06:Assignedv ← 07:P articipatedv ← 0. Called once at the start of the solver.. The step-size..
The number of learnt clauses generated by the solver.. V ars is the set of Boolean variables in the input CNF.. The EMA estimate of v.. When v was last assigned.. The number of learnt clauses v participated ingenerating since Assignedv .8:9: procedure AfterConflictAnalysis(learntClauseV ars ⊆ V ars, conf lictSide ⊆ V ars).Called after a learnt clause is generated fromconflict analysis.10:LearntCounter ← LearntCounter + 111:for v ∈ conf lictSide ∪ learntClauseV ars do12:P articipatedv ← P articipatedv + 113:if α > 0.06 then14:α ← α − 10−615:16: procedure OnAssign(v ∈ V ars). Called when v is assigned by branching or propagation.17:Assignedv ← LearntCounter18:P articipatedv ← 019:20: procedure OnUnassign(v ∈ V ars).
Called when v is unassigned by backtracking orrestart.21:Interval ← LearntCounter − Assignedv22:if Interval > 0 then. Interval = 0 is possible due to restarts.23:r ← P articipatedv /Interval.. r is the LR.24:Qv = (1 − α) · Qv + α · r. Update the EMA incrementally.25:26: function PickBranchLit. Called when the solver requests the next branching variable.27:U ← {v ∈ V ars | isU nassigned(v)}28:return argmaxv∈U Qv.
Use a priority queue for better performance.Algorithm 2). Note that we did not change the definition of the reward. Theextension simply encourages the algorithm to select variables with high RSRwhen deciding to branch. We hypothesize that variables observed to have highRSR are likely to have high LR as well.5.3Extension: LocalityRecent research shows that VSIDS exhibits locality [20], defined with respect tothe community structure of the input CNF instance [20, 25, 1].
Intuitively, if thesolver is currently working within a community, it is best to continue focusing onthe same community rather than exploring another. We hypothesize that highLR variables also exhibit locality, that is, the branching heuristic can achievehigher LR by restricting exploration.Inspired by the VSIDS decay, this extension multiplies the Qv of every unassigned variable v by 0.95 after each conflict (see line 5 in Algorithm 3). Again,we did not change the definition of the reward.
The extension simply discourages the algorithm from exploring inactive variables. This extension is similar tothe decay reinforcement model [13, 32] where unplayed arms are penalized by aAlgorithm 2 Pseudocode for ERWA as a branching heuristic with the RSRextension. The pseudocode Algorithm1.method(...) is calling out to the code inAlgorithm 1. The procedure P ickBranchLit is unchanged.1: procedure Initialize2:Algorithm1.Initialize()3:for v ∈ V ars do4:Reasonedv ← 0. V ars is the set of Boolean variables in the input CNF..
The number of learnt clauses v reasoned in generating since Assignedv .5:6: procedure AfterConflictAnalysis(learntClauseV ars ⊆ V ars, conf lictSide ⊆ V ars)7:Algorithm1.AfterConf lictAnalysis(learntClauseV ars, conf lictSide)S8:for v ∈ (reason(u)) \ learntClauseV ars dou∈learntClauseV ars9:Reasonedv ← Reasonedv + 110:11: procedure OnAssign(v ∈ V ars)12:Algorithm1.OnAssign()13:Reasonedv ← 014:15: procedure OnUnassign(v ∈ V ars)16:Interval ← LearntCounter − Assignedv17:if Interval > 0 then. Interval = 0 is possible due to restarts.18:r ← P articipatedv /Interval.. r is the LR.19:rsr ← Reasonedv /Interval.. rsr is the RSR.20:Qv = (1 − α) · Qv + α · (r + rsr). Update the EMA incrementally.Algorithm 3 Pseudocode for ERWA as a branching heuristic with the localityextension. Af terConf lictAnalysis is the only procedure modified.1: procedure AfterConflictAnalysis(learntClauseV ars ⊆ V ars, conf lictSide ⊆ V ars)2:Algorithm2.Af terConf lictAnalysis(learntClauseV ars, conf lictSide)3:U ← {v ∈ V ars | isU nassigned(v)}4:for v ∈ U do5:Qv ← 0.95 × Qv .multiplicative decay.
The implementation is optimized to do the multiplicationsin batch. For example, suppose variable v is unassigned for k conflicts. Ratherthan executing k updates of Qv ← 0.95×Qv , the implementation simply updatesonce using Qv ← 0.95k × Qv .5.4Putting It All Together to obtain the Learning Rate Branching(LRB) HeuristicThe learning rate branching (LRB) heuristic refers to ERWA in the MAB abstraction with the RSR and locality extensions. We show that LRB is better atoptimizing LR than the other branching heuristics considered, and subsequentlyhas the best overall performance of the bunch.6Experimental ResultsIn this section, we discuss the detailed and comprehensive experiments we performed to evaluate LRB.
First, we justify the extensions of LRB by demonstrating their performance vis-a-vis improvements in learning rate. Second, we showthat LRB outperforms the state-of-the-art VSIDS and CHB branching heuristic. Third, we show that LRB achieves higher rewards/LR than VSIDS, CHB,and LRB sans the extensions. Fourth, we show the effectiveness of LRB withina state-of-the-art CDCL solver, namely, CryptoMiniSat [29]. To better gaugethe results of these experiments, we quote two leading SAT solver developers,Professors Audemard and Simon [3]:“We must also say, as a preliminary, that improving SAT solvers is oftena cruel world. To give an idea, improving a solver by solving at leastten more instances (on a fixed set of benchmarks of a competition) isgenerally showing a critical new feature. In general, the winner of a competition is decided based on a couple of additional solved benchmarks.”6.1SetupThe experiments are performed by running CDCL solvers with various branching heuristics on StarExec [30], a platform designed for evaluating logic solvers.The StarExec platform uses the Intel Xeon CPU E5-2609 at 2.40GHz with 10240KB cache and 24 GB of main memory, running on Red Hat Enterprise LinuxWorkstation release 6.3, and Linux kernel 2.6.32-431.1.2.el6.x86 64.
The benchmarks for the experiments consist of all the instances from the previous 4 SATCompetitions (2014, 2013, 2011, and 2009), in both the application and hardcombinatorial categories. For each instance, the solver is given 5000 seconds ofCPU time and 7.5GB of RAM, abiding by the SAT Competition 2013 limits.Our experiments test different branching heuristics on a base CDCL solver,where the only modification is to the branching heuristic to give a fair apple-toapple comparison. Our base solver is MiniSat version 2.2.0 [12] (simp version)with one modification to use the popular aggressive LBD-based clause deletionproposed by the authors of the Glucose solver in 2009 [2]. Since MiniSat is arelatively simple solver with very few features, it is ideal for our base solverto better isolate the effects swapping branching heuristics in our experiments.Additionally, MiniSat is the basis of many competitive solvers and aggressiveLBD-based clause deletion is almost universally implemented, hence we believethe results of our experiments will generalize to other solver implementations.6.2Experiment: Efficacy of Extensions to ERWAIn this experiment, we demonstrate the effectiveness of the extensions we proposed for LRB.