Fredrik Kahl and I have a paper accepted to ICCV titled Generalized Roof Duality for Pseudo-Boolean Optimization.
Just as in the previous post, the problem of interest is to minimize degree- polynomials of the form:
The previous post showed that many different reductions to the quadratic case are possible. Our paper describes a generalization of roof duality which does not rely on reductions and is in a suitable sense optimal.
1. Submodular Relaxations
The reason for requirement (1) is that we want the minimum of to be a lower bound to the minimum of . By minimizing instead of , some information of the minimum of is still obtained. Since for all , it is certainly true that the minimum of is a lower bound.
The requirement (2) is also fairly obvious. Since we must be able to minimize , requiring that is submodular is natural.
The last requirement, (3), is not obvious at all. It is included for two reasons:
- One can prove that restricting ourselves to this class of symmetric functions does not affect the obtained lower bound, i.e. the optimal relaxation is symmetric.
- For a symmetric we can prove persistency, an extremely useful property.
Let denote the unknown minimum value of , that is, . Ideally, we would like for all points . This is evidently not possible in general. However, one could try to maximize the lower bound of , , hence,
A relaxation that provides the maximum lower bound will be called optimal. We prove in the paper that when , the lower bound coincides with the roof duality bound and therefore this maximum lower bound will be referred to as generalized roof duality.
A minimizer to the relaxation will have the persistency property. Let be a minimizer of . Further, let be the set of indices satisfying . Then there exists a minimizer to the original problem such that for all .
3. Efficient formulation
For a pseudo-boolean function in variables, solving the optimization problem (4) is not tractable since the required number of constraints is exponential in . Two obvious heuristic alternatives are:
- Decompose into a sum of the form and compute an optimal relaxation for each term . However, the sum of optimal relaxations is generally not optimal.
- Use a subset of the points in for to get an approximate optimal relaxation.
Neither of these approaches are satisfactory. One may even wonder if the optimal relaxation is polynomial time computable at all?
Persistency holds for any bisubmodular relaxation — optimal or not. Instead of solving (4), which, although possible, may require a large number of constraints, one can consider a simpler problem:
Instead of maximizing , we are only maximizing . This problem is considerably less arduous and can be solved in polynomial time (provided we can determine submodularity with a polynomial number of constraints). Given the minimizer , which is also polynomial time computable via graph cuts, we can make the following important observations:
- If is non-zero, then we can use persistency to get a partial solution and reduce the number of variables in .
- Otherwise, as the optimum is indeed the trivial solution, and as is maximized in the construction of , then we can conclude that is an optimal relaxation and we have obtained the generalized roof duality bound.
These observations lead to the following algorithm that computes the generalized roof duality bound:
- Construct by solving (5).
- Compute .
- If is non-zero, use persistency to simplify and start over from 1. Otherwise, stop.
Theorem 1 A solution that attains the generalized roof duality bound for a pseudo-boolean function can be computed in polynomial time.
Proof: If is non-zero, persistency can be used to simplify the original function . This is equivalent to adding constraints of the type to . This can only increase the optimal value. If on the other hand , the current optimization was solved. Therefore, the final bound is at least equal to the optimal value of (4).
Remark. As suggested by the proof, the iterative approach can obtain better bounds than the “optimal” value defined by (4). We have observed this in practice for small problems where solving (4) is feasible. One example is
for which (4) gives the lower bound and the iterative method above gives , which is optimal.
4. Full paper and source code
The paper can be downloaded from our web page:
The source code will be made available in the next month or so.
This post will be followed up by a post containing the experimental results.