0% found this document useful (0 votes)

81 views89 pages

Slightly Improved Approximation Algorithm For Metric TSP

This paper presents a randomized algorithm that provides a 3/2 - ε approximation for the metric traveling salesperson problem (TSP), where ε > 10-36. This slightly improves upon the decades-old 3/2 approximation ratio of the Christofides-Serdyukov algorithm. The algorithm selects a random spanning tree based on the linear programming relaxation of TSP, then adds edges to form an Eulerian tour. It also provides a 3/2 - ε approximation for the TSP path problem under a recent result. The paper describes the algorithm and introduces new techniques used in the analysis.

Uploaded by

Seraphina Nix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

81 views89 pages

Slightly Improved Approximation Algorithm For Metric TSP

Uploaded by

Seraphina Nix

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 89

A (Slightly) Improved Approximation Algorithm

for Metric TSP

Anna R. Karlin*, Nathan Klein†, and Shayan Oveis Gharan‡
arXiv:2007.01409v4 [cs.DS] 8 May 2021

University of Washington

May 11, 2021

Abstract
For some ǫ > 10−36 we give a randomized 3/2 − ǫ approximation algorithm for metric
TSP.

* [email protected]. Research supported by Air Force Office of Scientific Research grant FA9550-20-1-0212
and NSF grant CCF-1813135.
† [email protected]. Research supported in part by NSF grants CCF-1813135 and CCF-1552097.
‡ [email protected]. Research supported by Air Force Office of Scientific Research grant FA9550-20-1-0212,

NSF grants CCF-1552097, CCF-1907845, ONR YIP grant N00014-17-1-2429, and a Sloan fellowship.
Contents
1 Introduction 1
1.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 New Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Polygon Structure for Near Minimum Cuts Crossed on one Side. . . . . . . . 3
1.2.2 Generalized Gurvits’ Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 Conditioning while Preserving Marginals . . . . . . . . . . . . . . . . . . . . 4

2 Preliminaries 5
2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 Polyhedral Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.3 Structure of Near Minimum Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.4 Strongly Rayleigh Distributions and λ-uniform Spanning Tree Distributions . . . . 7
2.5 Sum of Bernoullis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.6 Random Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Overview of Proof 17
3.1 Ideas underlying proof of Theorem 3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2 Proof ideas for Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

4 Polygons and the Hierarchy of Near Minimum Cuts 28

4.1 Cuts Crossed on Both Sides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Proof of the Main Technical Theorem, Theorem 3.1 . . . . . . . . . . . . . . . . . . . 32
4.3 Structure of Polygons of Cuts Crossed on One Side . . . . . . . . . . . . . . . . . . . 34
4.4 Happy Polygons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
4.5 Hierarchy of Cuts and Proof of Theorem 4.6 . . . . . . . . . . . . . . . . . . . . . . . 42
4.6 Hierarchy Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5 Probabilistic statements 46
5.1 Gurvits’ Machinery and Generalizations . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.2 Max Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.3 Good Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4 2-1-1 Good Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Matching 59

7 Reduction and payment 63

7.1 Increase for Good Top Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.2 Increase for Bottom Edges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.2.1 Case 1: Ŝ is a degree cut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.2.2 Case 2: S and its parent Ŝ are both polygon cuts . . . . . . . . . . . . . . . . . 74

A Proofs from Section 5 81

1 Introduction
One of the most fundamental problems in combinatorial optimization is the traveling salesperson
problem (TSP), formalized as early as 1832 (c.f. [App+07, Ch 1]). In an instance of TSP we are
given a set of n cities V along with their pairwise symmetric distances, c : V × V → R ≥0 . The
goal is to find a Hamiltonian cycle of minimum cost. In the metric TSP problem, which we study
here, the distances satisfy the triangle inequality. Therefore, the problem is equivalent to finding
a closed Eulerian connected walk of minimum cost.1
It is NP-hard to approximate TSP within a factor of 123
122 [KLS15]. An algorithm of Christofides-
Serdyukov [Chr76; Ser78] from four decades ago gives a 23 -approximation for TSP (see [BS20]
for a historical note about TSP). This remains the best known approximation algorithm for the
general case of the problem despite significant work, e.g., [Wol80; SW90; BP91; Goe95; CV00;
GLS05; BEM10; BC11; SWZ12; HNR17; HN19; KKO20].
In contrast, there have been major improvements to this algorithm for a number of special
cases of TSP. For example, polynomial-time approximation schemes (PTAS) have been found
for Euclidean [Aro96; Mit99], planar [GKP95; Aro+98; Kle05], and low-genus metric [DHM07]
instances. In addition, the case of graph metrics has received significant attention. In 2011, the
third author, Saberi, and Singh [OSS11] found a 32 − ǫ0 approximation for this case. Mömke and
Svensson [MS11] then obtained a combinatorial algorithm for graphic TSP with an approximation
ratio of 1.461. This ratio was later improved by Mucha [Muc12] to 13 9 ≈ 1.444, and then by Sebö
and Vygen [SV12] to 1.4.
In this paper we prove the following theorem:

Theorem 1.1. For some absolute constant ǫ > 10−36 , there is a randomized algorithm that outputs a tour
with expected cost at most 23 − ǫ times the cost of the optimum solution.

We note that while the algorithm makes use of the Held-Karp relaxation, we do not prove that
the integrality gap of this polytope is bounded away from 3/2. We also remark that although our
approximation factor is only slightly better than Christofides-Serdyukov, we are not aware of any
example where the approximation ratio of the algorithm we analyze exceeds 4/3 in expectation.
Following a new exciting result of Traub, Vygen, Zenklusen [TVZ20] we also get the following
theorem.

Theorem 1.2. For some absolute constant ǫ > 0 there is a randomized algorithm that outputs a TSP path
with expected cost at most 23 − ǫ times the cost of the optimum solution.

1.1 Algorithm
First, we recall the classical Christofides-Serdyukov algorithm: Given an instance of TSP, choose
a minimum spanning tree and then add the minimum cost matching on the odd degree vertices
of the tree. The algorithm we study is very similar, except we choose a random spanning tree
based on the standard linear programming relaxation of TSP.
1 Given
such an Eulerian cycle, we can use the triangle inequality to shortcut vertices visited more than once to get
a Hamiltonian cycle.

1
Let x0 be an optimum solution of the following TSP linear program relaxation [DFJ59; HK70]:

min ∑ x(u,v) c(u, v)

u,v
s.t., ∑ x(u,v) = 2 ∀v ∈ V,
u (1)
∑ x(u,v) ≥ 2, ∀S ( V,
u ∈S,v∈
/S
x(u,v) ≥ 0 ∀u, v ∈ V.

Given x0 , we pick an arbitrary node, u, split it into two nodes u0 , v0 and set x(u0 ,v0 ) = 1, c(u0 , v0 ) =
0 and we assign half of every edge incident to u to u0 and the other half to v0 . This allows us to
assume without loss of generality that x0 has an edge e0 = (u0 , v0 ) such that xe0 = 1, c(e0 ) = 0.
Let E0 = E ∪ {e0 } be the support of x0 and let x be x0 restricted to E and G = (V, E). x0
restricted to E is in the spanning tree polytope (3).
For a vector λ : E → R ≥0 , a λ-uniform distribution µλ over spanning trees of G = (V, E)
e∈ T λe
is a distribution where for every spanning tree T ⊆ E, P µ [ T ] = ∑ ∏′ ∏ . Now, find a vector
T e∈ T ′ λe
λ such that for every edge e ∈ E, P µλ [ e ∈ T ] = xe (1 ± ǫ), for some ǫ < 2−n . Such a vector λ
can be found using the multiplicative weight update algorithm [Asa+10] or by applying interior
point methods [SV12] or the ellipsoid method [Asa+10]. (We note that the multiplicative weight
update method can only guarantee ǫ < 1/poly(n) in polynomial time.)

Theorem 1.3 ([Asa+10]). Let z be a point in the spanning tree polytope (see (3)) of a graph G = (V, E).
For any ǫ > 0, a vector λ : E → R ≥0 can be found such that the corresponding λ-uniform spanning tree
distribution, µλ , satisfies
∑ Pµλ [T ] ≤ (1 + ε)ze , ∀e ∈ E,
T ∈T :T ∋e

i.e., the marginals are approximately preserved. In the above T is the set of all spanning trees of (V, E).
The running time is polynomial in n = |V |, − log mine∈ E ze and log(1/ǫ).

Finally, we sample a tree T ∼ µλ and then add the minimum cost matching on the odd
degree vertices of T. The above algorithm is a slight modification of the algorithm proposed in

Algorithm 1 An Improved Approximation Algorithm for TSP

Find an optimum solution x0 of Eq. (1), and let e0 = (u0 , v0 ) be an edge with x0e0 = 1, c(e0 ) = 0.
Let E0 = E ∪ {e0 } be the support of x0 and x be x0 restricted to E and G = (V, E).
Find a vector λ : E → R ≥0 such that for any e ∈ E, P µλ [e] = xe (1 ± 2−n ).
Sample a tree T ∼ µλ .
Let M be the minimum cost matching on odd degree vertices of T.
Output T ∪ M.

[OSS11]. We refer the interested reader to exciting work of Genova and Williamson [GW17] on
the empirical performance of the max-entropy rounding algorithm. We also remark that although
the algorithm implemented in [GW17] is slightly different from the above algorithm, we expect
the performance to be similar.

2
1.2 New Techniques
Here we discuss new machinery and technical tools that we developed for this result which could
be of independent interest.

1.2.1 Polygon Structure for Near Minimum Cuts Crossed on one Side.
Let G = (V, E, x) be an undirected graph equipped with a weight function x : E → R ≥0 such
that for any cut (S, S) such that u0 , v0 6∈ S, x(δ(S)) ≥ 2.
For some (small) η ≥ 0, consider the family of η-near min cuts of G. Let C be a connected com-
ponent of crossing η-near min cuts. Given C we can partition vertices of G into sets a0 , . . . , am−1
(called atoms); this is the coarsest partition such that for each ai , and each (S, S) ∈ C , we have
ai ⊆ S or ai ⊆ S. Here a0 is the atom that contains u0 , v0 .
There has been several works studying the structure of edges between these atoms and the
structure of cuts in C w.r.t. the ai ’s. The cactus structure (see [DKL76]) shows that if η = 0, then
we can arrange the ai ’s around a cycle, say a1 , . . . , am (after renaming), such that x( E( ai , ai+1 )) = 1
for all i.
Benczúr and Goemans [Ben95; BG08] studied the case when η ≤ 6/5 and introduced the
notion of polygon representation, in which case atoms can be placed on the sides of an equilateral
polygon and some atoms placed inside the polygon, such that every cut in C can be represented
by a diagonal of this polygon. Later, [OSS11] studied the structure of edges of G in this polygon
when η < 1/100.
In this paper, we show it suffices to study the structure of edges in a special family of polygon
representations: Suppose we have a polygon representation for a connected component C of η-
near min cuts of G such that
• No atom is mapped inside,

• If we identify each cut (S, S) ∈ C with the interval along the polygon that does not contain
a0 , then any interval is only crossed on one side (only on the left or only on the right).
Then, we have (i) For any atom ai , x(δ( ai )) ≤ 2 + O(δ) and (ii) For any pair of atoms ai , ai+1 ,
x( E( ai , ai+1 ) ≥ 1 − Ω(η ) (see Theorem 4.9 for details).
We expect to see further applications of our theorem in studying variants of TSP.

1.2.2 Generalized Gurvits’ Lemma

Given a real stable polynomial p ∈ R ≥0 [z1 , . . . , zn ] (with non-negative coefficients), Gurvits
proved the following inequality [Gur06; Gur08]

n! p ( z1 , . . . , z n ) p ( z1 , . . . , z n )
inf ≤ ∂z1 . . . ∂zn p|z=0 ≤ inf . (2)
n n z > 0 z1 . . . z n z>0 z1 . . . z n
As an immediate consequence, one can prove the following theorem about strongly Rayleigh
(SR) distributions.
Theorem 1.4. Let µ : 2[n] → R ≥0 be SR and A1 , . . . , Am be random variables corresponding to the
number of elements sampled in m disjoint subsets of [n] such that E [ Ai ] = ni for all i. If ni = 1 for all
1 ≤ i ≤ n, then P [∀i, Ai = 1] ≥ mm!m .

3
One can ask what happens if the vector ~n = (n1 , . . . , nm ) in the above theorem is not equal
but close to the all ones vector, 1.
A related theorem was proved in [OSS11].

Theorem 1.5. Let µ : 2[n] → R ≥0 be SR and A, B be random variables corresponding to the num-
ber of elements sampled in two disjoint sets. If P [ A + B = 2] ≥ ǫ, P [ A ≤ 1] , P [ B ≤ 1] ≥ α and
P [ A ≥ 1] , P [ B ≥ 1] ≥ β then P [ A = B = 1] ≥ ǫαβ/3.

We prove a generalization of both of the above statements; roughly speaking, we show that
as long as k~n − 1k1 < 1 − ǫ then P [∀i, Ai = 1] ≥ f (ǫ, m) where f (ǫ, m) has no dependence on n,
the number of underlying elements in the support of µ.

Theorem 1.6 (Informal version of Proposition 5.1). Let µ : 2[n] → R ≥0 be SR and let A1 , . . . , Am be
random variables corresponding to the number of elements sampled in m disjoint subsets of [n]. Suppose
that there are integers n1 , . . . , nm such that for any set S ⊆ [m], P [∑i∈S Ai = ∑i∈S ni ] ≥ ǫ. Then,

P [∀i, Ai = ni ] ≥ f (ǫ, m).

The above statement is even stronger than Theorem 1.4 as we only require P [ ∑i∈S Ai = ∑i∈S ni ]
to be bounded away from 0 for any set S ⊆ [m] and we don’t need a bound on the expectation.
Our proof of the above theorem has double exponential dependence on ǫ. We leave it an open
problem to find the optimum dependency on ǫ. Furthermore, our proof of the above theorem
is probabilistic in nature; we expect that an algebraic proof based on the theory of real stable
polynomials will provide a significantly improved lower bound. Unlike the above theorem, such
a proof may possibly extend to the more general class of completely log-concave distributions
[AOV18].

1.2.3 Conditioning while Preserving Marginals

Consider a SR distribution µ : 2[n] → R ≥0 and let x : [n] → R ≥0 , where for all i, xi = P T ∼µ [i ∈ T ],
be the marginals.
Let A, B ⊆ [n] be two disjoint sets such that E [ A T ] , E [ BT ] ≈ 1. It follows from Theorem 1.6
that P [ A T = BT = 1] ≥ Ω(1). Here, however, we are interested in a stronger event; let ν =
µ| A T = BT = 1 and let yi = P T ∼µ [i ∈ T ]. It turns out that the y vector can be very different from
the x vector, in particular, for some i’s we can have |yi − xi | bounded away from 0. We show that
there is an event of non-negligible probability that is a subset of A T = BT = 1 under which the
marginals of elements in A, B are almost preserved.

Theorem 1.7 (Informal version of Proposition 5.6). Let µ : 2[n] → R ≥0 be a SR distribution and let
A, B ⊆ [n] be two disjoint subsets such that E [ A T ] , E [ BT ] ≈ 1. For any α ≪ 1 there is an event E A,B
such that P [E A,B ] ≥ Ω(α2 ) and

• P [ A T = BT = 1|E A,B ] = 1,

• ∑i∈ A |P [i] − P [i |E ] | ≤ α,

• ∑i∈ B |P [i ] − P [i |E ] | ≤ α.

4
We remark that the quadratic lower bound on α is necessary in the above theorem for a
sufficiently small α > 0. The above theorem can be seen as a generalization of Theorem 1.4 in the
special case of two sets.
We leave it an open problem to extend the above theorem to arbitrary k disjoint sets. We
suspect that in such a case the ideal event E A1 ,...,Ak occurs with probability Ω(α)k and preserves
all marginals of elements in each of the sets A1 , . . . , Ak up to a total variation distance of α.

2 Preliminaries
2.1 Notation
We write [n] := {1, . . . , n} to denote the set of integers from 1 to n. For a set of edges A ⊆ E and
(a tree) T ⊆ E, we write
A T = | A ∩ T |.
For a set S ⊆ V, we write
E(S) = {(u, v) ∈ E : u, v ∈ S}
to denote the set of edges in S and we write

δ(S) = {(u, v) ∈ E : |{u, v} ∩ S| = 1}

to denote the set of edges that leave S. For two disjoint sets of vertices A, B ⊆ V, we write

E( A, B) = {(u, v) ∈ E : u ∈ A, v ∈ B}.

For a set A ⊆ E and a function x : E → R we write

x( A) := ∑ xe .
e∈ A

For two sets A, B ⊆ V, we say A crosses B if all of the following sets are non-empty:

A ∩ B, A r B, B r A, A ∪ B.

We write G = (V, E, x) to denote an (undirected) graph G together with special vertices u0 , v0

and a weight function x : E → R ≥0 such that

x(δ(S)) ≥ 2, ∀ S ( V : u 0 , v0 ∈
/ S.

For such a graph, we say a cut S ⊆ V is an η-near min cut w.r.t., x (or simply η-near min cut when
x is understood) if x(δ(S)) ≤ 2 + η. Unless otherwise specified, in any statement about a cut
(S, S) in G, we assume u0 , v0 6∈ S.

5
2.2 Polyhedral Background
For any graph G = (V, E), Edmonds [Edm70] gave the following description for the convex hull
of spanning trees of a graph G = (V, E), known as the spanning tree polytope.

z ( E ) = |V | − 1
z( E(S)) ≤ |S| − 1 ∀S ⊆ V (3)
ze ≥ 0 ∀e ∈ E.

Edmonds [Edm70] proved that the extreme point solutions of this polytope are the characteristic
vectors of the spanning trees of G.

Fact 2.1. Let x0 be a feasible solution of (1) such that x0e0 = 1 with support E0 = E ∪ {e0 }. Let x be x0
restricted to E; then x is in the spanning tree polytope of G = (V, E).
2|S |− x0 ( δ( S ))
Proof. For any set S ⊆ V such that u0 , v0 ∈
/ S, x( E(S)) = 2 ≤ |S| − 1. If u0 ∈ S, v0 ∈
/ S,
2|S |−1−( x0 ( δ( S ))−1) 2|S |−2− x0 ( δ( S ))
then x( E(S)) = 2 ≤ |S| − 1. Finally, if u0 , v0 ∈ S, then x( E(S)) = 2 ≤
|S| − 2. The claim follows because x( E) = x0 ( E0 ) − 1 = n − 1.
Since c(e0 ) = 0, the following fact is immediate.

Fact 2.2. Let G = (V, E, x) where x is in the spanning tree polytope. Let µ be any distribution of spanning
trees with marginals x, then E T ∼µ [ c( T ∪ e0 )] = c( x).

To bound the cost of the min-cost matching on the set O of odd degree vertices of the tree
T, we use the following characterization of the O-join polytope2 due to Edmonds and Johnson
[EJ73].

Proposition 2.3. For any graph G = (V, E), cost function c : E → R + , and a set O ⊆ V with an even
number of vertices, the minimum weight of an O-join equals the optimum value of the following integral
linear program.
min c( y)
s.t. y(δ(S)) ≥ 1 ∀S ⊆ V, |S ∩ O| odd (4)
ye ≥ 0 ∀e ∈ E
Definition 2.4 (Satisfied cuts). For a set S ⊆ V such that u0 , v0 ∈
/ S and a spanning tree T ⊆ E we say
a vector y : E → R ≥0 satisfies S if one of the following holds:

• δ(S) T is even, or

• y(δ(S)) ≥ 1.

To analyze our algorithm, we will see that the main challenge is to construct a (random)
vector y that satisfies all cuts and E [ c(y)] ≤ (1/2 − ǫ)OPT.
2 Thestandard name for this is the T-join polytope. Because we reserve T to represent our tree, we call this the
O-join polytope, where O represents the set of odd vertices in the tree.

6
2.3 Structure of Near Minimum Cuts
Lemma 2.5 ([OSS11]). For G = (V, E, x), let A, B ( V be two crossing ǫ A , ǫB near min cuts respectively.
Then, A ∩ B, A ∪ B, A r B, B r A are ǫ A + ǫB near min cuts.
Proof. We prove the lemma only for A ∩ B; the rest of the cases can be proved similarly. By
submodularity,
x(δ( A ∩ B)) + x(δ( A ∪ B)) ≤ x(δ( A)) + x(δ( B)) ≤ 4 + ǫ A + ǫB .
Since x(δ( A ∪ B)) ≥ 2, we have x(δ( A ∩ B)) ≤ 2 + ǫ A + ǫB , as desired.

The following lemma is proved in [Ben97]:

Lemma 2.6 ([Ben97, Lem 5.3.5]). For G = (V, E, x), let A, B ( V be two crossing ǫ-near minimum
cuts. Then,
x( E( A ∩ B, A − B)), x( E( A ∩ B, B − A)), x( E( A ∪ B, A − B)), x( E( A ∪ B, B − A)) ≥ (1 − ǫ/2).
Lemma 2.7. For G = (V, E, x), let A, B ( V be two ǫ near min cuts such that A ( B. Then
x(δ( A) ∩ δ( B)) = x( E( A, B)) ≤ 1 + ǫ, and
x( E(δ( A) r δ( B))) ≥ 1 − ǫ/2.
Proof. Notice
2 + ǫ ≥ x(δ( A)) = x( E( A, B r A)) + x( E( A, B))
2 + ǫ ≥ x(δ( B)) = x( E( B r A, B)) + x( E( A, B))
Summing these up, we get
2x( E( A, B)) + x( E( A, B r A)) + x( E( B r A, B)) = 2x( E( A, B)) + x(δ( B r A)) ≤ 4 + 2ǫ.
Since B r A is non-empty, x(δ( B r A)) ≥ 2, which implies the first inequality. To see the second
one, let C = B r A and note
4 ≤ x(δ( A)) + x(δ(C )) = 2x( E( A, C )) + x(δ( B)) ≤ 2x( E( A, C )) + 2 + ǫ
which implies x( E( A, C )) ≥ 1 − ǫ/2.

2.4 Strongly Rayleigh Distributions and λ-uniform Spanning Tree Distributions

Let B E be the set of all probability measures on the Boolean algebra 2E . Let µ ∈ B E . The
generating polynomial gµ : R [{ze }e∈ E ] of µ is defined as follows:
gµ ( z ) : = ∑ µ( S ) ∏ ze .
S e∈S

We say µ is a strongly Rayleigh distribution if gµ 6= 0 over all {ye }e∈ E ∈ C E where Im(ze ) > 0
for all e ∈ E. We say µ is d-homogenous if for any λ ∈ R, gµ (λz) = λd gµ (z). Strongly Rayleigh
(SR) distributions were defined in [BBL09] where it was shown any λ-uniform spanning tree dis-
tribution is strongly Rayleigh. In this subsection we recall several properties of SR distributions
proved in [BBL09; OSS11] which will be useful to us.

7
Closure Operations of SR Distributions. SR distributions are closed under the following oper-
ations.

• Projection. For any µ ∈ B E , and any F ⊆ E, the projection of µ onto F is the measure µ F
where for any A ⊆ F,
µ F ( A ) = ∑ µ ( S ).
S:S ∩ F = A

• Conditioning. For any e ∈ E, {µ|e out} and {µ|e in}.

• Truncation. For any integer k ≥ 0 and µ ∈ B E , truncation of µ to k, is the measure µk where

for any A ⊆ E,
( µ( A)
if | A| = k
µk ( A) = ∑S:|S|=k µ(S)
0 otherwise.

• Product. For any two disjoint sets E, F, and µ E ∈ B E , µ F ∈ B F the product measure µ E× F is
the measure where for any A ⊆ E, B ⊆ F, µ E× F ( A ∪ B) = µ E ( A)µ F ( B).

Throughout this paper we will repeatedly apply the above operations. We remark that SR dis-
tributions are not necessarily closed under truncation of a subset, i.e., if we require exactly k
elements from F ( E.
Since λ-uniform spanning tree distributions are special classes of SR distributions, if we per-
form any of the above operations on a λ-uniform spanning tree distribution µ we get another SR
distribution. Below, we see that by performing the following particular operations we still have
a λ-uniform spanning tree distribution (perhaps with a different λ).

Closure Operations of λ-uniform Spanning Tree Distributions

• Conditioning. For any e ∈ E, {µ|e out}, {µ|e in}.

• Tree Conditioning. For G = (V, E), a spanning tree distribution µ ∈ B E , and S ⊆ V,

{µ|S tree}.
Note that arbitrary spanning tree distributions are not necessarily closed under truncation and
projection. We remark that SR measures are also closed under an analogue of tree conditioning,
i.e., for a set F ⊆ E, let k = maxS∈supp µ |S ∩ F |. Then, {µ||S ∩ F | = k} is SR. But if µ is a spanning
tree distribution we get an extra independence property. The following independence is crucial to
several of our proofs.

Fact 2.8. For a graph G = (V, E), and a vector λ( G ) : E → R ≥0 , let µλ( G) be the corresponding
λ-uniform spanning tree distribution. Then for any S ( V,

{µλ( G) |S tree} = µλ( G[S]) × µλ( G/S) .

Proof. Intuitively, this holds because in the max entropy distribution, conditioned on S being
a tree, any tree chosen inside S can be composed with any tree chosen on G/S to obtain a

8
spanning tree on G. So, to maximize the entropy these trees should be chosen independently.
More formally for any T1 ∈ G [S] and T2 ∈ G/S,

λ T1 λ T2
P [ T = T1 ∪ T2 | S is a tree] = ′ ′
∑ T1′ ∈ G[S],T2′ ∈ G/S λ T1 λ T2
λ T1 λ T2
= ′ · ′
∑ T1′ ∈ G[S] λ T1 ∑ T2′ ∈ G/S λ T2
= P T1′ ∼ G[S] T1′ = T1 P T2′ ∼ G/S T2′ = T2 ,

giving independence.

Negative Dependence Properties. An upward event, A, on 2E is a collection of subsets of E that

is closed under upward containment, i.e. if A ∈ A and A ⊆ B ⊆ E, then B ∈ A. Similarly, a
downward event is closed under downward containment. An increasing function f : 2E → R, is a
function where for any A ⊆ B ⊆ E, we have f ( A) ≤ f ( B). We also say f : 2E → R is a decreasing
function if − f is an increasing function. So, an indicator of an upward event is an increasing
function. For example, if E is the set of edges of a graph G, then the existence of a Hamiltonian
cycle is an increasing function, and the 3-colorability of G is a decreasing function.

Definition 2.9 (Negative Association). A measure µ ∈ B E is negatively associated if for any increas-
ing functions f , g : 2E → R, that depend on disjoint sets of edges,

E µ [ f ] · E µ [ g] ≥ E µ [ f · g]

It is shown in [BBL09; FM92] that strongly Rayleigh measures are negatively associated.

Stochastic Dominance. For two measures µ, ν : 2E → R ≥0 , we say µ ν if there exists a

coupling ρ : 2E × 2E → R ≥0 such that

∑ ρ( A, B) = µ ( A ), ∀ A ∈ 2E ,
B

∑ ρ( A, B) = ν( B ), ∀ B ∈ 2E ,
A

and for all A, B such that ρ( A, B) > 0 we have A ⊆ B (coordinate-wise).

Theorem 2.10 (BBL). If µ is strongly Rayleigh and µk , µk+1 are well-defined, then µk µk+1 .

Note that in the above particular case the coupling ρ satisfies the following: For any A, B ⊆ E
where ρ( A, B) > 0, B ⊇ A and | B r A| = 1, i.e., B has exactly one more element.
Let µ be a strongly Rayleigh measure on edges of G. Recall that for a set A ⊆ E, we write
A T = | A ∩ T | to denote the random variable indicating the number of edges in A chosen in a
random sample T of µ. The following facts immediately follow from the negative association
and stochastic dominance properties. We will use these facts repeatedly in this paper.

Fact 2.11. Let µ be any SR distribution on E, then for any F ⊂ E, and any integer k

/ F, then P µ e FT ≥ k ≤ P µ [e] and P µ [e| FT ≤ k] ≥ P µ [ e]
1. (Negative Association) If e ∈

9
2. (Stochastic Dominance) If e ∈ F, then P µ [e| FT ≥ k] ≥ P µ [ e] and P µ [e| FT ≤ k] ≤ P µ [ e].
Fact 2.12. Let µ be a homogenous SR distribution on E. Then,
• (Negative association with homogeneity) For any A ⊆ E, and any B ⊆ A
E µ [ B T | A T = 0] ≤ E µ [ B T ] + E µ [ A T ] (5)

• Suppose that µ is a spanning tree distribution. For S ⊆ V, let q := |S| − 1 − E µ [ E(S) T ]. For any
A ⊆ E ( S ), B ⊆ E ( S ),
E µ [ BT ] − q ≤ E µ [ BT |S is a tree] ≤ E µ [ BT ] (Negative association and homogeneity)
E µ [ A T ] ≤ E µ [ A T |S is a tree] ≤ E µ [ A T ] + q (Stochastic dominance and tree)

Rank Sequence. The rank sequence of µ is the sequence

P [|S| = 0] , P [|S| = 1] , . . . , P [|S| = m] ,
where S ∼ µ. Let gµ (z) be the generating polynomial of µ. The diagonal specialization of µ is the
univariate polynomial
ḡµ (z) := gµ (z, z, . . . , z).
Observe that ḡ(.) is the generating polynomial of the rank sequence of µ. It follows that if µ is
SR then g¯µ is real rooted.
It is not hard to see that the rank sequence of µ corresponds to sum of independent Bernoullis
iff g¯µ is real rooted. It follows that the rank sequence of an SR distributions has the law of a sum
of independent Bernoullis. As a consequence, it follows (see [HLP52; Dar64; BBL09]) that the
rank sequence of any strongly Rayleigh measure is log concave (see below for the definition),
unimodal, and its mode differs from the mean by less than 1.
Definition 2.13 (Log-concavity [BBL09, Definition 2.8]). A real sequence {ak }m k=0 is log-concave if
a2k ≥ ak−1 · ak+1 for all 1 ≤ k ≤ m − 1, and it is said to have no internal zeros if the indices of its non-zero
terms form an interval (of non-negative integers).

2.5 Sum of Bernoullis

In this section, we collect a number of properties of sums of Bernoulli random variables.
Definition 2.14 (Bernoulli Sum Random Variable). We say BS(q) is a Bernoulli-Sum random vari-
able if it has the law of a sum of independent Bernoulli random variables, say B1 + B2 + . . . + Bn for some
n ≥ 1, with E [ B1 + · · · + Bn ] = q.
We start with the following theorem of Hoeffding.
Theorem 2.15 ([Hoe56, Corollary 2.1]). Let g : {0, 1, . . . , n} → R and 0 ≤ q ≤ n for some integer
n ≥ 0. Let B1 , . . . , Bn be n independent Bernoulli random variables with success probabilities p1 , . . . , pn ,
where ∑ni=1 pn = q that minimizes (or maximizes)
E [ g( B1 + · · · + Bn )]
over all such distributions. Then, p1 , . . . , pn ∈ {0, x, 1} for some 0 < x < 1. In particular, if only m of
q −ℓ
pi ’s are nonzero and ℓ of pi ’s are 1, then the rest of the m − ℓ are m−ℓ .

10
Fact 2.16. Let B1 , . . . , Bn be independent Bernoulli random variables each with expectation 0 ≤ p ≤ 1.
Then " #
1
P ∑ Bi even = (1 + (1 − 2p)n )
i
2

Proof. Note that

n n
n n
( p + (1 − p))n = ∑ pk (1 − p) n − k and ((1 − p) − p)n = ∑ (− p)k (1 − p)n−k
k=0
k k=0
k

Summing them up we get,

n
1 + (1 − 2p)n = ∑ 2pk (1 − p)n−k .
0≤ k≤ n,k even
k

Corollary 2.17. Given a BS(q) random variable with 0 < q ≤ 1.2, then

1
P [ BS(q) even] ≤ (1 + e−2q )
2
Proof. First, if q ≤ 1, then by Hoeffding’s theorem we can write BS(q) as sum of n Bernoullis
with success probability p = q/n. If n = 1, then the statement obviously holds. Otherwise, by
the previous fact, we have (for some n),

1 1
P [ BS(q) even] ≤ (1 + (1 − 2p)n )) ≤ (1 + e−2q )
2 2
where we used that |1 − 2p| ≤ e−2p for p ≤ 1/2.
So, now assume q > 1. Write BS(q) as the sum of n Bernoullis, each with success probabilities
1 or p. First assume we have no ones. Then, either we only have two non-zero Bernoullis with
success probability q/2 in which case P [ BS(q) even] ≤ 0.62 + 0.42 and we are done. Otherwise,
n ≥ 3 so p ≤ 1/2 and similar to the previous case we get P [ BS(q) even] ≤ 21 (1 + e−2q ).
Finally, if q > 1 and one of the Bernoullis is always 1, i.e. BS(q) = BS(q − 1) + 1, then we get

1 1
P [ BS(q) even] = P [ BS(q − 1) odd] = (1 − (1 − 2p)n−1 ) ≤ (1 − e−4(q−1) ) ≤ 0.3
2 2
where we used that 1 − x ≥ e−2x for 0 ≤ x ≤ 0.2.

Lemma 2.18. Let p0 , . . . , pn be a log-concave sequence. If for some i, γpi ≥ pi+1 for some γ < 1, then,
n
pk
∑ pj ≤ 1 − γ , ∀k ≥ i
j= k
n
pi+1 γ
∑ pj · j ≤ 1 − γ i+1+
1−γ
.
j= i+1

11
Proof. Since we have a log-concave sequence we can write
1 p p
≤ i ≤ i+1 ≤ . . . (6)
γ pi+1 pi+2
Since all of the above ratios are at least 1/γ, for all k ≥ 1 we can write

pi+ k ≤ γ k−1 pi+1 ≤ γ k pi .

Therefore, the first statement is immediate and the second one follows,
n ∞
i+1

k γ
∑ p j j ≤ ∑ γ p i + 1 ( i + k + 1) = p i + 1 1 − γ + ( 1 − γ ) 2
j= i+1 k=0

Corollary 2.19. Let X be a BS(q) random variable such that P [ X = k] ≥ 1 − ǫ for some integer k ≥ 1,
ǫ < 1/10. Then, k(1 − ǫ) ≤ q ≤ k(1 + ǫ) + 3ǫ.
Proof. The left inequality simply follows since X ≥ 0. Since P [ X = k + 1] ≤ ǫ, we can apply
Lemma 2.18 with γ = ǫ/(1 − ǫ) to get

ǫ (1 − ǫ )

ǫ
E [ X | X ≥ k + 1] P [ X ≥ k + 1] ≤ k+1+
1 − 2ǫ 1 − 2ǫ
Therefore,
ǫ (1 − ǫ ) ǫ
q = E [ X ] ≤ k (1 − ǫ ) + (k + 1 + ) ≤ k(1 + ǫ) + 3ǫ
1 − 2ǫ 1 − 2ǫ
as desired.

Fact 2.20. For integers k < t and k − 1 ≤ p ≤ k,

k−1
∏ (1 − i/t)(1 − p/t)t−k ≥ e− p .
i=1

Proof. We show that the LHS is a decreasing function of t. Since ln is monotone, it is enough to
show
!
k−1
0 ≥ ∂t ln(LHS) = ∂t ∑ ln(1 − i/t) + (t − k) ln(1 − p/t)
i=1
k−1
1 1 ( t − k) p
=
t2 ∑ 1
− 1
+ ln(1 − p/t) +
t( t − p)
i=1 i t
R k−1
Using ∑ki=−12 1
t2 /i− t
≤ 0
dx
t2 /x − t
= −(k − 1)/t − ln(1 − (k − 1)/t) it is enough to show

k−1 k−1 ( t − k) p 1
0≥− − ln(1 − ) + ln(1 − p/t) + + 2 1
t t t( t − p) t ( k−1 − 1t )
t−p p−k 1 k−1
= ln + + +
t−k+1 t−p t t ( t − k + 1)

12
Rearranging, it is equivalent to show

p−k+1 p−k 1
ln(1 + )≥ +
t−p t−p t−k+1

Since p > k − 1, using taylor series of ln, to prove the above it is enough to show

p − k + 1 ( p − k + 1) 2 p−k 1
− ≥ + .
t−p 2( t − p ) 2 t−p t−k+1

This is equivalent to show

p−k+1 ( p − k + 1) 2 1 p−k+1
≥ 2
⇔ ≥
(t − p)(t − k + 1) 2( t − p ) t−k+1 2( t − p )

Finally the latter holds because (t − k + 1)( p − k + 1) ≤ (t − k + 1) ≤ 2(t − p) where we use

t ≥ k + 1 and p ≤ k.

Let Poi( p, k) = e− p pk /k! be the probability that a Poisson random variable with rate p is
exactly k; similarly, define Poi( p, ≤ k), Poi( p, ≥ k) as the probability that a Poisson with rate p is
at most k or at least k.

Lemma 2.21. Let X be a Bernoulli sum BS( p) for some n. For any integer k ≥ 0 such that k − 1 < p <
k + 1, the following holds true
( p− k) +
p−ℓ

P [ X = k] ≥ min Poi( p − ℓ, k − ℓ) 1 −
0≤ℓ≤ p,k k−ℓ+1

where the minimum is over all nonnegative integers ℓ ≤ p, k, and for z ∈ R, z+ = max{z, 0}.

Proof. Let X = B1 + · · · + Bn where Bi is a Bernoulli. Applying Hoeffding’s theorem, if ℓ of them

p−ℓ
have success probability 1, we need to prove a lower bound of Poi( p − ℓ, k − ℓ)(1 − k−ℓ+1 )( p−k)+ .
So, assuming none have success probability 1, it follows that each has success probability p/n. If
k ≥ p,

k−1
pk pk − p

n p k
P [ X = k] = (1 − p/n)n−k = ∏ (1 − i/n) k! (1 − p/n)n−k ≥ e = Poi( p, k),
k n i=1
k!

where in the inequality we used Fact 2.20 (also note if n = k the inequality follows from Stirling’s
formula and that p ≥ k − 1). If k < p < k + 1, then as above

k−1
pk
P [ X = k] = ∏ (1 − i/n) k! (1 − p/n)n− p (1 − p/n) p−k ≥ Poi( p, k)(1 − p/n) p−k ,
i=1

where again we used Fact 2.20.

Note that if we further know X ≥ a with probability 1 we can restrict ℓ in the statement to be
in the interval [ a, min ( p, k)].

13
Lemma 2.22. Let X be a Bernoulli sum BS( p), where for some integer k = ⌈ p⌉, Then,
P [ X ≥ k] ≥ min Poi( p − ℓ, ≥ k − ℓ)
0≤ℓ≤ p

where the minimum is over all non-negative integers ℓ ≤ p.

Proof. Suppose that X is a BS( p) with n Bernoullis with probabilities p1 , . . . , pn . If p − 1 < k − 1 <
p, by [Hoe56, Thm 4, (25)],
k−1−ℓ
n−ℓ i

P [ X ≤ k − 1] ≤ max ∑ q (1 − q)n−ℓ−i (7)
0≤ℓ< p i=0 i
p−ℓ
where q = n−ℓ .
If Y is a BS( p) with m > n Bernoullis with probabilities q1 , . . . , qm , the same upper bound
applies of course, with m replacing n. Also, note that
max P [ X ≤ k − 1] ≤ max P [Y ≤ k − 1]
p1 ...pn q1 ,...,q m

since it is always possible to set qi = pi for i ≤ n and q j = 0 for j > n.

Therefore, the upper bound in (7) obtained by taking the limit as n goes to infinity applies,
from which it follows that
k−1−ℓ
P [ X ≤ k − 1] ≤ max
0≤ℓ< p i=0
∑ Poi( p − ℓ, i )

and therefore
P [ X ≥ k] ≥ min Poi( p − ℓ, ≥ k − ℓ).
0≤ℓ< p

2.6 Random Spanning Trees

Lemma 2.23. Let G = (V, E, x), and let µ be any distribution over spanning trees with marginals x. For
any ǫ-near min cut S ⊆ V (such that none of the endpoints of e0 = (u0 , v0 ) are in S), we have
P T ∼µ [ T ∩ E(S) is tree] ≥ 1 − ǫ/2.
Moreover, if µ is a max-entropy distribution with marginals x, then for any set of edges A ⊆ E(S) and
B ⊆ E r E ( S ),
E [ A T ] ≤ E [ A T |S is tree] ≤ E [ A T ] + ǫ/2, E [ BT ] − ǫ/2 ≤ E [ BT |S is tree] ≤ E [ BT ] .
Proof. First, observe that
2|S| − x(δ(S))
E [ E(S) T ] = x( E(S)) ≥ ≥ |S| − 1 − ǫ/2,
2
where we used that since u0 , v0 ∈
/ S, and that for any v ∈ S, E [ δ(v) T )] = x(δ(v)) = 2.
Let pS = P [ S is tree]. Then, we must have
|S| − 1 − (1 − pS ) = pS (|S| − 1) + (1 − pS )(|S| − 2) ≥ E [ E(S) T ] ≥ |S| − 1 − ǫ/2.
Therefore, pS ≥ 1 − ǫ/2.
The second part of the claim follows from Fact 2.12.

14
Corollary 2.24. Let A, B ⊆ V be disjoint sets such that A, B, A ∪ B are ǫ A , ǫB , ǫ A∪ B -near minimum cuts
w.r.t., x respectively, where none of them contain endpoints of e0 . Then for any distribution µ of spanning
trees on E with marginals x,

P T ∼µ [ E( A, B) T = 1] ≥ 1 − (ǫ A + ǫB + ǫ A∪ B )/2.

Proof. By the union bound, with probability at least 1 − (ǫ A + ǫB + ǫ A∪ B )/2, A, B, and A ∪ B are
trees. But this implies that we must have exactly one edge between A, B.

The following simple fact also holds by the union bound.

Fact 2.25. Let G = (V, E, x) and let µ be a distribution over spanning trees with marginals x. For any
set A ⊆ E , we have
P T ∼ µ [ T ∩ A = ∅ ] ≥ 1 − x ( A ).

Lemma 2.26. Let G = (V, E, x), and let µ be a λ-uniform random spanning tree distribution with
marginals x. For any edge e = (u, v) and any vertex w 6= u, v we have

E [WT |e 6∈ T ] ≤ E [WT ] + P [w ∈ Pu,v |e 6∈ T ] · P [e ∈ T ] ,

where WT = | T ∩ δ(w)| and for a spanning tree T and vertices u, v ∈ V, Pu,v ( T ) is the set of vertices on
the path from u to v in T.

Proof. Define E′ = E r {e}. Let µ′ = µ| E′ be µ projected on all edges except e. Define µi = µ′n−2
(corresponding to e in the tree) and µo = µ′n−1 (corresponding to e out of the tree). Observe that
any tree T has positive measure in exactly one of these distributions.
′ ′
By Theorem 2.10, µi µo so there exists a coupling ρ : 2E × 2E between them such that for
any Ti , To such that ρ( Ti , To ) > 0, the tree To has exactly one more edge than Ti . Also, observe
that To is always a spanning tree whereas Ti ∪ {e} is a spanning tree. The added edge (i.e., the
edge in To r Ti ) is always along the unique path from u to v in To .
For intuition for the rest of the proof, observe that if w is not on the path from u to v in To ,
then the same set of edges is incident to w in both Ti and To . So, if w is almost never on the path
from u to v, the distribution of WT is almost independent of e. On the other hand, whenever w is
on the path from u to v, then in the worst case, we may replace e with one of the edges incident
to w, so conditioned on e out, WT increases by at most the probability that e is in the tree.
Say xe is the marginal of e. Then,

E [WT ] = E [WT |e ∈
/ T ] (1 − xe ) + E [WT |e ∈ T ] xe
= ∑ ρ( Ti , To )Wo (1 − xe ) + ∑ ρ(Ti , To )Wi xe
Ti ,To Ti ,To

= ∑ ρ( Ti , To )((1 − xe )Wo + xe Wi ), (8)

Ti ,To

where we write Wi /Wo instead of WTi /WTo

15
E [WT |e ∈
/ T] = ∑ ρ(Ti , To )Wo
Ti ,To

= ∑ ρ( Ti , To )Wo + ∑ ρ( Ti , To )Wo
Ti ,To :w ∈ Pu,v ( To ) Ti ,To :w ∈
/ Pu,v ( To )

≤ ∑ ρ( Ti , To )( xe (Wi + 1) + (1 − xe )Wo )
Ti ,To :w ∈ Pu,v ( To )

+ ∑ ρ( Ti , To )( xe Wi + (1 − xe )Wo )
Ti ,To :w ∈
/ Pu,v ( To )

= E [WT ] + ∑ ρ( Ti , To ) xe
Ti ,To :w ∈ Pu,v ( To )

= E [WT ] + ∑ µo ( To ) xe
To :w ∈ Pu,v ( To )
= E [WT ] + P [w ∈ Pu,v |e out] · P [e in]

where in the inequality we used the following: When w ∈ / Pu,v ( To ) we have Wi = Wo and when
w ∈ Pu,v ( To ) we have Wo ≤ Wi + 1. Finally, in the third to last equality we used (8).

U W
u v w
e f

Figure 1: Setting of Lemma 2.27

Lemma 2.27. Let G = (V, E, x), and let µ be a λ-uniform spanning tree distribution with marginals x.
For any pair of edges e = (u, v), f = (v, w) such that |P [ e] − 1/2|, |P [ f ] − 1/2| < ǫ (see Fig. 1), if
ǫ < 1/1000, then
E [WT |e 6∈ T ] + E [UT | f 6∈ T ] ≤ E [WT + UT ] + 0.81,
where U = δ(u)−e and W = δ(w)− f .
Proof. All probabilistic statements are with respect to ν so we drop the subscript. First, by
Lemma 2.26, and negative association we can write,

E [WT |e 6∈ T ] ≤ E [WT ] + P [w ∈ Pu,v |e 6∈ T ] P [ e ∈ T ]

≤ E [WT ] + P [w ∈ Pu,v ∧ e ∈
/ T ] + 2ǫ

Note that the lemma only implies E [ δ(w) T |e ∈/ T ] ≤ E [ δ(w) T ] + P [w ∈ Pu,v |e ∈

/ T ] P [e ∈ T ]. To
derive the first inequality we also exploit negative association which asserts that the marginal of
every edge only goes up under e ∈ / T, so any subset of δ(w) (in particular W) also goes up by at
most P [e ∈/ T ∧ w ∈ Pu,v ]. Also, the second inequality uses P [ e ∈ T ] ≤ P [ e ∈ / T ] + 2ǫ. Using a
similar inequality for UT , to prove the lemma it is enough to show that

P [w ∈ Pu,v ∧ e 6∈ T ] + P [ u ∈ Pv,w ∧ f 6∈ T ] ≤ 0.808

16
or that when this inequality fails, a different argument yields the lemma.
The main observation is that in any tree it cannot be that both u is on the v − w path and w is
on the u − v path. Therefore

P [ u ∈ Pv,w | e, f 6∈ T ] + P [w ∈ Pu,v | e, f 6∈ T ] ≤ 1

So, we have

P [ e 6∈ T ∧ w ∈ Pu,v ] + P [ f 6∈ T ∧ u ∈ Pv,w ]
≤ P [e, f 6∈ T ∧ w ∈ Pu,v ] + P [e ∈/ T, f ∈ T ] + P ν [ e, f 6∈ T ∧ u ∈ Pv,w ] + P [ f ∈
/ T, e ∈ T ]
≤ P [e, f ∈/ T ] + P [e ∈
/ T, f ∈ T ] + P [ f ∈
/ T, e ∈ T ]
= 1 − P [e, f ∈ T ] .

It remains to upper bound the RHS. Let α = P [ f ∈ T |e ∈

/ T ]. Observe that

P [e, f ∈ T ] = P [ f ∈ T ] − P [ f ∈ T, e ∈
/ T ] ≥ 1/2 − ǫ − (1/2 + ǫ)α.

If α ≤ 0.6, then P [e, f ∈ T ] ≥ 0.198 (using ǫ < 0.001) and the claim follows. Otherwise,
P [ f |e ∈
/ T ] ≥ 0.6. Similarly, P [ e| f ∈
/ T ] ≥ 0.6. But, by negative association,

E [WT |e ∈ / T ] − P [ f ]) ≤ E [WT ] + 2ǫ + 0.4 ≤ E [WT ] + 0.405

/ T ] ≤ E [WT ] + P [e] − (P [ f |e ∈

/ T ] ≤ E [ UT ] + 0.405, so the claim follows.

and similarly, E [UT | f ∈

3 Overview of Proof
As alluded to earlier, the crux of the proof of Theorem 1.1 is to show that the expected cost of the
minimum cost matching on the odd degree vertices of the sampled tree is at most OPT (1/2 − ǫ).
We do this by showing the existence of a cheap feasible O-join solution to (4).
First, recall that if we only wanted to get an O-join solution of value at most OPT/2, to satisfy
all cuts, it is enough to set ye := xe /2 for each edge [Wol80]. To do better, we want to take
advantage of the fact that we only need to satisfy a constraint in the O-join for S when δ(S) T
is odd. Here, we are aided by the fact that the sampled tree is likely to have many even cuts
because it is drawn from a Strong Rayleigh distribution.
If an edge e is exclusively on even cuts then ye can be reduced below xe /2. This, more or
less, was the approach in [OSS11] for graphic TSP, where it was shown that a constant fraction
of LP edges will be exclusively on even near min cuts with constant probability. The difficulty in
implementing this approach in the metric case comes from the fact that a high cost edge can be
on many cuts and it may be exceedingly unlikely that all of these cuts will be even simultaneously.
Overall, our approach to addressing this is to start with ye := xe /2 and then modify it with a
random3 slack vector s : E → R: When certain special (few) cuts that e is on are even we let
se = − xe η/8 (for a carefully chosen constant η > 0); for other cuts that contain e, whenever they
are odd, we will increase the slack of other edges on that cut to satisfy them. The bulk of our
effort is to show that we can do this while guaranteeing that E [ se ] < −ǫηxe for some ǫ > 0.
3 where the randomness comes from the random sampling of the tree

17
One thing we do not need to worry about if we perform the reduction just described is any
cut S such that x(δ(S)) > 2(1 + η ). Since we always have se ≥ − xe η/8, any such cut is always
satisfied, even if every edge in δ(S) is decreased and no edge is increased.
Let OPT be the optimum TSP tour, i.e., a Hamiltonian cycle, with set of edges E∗ ; throughout
the paper, we write e∗ to denote an edge in E∗ . To bound the expected cost of the O-join for a
random spanning tree T ∼ µλ , we also construct a random slack vector s∗ : E∗ → R ≥0 such that
( x + OPT )/4 + s + s∗ is a feasible for Eq. (4) with probability 1. In Section 3.1 we explain how to
use s∗ to satisfy all but a linear number of near mincuts.
Theorem 3.1 (Main Technical Theorem). Let x0 be a solution of LP (1) with support E0 = E ∪ {e0 },
and x be x0 restricted to E. Let z := ( x + OPT )/2, η ≤ 10−12 and let µ be the max-entropy distribution
with marginals x. Also, let E∗ denote the support of OPT. There are two functions s : E0 → R and
s∗ : E∗ → R ≥0 (as functions of T ∼ µ), , such that
i) For each edge e ∈ E, se ≥ − xe η/8.

ii) For each η-near-min-cut S of z, if δ(S) T is odd, then s(δ(S)) + s∗ (δ(S)) ≥ 0.

iii) For every OPT edge e∗ , E [ s∗e∗ ] ≤ 28η 2 and for every LP edge e 6= e0 , E [ se ] ≤ − 16
5
xe ǫP η for
ǫP = 3.9 · 10 − 17 (defined in (33)).
In the next subsection, we explain the main ideas needed to prove this technical theorem. But
first, we show how our main theorem follows readily from Theorem 3.1.

Proof of Theorem 1.1. Let x0 be an extreme point solution of LP (1), with support E0 and let x
be x0 restricted to E. By Fact 2.1 x is in spanning tree polytope. Let µ = µλ∗ be the max
entropy distribution with marginals x, and let s, s∗ be as defined in Theorem 3.1. We will define
y : E0 → R ≥0 and y∗ : E∗ → R ≥0 . Let
(
xe /4 + se if e ∈ E
ye =
∞ if e = e0

we also let y∗e∗ = 1/4 + s∗e∗ for any edge e∗ ∈ E∗ . We will show that y + y∗ is a feasible solution4
to (4). First, observe that for any S where e0 ∈ δ(S), we have y(δ(S)) + y∗ (δ(S)) ≥ 1. Otherwise,
we assume u0 , v0 ∈ / S. If S is an η-near min cut w.r.t., z and δ(S) T is odd, then by property (ii) of
Theorem 3.1, we have
z(δ(S))
y(δ(S)) + y∗ (δ(S)) = + s(δ(S)) + s∗ (δ(S)) ≥ 1.
2
On the other hand, if S is not an η-near min cut (w.r.t., z),

z(δ(S)) η
y(δ(S)) + y∗ (δ(S)) ≥ − x(δ(S))
2 8
z(δ(S)) η
≥ − 2(z(δ(S)) − 1)
2 8
≥ z(δ(S))(1/2 − η/4) + η/4 ≥ (2 + η )(1/2 − η/4) + η/4 ≥ 1.
4 Recallthat we merely need to prove the existence of a cheap O-join solution. The actual optimal O-join solution
can be found in polynomial time.

18
where in the first inequality we used property (i) of Theorem 3.1 which says that se ≥ xe η/8
with probability 1 for all LP edges and that s∗e∗ ≥ 0 with probability 1. In the second inequality
we used that z = ( x + OPT )/2, so, since OPT ≥ 2 across any cut, x(δ(S)) ≤ 2(z(δ(S)) − 1).
Therefore, y + y∗ is a feasible O-join solution.
Finally, using c(e0 ) = 0 and part (iii) of Theorem 3.1,

E [ c(y) + c(y∗ )] = OPT/4 + c( x)/4 + E [ c(s) + c(s∗ )]

5 5
≤ OPT/4 + c( x)/4 + 28η 2 OPT − ǫP ηc( x) ≤ (1/2 − ǫP η )OPT
16 32
choosing η such that
5
28η = ǫP (9)
32
and using c( x) ≤ OPT.
Now, we are ready to bound the approximation factor of our algorithm. First, since x0 is an
1
extreme point solution of (1), mine∈ E0 x0e ≥ n! . So, by Theorem 1.3, in polynomial time we can
find λ : E → R ≥0 such that for any e ∈ E, P µλ [ e] ≤ xe (1 + δ) for some δ that we fix later. It
follows that
∑ |Pµ [e] − Pµλ [e] | ≤ nδ.
e∈ E

By stability of maximum entropy distributions (see [SV19, Thm 4] and references therein), we
ǫP η
have that kµ − µλ k1 ≤ O(n4 δ) =: q. Therefore, for some δ ≪ n−4 we get kµ − µλ k1 = q ≤ 100 .
That means that

∗ 1 5 ǫP η
E T ∼µλ [min cost matching] ≤ E T ∼µ [c(y) + c(y )] + q(OPT/2) ≤ − ǫP η + OPT,
2 32 100

where we used that for any spanning tree the cost of the minimum cost matching on odd degree
vertices is at most OPT/2. Finally, since E T ∼µλ [ c( T )] ≤ OPT (1 + δ), ǫP = 3.9 · 10−17 , and
η = 2.17 · 10−19 (from (9)) we get a 3/2 − 10−36 approximation algorithm for TSP.

3.1 Ideas underlying proof of Theorem 3.1

The first step of the proof is to show that it suffices to construct a slack vector s for a “cactus-like”
structure of near min-cuts that we call a hierarchy. Informally, a hierarchy H is a laminar family
of mincuts5 , consisting of two types of cuts: triangle cuts and degree cuts. A triangle S is the union
of two min-cuts X and Y in H such that x( E( X, Y )) = 1. See Fig. 2 for an example of a hierarchy
with three triangles.
We will refer to the set of edges E( X, S) (resp. E(Y, S)) as A (respectively B) for a triangle
cut S. In addition, we say a triangle cut S is happy if A T and BT are both odd. All other cuts are
called degree cuts. A degree cut S is happy if δ(S) T is even.

Theorem 3.2 (Main Payment Theorem (informal)). Let G = (V, E, x) for LP solution x and let µ be
the max-entropy distribution with marginals x. Given a hierarchy H, there is a slack vector s : E → R
such that
5 This is really a family of near-min-cuts, but for the purpose of this overview, assume η = 0

19
b d
u3
u3
u1 u2 u1 u2
a c
a b c d

Figure 2: An example of part of a hierarchy with three triangles. The graph on the left shows
part of a feasible LP solution where dashed (and sometimes colored) edges have fraction 1/2
and solid edges have fraction 1. The dotted ellipses on the left show the min-cuts u1 , u2 , u3 in
the graph. (Each vertex is also a min-cut). On the right is a representation of the corresponding
hierarchy. Triangle u1 corresponds to the cut {a, b}, u2 corresponds to {c, d} and u3 corresponds
to {a, b, c, d}. Note that, for example, the edge ( a, c), represented in green, is in δ(u1 ), δ(u3 ), and
inside u3 . For triangle u1 , we have A = δ( a) r ( a, b) and B = δ(b) r (b, d).

i) For each edge e ∈ E, se ≥ − xe η/8.

ii) For each cut S ∈ H if δ(S) T is not happy, then s(δ(S)) ≥ 0.

iii) For every LP edge e 6= e0 , E [ se ] ≤ −ηǫP xe for ǫP > 0.

In the following subsection, we discuss how to prove this theorem. Here we explain at a high
level how to define the hierarchy and reduce Theorem 3.1 to this theorem. The details are in
Section 4.
First, observe that, given Theorem 3.2, cuts in H will automatically satisfy (ii) of Theorem 3.1.
The approach we take to satisfying all other cuts is to introduce additional slack, the vector s∗ ,
on OPT edges.
Consider the set of all near-min-cuts of z, where z := ( x + OPT )/2. Starting with z rather
than x allows us to restrict attention to a significantly more structured collection of near-min-cuts.
The key observation here is that in OPT, all min-cuts have value 2, and any non-min-cut has
value at least 4. Therefore averaging x with OPT guarantees that every η-near min-cut of z must
consist of a contiguous sequence of vertices (an interval) along the OPT cycle. Moreover, each of these
cuts is a 2η-near min-cut of x. Arranging the vertices in the OPT cycle around a circle, we identify
every such cut with the interval of vertices that does not contain (u0 , v0 ). Also, we say that a cut
is crossed on both sides if it is crossed on the left and on the right.
To ensure that any cut S that is crossed on both sides is satisfied, we first observe that S is odd
with probability O(η ). To see this, let S L and SR be the cuts crossing S on the left and right
with minimum intersection with S and consider the two (bad) events {E(S ∩ S L , S L r S)) T 6=
1} and {E(S ∩ SR , SR r S)) T 6= 1}. Recall that if A, B and A ∪ B are all near-min-cuts, then
P [ E( A, B) T 6= 1] = O(η ) (see Corollary 2.24). Applying this fact to the two aforementioned bad
events implies that each of them has probability O(η ). Therefore, we will let the two OPT edges
in δ(S) be responsible for these two events, i.e., we will increase the slack s∗ on these two OPT
edges by O(η ) when the respective bad events happens. This gives E [ s∗ (e∗ )] = O(η 2 ) for each

20
OPT edge e∗ . As we will see, this simple step will reduce the number of near-min-cuts of z that
we need to worry about satisfying to O(n).
Next, we consider the set of near-min-cuts of z that are crossed on at most one side. Partition
these into maximal connected components of crossing cuts. Each such component corresponds
to an interval along the OPT cycle and, by definition, these intervals form a laminar family.
A single connected component C of at least two crossing cuts is called a polygon. We prove
the following structural theorem about the polygons induced by z:
Theorem 3.3 (Polygons look like cycles (Informal version of Theorem 4.9)). Given a connected
component C of near-min-cuts of z that are crossed on one side, consider the coarsest partition of vertices
of the OPT cycle into a sequence a1 , . . . , am−1 of sets called atoms (together with a0 which is the set of
vertices not contained in any cut of C ). Then
• Every cut in C is the union of some number of consecutive atoms in a1 , . . . , am−1 .

• For each i such that 0 ≤ i < m − 1, x( E( ai , ai+1 )) ≈ 1 and similarly x( E( am−1 , a0 )) ≈ 1.

• For each i > 0, x(δ( ai )) ≈ 2.

The main observation used to prove Theorem 3.3 is that the cuts in C crossed on one side can
be partitioned into two laminar families L and R, where L (resp. R) is the set of cuts crossed on
the left (resp. right). This immediately implies that |C| is linear in m. Since cuts in L cannot cross
each other (and similarly for R), the proof boils down to understanding the interaction between
L and R.
The approximations in Theorem 3.3 are correct up to O(η ). Using additional slack in OPT, at
the cost of an additional O(η 2 ) for edge, we can treat these approximate equations as if they are
exact. Observe that if x( E( ai , ai+1 )) = 1, and x(δ( ai )) = x(δ( ai+1 )) = 2 for 1 ≤ i ≤ m − 2, then
with probability 1, E( ai , ai+1 ) T = 1. Therefore, any cut in C which doesn’t include a1 or am−1 is
even with probability 1. The cuts in C that contain a1 are even precisely6 when E( a0 , a1 ) T = 1
and similarly the cuts in C that contain am−1 are even when E( a0 , am−1 ) T = 1. These observations
are what allow us to imagine that each polygon is a triangle, i.e., assume m = 3.
The hierarchy H is the set of all η-near mincuts of z that are not crossed at all (these will be the
degree cuts), together with a triangle for every polygon. In particular, for a connected component
C of size more than 1, the corresponding triangle cut is a1 ∪ . . . ∪ am−1 , with A = E( a0 , a1 ) and
B = E( a0 , am−1 ). Observe that from the discussion above, when a triangle cut is happy, then all
of the cuts in the corresponding polygon C are even.
Summarizing, we show that if we can construct a good slack vector s for a hierarchy of degree
cuts and triangles, then there is a nonnegative slack vector s∗ , that satisfies all near-minimum
cuts of z not represented in the hierarchy, while maintaining slack for each OPT edge e∗ such
that E [ s∗ (e∗ )] = O(η 2 ).

Remarks: The reduction that we sketched above only uses the fact that µ is an arbitrary distri-
bution of spanning trees with marginals x and not necessarily a maximum-entropy distribution.
We also observe that to prove Theorem 1.1, we crucially used that 28η ≪ ǫ. This forces
us to take η very small, which is why we get only a “very slightly” improved approximation
algorithm for TSP. Furthermore, since we use OPT edges in our construction, we don’t get a
6 Roughly, this corresponds to the definition of the polygon being left-happy.

21
new upper bound on the integrality gap. We leave it as an open problem to find a reduction
to the “cactus” case that doesn’t involve using a slack vector for OPT (or a completely different
approach).

3.2 Proof ideas for Theorem 3.2

We now address the problem of constructing a good slack vector s for a hierarchy of degree cuts
and triangle cuts. For each LP edge f , consider the lowest cut in the hierarchy, that contains
both endpoints of f . We call this cut p( f ). If p( f ) is a degree cut, then we call f a top edge and
otherwise, it is a bottom edge7 . We will see that bottom edges are easier to deal with, so we start
by discussing the slack vector s for top edges.
Let S be a degree cut and let e = (u, v) (where u and v are children of S in H) be the set of
all top edges f = (u′ , v′ ) such that u′ ∈ u and v′ ∈ v. We call e a top edge bundle and say that u
and v are the top cuts of each f ∈ e. We will also sometimes say that e ∈ S.
Ideally, our plan is to reduce the slack of every edge f ∈ e when it is happy, that is, both of its
top cuts are even in T. Specifically, we will set s f := −ηx f when δ(u) T and δ(v) T are even. When
this happens, we say that f is reduced, and refer to the event {δ(u) T , δ(v) T even} as the reduction
event for f . Since this latter event doesn’t depend on the actual endpoints of f , we view this as a
simultaneous reduction of se .
Now consider the situation from the perspective of the degree cut u (where p(u) = S) and
consider any incident edge bundle in S, e.g., e = (u, v). Either its top cuts are both even and
se := −ηxe , or they aren’t even, because, for example, δ(u) T is odd. In this latter situation,
edges in δ↑ (u) := δ(u) ∩ δ(S) might have been reduced (because their top two cuts are even),
which a priori could leave δ(u) unsatisfied. In such a case, we increase se for edge bundles in
δ→ (u) := δ(u) r δ(S) to compensate for this reduction. Our main goal is then to prove is that for
any edge bundle its expected reduction is greater than its expected increase. The next example
shows this analysis in an ideal setting.
Example 3.4 (Simple case). Fix a top edge bundle e = (u, v) with p(e) = S. Let xu := x(δ↑ (u)) and
let xv := x(δ↑ (v)). Suppose we have constructed a (fractional) matching between edges whose top
two cuts are children of S in H and the edges in δ(S), and this matching satisfies the following
three conditions: (a) e = (u, v) ∈ S is matched (only) to edges going higher from its top two cuts
(i.e., to edges in δ↑ (u) and δ↑ (v)), (b) e is matched to an me,u fraction of every edge in δ↑ (u) and
to an me,v fraction of each edge in δ↑ (v), where

me,u + me,v = xe ,

and (c) the fractional value of edges in δ→ (u) := δ(u) r δ↑ (u) matched to edges in δ↑ (u) is equal
to xu . That is, for each u ∈ S, ∑f∈δ→ (u) mf,u = xu .
The plan is for e ∈ S to be tasked with part of the responsibility for fixing the cuts δ(u) and
δ(v) when they are odd and edges going higher are reduced. Specifically, se is increased to
compensate for an me,u fraction of the reductions in edges in δ↑ (u) when δ(u) T is odd. (And
7 For example, in Fig. 2, p( a, c) = u3 , and ( a, c) is a bottom edge.

22
xu xv
e
u v

similarly for reductions in v.) Thus,

xg
E [ se ] = −P [e reduced] ηxe + me,u ∑ P [δ(u) T odd| g reduced] P [ g reduced] η
x(δ↑ (u))
g∈δ↑ ( u )
xg
+ me,v ∑ P [ δ(v) T odd| g reduced] P [ g reduced] η
x(δ↑ (v))
(10)
g∈δ↑ ( v)

We will lower bound P [δ(u) T even| g reduced]. We can write this as

h i
P δ→ (u) T and δ↑ (u) T have same parity | g reduced .

Unfortunately, we do not currently have a good handle on the parity of δ↑ (u) T conditioned on
g reduced. However, we can use the following simple but crucial property: Since x(δ(S)) = 2,
by Lemma 2.23, T consists of two independent trees, one on S and one on V r S, each with the
corresponding marginals of x. Therefore, we can write

P [ δ(u) T even| g reduced] ≥ min(P [(δ→ (u)) T even] , P [(δ→ (u)) T odd]).

This gives us a reasonable bound when ǫ ≤ xu , xv ≤ 1 − ǫ since, because x(δ(u)) = x(δ(v)) = 2,

by the SR property, (δ→ (u)) T (and similarly (δ→ (v)) T ) is the sum of Bernoulis with expectation
in [1 + ǫ, 2 − ǫ]. From this it follows that

min(P [(δ→ (u)) T even] , P [(δ→ (u)) T odd]) = Ω(ǫ).

We can therefore conclude that P [δ(u) T odd| g reduced] ≤ 1 − O(ǫ).

The rest of the analysis of this special case follows from (a) the fact that our construction will
guarantee that for all edges g, the probability that g is reduced is exactly p, i.e., it is the same for
all edges, and (b) the fact that me,u xu + me,v xv = xe . Plugging these facts back into (10), gives

E [ se ] ≤ − pηxe + me,u (1 − ǫ) pη + me,v (1 − ǫ) pη

≤ − pηxe + (1 − ǫ) pηxe = −ǫpηxe . (11)

If we could prove (11) for every edge f in the support of x, that would complete the proof that
the expected cost of the min O-join for a random spanning tree T ∼ µ is at most (1/2 − ǫ)OPT.

Remark: Throughout this paper, we repeatedly use a mild generalization of the above "inde-
pendent trees fact": that if S is a cut with x(δ(S)) ≤ 2 + ǫ, then ST is very likely to be a tree.
Conditioned on this fact, marginals inside S and outside S are nearly preserved and the trees
inside S and outside S are sampled independently (see Lemma 2.23).

23
Ideal reduction: In the example, we were able to show that P [ δ(u) T odd| g reduced] was bounded
away from 1 for every edge g ∈ δ↑ (u), and this is how we proved that the expected reduction for
each edge was greater than the expected increase on each edge, yielding negative expected slack.
This motivates the following definition: A reduction for an edge g is k-ideal if, conditioned on
g reduced, every cut S that is in the top k levels of cuts containing g is odd with probability that
is bounded away from 1.

Moving away from an idealized setting: In Example 3.4, we oversimplified in four ways:

(a) We assumed that it would be possible to show that each top edge is good. That is, that its
top two cuts are even simultaneously with constant probability.

(b) We considered only top edge bundles (i.e., edges whose top cuts were inside a degree cut).

(c) We assumed that xu , xv ∈ [ǫ, 1 − ǫ].

(d) We assumed the existence of a nice matching between edges whose top two cuts were
children of S and the edges in δ(S).

Our proof needs to address all four anomalies that result from deviating from these assumptions.

b d

u0 v0
e0
a c

Figure 3: An Example with Bad Edges. A feasible solution of (1) is shown; dashed edges have
fraction 1/2 and solid edges have fraction 1. Writing E = E0 r {e0 } as a maximum entropy
distribution µ we get the following: Edges ( a, b), (c, d) must be completely negatively correlated
(and independent of all other edges). So, (b, u0 ), ( a, u0 ) are also completely negatively correlated.
This implies ( a, b) is a bad edge.

Bad edges. Consider first (a). Unfortunately, it is not the case that all top edges are good.
Indeed, some are bad. However, it turns out that bad edges are rare in the following senses: First,
for an edge to be bad, it must be a half edge, where we say that an edge e is a half edge if
xe ∈ 1/2 ± ǫ1/2 for a suitably chosen constant ǫ1/2 . Second, of any two half edge bundles sharing
a common endpoint in the hierarchy, at least one is good. For example, in Fig. 3, ( a, u0 ) and
(b, u0 ) are good half-edge bundles. We advise the reader to ignore half edges in the first reading
of the paper. Correspondingly, we note that our proofs would be much simpler if half-edge
bundles never showed up in the hierarchy. It may not be a coincidence that half edges are hard
to deal with, as it is conjectured that TSP instances with half-integral LP solutions are the hardest
to round [SWZ12; SWZ13].
Our solution is to never reduce bad edges. But this in turn poses two problems. First, it means
that we need to address the possibility that the bad edges constitute most of the cost of the LP

24
e
u v

A B A′ B′

a1 a2 a3 a4
f g

Figure 4: In the triangle u corresponding to the cut δ( a1 ∪ a2 ), when A T and BT are odd, all 3
cuts (δ( a1 ) T , δ( a2 ) T and δ( a1 ∪ a2 ) T = δ(u) T are odd (since fT is always 1). (Recall also that the
edges in the bundle e must have one endpoint in {a1 ∪ a2 } and one endpoint in {a3 ∪ a4 }, as was
the case, e.g., for the edge ( a, c) in Fig. 2.)

solution. Second, our objective is to get negative expected slack on each good edge and non-
positive expected slack on bad edges. Therefore, if we never reduce bad edges, we can’t increase
them either, which means that the responsibility for fixing an odd cut with reduced edges going
higher will have to be split amongst fewer edges (the incident good ones).
We deal with the first problem by showing that in every cut u in the hierarchy at least 3/4 of
the fractional mass in δ(u) is good and these edges suffice to compensate for reductions on the
edges going higher. Moreover, because there are sufficiently many good edges incident to each
cut, we can show that either using the slack vector {se } gives us a low-cost O-join, or we can
average it out with another O-join solution concentrated on bad edges to obtain a reduced cost
matching of odd degree vertices.
We deal with the second problem by proving Lemma 6.2, which guarantees a matching be-
tween good edge bundles e = (u, v) and fractions me,u , me,v of edges in δ↑ (u), δ↑ (v) such that,
roughly, me,u + me,v = (1 + O(ǫ1/2 )) xe .

Dealing with triangles. Turning to (b), consider a triangle cut S, for example δ( a1 ∪ a2 ) in Fig. 4.
Recall that in a triangle, we can assume that there is an edge of fractional value 1 connecting a1
and a2 in the tree, and this is why we defined the cut to be happy when A T and BT are odd: this
guarantees that all 3 cuts defined by the triangle (δ( a1 ), δ( a2 ), δ( a1 ∪ a2 ) are even.
Now suppose that e = (u, v) is a top edge bundle, where u and v are both triangles, as
shown in Fig. 4. Then we’d like to reduce se when both cuts u and v are happy. But this would
require more than simply both cuts being even. This would require all of A T , BT , A′T , BT′ to be
odd. Note that if, for whatever reason, e is reduced only when δ(u1 ) T and δ(u2 ) T are both even,
then it could be, for example, that this only happens when A T and BT are both even. In this case,
both δ( a1 ) T and δ( a2 ) T will be odd with probability 1 (recalling that fT = 1), which would then
necessitate an increase in sf whenever e is reduced. In other words, the reduction will not even
be 1-ideal.
It turns out to be easier for us to get a 1-ideal reduction rule for e as follows: Say that e is
2-1-1 happy with respect to u if δ(u) T is even and both A′T , BT′ are odd. We reduce e with probability
p/2 when it is 2-1-1 happy with respect to u and with probability p/2 when it is 2-1-1 happy
with respect to v. This means that when e is reduced, half of the time no increase in sf is needed
since u is happy. Similarly for v.
The 2-1-1 criterion for reduction introduces a new kind of bad edge: a half edge that is good,

25
but not 2-1-1 good. We are able to show that non-half-edge bundles are 2-1-1 good (Lemmas 5.22
and 5.23), and that if there are two half edges which are both in A or are both in B, then at
least one of them is 2-1-1 good (Lemma 5.25). Finally, we show that if there are two half edges,
where one is in A and the other is in B, and neither is 2-1-1 good, then we can apply a different
reduction criterion that we call 2-2-2 good. When the latter applies, we are guaranteed to decrease
both of the half edge bundles simultaneously. All together, the various considerations discussed
in this paragraph force us to come up with a relatively more complicated set of rules under
which we reduce se for a top edge bundle e whose children are triangle cuts. Section 5 focuses
on developing the relevant probabilistic statements.

Bottom edge reduction. Next, consider a bottom edge bundle f = ( a1 , a2 ) where p( a1 ) = p( a2 )

is a triangle. Our plan is to reduce sf (i.e., set it to −ηxf ) when the triangle is happy, that is,
A T = BT = 1. The good news here is that every triangle is happy with constant probability.
However, when a triangle is not happy, sf may need to increase to make sure that the O-join
constraint for δ( a1 ) and δ( a2 ) are satisfied, if edges in A and B going higher are reduced. Since
xf = x( A) = x( B) = 1, this means that f may need to compensate at twice the rate at which it is
getting reduced. This would result in E [ sf ] > 0, which is the opposite of what we seek.
We use two key ideas to address this problem. First, we reduce top edges and bottom edges
by different amounts: Specifically, when the relevant reduction event occurs, we reduce a bottom
edge f by βxf and top edges e by τxe , where β > τ (and τ is a multiple of η).
Thus, the expected reduction in sf is pβxf = pβ, whereas the expected increase (due to
compensation of, say, top edges going higher) is pτ ( x( A) + x( B))q = pτ2q, where

q = P [ triangle happy|reductions in A and B] .

Thus, so long as 2τq < β − ǫ, we get the expected reduction in sf that we seek.
The discussion so far suggests that we need to take τ smaller than β/2q, which is β/2 if q
is 1, for example. On the other hand, if τ = β/2, then when a top edge needs to fix a cut due
to reductions on bottom edges, we have the opposite problem – their expected increase will be
greater than their expected reduction, and we are back to square one.
Coming to our aid is the second key idea, already discussed in Section 1.2.3. We reduce
bottom edges only when A T = BT = 1 and the marginals of edges in A, B are approximately
preserved (conditioned on A T = BT = 1). This allows us to get much stronger upper bounds on
the probability that a lower cut a bottom edge is on is odd, given that the bottom edge is reduced,
and enables us to show that bottom edge reduction is ∞-ideal.
It turns out that the combined effects of (a) choosing τ = 0.571β, and (b) getting better bounds
on the probability that a lower cut is even given that a bottom edge is reduced, suffice to deal
with the interaction between the reductions and the increases in slack for top and bottom edges.
Example 3.5. [Bottom-bottom case] To see how preserving marginals helps us handle the inter-
action between bottom edges at consecutive levels, consider a triangle cut a1′ = {a1 , a2 } whose
parent cut Ŝ = {a1′ , a2′ } is also a triangle cut (as shown in Fig. 5). Let’s analyze E [ sf ] where
f = ( a1 , a2 ). Observe first that A→ ∪ B→ is a bottom edge bundle in the triangle Ŝ and all edges
in this bundle are reduced simultaneously when Â T = B̂T = 1 and marginals of all edges in Â ∪ B̂
are approximately preserved. (For the purposes of this overview, we’ll assume they are preserved
exactly). Let x( A↑ ) = α. Then since A = A↑ ∪ A→ and x( A) = 1, we have x( A→ ) = 1 − α. More-
over, since Â = A↑ ∪ B↑ and x( Â) = 1, we also have x( B↑ ) = 1 − α and x( B→ ) = α.

26
Â B̂

A↑ α B↑ 1 − α

a1 a2 B→ α a2′
a1′ f

Ŝ A→ 1 − α

Figure 5: Setting of Example 3.5. Note that the set A = δ( a1 ) ∩ δ( a1′ ) decomposes into two sets of
edges, A↑ , those that are also in δ(S), and the rest, which we call A→ . Similarly for B.

Therefore, using the fact that when A→ ∪ B→ is reduced, exactly one edge in A↑ ∪ B↑ is
selected (and also exactly one edge in A→ ∪ B→ is selected since it is a bottom edge bundle), and
marginals are preserved given the reduction, we conclude that

P a1′ happy| A→ ∪ B→ reduced = P [ A T = BT = 1| A→ ∪ B→ reduced] = α2 + (1 − α)2 .

Now, we calculate E [ sf ]. First, note that f may have to increase to compensate either for reduced
edges in A↑ ∪ B ↑ or in A→ ∪ B→ . For the sake of this discussion, suppose that A↑ ∪ B↑ is a set of
top edges. Then, in the worst case we need to increase f by pτ in expectation to fix the cuts a1 , a2
due to the reduction in A↑ ∪ B↑ . Now, we calculate the expected increase due to the reduction in
A→ ∪ B→ . The crucial observation is that edges in A→ ∪ B→ are reduced simultaneously, so both
cuts δ( a1 ) and δ( a2 ) can be fixed simultaneously by an increase in sf . Therefore, when they are
both odd, it suffices for f to increase by

max{ x( A→ ), x( B→ )} β = max{α, 1 − α} β,

to fix cuts a1 , a2 . Putting this together, we get

h i
E [ sf ] = − pβ + E [increase due to A→ ∪ B→ ] + E increase due to A↑ ∪ B↑
≤ − pβ + pβ max α[1 − α2 − (1 − α)2 ] + pτ
α∈[1/2,1]

which, since maxα∈[1/2,1] α[1 − α2 − (1 − α)2 ] = 8/27 and τ = 0.571β is

8
= pβ(−1 + + 0.571) = −0.13pβ.
27

Dealing with xu close to 1. 8 Now, suppose that e = (u, v) is a top edge bundle with xu :=
x(δ↑ (u)) is close to 1. Then, the analysis in Example 3.4, bounding r := P [δ(u) T odd| g reduced]
away from 1 for an edge g ∈ δ↑ (u) doesn’t hold. To address this, we consider two cases: The
first case, is that the edges in δ↑ (u) break up into many groups that end at different levels in
the hierarchy. In this case, we can analyze r separately for the edges that end at any given
8 Some portions of this discussion might be easier to understand after reading the rest of the paper.

27
level, taking advantage of the independence between the trees chosen at different levels of the
hierarchy.
The second case is when nearly all of the edges in δ↑ (u) end at the same level, for example,
they are all in δ→ (u′ ) where p(u′ ) is a degree cut. In this case, we introduce a more complex
(2-1-1) reduction rule for these edges. The observation is that from the perspective of these edges
u′ is a "pseudo-triangle". That is, it looks like a triangle cut, with atoms u and u′ r u where
δ(u) ∩ δ(u′ ) corresponds to the “A”-side of the triangle.
Now, we define this more complex 2-1-1 reduction rule: Consider a top edge f = (u′ , v′ ) ∈
δ (u′ ). So far, we only considered the following reduction rule for f: If both u′ , v′ are degree cuts,
→

f reduces when they are both even in the tree; otherwise if say u′ is a triangle cut, f reduces when
it is 2-1-1 good w.r.t., u′ (and similarly for v′ ). But clearly these rules ignore the pseudo triangle.
The simplest adjustment is, if u′ is a pseudo triangle with partition (u, u′ r u), to require f to
reduce when A T = BT = 1 and v′ is happy. However, as stated, it is not clear that the sets A and
B are well-defined. For example, u′ could be an actual triangle or there could be multiple ways
to see u′ as a pseudo triangle only one of which is (u, u′ r u). Our solution is to find the smallest
disjoint pair of cuts a, b ⊂ u′ in the hierarchy such that x(δ( a) ∩ δ(u′ )), x(δ(b) ∩ δ(u′ )) ≥ 1 − ǫ1/1 ,
where ǫ1/1 is a fixed universal constant, and then let A = δ( a) ∩ δ(u′ ), B = δ(b) ∩ δ(u′ ) and C =
δ(u′ ) r A r B (see Fig. 6 for an example). Then, we say f is 2-1-1 happy w.r.t., u′ if A T = BT = 1
and CT = 0.
A few observations are in order:
• Since u is a candidate for, say a, it must be that a is a descendent of u in the hierarchy (or
equal to u). In addition, b cannot simultaneously be in u, since a ∩ b = ∅ and x(δ(u) ∩
δ(u′ )) ≤ 1 by Lemma 2.7. So, when f is 2-1-1 happy w.r.t. u′ we get (δ(u) ∩ δ(u′ )) T = 1.
• If u′ = ( X, Y ) is a actual triangle cut, then we must have a ⊆ X, b ⊆ Y. So, when f is
2-1-1 happy w.r.t. u′ , we know that u′ is a happy triangle, i.e., (δ( X ) ∩ δ(u′ )) T = 1 and
(δ(Y ) ∩ δ(u′ )) T = 1.
Now, suppose for simplicity that all top edges in δ(u′ ) are 2-1-1 good w.r.t. u′ . Then, when
an edge g ∈ δ(u) ∩ δ(u′ ) is reduced, (δ(u) ∩ δ(u′ )) T = 1, so
P [δ(u) T odd| g reduced] ≤ P E(u, u′ r u) T even| g reduced ≤ 0.57,

since edges in E(u, u′ r u) are in the tree independent of the reduction and E [ E(u, u′ r u) T ] ≈ 1.

Dealing with xu close to 0 and the matching. We already discussed how the matching is
modified to handle the existence of bad edges. We now observe that we can handle the case
xu ≈ 0 by further modifying the matching. The key observation is that in this case, x(δ→ (u)) ≫
x(δ↑ (u)). Roughly speaking, this enables us to find a matching in which each edge in δ→ (u) has
to increase about half as much as would normally be expected to fix the cut of u. This eliminates
the need to prove a nontrivial bound on P [δ(u) T odd| g reduced]. The details of the matching
are in Section 6.

4 Polygons and the Hierarchy of Near Minimum Cuts

Let OPT be a minimum TSP solution, i.e., minimum cost Hamiltonian cycle and without loss
of generality assume it visits u0 and v0 consecutively (recall that c(u0 , v0 ) = 0). We write E∗

28
1 −ǫ
2

a1 b1 c1

1 +ǫ 1 +ǫ 1 +ǫ
2 2 2
1 − 2ǫ 1 − 2ǫ 1 − 2ǫ
a2 a3 b2 b3 c2 c3

ǫ
u1 1 1 1
2 2 2
u2 1
6
a4 b4 c4 2
3
2
3

u2 v2 w2

u1 a4 v1 b4 w1 c4

a1 a2 a3 b1 b2 b3 c1 c2 c3

Figure 6: Part of the hierarchy of the graph is shown on top. Edges of the same color have the same
fraction and ǫ ≫ η is a small constant. u1 corresponds to the degree cut { a1 , a2 , a3 }, u2 corresponds to
the triangle cut {u1 , a4 } and u corresponds to the degree cut containing all of the vertices shown. Observe
that edges in δ↑ ( a1 ) are top edges in the degree cut u. If ǫ < 21 ǫ1/1 then the ( A, B, C )-degree partitioning
of edges in δ(u2 ) is as follows: A = δ( a1 ) ∩ δ(u2 ) are the blue highlighted edges each of fractional value
1/2 − ǫ, B = δ( a4 ) ∩ δ(u2 ) are the green highlighted edges of total fractional value 1, and C are the red
highlighted edges each of fractional value ǫ. The cuts that contain edge ( a1 , c1 ) are highlighted in the
hierarchy at the bottom.

to denote the edges of OPT and we write e∗ to denote an edge of OPT. Analogously, we use
s∗ : E∗ → R ≥0 to denote the slack vector that we will construct for OPT edges.
Throughout this section we study η-near minimum cuts of G = (V, E, z) Note that these cuts
are 2η-near minimum cuts w.r.t., x. For every such near minimum cut, (S, S), we identify the cut
with the side, say S, such that u0 , v0 ∈
/ S. Equivalently, we can identify these cuts with an interval
along the optimum cycle, OPT, that does not contain u0 , v0 .
We will use “left" synonymously with “clockwise" and “right" synonymously with “counter-
clockwise." We say a vertex is to the left of another vertex if it is to the left of that vertex and to
the right of edge e0 = (u0 , v0 ). Otherwise, we say it is to the right (including the root itself in this
case).

Definition 4.1 (Crossed on the Left/Right, Crossed on Both Sides). For two crossing near minimum
cuts S, S′ , we say S crosses S′ on the left if the leftmost endpoint of S on the optimal cycle is to the left

29
of the leftmost endpoint of S. Otherwise, we say S crosses S′ on the right.
A near minimum cut is crossed on both sides if it is crossed on both the left and the right. We also
say a a near minimum cut is crossed on one side if it is either crossed on the left or on the right, but not
both.

4.1 Cuts Crossed on Both Sides

The following theorem is the main result of this section:

Theorem 4.2. Given OPT TSP tour with set of edges E∗ , and a feasible LP solution x0 of (1) with support
E0 = E ∪ {e0 } and let x be x0 restricted to E. For any distribution µ of spanning trees with marginals x,
if η < 1/100, then there is a random vector s∗ : E∗ → R ≥0 (the randomness in s∗ depends exclusively on
T ∼ µ) such that

• For any vector s : E → R where se ≥ − xe η/8 for all e and for any η-near minimum cut S w.r.t.,
z = ( x + OPT )/2 crossed on both sides where δ(S) T is odd, we have s(δ(S)) + s∗ (δ(S)) ≥ 0;

• For any e∗ ∈ E∗ , E [ s∗e∗ ] ≤ 5η 2 .

L ( e∗ ) R ( e∗ )
u e∗ v

Figure 7: L and R for an OPT edge e∗ .

For an OPT edge e∗ = (u, v), let L(e∗ ) be the largest η-near minimum cut (w.r.t. z) containing
u and not v which is crossed on both sides. Let R(e∗ ) be the largest near minimum cut containing
v and not u which is crossed on both sides. For example, see Fig. 7.

Definition 4.3. For a near minimum cut S that is crossed on both sides let S L be the near minimum cut
crossing S on the left which minimizes the intersection with S, and similarly for SR ; if there are multiple
sets crossing S on the left with the same minimum intersection, choose the smallest one to be S L (and
similar do for SR ).
We partition δ(S) into three sets δ(S) L , δ(S) R and δ(S)O as in Fig. 8 such that

δ(S) L = E(S ∩ S L , S L r S)
δ(S) R = E(S ∩ SR , SR r S)
δ ( S )O = δ ( S ) r ( δ ( S ) L ∪ δ ( S ) R )

30
SR

SL
S

Figure 8: S is crossed on the left by S L and on the right by SR . In green are edges in δ(S) L , in
blue edges in δ(S) R , and in red are edges in δ(S)O .

For an OPT edge e∗ define an (increase) event (of second type) I2 (e∗ ) as the event that at least
one of the following does not hold.

|T ∩ δ( L(e∗ )) R | = 1, |T ∩ δ( R(e∗ )) L | = 1, T ∩ δ( L(e∗ ))O = ∅, and T ∩ δ( R(e∗ ))O = ∅. (12)

In the proof of Theorem 4.2 we will increase an OPT edge e∗ whenever I2 (e∗ ) occurs.

Lemma 4.4. For any OPT edge e∗ , P [I2 (e∗ )] ≤ 18η.

Proof. Fix e∗ . To simplify notation we abbreviate L(e∗ ), R(e∗ ) to L, R. Since L is crossed on both
sides, L L , L R are well defined. Since by Lemma 2.5 L L ∩ L, L L r L are 4η-near min cuts and
L is 2η-near mincut with respect to x, by Corollary 2.24, P [| T ∩ δ( L) L )| = 1] ≥ 1 − 5η. Simi-
larly, P [| T ∩ δ( R) L | = 1] ≥ 1 − 5η. On the other hand, since L, L L , L R are 2η-near min cuts, by
Lemma 2.6, x( E( L ∩ L R , L R )), x( E( L ∩ L L , L L )) ≥ 1 − η. Therefore

x(δ( L)O ) ≤ 2 + 2η − x( E( L ∩ L R , L R )) − x( E( L ∩ L L , L L )) ≤ 4η.

It follows that P [ T ∩ δ( L)O = ∅] ≥ 1 − 4η. Similarly, P [ T ∩ δ( R)O = ∅] ≥ 1 − 4η. Finally, by the

union bound, all events occur simultaneously with probability at least 1 − 18η. So, P [I2 (e∗ )] ≤
18η as desired.

Lemma 4.5. Let S be a cut which is crossed on both sides and let e∗L , e∗R be the OPT edges on its interval
where e∗L is the edge further clockwise. Then, if δ(S) T 6= 2, at least one of I2 (e∗L ), I2 (e∗R ) occurs.

Proof. We prove by contradiction. Suppose none of I2 (e∗L ), I2 (e∗R ) occur; we will show that this
implies δ(S) T = 2.
Let R = R(e∗L ); note that S is a candidate for R(e∗L ), so S ⊆ R. Therefore, S L = R L and we
have
δ( R) L = E( R ∩ R L , R L r R) = E( R ∩ S L , S L r R) = δ(S) L .
where we used S ∩ S L = R ∩ S L and that S L r S = S L r R. Similarly let L = L(e∗R ), and, we have
δ( L) R = δ(S) R .

31
SL SR

e∗L e∗R

1 1

Figure 9: Setting of Lemma 4.5. Here we zoom in on a portion of the optimal cycle and assume
the root is not shown. If I2 (e∗L ) does not occur then E(S ∩ S L , S L r S) T = 1.

Now, since I2 (e∗L ) has not occurred, 1 = | T ∩ δ( R) L | = | T ∩ δ(S) L |, and since I2 (e∗R ) has not
occurred, 1 = | T ∩ δ( L) R | = | T ∩ δ(S) R |, where L = L(e∗R ). So, to get δ(S) T = 2, it remains to
show that T ∩ δ(S)O = ∅. Consider any edge e = (u, v) ∈ δ(S)O where u ∈ S. We need to
show e ∈ / T. Assume that v is to the left of S (the other case can be proven similarly). Then
e ∈ δ( R). So, since e goes to the left of R, either e ∈ E( R ∩ R L , R L r R) or e ∈ δ( R)O . But
/ δ(S) L = δ( R) L , we must have e ∈ δ( R)O . So, since I2 (e∗L ) has not occurred, e ∈
since e ∈ / T as
desired.

Proof of Theorem 4.2. For any OPT edge e∗ whenever I2 (e∗ ) occurs, define s∗e∗ = η/3.9. Then, by
Lemma 4.4, E [ se∗ ] ≤ 18η/3.9 and for any 2η-near min cut S (w.r.t., x) that is crossed on both
sides if δ(S) T is odd, then at least one of I2 (e∗L ), Iw (e∗R ) occurs, so

s(δ(S)) + s∗ (δ(S)) ≥ − x(δ(S))η/8 + s∗e∗ + s∗e∗ ≥ −(2 + 2η )η/8 + η/3.9 ≥ 0

L R

for η < 1/100 as desired.

4.2 Proof of the Main Technical Theorem, Theorem 3.1

The following theorem is the main result of this section.

Theorem 4.6. Let x0 be a feasible solution of LP (1) with support E0 = E ∪ {e0 } and x be x0 restricted to E.
Let µ be the max entropy distribution with marginals x. For η ≤ 10−12 , there is a set Eg ⊂ E r δ({u0 , v0 })
of good edges and two functions s : E0 → R and s∗ : E∗ → R ≥0 (as functions of T ∼ µ) such that

(i) For each edge e ∈ Eg , se ≥ − xe η/8 and for any e ∈ E r Eg , se = 0.

(ii) For each η-near-min-cut S w.r.t. z, if δ(S) T is odd, then s(δ(S)) + s∗ (δ(S)) ≥ 0.

(iii) We have E [ se ] ≤ −ǫP ηxe for all edges e ∈ Eg and E [ s∗e∗ ] ≤ 28η 2 for all OPT edges e∗ ∈ E∗ . for
ǫP defined in (33).

(iv) For every η-near minimum cut S of z crossed on (at most) one side such that S 6= V r {u0 , v0 },
x(δ(S) ∩ Eg ) ≥ 3/4.

Before proving this theorem we use it to prove the main technical theorem from the previous
section.

32
Theorem 3.1 (Main Technical Theorem). Let x0 be a solution of LP (1) with support E0 = E ∪ {e0 },
and x be x0 restricted to E. Let z := ( x + OPT )/2, η ≤ 10−12 and let µ be the max-entropy distribution
with marginals x. Also, let E∗ denote the support of OPT. There are two functions s : E0 → R and
s∗ : E∗ → R ≥0 (as functions of T ∼ µ), , such that

i) For each edge e ∈ E, se ≥ − xe η/8.

ii) For each η-near-min-cut S of z, if δ(S) T is odd, then s(δ(S)) + s∗ (δ(S)) ≥ 0.

iii) For every OPT edge e∗ , E [ s∗e∗ ] ≤ 28η 2 and for every LP edge e 6= e0 , E [ se ] ≤ − 16
5
xe ǫP η for
ǫP = 3.9 · 10 − 17 (defined in (33)).

Proof of Theorem 3.1. Let Eg be the good edges defined in Theorem 4.6 and let Eb := E r Eg be
the set of bad edges; in particular, note all edges in δ({u0 , v0 }) are bad edges. We define a new
vector s̃ : E ∪ {e0 } → R as follows:

∞
 if e = e0
s̃(e) ← − xe (η/10)(1 − 2η ) if e ∈ Eb , (13)

xe η/6 otherwise.


Let s̃∗ be the vector s∗ from Theorem 4.2. We claim that for any η-near minimum cut S such that
δ(S) T is odd, we have
s̃(δ(S)) + s̃∗ (δ(S)) ≥ 0.
To check this note by (iv) of Theorem 4.6 for every set S 6= V r {u0 , v0 } crossed on at most one
side, we have x( Eg ∩ δ(S)) ≥ 43 , so
η η
s̃(δ(S)) + s̃∗ (δ(S)) ≥ s̃(δ(S)) = x( Eg ∩ δ(S)) − (1 − 2η ) x( Eb ∩ δ(S)) ≥ 0. (14)
6 10
For S = V r {u0 , v0 }, we have δ(S) T = δ(u0 ) T + δ(v0 ) T = 2 with probability 1, so condition
ii) is satisfied for these cuts as well. Finally, consider cuts S which are crossed on both sides. By
Theorem 4.2,
s̃(δ(S)) + s̃∗ (δ(S)) ≥ 0 (15)
since s̃e ≥ −ηxe /10 ≥ −ηxe /8 for all e.
Now, we are ready to define s, s∗ . Let ŝ, ŝ∗ be the s, s∗ of Theorem 4.6 respectively. Define
s = γs̃ + (1 − γ)ŝ and similarly define s∗ = γs̃∗ + (1 − γ)ŝ∗ for some γ that we choose later. We
prove all three conclusions for s, s∗ . (i) follows by (i) of Theorem 4.6 and Eq. (13). (ii) follows by (ii)
of Theorem 4.6 and Eq. (14) above. It remains to verify (iii). For any OPT edge e∗ , E [ s∗e∗ ] ≤ 28η 2
by (iii) of Theorem 4.6 and the construction of s̃∗ . On the other hand, by (iii) of Theorem 4.6 and
Eq. (13),
(
≤ xe (γη/6 − (1 − γ)ǫP η ) ∀e ∈ Eg ,
E [ se ]
= − xe γ · (η/10)(1 − 2η ) ∀e ∈ Eb .
15 5 5
Setting γ = 4 ǫP we get E [ se ] ≤ − 16 ǫP ηxe for e ∈ Eg and E [ se ] ≤ − 16 xe ηǫP for e ∈ Eb as
desired.

33
4.3 Structure of Polygons of Cuts Crossed on One Side
Definition 4.7 (Connected Component of Crossing Cuts). Given a family of cuts crossed on at most
one side, construct a graph where two cuts are connected by an edge if they cross. Partition this graph into
maximal connected components. We call a path in this graph, a path of crossing cuts.
In the rest of this section we will focus on a single connected component C of cuts crossed on
(at most) one side.
Definition 4.8 (Polygon). For a connected component C of crossing near min cuts that are crossed on
one side, let a0 , . . . , am−1 be the coarsest partition of the vertices V , such that for all 0 ≤ i ≤ m − 1 and
for any A ∈ C either ai ⊆ A or ai ∩ A = ∅. These are called atoms. We assume a0 is the atom that
contains the special edge e0 , and we call it the root. Note that for any A ∈ C , a0 ∩ A = ∅.
Since every cut A ∈ C corresponds to an interval of vertices in V in the optimum Hamiltonian cycle,
we can arrange a0 , . . . , am−1 around a cycle (in the counter clockwise order). We label the arcs in this cycle
from 1 to m, where i + 1 is the arc connecting ai and ai+1 (and m is the name of the arc connecting am−1
and a0 ). Then every cut A ∈ C can be identified by the two arcs surrounding its atoms. Specifically, A is
identified with arcs i, j (where i < j) if A contains atoms ai , . . . , a j−1 , and we write ℓ( A) = i, r( A) = j.
Note that A does not contain the root a0 .
By construction for every arc 1 ≤ i ≤ m, there exists a cut A such that ℓ( A) = i or r( A) = i.
Furthermore, A, B ∈ C (with ℓ( A) ≤ ℓ( B)) cross iff ℓ( A) < ℓ( B) < r( A) < r( B).
See Fig. 10 for a visual example.
Notice that every atom of a polygon is an interval of the optimal cycle. In this section, we
prove the following structural theorem about polygons of near minimum cuts crossed on one
side.
Theorem 4.9 (Polygon Structure). For ǫη ≥ 14η and any polygon with atoms a0 ...am−1 (where a0 is
the root) the following holds:
• For all adjacent atoms ai , ai+1 (also including a0 , am−1 ), we have x( E( ai , ai+1 )) ≥ 1 − ǫη .

• All atoms ai (including the root) have x(δ( ai )) ≤ 2 + ǫη .

• x( E( a0 , {a2 , . . . , am−2 })) ≤ ǫη .

The interpretation of this theorem is that the structure of a polygon converges to the structure
of an actual integral cycle as η → 0. The proof of the theorem follows from the lemmas in the
rest of this subsection.
Definition 4.10 (Left and Right Hierarchies). For a polygon u corresponding to a connected component
C of cuts crossed on one side, let L (the left hierarchy) be the set of all cuts A ∈ C that are not crossed on
the left. We call any cut in L open on the left. Similarly, we let R be the set of cuts that are open on the
right. So, L, R is a partitioning of all cuts in C .
For two distinct cuts A, B ∈ L we say A is an ancestor of B in the left polygon hierarchy if A ⊇ B.
We say A is a strict ancestor of B if, in addition, ℓ( A) 6= ℓ( B). We define the right hierarchy similarly: A
is a strict ancestor of B if A ⊇ B and r( A) 6= r( B).
We say B is a strict parent of A if among all strict ancestors of A in the (left or right) hierarchy, B is
the one closest to A.
See Fig. 10 for examples of sets and their parent/ancestor relationships.

34
R1

L2 L3

a1 a2 a3 a4 a5 a6

a0
A B

a1 a2 a3 a4 a5 a6

Figure 10: An example of a polygon with contracted atoms. In black are the cuts in the left
polygon hierarchy, in red the cuts in the right polygon hierarchy. OPT edges around the cycle are
shown in green. Here R1 is an ancestor of R2 , however it is not a strict ancestor of R2 since they
have the same right endpoint. L1 is a strict ancestor and the strict parent of L3 . By Theorem 4.9,
every edge in the bottom picture represents a set of LP edges of total fraction at least 1 − ǫη .

Fact 4.11. If A, B are in the same hierarchy and they are not ancestors of each other, then A ∩ B = ∅.

Proof. If A ∩ B 6= ∅ then they cross. So, they cannot be open on the same side.

This lemma immediately implies that the cuts in each of the left (and right) hierarchies form
a laminar family.

Lemma 4.12. For A, B ∈ R where B is a strict parent of A, there exists a cut C ∈ L that crosses both
A, B. Similarly, if A, B ∈ L and B is a strict parent of A, there exists a cut C ∈ R that crosses A, B.

Proof. Since we have a connected component of near min cuts, there exists a path of crossing cuts
from A to B. Let P = ( A = C0 , C1 , . . . , Ck = B) be the shortest such path. We need to show that
k = 2.
First, since C1 crosses C0 and C0 is open on right, we have

ℓ(C1 ) < ℓ(C0 ) < r(C1 ) < r(C0 ).

Let I be the closed interval [ℓ(C1 ), r(C0 )]. Note that Ck = B has an endpoint that does not belong
to I. Let Ci be the first cut in the path with an endpoint not in I (definitely i > 1). This means

35
Ci−1 ⊆ I; so, since Ci−1 crosses Ci , exactly one of the endpoints of Ci is strictly inside I. We
consider two cases:
Case 1: r(Ci ) > r(C0 ). In this case, Ci must be crossed on the left (by Ci−1 ) and Ci ∈ R and it
does not cross C0 . So, C0 ( Ci and

ℓ(C1 ) < ℓ(Ci ) ≤ ℓ(C0 )

where the first inequality uses that the left endpoint of Ci is strictly inside I. Therefore, C1 crosses
both of C0 , Ci , and Ci is a strict ancestor of A = C0 . If Ci = B we are done, otherwise, A ⊆ B ⊆ Ci ,
but since C1 crosses both A and Ci , it also crosses B and we are done.
Case 2: ℓ(Ci ) < ℓ(C1 ). In this case, Ci must be crossed on the right (by Ci−1 ) and Ci ∈ L and
it does not cross C1 . So, we must have

r(C1 ) ≤ r(Ci ) < r(C0 ),

where the second inequality uses that the right endpoint of Ci is strictly inside I. But, this implies
that Ci also crosses C0 . So, we can obtain a shorter path by excluding all cuts C1 , . . . , Ci−1 and
that is a contradiction.

Lemma 4.13. Let A, B ∈ R such that A ∩ B = ∅, i.e., they are not ancestors of each other. Then, they
have a common ancestor, i.e., there exists a set C ∈ R such that A, B ⊆ C.
Proof. WLOG assume r( A) ≤ ℓ( B). Let C be the highest ancestor of A in the hierarchy, i.e., C
has no ancestor. For the sake of contradiction suppose B ∩ C = ∅ (otherwise, C is an ancestor
of B and we are done). So, r(C ) ≤ ℓ( B). Consider the path of crossing cuts from C to B, say
C = C0 , . . . , Ck = B.
Let Ci be the first cut in this path such that r(Ci ) > r(C0 ). Note that such a cut always exists
as r( B) > r(C ). Since Ci−1 crosses Ci and r(Ci−1 ) ≤ r(C0 ), Ci−1 crosses Ci on the left and Ci
is open on the right. We show that Ci is an ancestor of C = C0 and we get a contradiction to
C0 having no ancestors (in R). If ℓ(C0 ) < ℓ(Ci ), then Ci crosses C0 on the right and that is a
contradiction. So, we must have C0 ⊆ Ci , i.e., Ci is an ancestor of C0 .

It follows from the above lemma that each of the left and right hierarchies have a unique cut
with no ancestors.
Lemma 4.14. If A is a cut in R such that r( A) < m, then A has a strict ancestor. And, similarly, if
A ∈ L satisfies ℓ( A) > 1, then it has a strict ancestor.
Proof. Fix a cut A ∈ R. If there is a cut in B ∈ R such that r( B) > r( A), then either B is a strict
ancestor of A in which case we are done, or A ∩ B = ∅, but then by Lemma 4.13 A, B have a
common ancestor C, and C must be a strict ancestor of A and we are done.
Now, suppose for any R ∈ R, r( R) ≤ r( A). So, there must be a cut B ∈ L such that
r( B) > r( A) (otherwise we should have less than m atoms in our polygon). The cut B must be
crossed on the right by a cut C ∈ R. But then, we must have r(C ) > r( B) > r( A) which is a
contradiction.

Corollary 4.15. If A ∈ C has no strict ancestor, then r( A) = m if A ∈ R and ℓ( A) = 1 otherwise.

Lemma 4.16 (Polygons are Near Minimum Cuts). x(δ( a1 ∪ · · · ∪ am )) ≤ 2 + 4η.

36
Proof. Let A ∈ L and B ∈ R be the unique cuts in the left/right hierarchy with no ancestors.
Note that A and B are crossing (because there is a cut C that crosses A on the right, and B is
an ancestor of C). Therefore, since A, B are both 2 + 2η near min cuts, by Lemma 2.5, A ∪ B is a
2 + 4η near min cut.

Lemma 4.17 (Root Neighbors). x( E( a0 , a1 )), x( E( a0 , am−1 )) ≥ 1 − 2η.

Proof. Here we prove x( E( a0 , a1 )) ≥ 1 − 2η. One can prove x( E( a0 , am−1 )) ≥ 1 − 2η similarly. Let
A ∈ L and B ∈ R be the unique cuts in the left/right hierarchy with no ancestors. First, observe
that if ℓ( B) = 2, then since A, B are crossing, by Lemma 2.6 we have

x( E( A r B, A ∪ B)) = x( E( a1 , a0 )) ≥ 1 − η.

as desired.
By definition of atoms, there exists a cut C ∈ C such that either ℓ(C ) = 2 or r(C ) = 2; but if
r(C ) = 2 we must have ℓ(C ) = 1 in which case C cannot be crossed, so this does not happen. So,
we must have ℓ(C ) = 2. If C ∈ R, then since C is a descendent of B, we must have ℓ( B) = 2, and
we are done by the previous paragraph.
Otherwise, suppose C ∈ L. We claim that B crosses C. This is because, C is crossed on
the right by some cut B′ and B is an ancestor of B′ , so B ∩ C 6= ∅ and C 6⊆ B since ℓ( B) > 2.
Therefore, by Lemma 2.5 B ∪ C is a 2 + 4η near min cut. Since A crosses B ∪ C, by Lemma 2.6 we
have
x( E( A r ( B ∪ C ), A ∪ B ∪ C)) = x( E( a1 , a0 )) ≥ 1 − 2η
as desired.

Lemma 4.18. For any pair of atoms ai , ai+1 where 1 ≤ i ≤ m − 2 we have x(δ({ai , ai+1 })) ≤ 2 + 12η,
so x( E( ai , ai+1 )) ≥ 1 − 6η.

Proof. We prove the following claim: There exists j ≤ i such that x(δ({a j , . . . , ai+1 })) ≤ 2 + 6η.
Then, by a similar argument we can find j′ ≥ i + 1 such that x(δ({ai , . . . , a j′ })) ≤ 2 + 6η. By
Lemma 2.5 it follows that x(δ({ai , ai+1 })) ≤ 2 + 12η. Since x(δ( ai )), x(δ( ai+1 )) ≥ 2, we have

x(δ({ai , ai+1 })) + 2x( E( ai , ai+1 )) ≥ 4.

But due to the bound on x(δ({ai , ai+1 })) we must have x( E( ai , ai+1 )) ≥ 1 − 6η as desired.
It remains to prove the claim. First, observe that there is a cut A separating ai+1 , ai+2 (Note
that if i + 1 = m − 1 then ai+2 = a0 ); so, either ℓ( A) = i + 2 or r( A) = i + 2. If r( A) = i + 2 then,
A is the cut we are looking for and we are done. So, assume ℓ( A) = i + 2.
Case 1: A ∈ L. Let L ∈ L be the strict parent of A. If ℓ( L) ≤ i then we are done (since there
is a cut R ∈ R crossing A, L on the right so L r ( A ∪ R) is the cut that we want. If ℓ( L) = i + 1,
then let L′ be the strict parent of L). Then, there is a cut R ∈ R crossing A, L and a cut R′ crossing
L, L′ . First, since both R, R′ cross L (on the right) they have a non-empty intersection, so one of
them say R′ is an ancestor of the other (R) and therefore R′ must intersect A. On the other hand,
since R′ crosses L and ℓ( L) = i + 1, ℓ( R′ ) ≥ i + 2 = ℓ( A). Since R′ intersect A, either they cross,
or A ⊆ R′ , so we must have x(δ( A ∪ R)) ≤ 2 + 4η. Finally, since R′ crosses L′ (on the right) we
have x(δ( L′ r ( A ∪ R))) ≤ 2 + 6η and L′ r ( A ∪ R) is our desired set.

37
Case 2: A ∈ R. We know that A is crossed on the left by, say, L ∈ L. If ℓ( L) ≤ i, we are done,
since then L r A is the cut that we seek and we get x(δ( L r A)) ≤ 2 + 4η.
Suppose then that ℓ( L) = i + 1. Let L′ be the strict parent of L, which must have ℓ( L′ ) ≤ i. If
L crosses A, then L′ r A is the cut we seek and we get x(δ( L r A)) ≤ 2 + 4η.
′

Finally, if L′ doesn’t cross A, i.e., r( A) ≤ r( L′ ), then consider the cut R ∈ R that crosses L and
′
L on the right. Since r( L) < r( A), and A is not crossed on the right, it must be that ℓ( R) = i + 2.
In this case, L′ r R is the cut we want, and we get x(δ( L′ r R)) ≤ 2 + 4η.

Lemma 4.19 (Atoms are Near Minimum Cuts). For any 1 ≤ i ≤ m − 1, we have x(δ( ai )) ≤ 2 + 14η.

Proof. By Lemma 4.18, x( E({ai , ai+1 })) ≤ 2 + 12η (note that in the special case i = m − 1 we
take the pair ai−1 , ai ). There must be a 2η-near minimum cut C (w.r.t., x) separating ai from ai+1 .
Then either ai = C ∩ {ai , ai+1 } or ai = {ai , ai+1 } r C. In either case, we get x(δ( ai )) ≤ 2 + 14η by
Lemma 2.5.

4.4 Happy Polygons

Definition 4.20 (A, B, C-Polygon Partition). Let u be a polygon with atoms a0 , . . . , am−1 with root a0
where a1 , am−1 are the atoms left and right of the root. The A, B, C-polygon partition of u is a partition of
edges of δ(u) into sets A = E( a1 , a0 ) and B = E( am−1 , a0 ), C = δ(u) r A r B.

Note that by Theorem 4.9, x( A), x( B) ≥ 1 − ǫη and x(C ) ≤ ǫη where ǫη = 14η is defined in
Theorem 4.9.

Definition 4.21 (Leftmost and Rightmost cuts). Let u be a polygon with atoms a0 , . . . , am and arcs
labelled 1, . . . , m corresponding to a connected component C of η-near minimum cuts (w.r.t., z). We call
any cut C ∈ C with ℓ(C ) = 1 a leftmost cut of u and any cut C ∈ C with r(C ) = m a rightmost cut of
u. We also call a1 the leftmost atom of u (resp. am−1 the rightmost atom).

Observe that by Corollary 4.15, any cut that is not a leftmost or a rightmost cut has a strict
ancestor.

Definition 4.22 (Happy Polygon). Let u be a polygon with polygon partition A, B, C. For a spanning
tree T, we say that u is happy if
A T and BT odd, CT = 0.
We say that u is left-happy (respectively right-happy) if

A T odd, CT = 0,

(respectively BT odd, CT = 0).

Definition 4.23 (Relevant Cuts). Given a polygon u corresponding to a connected component C of cuts
crossed on one side with atoms a0 , . . . , am−1 , define a family of relevant cuts

C ′ = C ∪ {ai : 1 ≤ i ≤ m − 1, z(δ( ai )) ≤ 2 + η }.

Note that atoms of u are always ǫη /2-near minimum cuts w.r.t., z but not necessarily η-near
minimum cuts. The following theorem is the main result of this section.

38
Theorem 4.24 (Happy Polygons and Cuts Crossed on One Side). Let G = (V, E, x) for x be an
LP solution and z = ( x + OPT )/2. For a connected component C of near minimum cuts of z, let u be
the polygon with atoms a0 , a1 ...am−1 with polygon partition A, B, C. For µ an arbitrary distribution of
spanning trees with marginals x, there is a random vector s∗ : E∗ → R ≥0 (as a function of T ∼ µ) such
that for any vector s : E → R where se ≥ −ηxe /8 for all e ∈ E the following holds:

• If u is happy then, for any cut S ∈ C ′ if δ(S) T is odd then we have s(δ(S)) + s∗ (δ(S)) ≥ 0,

• If u is left happy, then for any S ∈ C ′ that is not a rightmost cut or the rightmost atom, if δ(S) T is
odd, then we have s(δ(S)) + s∗ (δ(S)) ≥ 0. Similarly, if u is right happy then for any cut S ∈ C ′
that is not a rightmost cut or the rightmost atom, the latter inequality holds.

• E [ s∗e∗ ] ≤ 23η 2 .

Before proving the above theorem, we study a special case.

Lemma 4.25 (Triangles as Degenerate Polygons). Let S = X ∪ Y where X, Y, S are ǫη -near min cuts
(w.r.t., x) and each of these sets is a contiguous interval around the OPT cycle. Then, viewing X as a1 and
Y as a2 (and a0 = X ∪ Y) the above theorem holds viewing S as a degenerate polygon.

Proof. In this case A = E( a1 , a0 ), B = E( a2 , a0 ), C = ∅. For the OPT edge e∗ between X, Y we

define I1 (e∗ ) to be the event that at least one of T ∩ E( X ), T ∩ E(Y ), T ∩ E(S) is not a tree.
Whenever this happens we define s∗e∗ = η/3.9. If S is left-happy we need to show when δ( X ) T
is odd, then s(δ( X )) + s∗ (δ( X )) ≥ 0. This is because when S is left-happy we have A T = 1 (and
CT = 0), so either I1 (e∗ ) does not happen and we get δ( X ) T = 2 or it happens in which case
s(δ( X )) + s∗ (δ( X )) ≥ 0 as s(δ( X )) ≥ −(2 + 2η )η/8 and s∗e∗ = η/3.9. Finally, observe that by
Corollary 2.24, P [I1 (e∗ )] ≤ 3ǫη , so E [ s∗e∗ ] = 3ǫη η/3.9 ≤ 23η 2 .

Lemma 4.26. For every cut A ∈ C that is not a leftmost or a rightmost cut, P [ δ( A) T = 2] ≥ 1 − 22η.

Proof. Assume A ∈ R; the other case can be proven similarly. Let B be the strict parent of A. By
Lemma 4.12 there is a cut C ∈ L which crosses A, B on their left. It follows by Lemma 2.5 that C r
A, C ∩ A are 4η near minimum cuts (w.r.t., x). So, by Corollary 2.24, P [ E( A ∩ C, C r A) T = 1] ≥
1 − 5η. On the other hand, B r ( A ∪ C ) is a 6η near minimum cut and A r C, B r C are 4η near
min cuts (w.r.t., x). So, by Corollary 2.24 P [ E( A r C, B r ( A ∪ C )) T = 1] ≥ 1 − 7η.
Finally, by Lemma 2.6, x( E( A ∩ C, C r A)), x( A r C, B r ( A ∪ C )) ≥ 1 − 3η. Since A is a
2η near min cut (w.r.t., x), all remaining edges have fractional value at most 8η, so with prob-
ability 1 − 8η, T does not choose any of them. Taking a union bound over all of these events,
P [ δ( A) T = 2] ≥ 1 − 22η.

Lemma 4.27. For any atom ai ∈ C ′ that is not the leftmost or the rightmost atom we have

P [δ( ai ) T = 2] ≥ 1 − 42η.

Proof. By Lemma 4.18, x(δ({ai , ai+1 })) ≤ 2 + 12η, and by Lemma 4.19, x(δ( ai+1 )) ≤ 2 + 14η (also
recall by the assumption of lemma x(δ( ai )) ≤ 2 + 2η, Therefore, by Corollary 2.24,

P [ E( ai , ai+1 ) T = 1] , P [ E( ai−1 , ai ) T = 1] ≥ 1 − 14η,

39
where the second inequality holds similarly. Also, by Lemma 4.18, x( E( ai−1 , ai )), x( E( ai , ai+1 )) ≥
1 − 6η. Since x(δ( ai )) ≤ 2 + 2η, x( E( ai , ai−1 ∪ ai ∪ ai+1 )) ≤ 14η. So,

P [ T ∩ E( ai , ai−1 ∪ ai ∪ ai+1 ) = ∅] ≥ 1 − 14η.

Finally, by the union bound all events occur with probability at least 1 − 42η.

Let e1∗ , . . . , e∗m be the OPT edges mapped to the arcs 1, . . . , m of the component C respectively.

Lemma 4.28. There is a mapping9 of cuts in C ′ to OPT edges e2∗ , . . . e∗m−1 such that each OPT edge has
at most 4 cuts mapped to it, and every atom of the polygon in C ′ gets mapped to two (not necessarily
distinct) OPT edges.

Proof. Consider first the set of cuts in CR ′ : = R ∪ { a : 1 ≤ i ≤ m − 1, z( δ ( a )) ≤ 2 + η } and

i i
′
similarly CL := L ∪ {ai : 1 ≤ i ≤ m − 1, z(δ( ai )) ≤ 2 + η }. Observe that this is also a laminar
family. Note that atoms are in both CR ′ and C ′ . We define a map from cuts in C ′ to OPT edges
L R
such that every OPT edge e2∗ , . . . , e∗m−1 gets at most 2 cuts mapped to it. A similar argument
works for cuts in CL′ .
For any 2 ≤ i ≤ m − 1, we map

argmax A∈C ′ | A| and argmax A∈C ′ | A|

R : ℓ( A )= i R :r ( A )= i

to e∗i , where recall ℓ( A) is the OPT edge leaving A on the left side and r( A) the OPT edge leaving
on the right. By construction, each OPT edge gets at most two cuts mapped to it.
Furthermore, we claim every cut A ∈ CR ′ gets mapped to at least one OPT edge. For the
′
sake of contradiction let A ∈ CR be a cut that is not mapped to any OPT edge. First note that
a1 is mapped to edge e2∗ (in both hierarchies) and am−1 is mapped to edge e∗m−1 . Otherwise, if
A ∈ R, ℓ( A) 6= 1. Furthermore, if A ∈ R and r( A) = m, then A is definitely the largest cut with
left endpoint ℓ( A). So assume, 1 < ℓ( A) = r( A) < m. Let B = argmaxB∈C ′ :ℓ( B)=ℓ( A)| B| and let
R
C = argmaxB∈C ′ :r (C)=r ( A)|C |. Since A is not mapped to any OPT edge but B, C are mapped by
R
above definition, we must have B, C 6= A. But that implies A ( B, C. And this means B, C cross;
but this is a contradiction with R being a laminar family.

Definition 4.29 (Happy Cut). We say a leftmost cut L ∈ L is happy if

E( L, a0 ∪ L) T = 1

Similarly, the leftmost atom a1 is happy if E( a1 , a0 ∪ a1 ) T = 1. Define rightmost cuts in u or the

rightmost atom in u to be happy, similarly.

Note that, by definition, if leftmost cut L is happy and u is left happy then L is even, i.e.,
δ( L) T = 2. Similarly, a1 is even if it is happy and u is left-happy.

Lemma 4.30. For every leftmost or rightmost cut A in u that is an η-near min cut w.r.t. z, P [ A happy] ≥
1 − 10η, and for the leftmost atom a1 (resp. rightmost atom am−1 ), if it is an η-near min cut then
P [ a1 happy] ≥ 1 − 24η (resp. P [ am−1 happy] ≥ 1 − 24η).
9 Each cut will be mapped to one or two OPT edges.

40
Proof. Recall that if A is a η-near min cut w.r.t. z then it is a 2η-near min cut w.r.t. x. Also, recall
for a cut L ∈ L, L R is the near minimum cut crossing L on the right that minimizes the intersection
(see Definition 4.3). We prove this for the leftmost cuts and the leftmost atom; the other case can
be proven similarly. Consider a cut L ∈ L. Since by Lemma 2.5 L R ∩ L, L R r L are 4η near
min cuts (w.r.t., x) and L R is a 2η near min cut, by Corollary 2.24, P [ E( L R ∩ L, L R r L) T = 1] ≥
1 − 5η. On the other hand, by Lemma 2.6, x( E( L R ∩ L, L R r L)) ≥ 1 − η, and by Lemma 4.17,
x( E( L, a0 )) ≥ 1 − 2η. It follows that

δ( L) r E( L R ∩ L, L R r L) r E( L, a0 ) ≤ 5η

Therefore, by the union bound, P [ L happy] ≥ 1 − 10η, since if (δ( L) r E( L R ∩ L, L R r L) r

E( L, a0 )) T = 0 and E( L R ∩ L, L R r L) T = 1 then E( L, a0 ∪ L) T = 1 and therefore L is happy.
Now consider the atom a1 , and suppose it is an η near min cut. By Lemma 4.18, x(δ({a1 , a2 })) ≤
2 + 12η and by Lemma 4.19, x(δ( a2 )) ≤ 2 + 14η. Therefore, by Corollary 2.24, P [ E( a1 , a2 ) T = 1] ≥
1 − 14η. On the other hand, by Lemma 4.18, x( E( a1 , a2 )) ≥ 1 − 6η and by Lemma 4.17, x( E( a1 , a0 )) ≥
1 − 2η. Therefore,

x( E( a1 , a3 ∪ · · · ∪ am−1 )) ≤ 2 + 2η − (1 − 6η ) − (1 − 2η )) ≤ 10η.

Observe, a1 is happy when both of these events occur; so, by the union bound, P [ a1 happy] ≥
1 − 24η as desired.

Proof of Theorem 4.24. Consider an OPT edge e∗i for 1 < i < m. For the at most four cuts mapped
to e∗i in Lemma 4.28, we define the following three events:
i) A leftmost cut assigned to e∗i is not happy. (Equivalently, a leftmost cut L ∈ L ∩ C ′ with
r( L) = i is not happy.)
ii) A rightmost cut assigned to e∗i is not happy. (Equivalently, a rightmost cut R ∈ R ∩ C ′ with
l ( R) = i is not happy.10 )

iii) A cut which is not leftmost or rightmost assigned to e∗i is odd.

Observe that the cuts in (i) and (ii) are assigned to e∗i in Lemma 4.28. We say an atom a is
singly-mapped to e∗i if in the matching a is only mapped to e∗i once, otherwise we say it is
doubly-mapped to e∗i .
We say an event I1 (e∗i ) occurs if either (i), (ii), or (iii) occurs. If I1 (e∗i ) occurs then we set:

η/3.9
 If (i),(ii), or (iii) occurred for at least one non-atom cut in C ′ , or for an atom
s∗e∗ = which is doubly-mapped to e∗i
i 
η/(2 · 3.9) Otherwise.


If I1 (e∗i ) does not occur we set s∗e∗ = 0. First, observe that for any non-atom cut S ∈ C ′ that is not
i
a leftmost or a rightmost cut, if δ(S) T is odd, then if e∗i is the OPT edge that S is mapped to, it
satisfies s∗e∗ = η/3.9, so
i

s(δ(S)) + s∗ (δ(S)) ≥ − x(δ(S))η/8 + s∗ (e∗i ) ≥ −(2 + 2η )η/8 + η/3.9 ≥ 0,

10 Notein the special case that i = 2, L in (i) will be the leftmost atom if it is a near min cut, and similarly in (ii)
when i = m − 1, R will be the rightmost atom if it is a near min cut.

41
for η < 1/100. The same inequality holds for non-leftmost/rightmost atom cuts a ∈ C ′ which are
doubly-mapped to e∗i . For non-leftmost/rightmost atom cuts a ∈ C ′ which are singly-mapped to
e∗i , a is mapped (possibly even twice) to another edge e∗j (note j = i − 1 or i + 1), and in this case
s∗ (e∗i ) + s∗ (e∗j ) ≥ η/3.9, and again the above inequality holds.
Now, suppose for a leftmost cut S ∈ L ∩ C ′ with r(S) = i has δ(S) T odd. If u is not left-happy
there is nothing to prove. If u is left-happy, then we must have S is not happy (as otherwise δ(S) T
would be even), so I1 (e∗i ) occurs, so similar to the above inequality s(δ(S)) + s∗ (δ(S)) ≥ 0. The
same holds for rightmost cuts and the leftmost/rightmost atoms in C ′ (note leftmost/rightmost
atoms are always doubly-mapped: a1 to e2∗ and am−1 to e∗m−1 ).
It remains to upper bound E [ s∗ (e∗i )] for 1 < i < m. By Lemma 4.28 at most four cuts are
mapped to e∗i . Then, either there is an atom which is doubly-mapped to e∗i or there is not.
First suppose exactly one atom is doubly-mapped to e∗i . Then there are at most three cuts
mapped to e∗i , including that atom. The probability of an event of type (i) or (ii) occurring for
the leftmost or rightmost atom is at most 1 − 24η by Lemma 4.30. Atoms which are not leftmost
or rightmost are even with probability at least 1 − 42η by Lemma 4.27. Therefore, in the worst
case, the doubly-mapped atom is not leftmost or rightmost. For the remaining two cuts, leftmost
and rightmost cuts are happy with probability at least 1 − 10η by Lemma 4.30, and (non-atom)
non leftmost/rightmost cuts are even with probability at least 1 − 22η by Lemma 4.26. Therefore
in the worst case the remaining two (non-atom) cuts mapped to e∗i are not leftmost/rightmost.
Therefore, if an atom is doubly-mapped to e∗i ,

E [ s∗ (e∗i )] ≤ 42η 2 /3.9 + 2 · 22η 2 /3.9 = 86η 2 /3.9 < 23η 2

Note if two atoms are doubly-mapped to e∗i ,

E [ s∗ (e∗i )] ≤ 2 · 42η 2 /3.9 = 84η 2 /3.9 < 23η 2

Otherwise, any atoms mapped to e∗i are singly-mapped. In this case, if only an atom cut
is odd/unhappy, we set s∗ (e∗i ) = η/(2 · 3.9). The probability of an event of type (i) or (ii)
occurring for the leftmost or rightmost atom is at most 1 − 24η by Lemma 4.30, so we can bound
the contribution of this event to E [ s∗ (e∗i )] by 24η 2 /(2 · 3.9). Atoms which are not leftmost or
rightmost are even with probability at least 1 − 42η by Lemma 4.27, and so we can bound their
contribution by 42η 2 /(2 · 3.9). Therefore, in the worst case four non-leftmost/rightmost non-atom
cuts are mapped to e∗i , in which case,

E [ s∗ (e∗i )] ≤ 4 · 22η 2 /3.9 = 88η 2 /3.9 < 23η 2

as desired.

4.5 Hierarchy of Cuts and Proof of Theorem 4.6

Definition 4.31 (Hierarchy). For an LP solution x0 with support E0 = E ∪ {e0 } and x be x0 restricted
to E, a hierarchy H is a laminar family of ǫη -near min cuts of G = (V, E, x) with root V r {u0 , v0 },
where every cut S ∈ H is either a polygon cut (including triangles) or a degree cut and u0 , v0 ∈/ S. For
any (non-root) cut S ∈ H, define the parent of S, p(S), to be the smallest cut S′ ∈ H such that S ( S′ .
For a cut S ∈ H, let A(S) := {u ∈ H : p(u) = S}. If S is a polygon cut, then we can order cuts in
A(S), u1 , . . . , um−1 such that

42
• A = E(S, u1 ), B = E(um−1 , S) satisfy x( A), x( B) ≥ 1 − ǫη .

• For any 1 ≤ i < m − 1, x( E(ui , ui+1 ) ≥ 1 − ǫη .

• C = ∪im=−22 E(ui , S) satisfies x(C ) ≤ ǫη .

We call the sets A, B, C the polygon partition of edges in δ(S). We say S is left-happy when A T is odd
and CT = 0 and right happy when BT is odd and CT = 0 and happy when A T , BT are odd and CT = 0.
We abuse notation, and for an (LP) edge e = (u, v) that is not a neighbor of u0 , v0 , let p(e) denote the
smallest11 cut S′ ∈ H such that u, v ∈ S′ . We say edge e is a bottom edge if p(e) is a polygon cut and
we say it is a top edge if p(e) is a degree cut.
Note that when S is a polygon cut u1 , . . . , um−1 will be the atoms a1 , . . . , am−1 that we defined
in the previous section, but a reader should understand this definition independent of the poly-
gon definition that we discussed before; in particular, the reader no longer needs to worry about
the details of specific cuts C that make up a polygon. Also, note that since V r {u0 , v0 } is the root
of the hierarchy, for any edge e ∈ E that is not incident to u0 or v0 , p(e) is well-defined; so all
those edges are either bottom or top, and edges which are incident to u0 or v0 are neither bottom
edges nor top edges.
The following observation is immediate from the above definition.
Observation 4.32. For any polygon cut S ∈ H, and any cut S′ ∈ H which is a descendant of S let
D = δ(S′ ) ∩ δ(S). If D 6= ∅, then exactly one of the following is true: D ⊆ A or D ⊆ B or D ⊆ C.

Theorem 4.33 (Main Payment Theorem). For an LP solution x0 and x be x0 restricted to E and a
hierarchy H for some ǫη ≤ 10−10 , the maximum entropy distribution µ with marginals x satisfies the
following:
i) There is a set of good edges Eg ⊆ E r δ({u0 , v0 }) such that any bottom edge e is in Eg and for any
(non-root) S ∈ H such that p(S) is a degree cut, we have x( Eg ∩ δ(S)) ≥ 3/4.

ii) There is a random vector s : Eg → R (as a function of T ∼ µ) such that for all e, se ≥ − xe η/8
(with probability 1), and

iii) If a polygon cut u with polygon partition A, B, C is not left happy, then for any set F ⊆ E with
p(e) = u for all e ∈ F and x( F ) ≥ 1 − ǫη /2, we have

s( A) + s( F ) + s− (C ) ≥ 0,

where s− (C ) = ∑e∈C min{se , 0}. A similar inequality holds if u is not right happy.

iv) For every cut S ∈ H such that p(S) is not a polygon cut, if δ(S) T is odd, then s(δ(S)) ≥ 0.

v) For a good edge e ∈ Eg , E [ se ] ≤ −ǫP ηxe (see Eq. (33) for definition of ǫP ) .
The above theorem is the main part of the paper in which we use that µ is a SR distribution.
See Section 7 for the proof. We use this theorem to construct a random vector s such that essen-
tially for all cuts S ∈ H in the hierarchy z/2 + s is feasible; furthermore for a large fraction of
“good” edges we have that E [ se ] is negative and bounded away from 0.
11 in the sense of the number of vertices that it contains

43
As we will see in the following subsection, using part (iii) of the theorem we will be able to
show that every leftmost and rightmost cut of any polygon is satisfied.
In the rest of this section we use the above theorem to prove Theorem 4.6. We start by
explaining how to construct H. Given the vector z = ( x + OPT )/2 run the following procedure
on the OPT cycle with the family of η-near minimum cuts of z that are crossed on at most one
side:
For every connected component C of η near minimum cuts (w.r.t., z) crossed on at most one
side, if |C| = 1 then add the unique cut in C to the hierarchy. Otherwise, C corresponds to a
polygon u with atoms a0 , . . . , am−1 (for some m > 3). Add a1 , . . . , am−1 12 and ∪im=−11 ai to H. Note
that since z({u0 , v0 }) = 2, the root of the hierarchy is always V r {u0 , v0 }.
Now, we name every cut in the hierarchy. For a cut S if there is a connected component of
at least two cuts with union equal to S, then call S a polygon cut with the A, B, C partitioning as
defined in Definition 5.18. If S is a cut with exactly two children X, Y in the hierarchy, then also
call S a polygon cut13 , A = E( X, X r Y ), B = E(Y, Y r X ) and C = ∅. Otherwise, call S a degree
cut.

Fact 4.34. The above procedure produces a valid hierarchy.

Proof. First observe that whenever |C| = 1 the unique cut in C is a 2η near min cut (w.r.t, x) which
is not crossed. For a polygon cut S in the hierarchy, by Lemma 4.16, the set S is a ǫη near min cut
w.r.t., x. If S is an atom of a polygon, then by Lemma 4.19 S is a ǫη near min cut.
Now, it remains to show that for a polygon cut S we have a valid ordering u1 , . . . , uk of cuts
in A(S). If S is a non-triangle polygon cut, the u1 , . . . , uk are exactly atoms of the polygon of S
and x( A), x( B) ≥ 1 − ǫη and x(C ) ≤ ǫη and x( E(ui , ui+1 )) ≥ 1 − ǫη follow by Theorem 4.9. For
a triangle cut S = X ∪ Y because S, X, Y are ǫη -near min cuts (by the previous paragraph), we
get x( A), x( B) ≥ 1 − ǫη as desired, by Lemma 2.7. Finally, since x(δ( X )), x(δ(Y )) ≥ 2 we have
x( E( X, Y )) ≥ 1 − ǫη .

The following observation is immediate:

Observation 4.35. Each cut S ∈ H corresponds to a contiguous interval around OPT cycle. For a
polygon u (or a triangle) with atoms a0 , . . . , am−1 for m ≥ 3 we say an OPT edge e∗ is interior to u if
e∗ ∈ E∗ ( ai , ai+1 ) for some 1 ≤ i ≤ m − 2. Any OPT edge e∗ is interior to at most one polygon.

(i) For each edge e ∈ Eg , se ≥ − xe η/8 and for any e ∈ E r Eg , se = 0.

(ii) For each η-near-min-cut S w.r.t. z, if δ(S) T is odd, then s(δ(S)) + s∗ (δ(S)) ≥ 0.

(iii) We have E [ se ] ≤ −ǫP ηxe for all edges e ∈ Eg and E [ s∗e∗ ] ≤ 28η 2 for all OPT edges e∗ ∈ E∗ . for
ǫP defined in (33).
12 Notice that an atom may already correspond to a connected component, in such a case we do not add it in this
step.
13 Think about such set as a degenerate polygon with atoms a : = X, a : = Y, a : = X ∪ Y. So, for the rest of this
1 2 0
section we call them triangles and in later section we just think of them as polygon cuts.

44
(iv) For every η-near minimum cut S of z crossed on (at most) one side such that S 6= V r {u0 , v0 },
x(δ(S) ∩ Eg ) ≥ 3/4.

Proof. Let Eg , s be as defined in Theorem 4.33, and let se0 = ∞. Also, let s∗ be the sum of the
s∗ vectors from Theorem 4.2 and Theorem 4.24. (i) follows (ii) of Theorem 4.33. E [ s∗e∗ ] ≤ 28η 2
follows from Theorem 4.2 and Theorem 4.24 and the fact that every OPT edge is interior to at
most one polygon. Also, E [ se ] ≤ −ǫP xe for edges e ∈ Eg follows from (v) of Theorem 4.33.
Now, we verify (iv): For any (non-root) cut S ∈ H such that p(S) is not a polygon cut
x(δ(S) ∩ Eg ) ≥ 3/4 by (i) of Theorem 4.33. The only remaining η-near minimum cuts are sets S
which are either atoms or near minimum cuts in the component C corresponding to a polygon u.
So, by Lemma 2.7, x(δ(S) ∩ δ(u)) ≤ 1 + ǫη . By (i) of Theorem 4.33 all edges in δ(S) r δ(u) are in
Eg . Therefore, x(δ(S) ∩ Eg )) ≥ 1 − ǫη ≥ 3/4.
It remains to verify (ii): We consider 4 groups of cuts:
Type 1: Near minimum cuts S such that e0 ∈ δ(S). Then, since se0 = ∞, s(δ(S)) + s∗ (δ(S)) ≥
0.
Type 2: Near minimum cuts S ∈ H where p(S) is not a polygon cut. By (iv) of Theorem 4.33
and that s∗ ≥ 0 the inequality follows.
Type 3: Near minimum cuts S crossed on both sides. Then, the inequality follows by
Theorem 4.2 and the fact that se ≥ −η/8 for all e ∈ E.
Type 4: Near minimum cuts S that are crossed on one side (and not in H) or S ∈ H and p(S)
is a (non-triangle) polygon cut. In this case S must be an atom or a η-near minimum cut (w.r.t.,
z) in some polygon u ∈ H. If S is not a leftmost cut/atom or a rightmost cut/atom, then the
inequality follows by Theorem 4.24. Otherwise, say S is a leftmost cut. If u is left-happy then by
Theorem 4.24 the inequality is satisfied. Otherwise, for F = δ(S) r δ(u), by Lemma 2.7, we have
x( F ) ≥ 1 − ǫη /2. Therefore, by (iii) of Theorem 4.33 we have

s(δ(S)) + s∗ (δ(S)) ≥ s( A) + s( F ) + s− (C ) ≥ 0

as desired. Note that since S is a leftmost cut, we always have A ⊆ δ(S). But C may have
an unpredictable intersection with δ(S); in particular, in the worst case only edges of C with
negative slack belong to δ(S). A similar argument holds when S is the leftmost atom or a
rightmost cut/atom.
Type 5: Near min cut S is the leftmost atom or the rightmost atom of a triangle u. This is
similar to the previous case except we use Lemma 4.25 to argue that the inequality is satisfied
when u is left happy.

4.6 Hierarchy Notation

In the rest of the paper we will not work with z, OPT edges, or the notion of polygons. So,
practically, by Definition 4.31, from now on, a reader can just think of every polygon as a triangle.
In the rest of the paper we adopt the following notation.
We abuse notation and call any u ∈ A(S) an atom of S.

Definition 4.36 (Edge Bundles, Top Edges, and Bottom Edges). For every degree cut S and every
pair of atoms u, v ∈ A(S), we define a top edge bundle f = (u, v) such that

f = {e = (u′ , v′ ) ∈ E : p(e) = S, u′ ∈ u, v′ ∈ v}.

45
Note that in the above definition, u′ , v′ are actual vertices of G.
For every polygon cut S, we define the bottom edge bundle f = {e : p(e) = S}.

We will always use bold letters to distinguish top edge bundles from actual LP edges. Also,
we abuse notation and write xe := ∑ f ∈e x f to denote the total fractional value of all edges in this
bundle.
In the rest of the paper, unless otherwise specified, we work with edge bundles and sometimes
we just call them edges.
For any u ∈ H with p(u) = S we write

δ ↑ ( u ) : = δ ( u ) ∩ δ ( S ),
δ → ( u ) : = δ ( u ) r δ ( S ).
E→ (S) := {e = (ui , u j ) : ui , u j ∈ A(S), ui 6= u j }.

Also, for a set of edges A ⊆ δ(u) we write A→ , A↑ to denote A ∩ δ→ (u), A ∩ δ↑ (u) respectively.
Note that E→ (S) ⊆ E(S) includes only edges between atoms of S and not all edges between
vertices in S.

5 Probabilistic statements
5.1 Gurvits’ Machinery and Generalizations
The following is the main result of this subsection.

Proposition 5.1. Given a SR distribution µ : 2[n] → R + , let A1 , . . . , Am be random variables correspond-

ing to the number of elements sampled from m disjoint sets, and let integers n1 , . . . , nm ≥ 0 be such that
for any S ⊆ [m],
" #
P ∑ Ai ≥ ∑ ni ≥ ǫ,
i∈S i∈S
" #
P ∑ Ai ≤ ∑ ni ≥ ǫ,
i∈S i∈S

it follows that,
P [∀i : Ai = ni ] ≥ f (ǫ)P [ A1 + · · · + Am = n1 + · · · + nm ] ,
2m
where f (ǫ) ≥ ǫ ∏m 1
k=2 max{ n k ,n1 +···+ n k −1 }+1 .

We remark that in applications of the above statement, it is enough to know that for any set
S ⊆ [m], ∑i∈S ni − 1 < E [ ∑i∈S Ai ] < ∑i∈S ni + 1. Because, then by Lemma 2.21 we can prove a
lower bound on the probability that ∑i∈S Ai = ∑i∈S ni .
We also remark the above lower bound of f (ǫ) is not tight; in particular, we expect the
dependency on m should only be exponential (not doubly exponential). We leave it as an open
problem to find a tight lower bound on f (ǫ).

46
Proof. Let E be the event A1 + · · · + Am = n1 + · · · + nm .
P [1 ≤ i ≤ m : Ai = ni ] =P [E ] P [ Am = nm |E ] P [ Am−1 = nm−1 | Am = nm , E ]
. . . P [ A2 = n2 | A3 = n3 , . . . , S A m = n m , E ]
So, to prove the statement, it is enough to prove that for any 2 ≤ k ≤ n,
m − k +1 1
P [ A k = n k | A k + 1 = n k + 1 , . . . , A m = n m , E ] ≥ ǫ2 (16)
max{nk , n1 + · · · + nk−1 } + 1
By the following Claim 5.2,
m − k +1
P [ A k ≥ n k | A k + 1 = n k + 1 , . . . , A m = n m , E ] ≥ ǫ2 ,
2m − k +1
P [ A k ≤ n k | A k+1 = n k+1 , . . . , A m = n m , E ] ≥ ǫ .
So, (16) simply follows by Lemma 5.3. Now we prove this claim.
Claim 5.2. Let [k] := {1, . . . , k}. For any 2 ≤ k ≤ m, and any set S ( [k],
" #
m − k +1
P ∑ A i ≥ ∑ n i | A k+1 = n k+1 , . . . , A m = n m , E ≥ ǫ2 ,
i∈S i∈S
" #
m − k +1
P ∑ A i ≤ ∑ n i | A k+1 = n k+1 , . . . , A m = n m , E ≥ ǫ2
i∈S i∈S

Proof. We prove by induction. First, notice for k = m the statement holds just by lemma’s
assumption and Lemma 5.4. Now, suppose the statement holds for k + 1. Now, fix a set S ( [k].
Let S = [k] r S. Define A = ∑i∈S Ai and B = ∑i∈S Ai , and similarly define n A , n B . By the
induction hypothesis,
m −k
ǫ2 ≤ P [ A ≤ n A | A k+2 = n k+2 , . . . , A m = n m , E ]
The same statement holds for events A ≥ n A , B ≤ n B , B ≥ n B , A + B ≥ n A + n B , A + B ≤ n A + n B .
Let Ek+1 be the event Ak+2 = nk+2 , . . . , Am = nm , E . Then, by Lemma 5.3, P [ A + B = n A + n B |Ek+1 ] >
0. Therefore, by Lemma 5.4,
m −k m − k +1
P [ A ≥ n A | A + B = n A + n B , E k + 1 ] , P [ A ≤ n A | A + B = n A + n B , E k + 1 ] ≥ ( ǫ2 ) 2 = ǫ2
as desired. Note that here we are using that A + B = n A + n B and Ek+1 implies that Ak+1 =
n k+1 .

This finishes the proof of Proposition 5.1

Lemma 5.3. Let µ : 2[n] → R ≥0 be a d-homogeneous SR distribution. If for an integer 0 ≤ k ≤ d,

P S∼µ [|S| ≥ k] ≥ ǫ and P µ [|S| ≤ k] ≥ ǫ. Then,
ǫ ǫ
P [|S| = k] ≥ min{ , },
( k + 1 d − k + 1
1/ max{k,d−k} !)
ǫ
P [|S| = k] ≥ min pm , ǫ 1 − .
pm

where pm ≤ max0≤i≤d P [|S| = i] is a lower bound on the mode of |S|.

47
Proof. Since µ is SR, the sequence s0 , s1 , . . . , sd where si = P [|S| = i] is log-concave and unimodal.
So, either the mode is in the interval [0, k] or in [k, d]. We assume the former and prove the
lemma; the latter can be proven similarly. First, observe that since sk ≥ sk+1 ≥ · · · ≥ sd , we get
sk ≥ ǫ/(d − k + 1). In the rest of the proof, we show that sk ≥ ǫ(1 − (ǫ/pm )1/k ).
1/(k−i)
s
Suppose si is the mode. It follows that there is i ≤ j ≤ k − 1 such that s j+j 1 ≥ sski . So,
by Eq. (6),
sk
ǫ ≤ sk + · · · + sd ≤ 1/(k−i)
1 − sski
If sk ≥ pm or sk ≥ ǫ then we are done. Otherwise,

sk ≥ ǫ 1 − (sk /pm )1/(k−i) ≥ ǫ 1 − (ǫ/pm )1/k

where we used si ≥ pm and sk ≤ ǫ.

Lemma 5.4. Given a strongly Rayleigh distribution µ : 2[n] → R ≥0 , let A, B be two (nonnegative)
random variables corresponding to the number of elements sampled from two disjoint sets such that
P [ A + B = n] > 0 where n = n A + n B . Then,

P [ A ≥ n A | A + B = n] = P [ B ≤ n B | A + B = n] ≥ P [ A ≥ n A ] P [ B ≤ n B ] , (17)
P [ A ≤ n A | A + B = n] = P [ B ≥ n B | A + B = n] ≥ P [ A ≤ n A ] P [ B ≥ n B ] . (18)

Proof. We prove the second statement. The first one can be proven similarly. First, notice

P [ A ≤ n A , A + B ≥ n] + P [ B ≥ n B , A + B < n]
=P [ B ≥ n B , A ≤ n A , A + B ≥ n ] + P [ A ≤ n A , B ≥ n B , A + B < n ]
=P [ B ≥ n B , A ≤ n A ] ≥ P [ B ≥ n B ] P [ A ≤ n A ] =: α,

where the last inequality follows by negative association. Say q = P [ A + B ≥ n]. From above,
either P [ A ≤ n A , A + B ≥ n] ≥ αq or P [ B ≥ n B , A + B < n] ≥ α(1 − q). In the former case, we
get P [ A ≤ n A | A + B ≥ n] ≥ α and in the latter we get P [ B ≥ n B | A + B < n] ≥ α. Now the
lemma follows by the stochastic dominance property

P [ A ≤ n A | A + B = n] ≥ P [ A ≤ n A | A + B ≥ n]
P [ B ≥ n B | A + B = n] ≥ P [ B ≥ n B | A + B < n]

Note that in the special case that A + B < n never happens, the lemma holds trivially.

Combining the previous two lemmas, we get

Corollary 5.5. Let µ : 2[n] → R ≥0 be a SR distribution. Let A, B be two random variables corresponding
to the number of elements sampled from two disjoint sets of elements. If P [ A ≥ n A ] , P [ B ≥ n B ] ≥ ǫ1
and P [ A ≤ n A ] , P [ B ≤ n B ] ≥ ǫ2 , then
1 1
P [ A = n A | A + B = n A + n B ] ≥ ǫ min{ , },
n + 1 nB + 1
n A o
P [ A = n A | A + B = n A + n B ] ≥ min pm , ǫ(1 − (ǫ/pm )1/ max{n A ,n B } )

48
where ǫ = ǫ1 ǫ2 and pm ≤ max0≤k≤n A +n B P [ A = k| A + B = n A + n B ] is a lower bound on the mode of
A.
For n A = 1, n B = 1, if P [ A = 1| A + B = 2] ≤ ǫ, since the distribution of A is unimodal, we get
pm ≥ 1 − 2ǫ. Therefore, if ǫ ≤ 1/3,

ǫ
P [ A = 1| A + B = 2] ≥ max ǫ/2, ǫ 1 − .
1 − 2ǫ

5.2 Max Flow

The following proposition is the main statement of this subsection.

Proposition 5.6. Let µ : 2E → R ≥0 be a homogeneous SR distribution. For any 300ǫ < ζ < 0.003
and disjoint sets A, B ⊆ E such that 1 − ǫ ≤ E [ A T ] , E [ BT ] ≤ 1 + ǫ (where T ∼ µ) there is an event
E A,B ( T ) such that P [E A,B ( T )] ≥ 0.002ζ 2 (1 − ζ/3 − ǫ) and it satisfies the following three properties.
i) P [ A T = BT = 1|E A,B ( T )] = 1,

ii) ∑e∈ A |P [ e] − P [ e|E A,B ( T )] | ≤ ζ, and

iii) ∑e∈ B |P [ e] − P [e|E A,B ( T )] | ≤ ζ.

In other words, under event E A,B which has a constant probability, A T = BT = 1 and the
marginals of all edges in A, B are preserved up to total variation distance ζ. We also remark that
above statement holds for a much larger value of ζ at the expense of a smaller lower bound on
P [E A,B ( T )].
Before, proving the above statement we prove the following lemma.

Lemma 5.7. Let µ : 2E → R ≥0 be a homogeneous SR distribution. Let A, B ⊆ E be two disjoint sets such
that 1 − ǫ ≤ E [ A T ] , E [ BT ] ≤ 1 + ǫ (where T ∼ µ), A′ ⊂ A and B′ ⊆ B and E [ A′T ∪ BT′ ] ≥ 1 + α for
some α > 100ǫ. If α < 0.001, we have

P A′T = BT′ = A T = BT = 1 ≥ 0.1α3 .

Proof. First, condition on ( A r A′ ) T = ( B r B′ ) T = 0. This happens with probability at least

α − 2ǫ ≥ 0.98α because E [ A T ] + E [ BT ] ≤ 2 + 2ǫ and E [ A′T ] + E [ BT′ ] ≥ 1 + α. Call this measure
ν. It follows by negative association that

E ν A′T , E ν BT′ ∈ [α − ǫ, 2 + 3ǫ − α].

(19)

• Case 1: E ν [ A′T + BT′ ] > 1.5. Since E ν [ A′T + BT′ ] ≤ 2 + 2ǫ, by Lemma 2.21, P ν [ A′T + BT′ = 2] ≥
0.25. Furthermore, by ,

P ν A′T ≥ 1 , P ν BT′ ≥ 1 ≥ 1 − e−(α−ǫ) ≥ 0.98α

(Lemma 2.22, α < 0.001)
′ ′
P ν A T ≤ 1 , P ν BT ≤ 1 ≥ α/2 − 1.5ǫ (Markov’s Inequality)

Therefore, by Corollary 5.5 and using α ≤ 0.001, P [ A′T = 1| A′T + BT′ = 2] ≥ 0.45α2 . It
follows that

P A T = BT = A′T = BT′ = 1 ≥ (0.98α)P ν A′T = BT′ = 1 ≥ (0.98α)0.25(0.45α2 ) ≥ 0.1α3 .

49
• Case 2: E [ A′T + BT′ ] ≤ 1.5 Since E ν [ A′T + BT′ ] ≥ 1 + α, by Lemma 2.21, P [ A′T + BT′ = 2] ≥
αe−α ≥ 0.99α. But now E [ A′T ] , E [ BT′ ] ≤ 1.5 and therefore by Markov’s Inequality,
P ν A′T ≤ 1 , P ν BT′ ≤ 1 ≥ 0.25.

On the other hand, by Lemma 2.22 P ν [ A′T ≥ 1] , P ν [ BT′ ≥ 1] ≥ (α − ǫ)e−α+ǫ ≥ 0.96α. It

follows by Corollary 5.5 that P [ A′T = 1| A′T + BT′ = 2] ≥ 0.2α. Therefore,
P A T = BT = A′T = BT′ = 1 ≥ (0.98α)P ν A′T = BT′ = 1 ≥ (0.98α)(0.2α)(0.99α) ≥ 0.1α3

as desired.

It is worth noting that α3 dependency is necessary in the above example. For an explicit
Strongly Rayleigh distribution consider the following product distribution:
(αx1 + (1 − α)y2 )(αy1 + (1 − α)z2 )(αz1 + (1 − α) x2 ),
and let A = { x1 , x2 }, B′ = B = {y1 , y2 }, and A′ = { x1 }. Observe that
P A = B = A′ = B′ = 1 = P [ x1 = 1, y1 = 1, z1 = 1] = α3 .

Proof of Proposition 5.6. To prove the lemma, we construct an instance of the max-flow, min-cut
problem. Consider the following graph with vertex set {s, A, B, t}. For any e ∈ A, f ∈ B connect
e to f with a directed edge of capacity ye, f = P [e, f ∈ T | A T = BT = 1]. For any e ∈ E, let
xe := P [e ∈ T ]. Connect s to e ∈ A with an arc of capacity βxe and similarly connect f ∈ B to t
with arc of capacity βx f , where β is a parameter that we choose later. We claim that the min-cut
of this graph is at least β(1 − ǫ − ζ/3). Assuming this, we can prove the lemma as follows: let
z be the maximum flow, where ze, f is the flow on the edge from e to f . We define the event
E A,B ( T ) = E ( T ) to be the union of events ze, f . More precisely, conditioned on A T = BT = 1 the
events e, f ∈ T | A T = BT = 1 are disjoint for different pairs e ∈ A, f ∈ B, so we know that we
have a specific e, f in the tree T with probability ye, f . And, of course, ∑e∈ A, f ∈ B ye, f = 1. So, for
e ∈ A, f ∈ B we include a ze, f measure of trees, T, such that A T = BT = 1, e, f ∈ T. First, observe
that
P [E ] = ∑ ze, f P [ A T = BT = 1] ≥ β(1 − ζ/3 − ǫ)P [ A T = BT = 1] . (20)
e ∈ A, f ∈ B

Part (i) of the proposition follows from the definition of E . Now, we check part (ii): Say z =
∑e∈ A, f ∈ B ze, f , and the flow into e is ze . Then,

ze, f ze
∑ |xe − P [e ∈ T |E ] | = ∑ xe − ∑ z = ∑ |xe − z |

e∈ A e∈ A f e∈ A

Note that both x and ze /z define a probability distribution on edges in A; so the RHS is just the
total variation distance between these two distributions. We can write
z
e
∑ e | x − P [ e ∈ T |E ] | = 2 ∑ z
− x e
e∈ X e ∈ X:ze /z> xe

βxe
≤ 2 ∑ β(1 − ζ/3 − ǫ)
− xe
e ∈ X:ze /z> xe
ζ/3 + ǫ (1 + ǫ)(ζ/3 + ǫ)
≤ 2 · ∑ xe ≤2 ≤ ζ.
e 1 − ζ/3 − ǫ 1 − ζ/3 − ǫ

50
The first inequality uses that the max-flow is at least β(1 − ζ/3) and that the incoming flow of e
is at most βxe , and the last inequality follows by ζ < 1/20 and ǫ < ζ/300. (iii) can be checked
similarly.
It remains to lower-bound the max-flow or equivalently the min-cut. Consider an s, t-cut S, S,
i.e., assume s ∈ S and t ∈/ S. Define S A = A ∩ S and SB = B ∩ S. We write

cap(S, S) = βx(S A ) + βx(SB ) + ∑ ye, f

e∈S A , f ∈SB

= βx(S A ∪ SB ) + P S A = S B = 1| A = B = 1
If x(SB ) ≥ x(S A ) − ζ/3, then

cap(S, S) ≥ βx(S A ∪ SB ) ≥ β( x(S A ∪ S A ) − ζ/3) ≥ β(1 − ǫ − ζ/3),

and we are done. Otherwise, say x(SB ) + γ = x(S A ), for some γ > ζ/3. So,

x(S B ) + x(S A ) = x(S B ) + x(SB ) + γ ≥ 1 − ǫ + γ

So, by Lemma 5.7 with (α = γ − ǫ)

P SA = SB = A = B = 1
0.1(γ − ǫ)3
P S A = S B = 1| A = B = 1 ≥ ≥ .
P [ A = B = 1] P [ A = B = 1]
It follows that
0.1(γ − ǫ)3
cap(S, S) ≥ βx(S A ∪ SB ) +
P [ A = B = 1]
0.1(γ − ǫ)3
≥ β( x ( S A ∪ S A ) − γ ) +
P [ A = B = 1]
0.1(γ − ǫ)3
≥ β (1 − ǫ − γ ) +
P [ A = B = 1]
To prove the lemma we just need to choose β such that RHS is at least β(1 − ǫ − ζ/3). Or
equivalently,
0.1(γ − ǫ)3
≥ β(γ − ζ/3).
P [ A = B = 1]
0.1( γ − ǫ )3
In other words, it is enough to choose β ≤ P [ A = B =1]( γ − ζ/3)
. Since γ ≥ ζ/3 and ζ/3 > 100ǫ, we
0.1ζ 2 /62
certainly have γ − ǫ ≥ ζ/6. Therefore, we can set β = P [ A= B=1] . Finally, this plus (20) gives

P [E ] ≥ (1 − ζ/3 − ǫ) βP [ A = B = 1] = 0.1(ζ 2 /62 )(1 − ζ/3 − ǫ) ≥ 0.002ζ 2 (1 − ζ/3 − ǫ)

as desired.

Definition 5.8 (Max-flow Event). For a polygon cut S ∈ H with polygon partition A, B, C, let ν be
the max-entropy distribution conditioned on S is a tree and CT = 0. By Lemma 2.23, we can write
ν : νS × νG/S , where νS is supported on trees in E(S) and νG/S on trees in E( G/S). For a sample
( TS , TG/S ) ∼ νS × νG/S , we say ES occurs if E A,B ( TG/S ) occurs, where E A,B (.) is the event defined in
1
Proposition 5.6 for sets A, B and ζ = ǫ M := 4000 and ǫ = 2ǫη .

51
Corollary 5.9. For a polygon cut S ∈ H with polygon partition A, B, C, we have,
i) P [ES ] ≥ 0.001ǫ2M .

ii) For any set F ⊆ δ(S) conditioned on ES marginals of edges in F are preserved up to ǫ M + ǫη in
total variation distance.

iii) For any F ⊆ E(S) ∪ δ(S) where either F ∩ A = ∅ or F ∩ B = ∅, there is some q ∈ x( F ) ± (ǫ M +

2ǫη ) such that the law of FT |ES is the same as a BS(q).
Proof. Condition S to be a tree and CT = 0 and let ν be the resulting measure. It follows that

P [ES ] = P ν [ES ] P [CT = 0, S tree] ≥ 0.002ǫ2M (1 − ǫ M /3 − ǫ)P [ CT = 0, S tree] ≥ 0.001ǫ2M .

which proves (i).

Now, we prove (ii). By Proposition 5.6, the marginals of edges in δ(S) are preserved up to a
total variation distance of ǫ M , so

E ν [( F ∩ δ(S)) T |E A,B ( TG/S )] = E ν [( F ∩ δ(S)) T ] ± ǫ M .

Since x(C ) ≤ ǫη and x(δ(S)) ≤ 2 + ǫη , by negative association,

x( F ∩ δ(S)) − ǫη /2 ≤ E ν [( F ∩ δ(S)) T ] ≤ x( F ∩ δ(S)) + ǫη .

This proves (ii). Also observe that since conditioned on ES , we choose at most one edge of
F ∩ δ(S), ( F ∩ δ(S)) T is a BS(q G/S ) for some q G/S = x( F ) ± (ǫ M + ǫη ).
On the other hand, observe that conditioned on ES , S is a tree, so

x( F ∩ E(S)) ≤ E [( F ∩ E(S)) T |ES ] ≤ x( F ∩ E(S)) + ǫη /2.

Since the distribution of ( F ∩ E(S)) T under ν|ES is SR, there is a random variable BS(qS ) =
( F ∩ E(S)) T where x( F ∩ E(S)) ≤ qS ≤ x( F ∩ E(S)) + ǫη /2.
Finally, FT |ES is exactly BS(qS ) + BS(q G/S ) = BS(q) for q = x( F ) ± (ǫ M + 2ǫη ).

Corollary 5.10. For u ∈ H and a polygon cut S ∈ H that is an ancestor of u,

P [δ(u) T odd|ES ] ≤ 0.5678.

Proof. First, notice by Observation 4.32, δ(u) ∩ δ(S) is either a subset of A, B, or C. Therefore,
we can write δ(u) T |ES as a BS(q) for q ∈ 2 ± [0.001] (where we use that ǫ M + 3ǫη < 0.001).
Furthermore, since δ(u) T 6= 0 with probability 1, we can write this as a 1 + BS(q − 1). Therefore,
by Corollary 2.17,
1 1
P [ δ(u) T odd|ES ] = P [ BS(q − 1) even] ≤ (1 + e−2(q−1) ) ≤ (1 + e−1.999 ) ≤ 0.5678
2 2
as desired.

Corollary 5.11. For a polygon cut u ∈ H and a polygon cut S ∈ H that is an ancestor of u,

P [u not left happy |ES ] ≤ 0.56794.

and the same follows for right happy.

52
Proof. Let A, B, C be the polygon partition of u. Recall that for u to be left-happy, we need
CT = 0 and A T odd. Similar to the previous statement, we can write A T |ES as a BS(q A ) for q A ∈
1 ± [0.00026] (where we used that ǫ M = 1/4000 and ǫη ≤ ǫ M /300). Therefore, by Corollary 2.17,

1 1
P [ A T even|ES ] ≤ (1 + e−2q A ) ≤ (1 + e−1.9997 ) ≤ 0.56768
2 2
Finally, E [ CT |ES ] ≤ x(CT ) + ǫ M + 2ǫη ≤ 0.00026. Now using the union bound,

P [ u not left happy | ES ] ≤ 0.56768 + 0.00026 ≤ 0.56794

as desired.

5.3 Good Edges

Definition 5.12 (Half Edges). We say an edge bundle e = (u, v) in a degree cut S ∈ H, i.e., p(e) = S,
is a half edge if | xe − 1/2| ≤ ǫ1/2 .

Definition 5.13 (Good Edges). We say a top edge bundle e = (u, v) in a degree cut S ∈ H is (2-2)
good, if one of the following holds:

1. e is not a half edge or

2. e is a half edge and P [δ(u) T = δ(v) T = 2|u, v trees] ≥ 3ǫ1/2 .

We say a top edge e is bad otherwise. We say every bottom edge bundle is good (but generally do not refer
to bottom edges as good or bad). We say any edge e that is a neighbor of u0 or v0 is bad.

In the next subsection we will see that for any top edge bundle e = (u, v) which is not a half
edge, P [(δ(u)) T = (δ(v)) T = 2|u, v trees] = Ω(1). The following theorem is the main result of
this subsection:
2 , a top edge bundle e = (u, v) is bad only if the following
Theorem 5.14. For ǫ1/2 ≤ 0.0005, ǫη ≤ ǫ1/2
three conditions hold simultaneously:

• e is a half edge,

• x(δ↑ (u)), x(δ↑ (v)) ≤ 1/2 + 9ǫ1/2 ,

• Every other half edge bundle incident to u or v is (2-2) good.

The proof of this theorem follows from Lemma 5.16 and Lemma 5.17 below.
In this subsection, we use repeatedly that for any atom u in a degree cut S, x(δ(u)) ≤ 2 + ǫη .
We also repeatedly use that for a half edge bundle e = (u, v) in a degree cut, conditioned on u, v
trees, e is in or out with probability at least 1/2 − ǫ1/2 − 3ǫη > 0.49.

Lemma 5.15. Let e = (u, v) be a good half edge bundle in a degree cut S ∈ H. Let A = δ(u)−e and
B = δ(v)−e . If ǫ1/2 ≤ 0.001 and ǫη < ǫ1/2 /100, then

P [ A T + BT ≤ 2|u, v trees] , P [ A T + BT ≥ 4|u, v trees] ≥ 0.4ǫ1/2

53
Proof. Throughout the proof all probabilistic statements are with respect to the measure µ condi-
tioned on u, v trees. Let p≤2 = P [ A T + BT ≤ 2] and similarly define p≥4 . Observe that whenever
δ(u) T = δ(v) T = 2, we must have A T + BT 6= 3. Since e is 2-2 good, this event happens with
probability at least 3ǫ1/2 , i.e.,
p≤2 + p≥4 ≥ 3ǫ1/2 (21)
By Lemma 2.21, using the fact that p0 = 0, we get p=3 ≥ 1/4.
First, we show that p≤2 ≥ 0.4ǫ1/2 . We have

3 + 2ǫ1/2 ≥ E [ A T + BT ] ≥ 4p≥4 + 2p=2 + 3(1 − p≥4 − p≤2 ) = 3 + p≥4 − p=2 − 3p=1 .

Again, we are using p0 = 0. By log-concavity p2=2 ≥ p=3 p=1 , so since p=3 ≥ 1/4, p=1 ≤ 4p2=2 ≤
4p2≤2 . Therefore,

p≥4 − 2ǫ1/2 ≤ p=2 + 3p=1 = p≤2 + 2p=1 ≤ p≤2 (1 + 8p≤2 ).

Finally, since ǫ1/2 < 0.001, plugging this upper bound on p≥4 into Eq. (21) we get p≤2 ≥
0.4ǫ1/2 .
Now, we show p≥4 ≥ 0.4ǫ1/2 /2. Assume p≥4 < ǫ1/2 /2 (otherwise we are done). Since
p=3 ≥ 1/4 by Lemma 2.18 with γ ≤ (ǫ1/2 /2)/(1/4) = 2ǫ1/2
p≥4
E [ A T + B T | A T + B T ≥ 4] · p ≥ 4 ≤ (4 + 3ǫ1/2 )
1 − 2ǫ1/2

Therefore,
p≥4
3 − 2ǫ1/2 − 2ǫη ≤ E [ A T + BT ] ≤ 2p≤2 + (4 + 3ǫ1/2 ) + 3(1 − p≤2 − p≥4 )
1 − 2ǫ1/2

So, 1.01p≥4 ≥ p≤2 − 2.02ǫ1/2 where we used ǫ1/2 ≤ 0.001 and ǫη < ǫ1/2 /100. Now, p≥4 ≥ 0.4ǫ1/2
follows by Eq. (21).

δ↑ (u)

u v
e
W
S

Figure 11: Setting of Lemma 5.16

Lemma 5.16. Let e = (u, v) be a half edge bundle in a degree cut S ∈ H, and suppose x(δ↑ (u)) ≥
1/2 + kǫ1/2 . If k ≥ 9 and ǫ1/2 ≤ 0.001, then, e is 2-2 good.

Proof. First, condition u, v, S to be trees. Let W = S r {u}. Since S is a near mincut,

x(δ(W )) = x(δ(S)) + x(δ(u)) − 2x(δ↑ (u)) ≤ 2(2 + ǫη ) − 2(1/2 + kǫ1/2 ) = 3 − 2kǫ1/2 + 2ǫη

54
So, by Lemma 2.23, P [W is tree] ≥ 1/2 + kǫ1/2 − ǫη − ǫη . Note that the extra −ǫη comes from
the fact that conditioning u be a tree can decrease marginals of edges in E(W ) by at most ǫη .
Let ν be the measure in which we also condition on W to be a tree. Note that ν is a strongly
Rayleigh distribution on the set of edges in E(W ) ∪ E(u, W ) ∪ E( G/S); this is because ν is a
product of 3 SR distributions each supported on one of the aforementioned sets.
Let X = δ↑ (u) T and Y = δ(v) T − 1. Observe that, under ν, X = Y = 1 iff δ(u) T = δ(v) T = 2.
Furthermore, Y ≥ 0 with probability 1, since v is connected to the rest of the graph. So, we just
to lower P ν [ X = Y = 1] . First, notice

E ν [ X ] ∈ [0.5 + kǫ1/2 − ǫη , 1 + ǫη ]
(22)
E ν [Y ] ∈ [0.5 + kǫ1/2 − 4ǫη , 1.5 − kǫ1/2 + 3ǫη ]

Note that using Proposition 5.1, we can immediately argue that P ν [ X = Y = 1] ≥ Ω(ǫ1/2 ). We
do the following more refined analysis to make sure that this probability is at least 3ǫ1/2 (for
ǫ1/2 ≤ 0.0005) and k ≥ 9.
Case 1: P ν [ X + Y = 2] ≥ 0.05. By Lemma 2.22, P ν [ X ≥ 1] P ν [Y ≥ 1] ≥ 1 − e−0.5 ≥ 0.4. On
the other hand, by Markov P ν [ X ≤ 1] , P ν [Y ≤ 1] ≥ 1/4. Therefore, by Corollary 5.5, P ν [ X = 1| X + Y = 2] ≥
0.1(1 − 1/8) = 0.0875. Therefore P ν [ X = 1, Y = 1] ≥ (0.05)(0.087) ≥ 0.004. Finally, removing
the conditioning on S and W being trees, we get P [ X = 1, Y = 1] ≥ (0.5)(0.004) = 0.002 ≥ 3ǫ1/2
since ǫ1/2 ≤ 0.0005.
Case 2: P ν [ X + Y = 2] < 0.05. We know that E ν [ X + Y ] ≤ 2.5; so if it is also at least 1.2, then
P [ X + Y = 2] ≥ 0.05 by Lemma 2.21.
So, from now on assume E ν [ X + Y ] < 1.2. Now, by Lemma 2.21, P [ X + Y = 1] ≥ 0.25. So,
since P ν [ X + Y = 2] < 0.05, by Lemma 2.18 (with γ = 0.2, i = 1, k = 3), P [ X + Y > 2] < 0.02.
On the other hand, by Eq. (22), since E ν [ X + Y ] < 1.2, we have E ν [ X ] , E ν [Y ] ≤ 0.7 (since
each of them is at least 0.5 by (22)). It follows by Lemma 2.22 that P ν [ X ≥ 1] , P ν [Y ≥ 1] ≥
1 − e−0.7 ≥ 0.5. In this case, applying stochastic dominance, we have

P ν [ X ≥ 1| X + Y = 2] ≥ P ν [ X ≥ 1| X + Y ≤ 2]
≥ P ν [X ≥ 1, X + Y ≤ 2]
≥ P ν [X ≥ 1] − P ν [ X + Y > 2] ≥ P ν [X ≥ 1] − 0.02 ≥ 0.48.

Similarly, P [ X ≤ 1| X + Y = 2] = P [Y ≥ 1| X + Y = 2] ≥ P [Y ≥ 1] − 0.02 ≥ 0.48. Finally since

the distribution of X conditioned on X + Y = 2 is the same as the number of successes in 2
independent Bernoulli trials, with probabilities, say, p1 and p2 , we can minimize p1 (1 − p2 ) +
(1 − p1 ) p2 subject to 1 − p1 p2 ≥ 0.48 and 1 − (1 − p1 )(1 − p2 ) ≥ 0.48. Solving this yields
P [ X = 1| X + Y = 2] ≥ 0.4.
Lastly, observe that since by Eq. (22) E ν [ X + Y ] ≥ 1 + (2k − 1)ǫ1/2 , by Lemma 2.21 we can
write
P ν [ X + Y = 2] ≥ (2k − 1)ǫ1/2 e−(2k−1)ǫ1/2 ≥ (2k − 2)ǫ1/2 .
Finally,

P [ δ(u) T = δ(v) T = 2] ≥ P [W tree] P ν [ X = 1| X + Y = 2] P ν [ X + Y = 2] ≥ 0.5 · 0.4(2k − 2)ǫ1/2

To get the RHS to be at least 3ǫ1/2 it suffices that k ≥ 9.

55
U V W
u v w
e f

Figure 12: Setting of Lemma 5.17

Lemma 5.17. Let e = (u, v), f = (v, w) be two half edge bundles in a degree cut S ∈ H. If ǫ1/2 < 0.0005,
then one of e or f is good.

Proof. We use the following notation V = δ(v)−e−f , U = δ(u)−e , W = δ(w)−f . For a set A of
edges and an edge bundle e we write A+e = A ∪ {e}. Furthermore, for a measure ν we write
ν−e to denote ν conditioned on e ∈
/ T.
Condition u, v, w to be trees. This occurs with probability at least 1 − 3ǫη . Let ν be this
measure. By Lemma 2.27, without loss of generality, we can assume

/ T ] ≤ E ν [WT ] + 0.405.
E ν [WT |e ∈ (23)

/ T ] ≥ E ν [VT ] + 0.03, then we will show e is 2-2 good. First,

Now, if E ν [VT |e ∈

E ν−e [(V+f ) T ] ∈ [1.53 − ǫ1/2 , 2 + ǫη ],

E ν−e [UT ] ∈ [1.5 − ǫ1/2 , 2 + ǫη ],
E ν−e [(V+f ) T + UT ] ∈ [3.03 − 2ǫ1/2 , 3.5 + 3ǫ1/2 ].

Therefore, by Lemma 2.21, P ν−e [(V+f ) T + UT = 4] ≥ 0.029, where we use the fact that UT ≥ 1
and (V+f ) T ≥ 1 with probability 1 under ν−e and apply this and the remaining calculations to
UT − 1, (V+f ) T − 1. In addition, we have

P ν−e [UT ≤ 2] , P ν−e [(V+f ) T ≤ 2] ≥ 0.499 (Markov Inequality)

P ν−e [UT ≥ 2] , P ν−e [(V+f ) T ≥ 2] ≥ 0.39 (Lemma 2.22)

It follows by Corollary 5.5 (with ǫ = 0.194 and pm = 0.6) that

P ν−e [UT = 2|UT + (V+f ) T = 4] ≥ 0.13.

Therefore,

/ T ] P ν−e [UT = (V+f ) T = 2]

P [ δ(u) T = δ(v) T = 2] ≥ P [ u, v, w trees, e ∈
≥ (0.49)(0.029)(0.13) ≥ 0.0018.

The lemma follows (i.e., e is 2-2 good) since 0.0018 ≥ 3ǫ1/2 for ǫ1/2 ≤ 0.0005.
/ T ] ≤ E ν [VT ] + 0.03 then we will show that f is 2/2 good. We have,
Otherwise, if E ν [VT |e ∈

E ν+f [(V+e ) T ] , E ν+f [WT ] ∈ [1 − 2ǫ1/2 , 1.5 + 2ǫ1/2 ]

P ν+f [(V+e ) T ≤ 1] , P ν+f [WT ≤ 1] ≥ 0.249 (Markov)
P ν+f [(V+e ) T ≥ 1] , P ν+f [WT ≥ 1] ≥ 0.63 (Lemma 2.21)

56
So, by Lemma 2.22 (with ǫ = 0.15, pm = 0.7), we get P ν+f [WT = 1|(V+e ) T + WT = 2] ≥ 0.11. On
the other hand,

P ν+f [(V+e ) T + WT = 2] ≥ P ν+f [e ∈

/ T ] P ν+f−e [(V+e ) T + WT = 2] ≥ (0.49)(0.0582) ≥ 0.0285

To derive the last inequality, we show P ν+f−e [(V+e ) T + WT = 2] ≥ 0.0582. This is because by
negative association and Eq. (23)

E ν+f−e [(V+e ) T + WT ] = E ν+f−e [VT + WT ]

≤ Eν−e [VT + WT ] ≤ Eν [WT ] + 0.405 + Eν [VT ] + 0.03 ≤ 2.94;

So, since (V+e ) T + WT is always at least 1, so by Theorem 2.15, in the worst case, P ν−e+f [(V+e ) T + WT = 2]
is the probability that the sum of two Bernoullis with success probability 1.94/2 is 1, which is
0.0582.
Therefore, similar to the previous case,

P [ δ(u) T = δ(v) T = 2] ≥ P [u, v, w trees, f ∈ T ] P ν+f [(V+e ) T = WT = 2] P ν+f [WT = 1|(V+e ) T = WT = 2]

≥ (0.49)(0.0285)(0.11) ≥ 3ǫ1/2

for ǫ1/2 ≤ 0.0005 as desired.

5.4 2-1-1 Good Edges

Definition 5.18 (A, B, C-Degree Partitioning). For u ∈ H, we define a partitioning of edges in δ(u):
Let a, b ( u be minimal cuts in the hierarchy, i.e., a, b ∈ H, such that a 6= b and x(δ( a) ∩ δ(u)), x(δ(b) ∩
δ(u)) ≥ 1 − ǫ1/1 . Note that since the hierarchy is laminar, a, b cannot cross. Let A = δ( a) ∩ δ(u), B =
δ(b) ∩ δ(u), C = δ(u) r A r B.
If there is no cut a ( u (in the hierarchy) such that x(δ( a) ∩ δ(u)) ≥ 1 − ǫ1/1 , we just let A, B be
arbitrary set of edges in δ(u) which x( A), x( B) ≥ 1 − ǫ1/1 .
If there is just one minimal cut a ( u (in the hierarchy) with x(δ( a) ∩ δ(u)) ≥ 1 − ǫ1/1 , i.e., b does
not exist in the above definition, then we define A = δ( a) ∩ δ(u). Let a′ ∈ H be the unique child of u
such that a ⊆ a′ , i.e., a is equal to a′ or a descendant of a′ . Then we define C = δ( a′ ) ∩ δ(u) r δ( a) and
B = (δ(u) r A) r C. Note that in this case since x(δ↑ ( a′ )) ≤ 1 + ǫη , we have x( B) ≥ 1 − ǫη ≥ 1 − ǫ1/1 .
See Fig. 6 for an example. The following inequalities on A, B, C degree partitioning will be
used in this section:
1 − ǫ1/1 ≤ x( A), x( B) ≤ 1 + ǫη ,
(24)
x(C ) ≤ 2ǫ1/1 + ǫη .
For an edge bundle e = (u, v) and degree partitioning A, B, C of δ(u) we write e( A) = e ∩ A.
Note that e( A) is not really an edge bundle.
In this section we will define a constant p > 0 which is the minimum probability that an edge
bundle is good.
Definition 5.19 (2-1-1 Happy/Good). Let e = (u, v) be a top edge bundle. Let A, B, C ⊆ δ(u) be a
Degree Partitioning of edges δ(u) as defined in Definition 5.18. We say that e is 2-1-1 happy with respect
to u if the event

A T = 1, BT = 1, CT = 0, δ(v) T = 2, and u and v are both trees

57
occurs.
We say e is 2-1-1 good with respect to u if

P [e is 2-1-1 happy wrt u] ≥ p.

Definition 5.20 (2-2-2 Happy/Good). Let e = (u, v), f = (v, w) be top half-edge bundles (with
p(e) = p(f)). We say e, f are 2-2-2 happy (with respect to v) if δ(u) T = δ(v) T = δ(w) T = 2 and u, v, w
are all trees.
We say e, f are 2-2-2 good with respect to v if P [e, f 2-2-2 happy] ≥ p.

We will use the following notation: For a set of edges D, and an edge bundle e, let e( D ) :=
e ∩ D.
The following theorem is the main result of this section.

Theorem 5.21. Let v, S ∈ H where p(v) = S, and let A, B, C be the degree partitioning of δ(v). For
2 , with ǫ
p ≥ 0.005ǫ1/2 2
1/2 ≤ 0.0002, ǫ1/1 ≤ ǫ1/2 /12 and ǫη ≤ ǫ1/2 , at least one of the following is true:

i) δ→ (v) has at least 1/2 − ǫ1/2 fraction of bad edges,

ii) δ→ (v) has at least 1/2 − ǫ1/2 − ǫη fraction of 2/1/1 good edges with respect to v.

iii) There are two (top) half edge bundles e, f ∈ δ→ (v) such that xe( B) ≤ ǫ1/2 , xf( A) ≤ ǫ1/2 , and e, f
are 2/2/2 good (with respect to v).

We will prove this theorem after proving several intermediate lemmas (whose proofs can be
found in Appendix A).

Lemma 5.22. Let e = (u, v) be a top edge bundle such that xe ≤ 1/2 − ǫ1/2 . If 12ǫ1/1 ≤ ǫ1/2 ≤ 0.001
2 .
then, e is 2/1/1 happy with probability at least 0.005ǫ1/2

Lemma 5.23. Let e = (u, v) be a top edge bundle such that xe ≥ 1/2 + ǫ1/2 . If 12ǫ1/1 ≤ ǫ1/2 ≤ 0.001,
2 .
then, e is 2/1/1 happy with respect to u with probability at least 0.006ǫ1/2

Lemma 5.25. Let e = (v, u) and f = (v, w) be good half top edge bundles and let A, B, C be the
degree partitioning of δ(v) such that xe( B) , xf( B) ≤ ǫ1/2 . Then, one of e, f is 2-1-1 happy with probability
2 .
at least 0.005ǫ1/2

Lemma 5.26. Let e = (u, v) be a good half edge bundle and let A, B, C be the degree partitioning of δ(u)
(see Fig. 15). If 12ǫ1/1 ≤ ǫ1/2 ≤ 0.001 and xe( A) , xe( B) ≥ ǫ1/2 , then

2
P [e 2-1-1 happy w.r.t u] ≥ 0.02ǫ1/2 .

58
Lemma 5.27. Let e = (u, v), f = (v, w) be two good top half edge bundles and let A, B, C be degree partitioning
of δ(v) such that xe( B) , xf( A) ≤ ǫ1/2 . If e, f are not 2-1-1 good with respect to v, and 12ǫ1/1 ≤ ǫ1/2 ≤
0.0002, then e, f are 2-2-2 happy with probability at least 0.01.

Proof of Theorem 5.21. Suppose case (i) does not happen. Since every bad edge has fraction at least
1/2 − ǫ1/2 this means that δ(v) has no bad edges. First, notice by Lemma 5.22 and Lemma 5.23
any non half-edge in δ→ (v) is 2/1/1 good (with respect to v). If there is only one half edge in
δ→ (v), then we have at least fraction 1 − ǫη − (1/2 + ǫ1/2 ) fraction of 2-1-1 good edges and we
are done with case (ii). Otherwise, there are two good half edges e, f ∈ δ→ (v).
First, by Lemma 5.26 if xe( A) , xe( B) ≥ ǫ1/2 , then e is 2/1/1 good (w.r.t., v) and we are done.
Similarly, if xf( A) , xf( B) ≥ ǫ1/2 , then f is good. So assume none of these happens.
Furthermore by Lemma 5.25 if xe( B) , xf( B) ≤ ǫ1/2 (or xe( A) , xf( A) ≤ ǫ1/2 ) then one of e, f is
2/1/1 good.
So, the only remaining case is when e, f are not 2-1-1 good and xe( B) , xf( A) ≤ ǫ1/2 . But in this
case by Lemma 5.27, e, f are 2/2/2 good; so (iii) holds.

Lemma 5.28. For a degree cut S ∈ H, and u ∈ A(S), let A, B, C be the degree partition of u. Then,
A ∩ δ→ (u) has fraction at most 1/2 + 4ǫ1/2 of good edges that are not 2-1-1 good (w.r.t., u).

Proof. Suppose by way of contradiction that there is a set D ⊆ A→ of good edges that are not
2-1-1 good w.r.t. u with x( D ) ≥ 21 + 4ǫ1/2 . By Lemma 5.22 and Lemma 5.23, every edge in D is
part of a half edge bundle.
There are at least two half edge bundles e, f such that x( D ∩ e), x( D ∩ f) ≥ ǫ1/2 , as there are at
most four half edge bundles in δ→ (u) (and using that for any half edge bundle e, xe ≤ 12 + ǫ1/2 ).
Since D ⊆ A→ , we have
x( A ∩ e), x( A ∩ f) ≥ ǫ1/2 .
Since x( A ∩ e) ≥ ǫ1/2 , if x( B ∩ e) ≥ ǫ1/2 then, by Lemma 5.26 e is 2-1-1 good. But since every
edge in D is not 2-1-1 good w.r.t u, we must have x( B ∩ e) < ǫ1/2 . The same also holds for f.
Finally, since x( B ∩ e) < ǫ1/2 and x( B ∩ f) < ǫ1/2 by Lemma 5.25 at least one of e, f is 2-1-1 good
w.r.t u. This is a contradiction.

6 Matching
Definition 6.1 (ǫF fractional edge). For z ≥ 0 we say that z is ǫF -fractional if ǫF ≤ z ≤ 1 − ǫF .

The following lemma is the main result of this section.

Lemma 6.2 (Matching Lemma). For a degree cut S ∈ H, let F (S) ⊆ A(S) denote the set of atoms u such
that x(δ↑ (u)) is ǫF -fractional. Then for any ǫF ≤ 1/10, ǫB ≥ 21ǫ1/2 , α ≥ 2ǫη , there is a matching from
good edges (see Definition 5.13) in E→ (S) to edges in δ(S) such that every good edge bundle e = (u, v)
(where u, v ∈ A(S)) is matched to a fraction me,u of edges in δ↑ (u) and a fraction me,v of δ↑ (v) where

me,u Fu + me,v Fv ≤ xe (1 + α), (25)

me,u = me,v = 0 if e is bad, and for every atom u ∈ A(S),

n o
Fu = 1 − ǫB I x(δ↑ (u)) is ǫF fractional .

59
Furthermore,
∑ me,u = x(δ↑ (u)) Zu (26)
e ∈δ→ ( u )

where for any u ∈ A(S),

n o
Zu := 1 + I |A(S)| ≥ 4, x(δ↑ (u)) ≤ ǫF .

Roughly speaking, the intention of the above lemma is to match edges in E→ (S) to a similar
fraction of edges from endpoints that go higher. Eq. (25) says that if x(δ↑ (u)) is fractional then
edges incident to u can be matched to a larger faction of edges in δ↑ (u). On the other hand,
Eq. (26) says that if x(δ↑ (u)) ≈ 0, then a larger fraction of edges will match to edges in δ↑ (u).
This is the matching that we use in order to decide which edges will have positive slack to
compensate for the negative slack of edges going higher.
Throughout this section we adopt the following notation: For a cut S ∈ H and a set W ⊆ A(S),
we write

E(W, S r W ) := ∪ a∈W,b∈A(S)rW E( a, b),

δ ↑ (W ) : = ∪ a ∈ W δ ↑ ( a ) = δ (W ) ∩ δ ( S ) ,
δ → (W ) : = ∪ a ∈ W δ → ( a ) .

Note that in δ→ (W ) 6⊆ δ(W ) since it includes edge bundles between atoms in W.

Before proving the main lemma we record the following facts.

Lemma 6.3. For any S ∈ H and W ( A(S), we have

1
x(δ→ (W )) ≥ ∑ x(δ( a)) − ǫ/2 ≥ |W | − ǫ/2
2 a ∈W

where the sum is over the vertices in W.

Proof. We have
!
1
x(δ→ (W )) = ∑ ( x(δ(a)) + x(E(W, S r W )) − x(δ↑ (W )) .
2 a ∈W

Since x(δ(S r W )) ≥ 2 and x(δ(S)) ≤ 2 + ǫ, we have:

(a) x( E(W, S r W )) + x(δ↑ (S r W ))) ≥ 2 and (b) x(δ↑ (W )) + x(δ↑ (S r W )) ≤ 2 + ǫ.

Subtracting (b) from (a), we get

x( E(W, S r W )) − x(δ↑ (W )) ≥ −ǫ,

which after substituting into the above equation, completes the proof of the first inequality in the
lemma statement. The second inequality follows from the fact that δ( a) ≥ 2 for each atom a.

Lemma 6.4. For S ∈ H, if |A(S)| = 3 then there are no bad edges in E→ (S).

60
Proof. Suppose A(S) = {u, v, w} and e = (u, v) is a bad edge bundle. Then | xe − 21 | ≤ ǫ1/2 . In
addition, by Theorem 5.14, x(δ↑ (u)), x(δ↑ (v)) ≤ 1/2 + 9ǫ1/2 . Therefore,

x(u,w) = x(δ(u)) − xe − x(δ↑ (u)) ≥ 1 − 10ǫ1/2 .

Similarly, x(v,w) ≥ 1 − 10ǫ1/2 . Finally, since x(δ(S)) ≥ 2, and x(δ↑ (u)), x(δ↑ (v)) ≤ 1/2 + 9ǫ1/2 , we
must have x(δ↑ (w)) ≥ 1 − 18ǫ1/2 . But, this contradicts the assumption that w ∈ H must satisfy
x(δ(w)) ≤ 2 + ǫη .

Proof of Lemma 6.2. We will prove this by setting up a max-flow min-cut problem. Construct a
graph with vertex set {s, X, Y, t}, where s, t are the source and sink. We identify X with the set of
good edge bundles in E→ (S) and Y with the set of atoms in A(S). For every edge bundle e ∈ X,
add an arc from s to e of capacity c(s, e) := (1 + α) xe . For every u ∈ A(S), there is an arc (u, t)
with capacity
c(u, t) = x(δ↑ (u)) Fu Zu .
Finally, connect e = (u, v) ∈ X to nodes u and v ∈ Y with a directed edge of infinite capacity,
i.e., c(e, u) = c(e, v) = ∞. We will show below that there is a flow saturating t, i.e. there is a flow
of value
c(t) := ∑ c(u, t) = ∑ x(δ↑ (u)) Fu Zu .
u ∈A( S ) u ∈A( S )

Suppose that in the corresponding max-flow, there is a flow of value f e,u on the edge (e, u).
Define
fe,u
me,u := .
Fu
Then (25) follows from the fact that the flow leaving e is at most the capacity of the edge from
s to e, and (26) follows by conservation of flow on the node u (after cancelling out Fu from both
sides).
We have left to show that for any s-t cut A, A where s ∈ A, t ∈ A that the capacity of this cut
is at least c(t).
Claim 6.5. If A = {s}, then capacity of ( A, A) is at least c(t).

Proof. If |A(S)| = 3 then Zu = 1 for all u ∈ A(S) and by Lemma 6.4 all edges are good. Also,

1 2|A(S)| − (2 + ǫη )
x( E→ (S)) = ∑ ( x(δ( a)) − x(δ↑ ( a))) ≥ = 2 − ǫη /2.
2 a∈A( S )
2

So, x( E→ (S)) ≥ 2 − ǫη /2. Thus, for α ≥ 2ǫη we have

c(s)(1 + α) ≥ (2 − ǫη /2)(1 + α) ≥ 2 + ǫη ≥ x(δ(S)) = c(t)

as desired.
Now, suppose |A(S)| ≥ 4. By Theorem 5.14 there is at most one bad half edge adjacent to
every vertex. Therefore there are at most |A(S)|/2 bad edges in total (the bound is met if they
form a perfect matching) which contributes to at most a total of (1/2 + ǫ1/2 )|A(S)|/2 =: xB

61
fraction. So, there is a fraction of at least xG := x( E→ (S)) − xB ≥ |A(S)| − 1 − ǫη /2 − xB of good
edges. So,
ǫη

3 ǫ1/2
(1 + α) xG ≥ (1 + α) |A(S)|( − )−1−
4 2 2
ǫη

3 ǫ1/2
≥ (1 + α) |A(S)|( − − ǫF ) − 1 − + ǫF |A(S)|.
4 2 2
≥ 2 + ǫη + ǫF |A(S)| ≥ c(t).

where the final inequality holds, e.g., for α ≥ 2ǫη and since |A(S)| ≥ 5 and ǫF ≤ 1/10.
If |A(S)| = 4, and we have 0 or 1 bad edges, then xG ≥ 2.5 − ǫη /2 − ǫ1/2 , so (1 + α) xG ≥
1
(1 + 2ǫη )(2.5 − ǫη /2 − ǫ1/2 ) ≥ 2 + ǫη + 2ǫF for ǫF < 10 (and noting that with |A(S)| = 4 at most
↑
2 nodes can have δ (u) ≤ ǫF ).
Finally, suppose that |A(S)| = 4 and there are two bad edges. Then they form a perfect
matching inside S and for each u ∈ A(S), x(δ↑ (u)) ≤ 1/2 + 9ǫ1/2 (see Theorem 5.14).
It also must be the case that x(δ↑ (u)) ≥ ǫF for each u ∈ A(S). If not, there would have to be
a node u′ ∈ A(S) such that x(δ↑ (u′ )) ≥ (2 − ǫF )/3 > 1/2 + 9ǫ1/2 , which is a contradiction to u′
having an incident bad edge. Thus, for each u ∈ A(S), x(δ↑ (u)) is ǫF -fractional, i.e., Fu = 1 − ǫB
and Zu = 1 implying that c(t) ≤ (2 + ǫη )(1 − ǫB ). Therefore, we have

c(s) = (1 + α) xG ≥ (1 + α)(3 − ǫη /2 − 2(1/2 + ǫ1/2 )) = (1 + α)(2 − 2ǫ1/2 − ǫη /2, )

and the rightmost quantity is at least c(t) for ǫB ≥ 2ǫ1/2 and α ≥ 2ǫη .

From now on, we assume that the min s-t cut A 6= {s}. In the following we will prove that
for any set of atoms W ( S, we have:

c(s, δ→ (W )) = (1 + α) xG (δ→ (W )) ≥ c(δ↑ (W ), t) (27)

where for a set F of edges we write xG ( F ) to denote the total fractional value of good edges in F.
Let A X = A ∩ X, AY = A ∩ Y and so on. Assuming the above inequality, let us prove the
lemma: First, for the set of edges A X chosen from X, let Q be the set of endpoints of all edge
bundles in A X (in A(S)).
Observe that we must choose all atoms in Q inside AY due to the infinite capacity arcs, i.e.,
Q ⊆ AY . Let W = S r Q. Note that W 6= S. Then:

cap( A, A) = c( AY , t) + c(s, A X )
≥ c(δ↑ ( Q), t) + c(s, δ→ (W ))
= c(δ↑ (S), t) − c(δ↑ (W )) + c(s, δ→ (W )) ≥ c(δ↑ (S), t),

where the last inequality follows by (27).

Finally, we prove (27). Suppose atoms in W are adjacent to k bad edges. Then

xG (δ→ (W )) = x(δ→ (W )) − xB (δ→ (W ))

which by Lemma 6.3 and the fact that each bad edge has fraction at most 1/2 + ǫ1/2 , is

≥ |W | − ǫη /2 − k(1/2 + ǫ1/2 ). (28)

62
To upper bound c(δ↑ (W ), t), we observe that for any u ∈ A(S),

↑ if x(δ↑ (u)) < ǫF
 x(δ (u)) Zu ≤ 1/5

c(u, t) ≤ (1/2 + 9ǫ1/2 )(1 − ǫB ) if x(δ↑ (u)) > ǫF and u incident to bad edge

1 + ǫη otherwise, using Lemma 2.7.


Therefore, we can write,

c(δ↑ (W ), t) ≤ k(1/2 + 9ǫ1/2 )(1 − ǫB ) + (|W | − k)(1 + ǫη ).
Now, to prove (27), using (28), it is enough to choose α and ǫB such that,

(1 + α) |W | − ǫη /2 − k(1/2 + ǫ1/2 ) ≥ k(1/2 + 9ǫ1/2 )(1 − ǫB ) + (|W | − k)(1 + ǫη ),
or equivalently,
ǫη
|W |(α − ǫη ) ≥ k(α/2 + 10ǫ1/2 + αǫ1/2 − ǫB /2 − 9ǫB ǫ1/2 − ǫη ) + (1 + α )
2
Since every atom is adjacent to at most one bad edge, k ≤ |W | and |W | ≥ 1, the inequality follows
using ǫB ≥ 21ǫ1/2 and α > 2ǫη and ǫ1/2 ≤ 0.0005 and ǫη ≤ ǫ1/22 .

7 Reduction and payment

In this section we prove Theorem 4.33.
In Section 5 we defined a number of happy events, such as 2-1-1 happy or 2-2-2 happy and
showed that each of these events occurs with probability at least p. In this section, we will
subsample these events to define a corresponding decrease event that occurs with probability
exactly14 p.

Reduction Events.
i) For each polygon cut S ∈ H, let RS be the indicator of a uniformly random subset of
measure p of the max flow event ES . Note that when RS = 1 then in particular we know
that the polygon S is happy.
ii) For a top edge bundle e = (u, v) define



 1 if e is 2-1-1 happy and good w.r.t. u

1 if e is 2-2-2 happy and good w.r.t. u, but not 2-1-1 good with respect to u
He,u =


 1 if e is 2-2 happy and good, but not 2-1-1 or 2-2-2 good with respect to u

0 otherwise.

14 Suppose that under the distribution µ on spanning trees, some event D ′ has probability q ≥ p and we seek to
define an event D ⊆ D ′ that has probability exactly p. To this end, one can copy every tree T in the support of µ,
kq
exactly ⌊ p ⌋ times for some integer k > 0 and whenever we sample T we choose a copy uniformly at random. So,
to get a probability exactly p for an event, we say this event occurs if for a “feasible” tree T one of the first k copies
are sampled. Now, as k → ∞ the probability that D occurs converges to p. Now, for a number of decreasing events,
D1 , D2 , . . . , that occur with probabilities q1 , q2 , . . . (respectively), we just need to let k be the least common multiple of
p/q1 , p/q2 , . . . and follow the above procedure. Another method is to choose an independent Bernoulli with success
probability p/q for any such event D .

63
and let He,v be defined similarly. Since p is a lower bound on the probability an edge is
good, we may now let Re,u and Re,v be indicators of subsets of measure p of He,u and He,v
respectively (note Re,u and Re,v may overlap). In this way every top edge bundle e = (u, v)
is associated with indicators Re,u and Re,v .
We set β = η/8 and τ = 0.571β. Define r : E → R ≥0 as follows: For any (non-bundle) edge e,
(
βxe RS if p(e) = S for a polygon cut S ∈ H
re = 1
2 τxe (Rf,u + Rf,v ) if e ∈ f for a top edge bundle f = ( u, v)

Increase Events Let E be the set of edge bundles, i.e., top/bottom edge bundles. Now, we
define the increase vector I : E → R ≥0 as follows:
• Top edges. Let me,u be defined as in Lemma 6.2. For each top edge bundle e = (u, v), let
me,u
Ie,u := ∑ rg ·
∑f∈δ→ (u) mf,u
I {u is odd} , (29)
g∈δ↑ ( u )

and define Ie,v analogously. Let Ie = Ie,u + Ie,v .

• Bottom edges. For each bottom edge bundle S with polygon partition A, B, C, let r( A) :=
∑ f ∈ A r f , r( B) := ∑ f ∈ B r f , and r(C ) := ∑ f ∈C r f . Then set

IS := (1 + ǫη ) max{r( A) · I {S not left happy} , r( B) · I {S not right happy}}

+ r(C )I {S not happy} . (30)

The following theorem is the main technical result of this section.

ǫ1/1
Theorem 7.1. For any good top edge bundle e, E [ Ie ] ≤ (1 − 6 ) pτxe , and for any bottom edge bundle
S, E [ IS ] ≤ 0.99994βp.
Using this theorem, we can prove the desired theorem:
Theorem 4.33 (Main Payment Theorem). For an LP solution x0 and x be x0 restricted to E and a
hierarchy H for some ǫη ≤ 10−10 , the maximum entropy distribution µ with marginals x satisfies the
following:
i) There is a set of good edges Eg ⊆ E r δ({u0 , v0 }) such that any bottom edge e is in Eg and for any
(non-root) S ∈ H such that p(S) is a degree cut, we have x( Eg ∩ δ(S)) ≥ 3/4.

ii) There is a random vector s : Eg → R (as a function of T ∼ µ) such that for all e, se ≥ − xe η/8
(with probability 1), and

iii) If a polygon cut u with polygon partition A, B, C is not left happy, then for any set F ⊆ E with
p(e) = u for all e ∈ F and x( F ) ≥ 1 − ǫη /2, we have

s( A) + s( F ) + s− (C ) ≥ 0,

where s− (C ) = ∑e∈C min{se , 0}. A similar inequality holds if u is not right happy.

64
iv) For every cut S ∈ H such that p(S) is not a polygon cut, if δ(S) T is odd, then s(δ(S)) ≥ 0.

v) For a good edge e ∈ Eg , E [ se ] ≤ −ǫP ηxe (see Eq. (33) for definition of ǫP ) .

Proof of Theorem 4.33. First, we set the constants:

ǫ1/2 2
ǫ1/2 = 0.0002, ǫ1/1 = , p = 0.005ǫ1/2 , ǫ M = 0.00025, τ = 0.571β, β = η/8. (31)
12
Define Eg to be the set of bottom edges together with any edge e which is part of a good top
edge bundle. Now, we verify (i): We show for any S ∈ H such that p(S) is a degree cut,
x( Eg ∩ δ(S)) ≥ 3/4. First, by Theorem 5.14, if x(δ↑ (S)) ≥ 1/2 + 9ǫ1/2 then all edges in δ→ (S)
are good, so the claim follows because by Lemma 2.7, x(δ→ (S)) ≥ 1 − ǫη ≥ 3/4. Otherwise,
x(δ↑ (S)) ≤ 1/2 + 9ǫ1/2 . Then, by Theorem 5.14 there is at most one bad edge in δ→ (S). Therefore,
there is a fraction at least x(δ→ (S)) − (1/2 + ǫ1/2 ) ≥ 3/4 of good edges in δ→ (S).
For any edge e ∈ E′ define
(
If xxfe if e ∈ f for a top edge bundle f,
se = −re + (32)
IS xe if p(e) = S for a polygon cut S ∈ H.

Now, we verify (ii): First, we observe that se = 0 (with probability 1) if e is part of a bad edge
bundle since we defined decrease events only for good edges and me,u is non-zero only for good
edge bundles. Since re ≤ βxe for bottom edges and re ≤ τxe for top edges, and τ ≤ β ≤ η/8, we
have that re ≤ xe η/8. It follows that se ≥ − xe η/8.
Now, we verify (iii): Suppose a polygon cut u is not left-happy. Since u is not happy we must
have Ru = 0 and re = 0 for any e ∈ F. Therefore,

s( A) + s( F ) + s− (C ) = s( A) + IS x( F ) + s− (C )
≥ −r( A) + (1 + ǫη )(r A + rC )(1 − ǫη /2) − r(C ) ≥ 0.

where we used that x( F ) ≥ 1 − ǫη /2.

Now, we verify (iv): Let S ∈ H, where p(S) is a degree cut. If S is odd, then re = 0 for all
edges e ∈ δ→ (S); so by Eq. (29)

s(δ(S)) ≥ − ∑ rg + ∑ Ie,S
g∈δ↑ ( S ) e∈ δ → ( S )
me,S
=− ∑ rg + ∑
→
∑ rg
∑f∈δ→ (S) mf,S
= 0.
g∈δ↑ ( S ) e∈ δ ( S ) g ∈ δ ↑ ( S )

Finally, we verify (v): Here, we use Theorem 7.1. For a good top edge e that is part of a top
edge bundle f we have
xe ǫ ǫ
E [ se ] = −E [re ] + E [ If ] ≤ −τ pxe + (1 − 1/1 ) pτxe = − 1/1 pτxe .
xf 6 6

On the other hand, for a bottom edge e with p(e) = S, then

E [ se ] = −E [re ] + E [ IS ] xe ≤ − βpxe + 0.99994pβxe ≤ −0.00006pβxe .

65
Finally, we can let
ǫ1/1 ǫ 2 0.571
ǫP : = p(τ/η ) = 1/2 0.005ǫ1/2 3
≥ 0.0000049ǫ1/2 ≥ 3.9 · 10−17 (33)
6 72 8
as desired.

In the rest of this section we prove Theorem 7.1. Throughout the proof, we will repeatedly
use the following facts proved in Section 5: If a top edge e = (u, v) that is part of a bundle f
is reduced (equivalently Hf,u = 1 or Hf,v = 1), then u and v are trees, which means that tree
sampling inside u and v is independent of the reduction of e.
Note however, that conditioning on a near-min-cut or atom to be a tree increases marginals
inside and reduces marginals outside as specified by Lemma 2.23. Since for any S ∈ H, x(δ(S)) ≤
2 + ǫη , the overall change is ±ǫη /2.
The proof of Theorem 7.1 simply follows from Lemma 7.2 and Lemma 7.7 that we will prove
in the following two sections.

7.1 Increase for Good Top Edges

The following lemma is the main result of this subsection.
Lemma 7.2 (Top Edge Increase). Let S ∈ H be a degree cut and e = (u, v) a good edge bundle with
p(e) = S. If ǫ1/2 ≤ 0.0002, ǫ1/1 ≤ ǫ1/2 /12 and ǫη ≤ ǫ1/2 2 , ǫ = 1/10 then
F
ǫ
E [ Ie,u ] + E [ Ie,v ] ≤ pτxe 1 − 1/1 .
6
We will use the following technical lemma to prove the above lemma.
Lemma 7.3. Let S ∈ H be a degree cut with an atom u ∈ A(S). If x(δ↑ (u)) > ǫF , ǫ1/2 ≤ 0.0002,
ǫ1/1 ≤ ǫ1/2 /12, and ǫF = 1/10 then we have
1
∑ 2
τxg · (P [ δ(u) T odd|Rf,u′ ] + P [δ(u) T odd|Rf,v′ ]) (34)
g ∈ δ ↑ ( u ),
g∈f=( u ′ ,v′ ) good top
ǫ1/1
+ ∑ βxg · P [ δ(u) T odd|RS ] ≤ τ (1 − ) x(δ↑ (u)) Fu ,
g∈δ↑ ( u ),p( g)= S
5

where recall we set Fu := 1 − ǫB I {u ∈ F (S)} in Lemma 6.2, where we take ǫB := 21ǫ1/2 .

Proof of Lemma 7.2. By linearity of expectation and using Eq. (29):
 
me,u
E [ Ie,u ] = E  ∑ rg · I {u is odd}
∑f∈δ→ (u) mf,u g∈δ↑ ( u )
me,u 1
=
∑f∈δ→ (u) mf,u
∑ 2
τxg (P [Rf,u′ , δ(u) T odd] + P [Rf,v′ , δ(u) T odd]) (35)
↑
g∈δ (u):
g∈f=( u ′ ,v′ ) good top

+ ∑ βxg P [RS , δ(u) T odd]
g∈δ↑ ( u ) :p( g)= Spolygon

66
A similar equation holds for E [ Ie,v ].
The case where x(δ↑ (u)) ≤ ǫF or x(δ↑ (v)) ≤ ǫF is dealt with in Lemma 7.6. So, consider the
case where x(δ↑ (u)), x(δ↑ (v)) > ǫF . Now recall that from (26),

∑ mf,u = Z u δ↑ (u) (36)

→
f∈ δ ( u )

where Zu = 1 + I |S| ≥ 4, x(δ↑ (u)) ≤ ǫF . In this case, Zu = Zv = 1.

Using P [Rf,u′ , δ(u) T odd] = pP [ δ(u) T odd|Rf,u′ ], and plugging (34) into (35) for u and v, we
get (and using Eq. (36)):

ǫ1/1 ↑ me,u ↑ me,v
E [ Ie,u ] + E [ Ie,v ] ≤ pτ (1 − ) x(δ (u)) Fu + x(δ (v)) Fv (37)
5 x(δ↑ (u)) x(δ↑ (v))
ǫ
= pτ (1 − 1/1 )( Fu me,u + Fv me,v )]
5
ǫ1/1 ǫ
≤ pτ (1 − )(1 + 2ǫη ) xe < pτxe (1 − 1/1 ).
5 6
ǫ1/1
where on the final line we used (25) and ǫη < 100 .

Proof of Lemma 7.3. Suppose that Si ∈ H are the ancestors of S in the hierarchy (in order) such
S1 = S and for each i, Si+1 = p(Si ). Let
δ ≥ i : = δ ( u ) ∩ δ ( Si ) and δ i : = δ ( u ) ∩ δ → ( Si ).
Each group of edges δi is either entirely top edges or entirely bottom edges. First note that if
g ∈ δi and g is a bottom edge, i.e., Si+1 is a polygon cut, then by Corollary 5.10,

P δ(u) T odd|RSi+1 = P δ(u) T odd|ESi+1 ≤ 0.5678
(see Definition 5.8 and Item i) for definition of ESi+1 , Ri+1 ) where in the equality we used that
RSi+1 is a uniformly random event chosen in ESi+1 . Therefore, to prove Eq. (34) it is enough to
show
1
∑ 2
τxg (P [ δ(u) T odd|Rf,u′ ] + P [ δ(u) T odd|Rf,v′ ])
↑
g∈δgood ( u ) :
g∈f=( u ′ ,v′ ) top,
ǫ
↑ ↑

≤ τ (1 − 1/1 ) Fu x(δgood (u)) + x(δbad (u)) + 0.0014x(δβ↑ (u)) (38)
5
where we write δβ (u), δgood (u), δbad (u) to denote the set of bottom edges, good top edges, and
bad (top) edges in δ(u) respectively and we used that
ǫ
τ (1 − 1/1 )(1 − ǫB ) − 0.5678β ≥ 0.0014τ
5
ǫ
since τ = 0.571β and ǫ1/1 ≤ 12 1/2
and ǫ1/2 ≤ 0.0002.
1 ǫ
Since h(f) := 2 (P [ δ(u) T odd|Rf,u′ ] + P [ δ(u) T odd|Rf,v′ ]) ≤ 1 and (1 − 1/1
5 ) Fu is nearly 1, in
each of the following cases
(
↑ 0.003 when Fu = 1 ↑
x(δβ (u)) ≥ 4 ↑
or x(δbad (u)) ≥ 0.006 when Fu ≥ 1 − ǫB ,
5 x ( δ ( u )) when Fu = 1 − ǫ B
(39)

67
ǫ
(38) holds. To see this, just plug in ǫ1/1 ≤ 121/2
, ǫ1/2 ≤ 0.0002, ǫB = 21ǫ1/2 , ǫη ≤ 10−10 , x(δ↑ (u)) ≤
1 + ǫη and any inequality from (39) into (38), using the upper bound h(f) = 1.
Alternatively, for δtop (u) = δgood (u) ∪ δbad (u) be the set of top edges in δ(u), if we can show
↑
the existence of a set D ⊆ δtop (u) such that

P [δ(u) T odd|Rf,u′ ] + P [δ(u) T odd|Rf,v′ ] ǫ

↑
x( D) · max 1− ≥ 1/1 + 1 − Fu x(δtop (u)),
g∈ D: 2 5
g∈f=( u ′ ,v′ ) good
(40)
then, again, (38) holds.
In the rest of the proof, we will consider a number of cases and show that in each of them,
either one of the inequalities in (39) or the inequality in (40) for some set D is true, which will
imply the lemma.

Sℓ x (δ≥ℓ ) ≥ 2ǫη + ǫ1/1

ǫF
Sk
δj x (δ≥k ) ≥ 2ǫη + 2

Sj x (δ≥ j ) ≥ 1 − ǫ1/1
S3 δ2
S2
δ1
u

S = S1

First, let

j = max{i : x(δ≥i ) ≥ 1 − ǫ1/1 }

k = max{i : x(δ≥i ) ≥ 2ǫη + ǫF /2},
ℓ = max{i : x(δ≥i ) ≥ 2ǫη + ǫ1/1 }

Just note j ≤ k ≤ ℓ. Note that levels ℓ and k exist since x(δ↑ (u)) ≥ ǫF , whereas level j may not
exist (if x(δ↑ (u)) < 1 − ǫ1/1 ). We consider three cases:

Case 1: x(δ↑ (u)) ≥ 1 − ǫ1/1 : Then j exists and S j has a valid A, B, C degree partitioning
(Definition 5.18) where A = δ(u′ ) ∩ δ(S j ) such that either u = u′ or u′ is a descendant of u
in H. Note that, x(δ(u) ∩ δ(S j )) ≥ 1 − ǫ1/1 , and that, in this case, x(δ↑ (u)) is not ǫF fractional
(see Lemma 6.2), so Fu = 1.

Case 1a: x(δ j ) ≥ 3/4. If δ j are bottom edges then (39) holds. So, suppose that δ j is a set of top
edges. By Lemma 5.28, at most 1/2 + 4ǫ1/2 fraction of edges in A ∩ δ j are good but not
2-1-1 good (w.r.t., u). So, the rest of the edges in A ∩ δ j are either bad or 2-1-1 good. Since

x( A ∩ δ j ) ≥ 3/4 − x(C ) ≥ 3/4 − 2ǫ1/1 − ǫη ,

68
δ j either has a mass of 21 (1/4 − 2ǫ1/1 − ǫη − 4ǫ1/2 ) > 1/8 − 3ǫ1/2 of bad edges or of 2-1-1
good edges.15 The former case implies that (39) holds. In the latter case, by Claim 7.4 for
any 2-1-1 good edge g ∈ δ j with g ∈ f = (u′ , v′ ) we have P [ δ(u) T odd|Rf,u′ ] ≤ 2ǫη + ǫ1/1 ;
so (40) holds for D defined as the set of 2-1-1 good edges in δ j .

Case 1b: x(δ j ) < 3/4. If x(δβ↑ (u)) ≥ 0.003, then (39) holds. Otherwise, we apply Claim 7.5 with
ǫ = ǫ1/1 to all top edges in D = δ≥ j+1 r δ≥ℓ+1 and we get that

1 2
(P [δ(u) T odd|Rf,u′ ] + P [δ(u) T odd|Rf,v′ ]) ≤ 1 − ǫ1/1 + ǫ1/1 .
2
Since x( D ) ≥ 1 − ǫ1/1 − 3/4 − 2ǫη − ǫ1/1 − 0.003 > 0.24, (40) holds.

Case 2: 1 − ǫF < x(δ↑ (u)) < 1 − ǫ1/1 . Again we have Fu = 1. So we can either show that
↑
x(δβ (u)) ≥ 0.003 or take D to be the top edges in δ↑ (u) r δ≥ℓ+1 and use Claim 7.5 with ǫ = ǫ1/1 .
This will enable us to show that (40) holds as in the previous case.

Case 3: ǫF < xδ↑ (u) < 1 − ǫF : In this case Fu = 1 − ǫB . If at least 4/5 of the edges in δ↑ (u) are
bottom edges, then we are done by (39).
Otherwise, let u′ = p(u). For any top edge e ∈ δ↑ (u) where e ∈ f = (u′′ , v′′ ) we have

P [ δ(u) T odd|Rf,u′′ ] ≤ P u′ tree|Rf,u′′ P δ(u) T odd|u′ tree, Rf,u′′ + P u′ not tree|Rf,u′′

Using that u′ ⊆ u′′ is a tree under |Rf,u′′ with probability at least 1 − ǫη /2, and applying Claim 7.5
(to u and u′ ) with ǫ = ǫF we have P [ δ(u) T odd|u′ tree, Rf,u′′ ] ≤ 1 − ǫF + ǫ2F we get

P [ δ(u) T odd|Rf,u′′ ] ≤ 1 − ǫF + ǫ2F + ǫη /2.

Now, let D be all top edges in δ↑ (u). Then, we apply Eq. (40) to this set of mass at least x(δ↑ (u))/5,
ǫ
and we are done, using that (ǫF − 2ǫ2F )/5 ≥ ( 1/1 5 + ǫB ) which holds for ǫ F ≥ 1/10, ǫB = 21ǫ1/2 ,
and ǫ1/2 ≤ 0.0002.

Claim 7.4. For u ∈ H and a top edge e ∈ f = (u′ , v′ ) for some u′ ∈ H that is an ancestor of u, if
x(δ(u) ∩ δ(u′ )) ≥ 1 − ǫ1/1 and f is 2-1-1 good, then

P [δ(u) T odd|Rf,u′ ] ≤ 2ǫη + ǫ1/1 .

Proof. Let A, B, C be the degree partitioning of δ(u′ ). By the assumption of the claim, without
loss of generality, assume A ⊆ δ(u) ∩ δ(u′ ). This means that if Rf,u′ = 1 then u′ is a tree and
A T = 1 = (δ(u) ∩ δ(u′ )) T (also using CT = 0). Therefore,

P [ δ(u) T odd|Rf,u′ ] = P (δ(u) r δ(u′ )) T even|Rf,u′ .

To upper bound the RHS first observe that

E (δ(u) r δ(u′ )) T |Rf,u′ ≤ ǫη /2 + x(δ(u) r δ(u′ )) ≤ ǫη /2 + x(δ(u)) − x( A) < 1 + 2ǫη + ǫ1/1 .

15 We are using the fact that ǫ1/1 = ǫ1/2 /12 and that ǫη is tiny by comparison to these.

69
Under the conditional measure |Rf,u′ , u′ is a tree , so u must be connected inside u′ , i.e., (δ(u) r
δ(u′ )) T ≥ 1 with probability 1. Therefore,

P (δ(u) r δ(u′ )) T even|Rf,u′ ≤ P (δ(u) r δ(u′ )) T − 1 6= 0|Rf,u′ ≤ 2ǫη + ǫ1/1

as desired.

Claim 7.5. For u, u′ ∈ H such that u′ is an ancestor of u. Let ν = νu′ × νG/u′ be the measure resulting
from conditioning u′ to be a tree. if x(δ(u) ∩ δ(u′ )) ∈ [ǫ, 1 − ǫ], then

P ν δ(u) odd|(δ(u) ∩ δ(u′ )) T ≤ 1 − ǫ + min{2ǫη , ǫ2 }.

(41)

Proof. Let D = δ(u) r δ(u′ ). By assumption, u′ is a tree, so DT ≥ 1 with probability 1. Therefore,

since we have no control over the parity of (δ(u) ∩ δ(u′ )) T

P ν δ(u) T even|(δ(u) ∩ δ(u′ )) T ≥ min{P DT − 1 odd|u′ tree , P DT − 1 = 0|u′ tree }

where we removed the conditioning by taking the worst case over (δ(u) ∩ δ(u′ )) T even, (δ(u) ∩
δ(u′ )) T odd. First, observe by the assumption of the claim and that x(δ↑ (u)) ≤ 2 + ǫη we have

E DT − 1|u′ tree ∈ [ǫ, 1 − ǫ + 2ǫη ].

Furthermore, since we have a SR distribution on G [u′ ], DT − 1 is a Bernoulli sum random variable.

Therefore,
P DT − 1 = 0|u′ tree ≥ ǫ − 2ǫη

and by Corollary 2.17

P DT − 1 odd|u′ tree ≥ 1 − 1/2(1 + e−2ǫ ) ≥ ǫ − ǫ2

as desired.

Lemma 7.6. Let S ∈ H be a degree cut and e = (u, v) a good edge bundle with p(e) = S. If x(δ↑ (u)) <
ǫF , ǫ1/2 ≤ 0.0002, ǫ1/1 ≤ ǫ1/2 /10, then,
ǫ
E [ Ie,u ] + E [ Ie,v ] ≤ pτxe 1 − 1/1
6
Proof. First, we consider the case |A(S)| = 3, where A(S) = {u, v, w}. Let f = (u, w), g = (v, w)
(and of course e = (u, v)). We will use the following facts below:

δ↑ (u)
δ↑ (w) u δ ↑ ( v)
f e
w v
g
S

70
xe + xf ≥ 2 − ǫ F (x(δ(u)) ≥ 2 and x(δ↑ (u)) ≤ ǫF )
x(δ↑ (v)) + x(δ↑ (w)) ≥ 2 − ǫF (x(δ(S)) ≥ 2)
↑
xf , x(δ (w)) ≤ 1 + ǫη , (Lemma 2.7)

so we have,
xe , x(δ↑ (v)) ≥ 1 − ǫF − ǫη . (42)
Now we bound E [ Ie,u ] + E [ Ie,v ]. Since x(δ↑ (v)) ≥ ǫF , applying (34) and (35) to Ie,v and using
Zv = 1 we get
me,v ǫ ǫ
E [ Ie,v ] ≤ pτ (1 − 1/1 ) x(δ↑ (v)) Fv = me,v pτ 1 − 1/1 Fv (43)
∑f∈δ→ (v) mf,v 5 5

On the other hand, since by Corollary 5.10 for any bottom edge g ∈ δ↑ (u) with p( g) = S′ , we
have
P [ δ(u) T odd|RS′ ] = P [δ(u) T odd|ES′ ] ≤ 0.5678,
using 0.5678β ≤ τ and Zu = 1 we can write,
me,u
E [ Ie,u ] ≤ ∑ xh pτFu ·
Zu x(δ↑ (u))
= pτFu me,u . (44)
h∈δ↑ ( u )

Therefore,
ǫ
E [ Ie,u ] + E [ Ie,v ] ≤ pτFu me,u + pτ 1 − 1/1 Fv me,v
5
ǫ
= pτ ( Fu me,u + Fv me,v ) − 1/1 pτFv me,v
5
ǫ1/1
≤ pτ (1 + 2ǫη ) xe − pτFv me,v (45)
5
where the final inequality follows from (25). To complete the proof, we lower bound me,v .
Using (26) for v and w, we can write,

x(δ↑ (v)) + x(δ↑ (w)) = me,v + mg,v + mf,w + mg,w

(1 + 2ǫη )
≤ me,v + ( x + xg ) (using (25))
(1 − ǫ B ) f
!
(1 + 2ǫη ) x(δ( a)) x(δ(S))
= me,v +
(1 − ǫ B ) ∑ 2
−
2
− xe
a∈A( S )
(1 + 2ǫη )
≤ me,v + (2 + 3ǫη − xe )
(1 − ǫ B )
and using the fact that x(δ↑ (v)) + x(δ↑ (w)) ≥ 2 − ǫF , we get

me,v ≥ xe − ǫF − 4ǫB ≥ (1 − 1.2ǫF ) xe ,

where the second inequality follows from (42) and ǫB = 21ǫ1/2 and ǫη < ǫ1/2 2 and ǫF ≥ 1/10.
Plugging this back into (45) and using Fv ≥ 1 − ǫB = 1 − 21ǫ1/2 we get
ǫ ǫ
E [ Ie,u ] + E [ Ie,v ] ≤ pτxe 1 + 2ǫη − 1/1 (1 − 1.2ǫF )(1 − 21ǫ1/2 ) ≤ pτxe (1 − 1/1 )
5 6

71
as desired. In the last inequality we used ǫF ≤ 1/10 and ǫ1/2 ≤ 0.0002.
Case 2: |S| ≥ 4. In this case, Zu = 2. Therefore, by Eq. (44)

me,u 1
E [ Ie,u ] ≤ ∑ xe pτFu ↑
Zu x(δ (u))
= pτFu me,u .
2
e ∈δ↑ ( u )

Now, either x(δ↑ (v)) < ǫF and we get the same inequality for Ie,v or x(δ↑ (v)) ≥ ǫF in which case
by (43) we get E [ Ie,v ] ≤ me,v pτFv (1 − ǫ1/1 /5). Putting these together proves the lemma.

7.2 Increase for Bottom Edges

The following lemma is the main result of this subsection.
2 , for any polygon cut S ∈ H ,
Lemma 7.7 (Bottom Edge Increase). If ǫ1/2 ≤ 0.0002, ǫη ≤ ǫ1/2

E [ IS ] ≤ 0.99994βp.

Proof. For a set of edges D ∈ δ(S) define the random variable.

IS ( D ) := (1 + ǫη )(max{r( A ∩ D )I {S not left happy} , r( B ∩ D )I {S not right happy}

+ r(C ∩ D )I {S not happy}). (46)

Note that by definition IS (δ(S)) = IS and for any two disjoint sets D1 , D2 , IS ( D1 ∪ D2 ) ≤ IS ( D1 ) +
IS ( D2 ). Also, define IS↑ = IS (δh↑ (Si)) and IS→ = IS (δ→ (S)).
First, we upper bound E IS↑ . Let f ∈ δ↑ (S) and suppose that f with p( f ) = S′ is a bottom
edge. Say we have f ∈ A↑ (S). We write,

E [ IS ( f )] = (1 + ǫη ) βx f P [RS′ ] P [S not left happy |RS′ ]

≤ 0.568x f pβ

where in the inequality we used Corollary 5.11 and that

P [ S not left happy|RS ] = P [ S not left happy|ES ]

since RS is a uniformly random subset of ES . On the other hand, if f is a top edge, then we use
the trivial bound
E [ IS ( f )] = (1 + ǫη )τ px f . (47)
Therefore, h i
E IS↑ ≤ (1 + ǫη )τ px(δ↑ (S)) ≤ (1 + ǫη )(0.571) βpx(δ↑ (S)) (48)

Now, we consider three cases:

Case 1: Ŝ = p(S) is a degree cut. Combining (48) and Lemma 7.8 below, we get

E [ IS ] ≤ (1 + ǫη ) p(0.571) β(7/4 + 6ǫ1/2 + ǫη ) ≤ 0.99994βp

2 .
using ǫ1/2 ≤ 0.0002 and ǫη ≤ ǫ1/2

72
Case 2: Ŝ = p(S) is a polygon cut with ordering u1 , . . . , uk of A(Ŝ), S = u1 or S = uk Then, by
Lemma 7.9 below,

E [ IS ] ≤ (1 + ǫη ) βp(0.571x(δ↑ ) + 0.31) ≤ 0.89βp

where we used x(δ↑ (S)) ≤ 1 + ǫη .

Case 3: Ŝ = p(S) is a polygon cut with ordering u1 , . . . , uk of A(Ŝ), S 6= u1 , uk Then, by
Lemma 7.12 below

E [ IS ] ≤ (1 + ǫη ) βp(0.571x(δ↑ ) + 0.85) ≤ 0.86βp

where we use that x(δ↑ (S)) ≤ ǫη since we have a hierarchy. This concludes the proof.

7.2.1 Case 1: Ŝ is a degree cut

Lemma 7.8. Let S ∈ H be a polygon cut with parent Ŝ which is a degree cut. Then

E [ IS→ ] ≤ (1 + ǫη ) pτ ( x(δ→ (S)) − (1/4 − 6ǫ1/2 )).

Proof. Let A, B, C be the polygon partition of S. We will show that for a constant fraction of the
edges in δ→ (S), we can improve over the trivial bound in (47). To this end, consider the cases
given by Theorem 5.21.
Case 1: There is a set of 2-1-1 good edges (w.r.t., S) D ⊆ δ→ (S), such that xD ≥ 1/2 −
ǫ1/2 − ǫη . For any (top) edge e ∈ f = (S, u) such that e ∈ D, if Rf,S , then S is happy, that is
A T = BT = 1, CT = 0.
Therefore,
1 + ǫη
E [ IS ( D )] ≤ ∑ 2
τxe P [S not happy|Rf,u ] P [Rf,u ]
e ∈ D:e ∈f=( S,u )
1 + ǫη
≤ pτx( D ).
2
Using the trivial inequality Eq. (47) for edges in δ→ (S) r D we get

x( D)
E [ IS→ ] ≤ (1 + ǫη ) pτ ( + x(δ→ (S)) − x( D )) ≤ (1 + ǫη ) pτ ( x(δ→ (S)) − (1/4 − ǫ1/2 ))
2
as desired. In the last inequality we used x( D ) ≥ 1/2 − ǫ1/2 − ǫη .
Case 2: There are two 2-2-2 good top half edge bundles, e = (S, v), f = (S, w) in δ→ (S),
such that xe( B) , xf( A) ≤ ǫ1/2 . (Recall that e( A) = e ∩ A.) Let D = e( A) ∪ f( B). In this case, e and f
are reduced simultaneously by τ when they are 2-2-2 happy (w.r.t., S), i.e., when Re,S = Rf,S = 1.
In such a case we have δ(S) T = δ(v) T = δ(w) T = 2. Therefore,

E [ IS ( D )] ≤ (1 + ǫη )E [max{r( A ∩ D ), r( B ∩ D )}]
τ
≤ (1 + ǫη ) max{ xe( A) , xf( B) }(P [e, f 2-2-2 happy] + P [Re,v ] + P [Rf,w ])
2
3p 1 3
≤ ( 1 + ǫη ) τ x ( D ) + 3ǫ1/2 = (1 + ǫη )τ px( D ) + 4.5ǫ1/2
2 2 4

73
where we used that 1/2 − 2ǫ1/2 − x(C ) ≤ xe( A) , xf( B) ≤ 1/2 + ǫ1/2 and that x(C ) ≤ ǫη . Using the
trivial inequality Eq. (47) for edges in δ→ (S) r D we get

E [ IS→ ] ≤ (1 + ǫη ) pτ ( x( D )(3/4 + 4.5ǫ1/2 ) + x(δ→ (S)) − x( D ))

≤ (1 + ǫη ) pτ ( x(δ→ (S)) − (1/4 − 6ǫ1/2 ))

where we used x( D ) ≥ 1 − 4ǫ1/2 − ǫη .

Case 3: There is a bad half edge e in δ→ (S). Since bad edges never decrease, no correspond-
ing increase occurs, so by the trivial bound Eq. (47)

E [ IS→ ] ≤ (1 + ǫη ) pτ ( x(δ→ (S)) − (1/2 − ǫ1/2 )).

This concludes the proof.

7.2.2 Case 2: S and its parent Ŝ are both polygon cuts

In this subsection we prove two lemmas: Lemma 7.9, which bounds E [ IS→ ] when S is the leftmost
or rightmost atom of Ŝ, and Lemma 7.12, which bounds this quantity when S is not leftmost or
rightmost.

Lemma 7.9. Let S ∈ H be a polygon cut with p(S) = Ŝ also a polygon cut. Let u1 , . . . , uk be the ordering
of cuts in A(Ŝ) (as defined in Definition 4.31). If ǫ M ≤ 0.001, ǫη ≤ ǫ2M , S = u1 or S = uk , then

E [ IS→ ] ≤ 0.31βp.

Proof. Let S be the leftmost atom of Ŝ and let A, B, C be the polygon partition of δ(S). First, note

E [ IS→ ] ≤ (1 + ǫη ) (E [max(r( A→ ), r( B→ )) · I {S not happy}] + E [r(C → )I {S not happy}]) . (49)

where recall that A→ = A ∩ δ→ (S). WLOG assume x( A→ ) ≥ x( B→ ). Then,

E [max{r( A→ ), r( B→ )}I {S not happy}] = βpx( A→ ) · P S not happy|RŜ

By Lemma 7.10 we have

x( A→ ) · P S not happy|RŜ ≤ x( A→ ) 1 − ((1 − x( A→ ))2 + ( x( A→ ))2 − 2ǫ M − 17ǫη )

≤ 2x( A→ )2 − 2x( A→ )3 + 2ǫ M x( A→ ) + 17ǫη x( A→ )

≤ (8/27 + 2ǫ M + 17ǫη ),

where in the final inequality we used that the function x 7→ x2 (1 − x) is maximized at x = 2/3,
and using ǫ M ≤ 0.001, ǫη < ǫ2M .
Plugging this back into (49), and using x(C ) ≤ ǫη , we get

8
E [ IS→ ] ≤ (1 + ǫη ) βp( + 2ǫ M + 18ǫη ) ≤ 0.31βp,
27
where the last inequality follows since ǫ M ≤ 0.001 and ǫη < ǫ2M .

74
Lemma 7.10. Let S ∈ H be a polygon cut with p(S) = Ŝ also a polygon cut. Let u1 , . . . , uk be the
ordering of cuts in A(Ŝ). If S = u1 , (or S = uk ) then
P S happy|RŜ ≥ (1 − x( A→ ))2 + ( x( A→ ))2 − 2ǫ M − 17ǫη .

Proof. Let A, B, C, Â, B̂, Ĉ be the polygon partition of S, Ŝ respectively. Observe that since S = u1 ,
we have Â = E(u1 , Ŝ) = A↑ ∪ B↑ ∪ C ↑ and B̂, Ĉ ∩ ( A ∪ B ∪ C ) = ∅. Conditioned on RŜ , Ŝ is a tree,
and marginals of all edges in Â is changed by a total variation distance at most ǫ′M := ǫ M + 2ǫη
from x (see Corollary 5.9) and they are independent of edges inside Ŝ. The tree conditioning
increases marginals inside by at most ǫη /2. Since after the changes just described
E [ CT ] ≤ xC + ǫη + ǫ′M ≤ 4ǫη + ǫ M ,

it follows that P CT = 0|RŜ ≥ 1 − 4ǫη − ǫ M . So,

P S happy | RŜ ≥ (1 − 4ǫη − ǫ M )P A T = BT = 1|CT = 0, RŜ . (50)
Let ν be the conditional measure CT = 0, RŜ . We see that
h i h i
P ν [ A T = BT = 1] = P ν A↑T = 1, BT↑ = 0, A→ → ↑ ↑ → →
T = 0, BT = 1 + P ν A T = 0, BT = 1, A T = 1, BT = 0

so using independence of (δ↑ (S)) T and (δ→ (S)) T .

h i h i
= P ν A↑T = 1, BT↑ = 0 P ν [ A→
T = 0, B →
T = 1 ] + P ν A ↑
T = 0, B ↑
T = 1 P ν [ A→ →
T = 1, BT = 0]

≥ ( x( A↑ ) − ǫ′M )P ν [ A→ → ↑ ′ → →
T = 0, BT = 1] + ( x ( B ) − ǫ M )P ν [ A T = 1, BT = 0] .

In the final inequality, we used the fact that conditioned on RŜ , Â = ( A↑ ∪ B↑ ∪ C ↑ ) T = 1 and
marginals in A↑ and B↑ are approximately preserved. Now, we lower bound P ν [ A→ →
T = 1, BT = 0].
Let ǫ A , ǫB be such that
Eν [ A→ → →
T ] = P ν [ A T = 1, BT = 0] + ǫ A , E ν [ BT→ ] = P ν [ A→ →
T = 0, BT = 1] + ǫB

Therefore,
ǫ A + ǫB = E ν [ A → → → → ↑
T + BT ] − P ν [ A T + BT = 1] ≤ x ( δ ( S )) − x ( δ ( S )) + ǫη
≤ (2 + ǫη + 1.5ǫη − (1 − ǫη )) − (1 − 4ǫη ) ≤ 8ǫη .
where in the inequality we used x(δ(S)) ≤ 2 + ǫη , that conditioning Ŝ to be a tree and C to 0
increases marginals by at most 1.5ǫη , that x(δ↑ (S)) ≥ 1 − ǫη (by Lemma 4.17) and Claim 7.11.
Therefore,
P ν [ A T = BT = 1] ≥ ( x( A↑ ) − ǫ′M )(E ν [ BT→ ] − ǫB ) + ( x( B↑ ) − ǫ′M )(E ν [ A→
T ] − ǫA )
≥ ( x( A↑ ) − ǫ′M )( x( B→ ) − 8ǫη ) + ( x( B↑ ) − ǫ′M )( x( A→ ) − 8ǫη )
where the second inequality uses that the tree conditioning and CT→ = 0 can only increase the
marginals of edges in A→ and B→ . Simplify the above using x( A↑ ) + x( A→ ) ≥ 1 − ǫη , and
similarly for B,
P ν [ A T = B T = 1]
≥ (1 − x( A→ ) − ǫη − ǫ′M )( x( B→ ) − 8ǫη ) + (1 − x( B→ ) − ǫη − ǫ′M )( x( A→ ) − 8ǫη )

75
and since x( A→ ) + x( B→ ) ≥ 1 − 2ǫη (because x( A↑ ) + x( B↑ ) ≤ 1 + ǫη and xC ≤ ǫη ), this is

≥ (1 − x( A→ ) − ǫη − ǫ′M )(1 − x( A→ ) − 10ǫη ) + ( x( A→ ) − 3ǫη − ǫ′M )( x( A→ ) − 8ǫη )

≥ (1 − x( A→ ))2 + ( x( A→ ))2 − ǫ′M − 11ǫη .

Plugging this into Eq. (50), we obtain

P A T = BT = 1, CT = 0 | RŜ ≥ (1 − 2ǫη − ǫ′M )P A T = BT = 1|CT = 0, RŜ

≥ (1 − 2ǫη − ǫ′M )((1 − x( A→ ))2 + ( x( A→ ))2 − ǫ′M − 11ǫη )

≥ (1 − x( A→ ))2 + ( x( A→ ))2 − 2ǫ′M − 13ǫη ,

which noting ǫ′M = ǫ M + 2ǫη completes the proof of the lemma.

Claim 7.11. Let ν be the conditional measure CT = 0, RŜ . Then

P ν [ A→ →
T + BT = 1] ≥ 1 − 4ǫη .

Proof. First, notice under ν, Ŝ is a tree, and δ→ (S) is independent of edges in G/Ŝ, though
E ν [δ→ (S) T ] may be increased by 2ǫη due to RŜ , CT = 0, so

E ν [δ→ (S) T ] ≤ 2 + ǫη − (1 − ǫη ) + 2ǫη ≤ 1 + 4ǫη ,

where we used that S = u1 , so x(δ↑ (S)) ≥ 1 − ǫη .

Also, since CT = 0, under ν, we have δ→ (S) T = A→ → →
T + BT . Furthermore, since δ ( S ) T ≥ 1
with probability 1 under ν,

P ν [ A→ → →
T + BT = 1] = P ν [ δ ( S ) T = 1] ≥ 1 − 4ǫη

as desired.

Lemma 7.12. Let S ∈ H be a polygon cut with p(S) = Ŝ also a polygon cut with u1 , . . . , uk be the
ordering of cuts in A(Ŝ). If S 6= u1 , uk , then

E [ IS→ ] ≤ 0.85βp.

Proof. Let S = ui for some 2 ≤ i ≤ k − 1. Let A, B, C be the polygon partitioning of δ(ui )

and Â, B̂, Ĉ be the polygon partition of Ŝ. Since ui is in the hierarchy A↑ ∪ B↑ ∪ C ↑ ⊆ Ĉ. So,
conditioned on RŜ , A↑T = BT↑ = CT↑ = 0.
Once again, let ν be the conditional measure CT = 0, RŜ . Similar to the previous case, we
will lower-bound

P S happy|RŜ ≥ (1 − 2ǫη )P A→ →

T = 1, BT = 1, |CT = 0, RŜ
= (1 − 2ǫη )P ν [ A→ → → → →
T = 1| A T + B T = 2] P ν [ A T + B T = 2] (51)

where we used E CT→ |RŜ ≤ 2ǫη in the first inequality. So, it remains to lower-bound each of

the two terms in the RHS.

We start with the first one. Since x( A) ∈ [1 − ǫη , 1 + ǫη ] and x( A↑ ) ≤ ǫη we have

Eν [ A→
T ] ∈ [1 − 2ǫη , 1 + 3ǫη ].

76
The same bounds hold for E ν [ x( B→ )].
Therefore,

P ν [ A→ →
T ≥ 1] , P ν [ B T ≥ 1] ≥ 1 − e
−1+2ǫη
(Lemma 2.22)
P ν [ A→
T ≤ 1] , P ν [ BT→ ≤ 1] ≥ 0.495 (Markov)

Therefore, by Corollary 5.5 (with ǫ = 0.495(1 − e−1+2ǫη ) ≥ 0.31) we have

P ν [ A→ → →
T = 1 | A T + BT = 2] ≥ 0.155.

By Corollary 2.24, P ν [ E(ui−1 , ui ) T = 1] ≥ 1 − 4ǫη . Similarly, P ν [ E(ui , ui+1 ) T = 1] ≥ 1 − 4ǫη . And,

P ν [δ→ (ui ) T − E(ui−1 , ui ) T − E(ui , ui+1 ) T = 0] ≥ 1 − 4ǫη

So, by a union bound all of these events happen simultaneously and we get P ν [δ→ (ui ) T = 2] ≥
1 − 12ǫη . Therefore,

P ν [( A→ ) T = ( B→ ) T = 1] ≥ 0.155(1 − 12ǫη ) ≥ 0.153.

Plugging this back into (51), we get

P S happy|RŜ ≥ 0.153(1 − 2ǫη ) ≥ 0.152.

Plugging this in (49) we get

E [ IS→ ] ≤ (1 + ǫη ) βpP S not happy|RŜ (max{ x( A→ ), x( B→ )} + x(C → ))

≤ (1 + ǫη ) βp(1 − 0.152)(1 + ǫη + ǫη ) ≤ 0.85βp

as desired.

References
[AOV18] Nima Anari, Shayan Oveis Gharan, and Cynthia Vinzant. “Log-Concave Polynomi-
als, Entropy, and a Deterministic Approximation Algorithm for Counting Bases of
Matroids”. In: FOCS. Ed. by Mikkel Thorup. IEEE Computer Society, 2018, pp. 35–46
(cit. on p. 4).
[App+07] David L. Applegate, Robert E. Bixby, Vasek Chvatal, and William J. Cook. The Trav-
eling Salesman Problem: A Computational Study (Princeton Series in Applied Mathematics).
Princeton, NJ, USA: Princeton University Press, 2007 (cit. on p. 1).
[Aro+98] Sanjeev Arora, Michelangelo Grigni, David Karger, Philip Klein, and Andrzej Woloszyn.
“A polynomial-time approximation scheme for weighted planar graph TSP”. In: SODA.
1998, pp. 33–41 (cit. on p. 1).
[Aro96] Sanjeev Arora. “Polynomial Time Approximation Schemes for Euclidean TSP and
Other Geometric Problems”. In: FOCS. 1996, pp. 2–11 (cit. on p. 1).

77
[Asa+10] Arash Asadpour, Michel X. Goemans, Aleksander Madry, Shayan Oveis Gharan, and
Amin Saberi. “An O(log n/ log log n)-approximation Algorithm for the Asymmetric
Traveling Salesman Problem”. In: SODA. 2010, pp. 379–389 (cit. on p. 2).
[BBL09] Julius Borcea, Petter Branden, and Thomas M. Liggett. “Negative dependence and
the geometry of polynomials.” In: Journal of American Mathematical Society 22 (2009),
pp. 521–567 (cit. on pp. 7, 9, 10).
[BC11] Sylvia Boyd and Robert Carr. “Finding low cost TSP and 2-matching solutions using
certain half-integer subtour vertices”. In: Discrete Optimization 8.4 (2011), pp. 525 –539
(cit. on p. 1).
[BEM10] S. Boyd and P. Elliott-Magwood. “Structure of the extreme points of the subtour elim-
ination polytope of the STSP”. In: Combinatorial Optimization and Discrete Algorithms.
Vol. B23. 2010, pp. 33–47 (cit. on p. 1).
[Ben95] András A. Benczúr. “A Representation of Cuts within 6/5 Times the Edge Connec-
tivity with Applications”. In: FOCS. 1995, pp. 92–102 (cit. on p. 3).
[Ben97] Andras A. Benczúr. “Cut structures and randomized algorithms in edge-connectivity
problems”. PhD thesis. MIT, 1997 (cit. on p. 7).
[BG08] András A. Benczúr and Michel X. Goemans. “Deformable Polygon Representation
and Near-Mincuts”. In: Building Bridges: Between Mathematics and Computer Science,
M. Groetschel and G.O.H. Katona, Eds., Bolyai Society Mathematical Studies 19 (2008),
pp. 103–135 (cit. on p. 3).
[BP91] S. C. Boyd and William R. Pulleyblank. “Optimizing over the subtour polytope of the
travelling salesman problem”. In: Math. Program. 49 (1991), pp. 163–187 (cit. on p. 1).
[BS20] René van Bevern and Viktoriia A. Slugina. “A historical note on the 3/2-approximation
algorithm for the metric traveling salesman problem”. 2020 (cit. on p. 1).
[Chr76] Nicos Christofides. Worst Case Analysis of a New Heuristic for the Traveling Salesman
Problem. Report 388. Pittsburgh, PA: Graduate School of Industrial Administration,
Carnegie-Mellon University, 1976 (cit. on p. 1).
[CV00] Robert D. Carr and Santosh Vempala. “Towards a 4/3 approximation for the asym-
metric traveling salesman problem”. In: SODA. 2000, pp. 116–125 (cit. on p. 1).
[Dar64] J. N. Darroch. “On the distribution of the number of successes in independent trials”.
In: Ann. Math. Stat. 36 (1964), pp. 1317–1321 (cit. on p. 10).
[DFJ59] G.B. Dantzig, D.R. Fulkerson, and S. Johnson. “On a Linear Programming Combi-
natorial Approach to the Traveling Salesman Problem”. In: OR 7 (1959), pp. 58–66
(cit. on p. 2).
[DHM07] Erik D. Demaine, MohammadTaghi Hajiaghayi, and Bojan Mohar. “Approximation
algorithms via contraction decomposition”. In: SODA. 2007, pp. 278–287 (cit. on p. 1).
[DKL76] E.A. Dinits, A.V. Karzanov, and M.V. Lomonosov. “On the structure of a family of
minimal weighted cuts in graphs”. In: Studies in Discrete Mathematics (in Russian), ed.
A.A. Fridman, 290-306, Nauka (Moskva) (1976) (cit. on p. 3).

78
[Edm70] Jack Edmonds. “Submodular functions, matroids and certain polyhedra”. In: Com-
binatorial Structures and Their Applications. New York, NY, USA: Gordon and Breach,
1970, pp. 69–87 (cit. on p. 6).
[EJ73] Jack Edmonds and Ellis L. Johnson. “Matching, Euler tours and the Chinese post-
man”. In: Mathematical Programming 5.1 (1973), pp. 88–124 (cit. on p. 6).
[FM92] Tomás Feder and Milena Mihail. “Balanced matroids”. In: Proceedings of the twenty-
fourth annual ACM symposium on Theory of Computing. Victoria, British Columbia,
Canada: ACM, 1992, pp. 26–38 (cit. on p. 9).
[GKP95] M. Grigni, E. Koutsoupias, and C. Papadimitriou. “An approximation scheme for pla-
nar graph TSP”. In: FOCS ’95: Proceedings of the 36th Annual Symposium on Foundations
of Computer Science. Washington, DC, USA: IEEE Computer Society, 1995, p. 640. isbn:
0-8186-7183-1 (cit. on p. 1).
[GLS05] David Gamarnik, Moshe Lewenstein, and Maxim Sviridenko. “An improved upper
bound for the TSP in cubic 3-edge-connected graphs”. In: Oper. Res. Lett. 33.5 (Sept.
2005), pp. 467–474 (cit. on p. 1).
[Goe95] Michel X. Goemans. “Worst-Case Comparison of Valid Inequalities for the TSP”. In:
MATH. PROG 69 (1995), pp. 335–349 (cit. on p. 1).
[Gur06] Leonid Gurvits. “Hyperbolic polynomials approach to Van der Waerden/Schrijver-
Valiant like conjectures: sharper bounds, simpler proofs and algorithmic applica-
tions”. In: STOC. Ed. by Jon M. Kleinberg. ACM, 2006, pp. 417–426 (cit. on p. 3).
[Gur08] Leonid Gurvits. “Van der Waerden/Schrijver-Valiant like Conjectures and Stable (aka
Hyperbolic) Homogeneous Polynomials: One Theorem for all”. In: Electr. J. Comb. 15.1
(2008) (cit. on p. 3).
[GW17] Kyle Genova and David P. Williamson. “An Experimental Evaluation of the Best-of-
Many Christofides’ Algorithm for the Traveling Salesman Problem”. In: Algorithmica
78.4 (2017), pp. 1109–1130 (cit. on p. 2).
[HK70] M. Held and R.M. Karp. “The traveling salesman problem and minimum spanning
trees”. In: Operations Research 18 (1970), pp. 1138–1162 (cit. on p. 2).
[HLP52] G. H. Hardy, J. E. Littlewood, and G. Polya. Inequalities. Cambridge Univ. Press, 1952
(cit. on p. 10).
[HN19] Arash Haddadan and Alantha Newman. “Towards Improving Christofides Algo-
rithm for Half-Integer TSP”. In: ESA. Ed. by Michael A. Bender, Ola Svensson, and
Grzegorz Herman. Vol. 144. LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Infor-
matik, 2019, 56:1–56:12 (cit. on p. 1).
[HNR17] Arash Haddadan, Alantha Newman, and R. Ravi. “Cover and Conquer: Augmenting
Decompositions for Connectivity Problems”. abs/1707.05387. 2017 (cit. on p. 1).
[Hoe56] W. Hoeffding. “On the distribution of the number of successes in independent trials”.
In: Ann. Math. Statist. 27 (1956), pp. 713–721 (cit. on pp. 10, 14).
[KKO20] Anna R. Karlin, Nathan Klein, and Shayan Oveis Gharan. “An improved approxi-
mation algorithm for TSP in the half integral case”. In: STOC. Ed. by Konstantin
Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Kamath, and Julia Chuzhoy.
ACM, 2020, pp. 28–39 (cit. on p. 1).

79
[Kle05] Philip N. Klein. “A linear-time approximation scheme for planar weighted TSP”. In:
FOCS. 2005, pp. 647–657 (cit. on p. 1).
[KLS15] Marek Karpinski, Michael Lampis, and Richard Schmied. “New inapproximability
bounds for TSP”. In: Journal of Computer and System Sciences 81.8 (2015), pp. 1665 –
1677. issn: 0022-0000 (cit. on p. 1).
[Mit99] Joseph SB Mitchell. “Guillotine subdivisions approximate polygonal subdivisions: A
simple polynomial-time approximation scheme for geometric TSP, k-MST, and re-
lated problems”. In: SIAM Journal on Computing 28.4 (1999), pp. 1298–1309 (cit. on
p. 1).
[MS11] Tobias Moemke and Ola Svensson. “Approximating Graphic TSP by Matchings”. In:
FOCS. 2011, pp. 560–569 (cit. on p. 1).
[Muc12] M Mucha. “ 13
9 -approximation for graphic TSP.” In: STACS. 2012, pp. 30–41 (cit. on
p. 1).
[OSS11] Shayan Oveis Gharan, Amin Saberi, and Mohit Singh. “A Randomized Rounding
Approach to the Traveling Salesman Problem”. In: FOCS. IEEE Computer Society,
2011, pp. 550–559. isbn: 978-0-7695-4571-4 (cit. on pp. 1–4, 7, 17).
[Ser78] A. I. Serdyukov. “O nekotorykh ekstremal’nykh obkhodakh v grafakh”. In: Upravlyae-
mye sistemy 17 (1978), pp. 76–79 (cit. on p. 1).
[SV12] András Sebö and Jens Vygen. “Shorter Tours by Nicer Ears:” CoRR abs/1201.1870.
2012 (cit. on pp. 1, 2).
[SV19] Damian Straszak and Nisheeth K. Vishnoi. “Maximum Entropy Distributions: Bit
Complexity and Stability”. In: COLT. Ed. by Alina Beygelzimer and Daniel Hsu.
Vol. 99. Proceedings of Machine Learning Research. PMLR, 2019, pp. 2861–2891 (cit.
on p. 19).
[SW90] D. B. Shmoys and D. P. Williamson. “Analyzing the Held-Karp TSP bound: a mono-
tonicity property with application”. In: Inf. Process. Lett. 35.6 (Sept. 1990), pp. 281–285
(cit. on p. 1).
[SWZ12] Frans Schalekamp, David P. Williamson, and Anke van Zuylen. “A proof of the Boyd-
Carr conjecture”. In: SODA. 2012, pp. 1477–1486 (cit. on pp. 1, 24).
[SWZ13] Frans Schalekamp, David P. Williamson, and Anke van Zuylen. “2-Matchings, the
Traveling Salesman Problem, and the Subtour LP: A Proof of the Boyd-Carr Conjec-
ture”. In: Mathematics of Operations Research 39.2 (2013), pp. 403–417 (cit. on p. 24).
[TVZ20] Vera Traub, Jens Vygen, and Rico Zenklusen. “Reducing path TSP to TSP”. In: STOC.
Ed. by Konstantin Makarychev, Yury Makarychev, Madhur Tulsiani, Gautam Ka-
math, and Julia Chuzhoy. ACM, 2020, pp. 14–27 (cit. on p. 1).
[Wol80] Laurence A. Wolsey. “Heuristic analysis, linear programming and branch and bound”.
In: Combinatorial Optimization II. Vol. 13. Mathematical Programming Studies. Springer
Berlin Heidelberg, 1980, pp. 121–134 (cit. on pp. 1, 17).

80
V
u v
e

Figure 13: Setting of Lemma 5.22

A Proofs from Section 5

Lemma 5.22. Let e = (u, v) be a top edge bundle such that xe ≤ 1/2 − ǫ1/2 . If 12ǫ1/1 ≤ ǫ1/2 ≤ 0.001
2 .
then, e is 2/1/1 happy with probability at least 0.005ǫ1/2

Proof. Let A, B, C be the degree partitioning of δ(u). Let V := δ(v)−e (see Fig. 13). Condition
u, v be trees, e and C to 0, let ν be the resulting measure. This happens with probability at least
0.5 and increases marginals in A−e , B−e , V by at most xe + 2ǫ1/1 + ǫη ≤ xe + 2.1ǫ1/1 and by tree
conditioning decreases marginals by at most 2ǫη . After conditioning, we have

E ν [ A T ] ∈ x( A) − xe( A) + [−2ǫη , xe + 2.1ǫ1/1 ] ⊂ [0.5, 1.5], similarly E ν [ BT ] ⊂ [0.5, 1.5]

E ν [VT ] ∈ x(δ(v)) − xe + [−2ǫη , xe + 2.1ǫ1/1 ] ⊂ [1.5, 2.01]
E ν [ BT + VT ] ∈ x( B) + x(δ(v)) − xe − xe( B) + [−2ǫη , xe + 2.1ǫ1/1 ] ⊂ [2 + 1.8ǫ1/2 , 3.01],
E ν [ A T + BT ] ∈ x( A) + x( B) − xe( A) − xe( B) + [−2ǫη , xe + 2.1ǫ1/1 ] ⊂ [1.5, 2.01],
E ν [ A T + BT + VT ] ∈ x( A) + x( B) + x(δ(v)) − xe − xe( A) − xe( B) + [−2ǫη , xe + 2.1ǫ1/1 ]
⊂ [3 + 1.75ǫ1/2 , 3.51].

where we used ǫ1/2 ≤ 0.001 and 12ǫ1/1 < ǫ1/2 and xe( A) , xe( B) , xe( A) + xe( B) ≤ xe ≤ 1/2 − ǫ1/2 . It
immediately follows from Proposition 5.1 that P ν [ A T = BT = 1, VT = 2] is at least a constant. In
the rest of the proof, we do a more refined analysis. Using A T + BT ≥ 1, VT ≥ 1,

P ν [ A T + BT + VT = 4] ≥ (1.75ǫ1/2 )e−1.75ǫ1/2 ≥ 1.7ǫ1/2 , (Lemma 2.21)

P ν [ A T + BT ≥ 2] , P ν [VT ≥ 2] ≥ 0.39, (Lemma 2.22)
P ν [ A T + BT ≤ 2] , P ν [VT ≤ 2] ≥ 0.5, (Markov, A T + BT ≥ 1, VT ≥ 1 under ν)
P ν [ A T ≤ 1] ≥ 0.25, P ν [ BT + VT ≤ 3] ≥ 0.33. (Markov’s Inequality and VT ≥ 1 under ν)
P ν [ A T ≥ 1] ≥ 0.39, P ν [ BT + VT ≥ 3] ≥ 1.75ǫ1/2 , (Lemma 2.22)

It follows by Corollary 5.5 (with ǫ = 0.195, pm ≥ 1 − 2ǫ ≥ 0.6) that

P ν [VT = 2| A T + BT + VT = 4] ≥ 0.13.

Note that since VT ≥ 1, A T + BT ≥ 1 with probability 1, we apply Corollary 5.5 to VT − 1, A T +

BT − 1.
Furthermore, by Lemma 5.4, P ν [ A T ≥ 1| A T + BT + VT = 4] ≥ 0.128, P ν [ A T ≤ 1| A T + BT + VT = 4] ≥
0.43ǫ1/2 . The same holds for BT . Therefore, by Corollary 5.5 (with ǫ = 0.055ǫ1/2 ), using that
ǫ1/2 < 0.001,
P ν [ A T = 1| A T + BT = 2, VT = 2] ≥ 0.05ǫ1/2 .

81
Putting these together we have

P [ e 2-1-1 happy] ≥ 0.5P ν [ A T = BT = 1, VT = 2]

= 0.5P ν [ A T + BT + VT = 4] P ν [VT = 2| A T + BT + VT = 4]
· P ν [ A T = 1|VT = 2, A T + BT = 2]
2
≥ 0.5(1.7ǫ1/2 )(0.13)(0.05ǫ1/2 ) ≥ 0.005ǫ1/2

as desired.

Lemma 5.23. Let e = (u, v) be a top edge bundle such that xe ≥ 1/2 + ǫ1/2 . If 12ǫ1/1 ≤ ǫ1/2 ≤ 0.001,
2 .
then, e is 2/1/1 happy with respect to u with probability at least 0.006ǫ1/2

Proof. Let A, B, C be the degree partitioning of the edges in δ(u), V = δ−e (v). Condition u, v be
trees, CT = 0 and u ∪ v to be a tree (in order). This happens with probability at least 12 + ǫ1/2 −
3ǫη − 2ǫ1/1 ≥ 0.5. Let ν be the resulting measure restricted to edges in A, B, V. Note that ν on
edges in A, B, V is SR. This is because ν is a product of two strongly Rayleigh distribution on the
following two disjoint set of edges (i) the edges between u, v and (ii) the edges in A−e , B−e , V.
Furthermore, observe that under ν, every set of edges in A−e , B−e , V increases by at most
2ǫ1/1 + ǫη < 0.2ǫ1/2 (using 12ǫ1/1 ≤ ǫ1/2 ), and decreases by at most 1 − xe + 2ǫη . Therefore,

E ν [ A T ] ∈ x( A) + [−(1 − xe ) − 2ǫη , 1 − xe + 0.2ǫ1/2 ] ⊂ [0.5, 1.5], similarly, E ν [ BT ] ∈ [0.5, 1.5]

E ν [VT ] ∈ x(δ(v)) − xe + [−(1 − xe ) − 2ǫη , 0.2ǫ1/2 ] ⊂ [0.995, 1.5].
E ν [ A T + BT ] ∈ x( A) + x( B) + 1 − xe( A) − xe( B) + [−(1 − xe ) − 2ǫη , 0.2ǫ1/2 ] ⊂ [1.995, 2.5],
E ν [ BT + VT ] ∈ x( B) + x(δ(v)) − xe + [−(1 − xe ) − 2ǫη , 1 − xe + 0.2ǫ1/2 ] ⊂ [1.99, 3 − 1.75ǫ1/2 ].
E ν [ A T + BT + VT ] ∈ x( A) + x( B) + x(δ(v)) + 1 − xe − xe( A) − xe( B) + [−(1 − xe ) − 2ǫη , 0.2ǫ1/2 ]
⊂ [2.99, 4 − 1.75ǫ1/2 ].

where in the upper bound on E ν [ A T ], E ν [ BT ], E ν [ BT + VT ] we used that the marginals of edges

in the bundle e can only increase by 1 − xe (in total) when conditioning u ∪ v to be a tree. So,

P ν [ A T + BT + VT = 3] ≥ ǫ1/2 , (By Theorem 2.15)

P ν [ A T + BT ≥ 2] ≥ 0.63, P ν [VT ≥ 1] ≥ 0.63 (Lemma 2.22, A T + BT ≥ 1)
P ν [ A T + BT ≤ 2] ≥ 0.25, P ν [VT ≤ 1] ≥ 0.25, (Markov Inequality, A T + BT ≥ 1)
P ν [ A T ≥ 1] ≥ 0.39, P ν [ BT + VT ≥ 2] ≥ 0.59 (Lemma 2.22)
P ν [ A T ≤ 1] ≥ 0.25, P ν [ BT + VT ≤ 2] ≥ 1.75ǫ1/2 , (Markov, In worst case P [ BT + VT < 2] = 0)

It follows by Corollary 5.5 (with ǫ = 0.157, pm = 0.68) that

P ν [ A T + BT = 2| A T + BT + VT = 3] ≥ 0.12.

Note that since A T + BT ≥ 1 with probability 1, we apply Corollary 5.5 to A T + BT − 1, VT .

Furthermore, by Lemma 5.4, P ν [ A T ≥ 1| A T + BT + VT = 3] ≥ 0.68ǫ1/2 and P ν [ A T ≤ 1| A T + BT + VT = 3] ≥
0.147. By symmetry, the same holds for BT . Therefore, by Corollary 5.5,

P ν [ A T = 1| A T + BT = 2, VT = 1] ≥ 0.09ǫ1/2 .

82
where we used ǫ1/2 < 0.001.
Finally,
2
P [e 2-1-1 happy] ≥ (0.09ǫ1/2 )0.12(ǫ1/2 )0.5 ≥ 0.005ǫ1/2 ,
as desired.

V
A B e( B )
v
u

e( A )

Figure 14: Setting of Lemma 5.24

Lemma 5.24. For a good half top edge bundle e = (u, v), let A, B, C be the degree partitioning of δ(u),
and let V = δ(v)−e (see Fig. 14). If xe( B) ≤ ǫ1/2 and P [( A−e ) T + VT ≤ 1] ≥ 5ǫ1/2 then e is 2-1-1
good,
2
P [e 2-1-1 happy w.r.t. u] ≥ 0.005ǫ1/2
Proof. The proof is similar to Lemma 5.23. We condition u, v to be trees, CT = 0, u ∪ v to be a tree.
Let ν be the resulting SR measure on edges in A, B, V. The main difference is since xe 6≥ 1/2 + ǫ1/2
we use the lemma’s assumptions to lower bound P ν [ A T + BT + VT = 3] , P ν [ A T + VT ≤ 2] , P ν [ BT + VT ≤ 2].
First, since e is 2-2 good, by Lemma 5.15 and negative association,
P ν [(δ(u)−e ) T + VT ≤ 2] ≥ P [(δ(u)−e ) T + VT ≤ 2] − P [CT = 0] ≥ 0.4ǫ1/2 − 2ǫ1/1 − ǫη ≥ 0.22ǫ1/2 ,
where we used ǫ1/1 < 12ǫ1/2 . Letting pi = P [(δ(u)−e ) T + VT = i ], we therefore have p≤2 ≥
0.22ǫ1/2 . In addition, by Lemma 2.21, p3 ≥ 1/4. If p2 < 0.2ǫ1/2 , then from p2 /p3 ≤ 0.8ǫ1/2 , we
could use log-concavity to derive a contradiction to p≤2 ≥ 0.22ǫ1/2 (analogously to what’s done
in the proof of Lemma 2.18). Therefore, we must have
P ν [ A T + BT + VT = 3] = P ν [(δ(u)−e ) T + VT = 2] ≥ 0.2ǫ1/2 .
Next, notice since P [u, v, u ∪ v trees, CT = 0] ≥ 0.49, by the lemma’s assumption, P ν [e( B)] ≤
2.01ǫ1/2 . Therefore,
E ν [ BT + VT ] ≤ x(V ) + x( B) + 1.01ǫ1/2 + 2ǫ1/1 + ǫη ≤ 2.51.
So, by Markov, P ν [ BT + VT ≤ 2] ≥ 0.15. Finally, by negative association,
P ν [ A T + VT ≤ 2] ≥ P ν [( A−e ) T + VT ≤ 1] ≥ P [( A−e ) T + VT ≤ 1] − P [ CT = 0] ≥ 4.8ǫ1/2
where we used the lemma’s assumption.
Now, following the same line of arguments as in Lemma 5.23, we have P ν [ A T + BT = 2| A T + BT + VT = 3] ≥
0.12. Also, P ν [ A T ≥ 1| A − T + BT + VT = 3] ≥ 3.02, which implies P ν [ A T = 1| A T + BT = 2, VT = 1] ≥
0.42ǫ. This implies
2
P [e 2-1-1 happy] ≥ (0.42ǫ1/2 )0.12(0.2ǫ1/2 )0.498 ≥ 0.005ǫ1/2
as desired.

83
Lemma 5.25. Let e = (v, u) and f = (v, w) be good half top edge bundles and let A, B, C be the
degree partitioning of δ(v) such that xe( B) , xf( B) ≤ ǫ1/2 . Then, one of e, f is 2-1-1 happy with probability
2 .
at least 0.005ǫ1/2
Proof. Let U = δ(u)−e . By Lemma 2.27, we can assume, without loss of generality, that

E [ U T |f ∈
/ T, u, v, w tree] ≤ x(UT ) + 0.405 + 3ǫη . (52)

On the other hand,

E [( A−e−f ) T ] ≥ E [( A−e−f ) T |f ∈
/ T, u, v, w tree] P [f ∈
/ T, u, v, w, tree]
≥ E [( A−e−f ) T |f ∈
/ T, u, v, w tree] 0.49

So,
1 1
E [( A−e−f ) T |f ∈
/ T, u, v, w, tree] ≤ x ( A − e− f ) ≤ (4ǫ1/2 + ǫη ) ≤ 8.2ǫ1/2 . (53)
0.49 0.49
Combining (52) and (53), we get E [UT + ( A−e )|f ∈
/ T, u, v, w tree] ≤ 1.91 where we used
ǫ1/2 ≤ 0.001. Therefore, using Lemma 2.21, we get

P [UT + ( A−e ) T ≤ 1] ≥ 0.49P [UT + ( A−e ) T ≤ 1|f ∈

/ T, u, v, w tree] ≥ 0.01,

Since ǫ1/2 ≤ 0.001, by Lemma 5.24, e is 2-1-1 good.

X
Y
A B e( B )
v

e( A )

Figure 15: Setting of Lemma 5.26.

Lemma 5.26. Let e = (u, v) be a good half edge bundle and let A, B, C be the degree partitioning of δ(u)
(see Fig. 15). If 12ǫ1/1 ≤ ǫ1/2 ≤ 0.001 and xe( A) , xe( B) ≥ ǫ1/2 , then
2
P [e 2-1-1 happy w.r.t u] ≥ 0.02ǫ1/2 .

Proof. Condition CT to be zero, u, v and u ∪ v be trees. This happens with probability at least
0.49. Let ν be the resulting measure. Let X = A−e ∪ B−e , Y = δ(v)−e Since e is 2/2 good by
Lemma 5.15 and stochastic dominance,

P ν [ XT + YT ≤ 2] ≥ P [(δ(u)−e ) T + YT ≤ 2] − P [ CT = 0] ≥ 0.4ǫ1/2 − 2ǫ1/1 − ǫη ≥ 0.22ǫ1/2 ,

where we used ǫ1/1 < 12ǫ1/2 . It follows by log-concavity of XT + YT that P ν [ XT + YT = 2] ≥

0.2ǫ1/2 . Now,

E ν [ XT ] , E ν [YT ] ∈ [1 − 3ǫ1/1 , 1.5 + ǫ1/2 + 2ǫ1/1 + 3ǫη ] ⊂ [0.995, 1.51]

84
So,

P ν [ XT ≥ 1] , P ν [YT ≥ 1] ≥ 0.63, (Lemma 2.22)

P ν [ XT ≤ 1] , P ν [YT ≤ 1] ≥ 0.245. (Markov)

Therefore, by Corollary 5.5 P ν [ XT = 1| XT + YT = 2] ≥ 0.119.

P ν [ XT = YT = 1] ≥ (0.2ǫ1/2 )0.119 ≥ 0.023ǫ1/2 ,

Let E be the event {XT = YT = 1|ν}. Note that in ν we always choose exactly 1 edge from the e
bundle and that is independent of edges in X, Y, in particular the above event. Therefore, we can
correct the parity of A, B by choosing from e A or eB . It follows that
2
P [e 2/1/1 happy w.r.t u] ≥ P ν [E ] (1.99ǫ1/2 )0.49 ≥ 0.02ǫ1/2 ,

where we used that E ν [e( A) T ] ≥ 1.99ǫ1/2 , and the same fact for e( B) T . To see why this latter fact
is true, observe that conditioned on u, v trees, we always sample at most one edge between u, v.
Therefore, since under ν we choose exactly one edge between u, v, the probability of choosing
from e( A) (and similarly choosing from e( B)) is at least

E [e( A) T |u, v trees, CT = 0] xe( A) − 2ǫη ǫ1/2 − 2ǫη

≥ ≥ ≥ 1.99ǫ1/2
P [e|u, v trees, CT = 0] xe + 3ǫ1/1 1/2 + 1.3ǫ1/2

as desired.

U W

u e( A ) A B f( B ) w

e( B ) f( A )

Figure 16: Setting of Lemma 5.27. We assume that the dotted green/blue edges are at most ǫ1/2 .
Note that edges of C are not shown.

Lemma 5.27. Let e = (u, v), f = (v, w) be two good top half edge bundles and let A, B, C be degree partitioning
of δ(v) such that xe( B) , xf( A) ≤ ǫ1/2 . If e, f are not 2-1-1 good with respect to v, and 12ǫ1/1 ≤ ǫ1/2 ≤
0.0002, then e, f are 2-2-2 happy with probability at least 0.01.

Proof. First, observe that by Lemma 5.24 if P [UT + ( A−e ) T ≤ 1] ≥ 0.25ǫ, where ǫ ≥ 20ǫ1/2 is a
constant that we fix later, then e is 2/1/1 good and we are done. So, assume, P [UT + ( A−e ) T ≥ 2] ≥
1 − 0.25ǫ. Furthermore, let q = P [UT + ( A−e ) T ≥ 3]. Since x(U ) + x( A−e ) ≤ 2 + 3ǫ1/2 + 2ǫ1/1 +
3ǫη ≤ 2 + 3.2ǫ1/2 (where we used xe( A) ≥ xe − xe( B) − xC ≥ 1/2 − 2ǫ1/2 − 2ǫ1/1 − ǫη and where
we used 12ǫ1/1 ≤ ǫ1/2 ),
2(1 − q − 0.25ǫ) + 3q ≤ 2 + 3.2ǫ1/2 .

85
This implies that q ≤ 0.5ǫ + 3.2ǫ1/2 ≤ 0.75ǫ (for ǫ ≥ 13ǫ1/2 ). Therefore,

P [UT + ( A−e ) T = 2] , P [WT + ( B−f ) T = 2] ≥ 1 − ǫ (54)

where the second inequality follows by a similar argument.

Claim A.1. Let Z = δ(u) ∩ δ(w). If ǫ < 1/15, then either E [ Z |u, v, w tree] ≤ 3ǫ or E [ Z |u, v, w tree] ≥
(1 − 3ǫ).
Proof. For the whole proof we work with µ conditioned on u, v, w are trees. Let z = E [ Z ]. Let
D = U ∪ W ∪ A−e ∪ B−f r Z. Note that DT + 2ZT = UT ∪ WT ∪ ( A−e ) T ∪ ( B−f ) T . By Eq. (54) and
a union bound P [ DT + 2ZT = 4] ≥ 1 − 2ǫ − 3ǫη . Therefore,
q
2.1ǫ ≥ 2ǫ + 3ǫη ≥ P [ DT + 2ZT 6= 4] ≥ P [ DT = 3] ≥ P [ D T = 2] P [ D T = 4]

where the last inequality follows by log-concavity. On the other hand,

z = P [ Z = 1] ≤ P [ DT = 2, Z = 1] + P [ DT + 2ZT 6= 4] ≤ P [ DT = 2] + 2.1ǫ,
1 − z = P [ Z = 0] ≤ P [ DT = 4, Z = 0] + P [ DT + 2ZT 6= 4] ≤ P [ DT = 4] + 2.1ǫ

Putting everything together,

(2.1ǫ)2 ≥ (z − 2.1ǫ)(1 − z − 2.1ǫ) = z(1 − z) − 2.1ǫ + 2.1ǫ2 .

Therefore, using ǫ ≤ 1/15, we get that either z ≤ 3ǫ or z ≥ 1 − 3ǫ.

So, for the rest of proof we assume E [ ZT |u, v, w trees] < 3ǫ. A similar proof shows e, f are 2-
2-2 good when E [ ZT |u, v, w trees] > 1 − 3ǫ. We run the following conditionings in order: u, v, w
trees, ZT = 0, CT = 0, e( B), f ∈/ T, e( A) ∈ T. Note that e( A) ∈ T is equivalent to u ∪ v be a tree.
Call this event E (i.e., the event that all things we conditioned on happen). First, notice

P [E ] ≥ (1 − 3ǫη )(1 − 3ǫ − 2ǫ1/1 − ǫη − ǫ1/2 − (1/2 + ǫ1/2 ))(1/2 − 3ǫ1/2 ) ≥ 0.22 ≥ 1/5 (55)

Moreover, since all of these conditionings correspond to upward/downward events, µ|E is

strongly Rayleigh. The main statement we will show is that

P [e, f 2-2-2 happy|E ] ≥ P [UT = ( A−e ) T = 1, ( B−f ) T = 0, WT = 2|E ] = Ω(1).

The main insight of the proof is that Eq. (54) holds (up to a larger constant of ǫ), even after
conditioning E , B−f = 0, A−e = 1; so, we can bound the preceding event by just a union bound.
The main non-trivial statement is to argue that the expectations of B−f and A−e do not change
so much under E .
Combining (54) and (55),

P [UT + ( A−e ) T = 2|E ] , P [WT + ( B−f ) T = 2|E ] ≥ 1 − 5ǫ. (56)

We claim that

E [ BT |E ] = E [( B−f ) T |E ] ≤ x( B−f ) + 3ǫη + 3ǫ1/1 + ǫ1/2 + 35ǫ ≤ 0.66 (57)

86
using ǫ1/2 < 0.0002 and ǫ = 20ǫ1/2 . To see this, observe that after each conditioning in E either
all marginals increase or all decrease. Furthermore, the events CT = 0, ZT = 0, e( B) T = 0 can
increase marginals by at most 3ǫη + 3ǫ1/1 + ǫ1/2 ; the only other event that can increase B−f is
f ∈/ T. Now we know P [ B−f ) T + WT = 2|E ] ≥ 1 − 5ǫ before and after conditioning f ∈ / T.
Therefore, by Corollary 2.19, 2 − 10ǫ ≤ E [ B−f ) T + WT ] ≤ 2 + 25ǫ. But if E [ B−f ) T ] increased by
more than 35ǫ, then either before conditioning f ∈ / T, E [( B−f ) + WT ] < 2 − 10ǫ or afterwards it is
more than 2 + 25ǫ, which is a contradiction, and completes the proof of (57). A similar argument
shows that E [( A−e ) T |E ] ≤ 0.66.
We also claim that
E [( A−e ) T |E ] ≥ x( A−e ) − 3ǫη − 35ǫ ≥ 0.33.
As above, everything conditioned on in E increases E [( A−e ) T ] except for possibly e( A) ∈ T.
As above, we know that P [UT + ( A−e ) T = 2|E ] ≥ 1 − 5ǫ before and after e( A) ∈/ T. So again
applying Corollary 2.19, we see that it can’t decrease by more than 35ǫ.
It follows that

0.33 ≤ E [( A−e ) T |E ] ≤ E [( A−e ) T |E , ( B−f ) T = 0] ≤ 0.66 + 0.66 ≤ 1.32.

So, by Lemma 2.21 and Theorem 2.15, P ( A−e ) T = 1|E , ( B− f ) T = 0 ≥ 0.33e−.33 ≥ 0.237.

Therefore, by Lemma 2.21

P [E , ( A−e ) T = 1, ( B−f ) T = 0] ≥ (0.22)(0.39)(0.23) ≥ 0.019.

Therefore, by (56)

P [UT = 1|E , ( A−e ) T = 1, ( B−f ) T = 0] , P [WT = 2|E , ( A−e ) T = 1, ( B−f ) T = 0] ≥ 1 − 5ǫ/0.019

Finally, by union bound

P [UT = 1, WT = 2|E , ( A−e ) T = 1, ( B−f ) T = 0] ≥ 1 − ǫ/0.009

Using ǫ = 20ǫ1/2 and ǫ1/2 ≤ 0.0002 this means both of the above events happens, so e, f are
2-2-2-happy with probability 0.019(1 − ǫ/0.009) > 0.01 as desired.

Problem Tutorial: "Dress To Impress"
No ratings yet
Problem Tutorial: "Dress To Impress"
8 pages
A Domino Reduction Guide
No ratings yet
A Domino Reduction Guide
31 pages
Expansion Process of A Perfect Gas
100% (2)
Expansion Process of A Perfect Gas
11 pages
AMFI IAP Presentation
No ratings yet
AMFI IAP Presentation
34 pages
Lab 3
0% (1)
Lab 3
6 pages
04.TSP
No ratings yet
04.TSP
11 pages
UNIT 5 ApproximationAlgorithms
No ratings yet
UNIT 5 ApproximationAlgorithms
35 pages
DAA Unit - 5
No ratings yet
DAA Unit - 5
35 pages
Rec12
No ratings yet
Rec12
3 pages
TSP
No ratings yet
TSP
11 pages
Lecture 28
No ratings yet
Lecture 28
18 pages
Approx Lecture03-2 MTSP
No ratings yet
Approx Lecture03-2 MTSP
8 pages
Probabilistic Approximation of Metric Spaces and Its Algorithmic Applications
No ratings yet
Probabilistic Approximation of Metric Spaces and Its Algorithmic Applications
60 pages
RRL 2
No ratings yet
RRL 2
78 pages
Approximation Algorithms PDF
No ratings yet
Approximation Algorithms PDF
59 pages
Negative-Weight Single-Source Shortest Paths in Near-linear Time
No ratings yet
Negative-Weight Single-Source Shortest Paths in Near-linear Time
38 pages
l16
No ratings yet
l16
10 pages
Combinatorial Optimization - Chekuri (2022)
No ratings yet
Combinatorial Optimization - Chekuri (2022)
255 pages
Algoritmo Christofides
No ratings yet
Algoritmo Christofides
5 pages
Dict PDF
No ratings yet
Dict PDF
221 pages
An O(log n/ log log n)-approximation Algorithm for the Asymmetric Traveling Salesman Problem
No ratings yet
An O(log n/ log log n)-approximation Algorithm for the Asymmetric Traveling Salesman Problem
11 pages
Cmu850 f20
No ratings yet
Cmu850 f20
309 pages
Approximation Algorithms (2)
No ratings yet
Approximation Algorithms (2)
13 pages
UNIT 5 Approximation Algorithms
No ratings yet
UNIT 5 Approximation Algorithms
59 pages
MIT6_006S20_q2_sol
No ratings yet
MIT6_006S20_q2_sol
11 pages
cmu850-f20
No ratings yet
cmu850-f20
285 pages
Lecture 29
No ratings yet
Lecture 29
13 pages
Approximation Algorithms
No ratings yet
Approximation Algorithms
13 pages
An O(log n % log log n)-approximation Algorithm
No ratings yet
An O(log n % log log n)-approximation Algorithm
11 pages
Ham Path Notes
No ratings yet
Ham Path Notes
3 pages
Full Lecture Notes
No ratings yet
Full Lecture Notes
147 pages
Assignment 1
No ratings yet
Assignment 1
6 pages
Algorithm Assignment Help
100% (1)
Algorithm Assignment Help
6 pages
Changing and Unchanging of The Radius of A Graph: University of Central Florida Florida 32816
No ratings yet
Changing and Unchanging of The Radius of A Graph: University of Central Florida Florida 32816
16 pages
Graphs and Beyond - Faster Algorithms For High Dimensional Convex Optimization - Jakub Pachocki
No ratings yet
Graphs and Beyond - Faster Algorithms For High Dimensional Convex Optimization - Jakub Pachocki
202 pages
CH 35
No ratings yet
CH 35
17 pages
Operations Research
No ratings yet
Operations Research
118 pages
Sol 1
No ratings yet
Sol 1
4 pages
Brute Force Through Graphs
No ratings yet
Brute Force Through Graphs
16 pages
Pritchard Phdthesis
No ratings yet
Pritchard Phdthesis
209 pages
(CRM Series 16) Roman Glebov, Daniel Král’, Jan Volec (Auth.), Jaroslav Nešetřil, Marco Pellegrini (Eds.) - The Seventh European Conference on Combinatorics, Graph Theory and Applications_ EuroComb 20
No ratings yet
(CRM Series 16) Roman Glebov, Daniel Král’, Jan Volec (Auth.), Jaroslav Nešetřil, Marco Pellegrini (Eds.) - The Seventh European Conference on Combinatorics, Graph Theory and Applications_ EuroComb 20
612 pages
Chapter 3 - Solutions
No ratings yet
Chapter 3 - Solutions
10 pages
The Scaling Limit of Random Simple Triangulations and Random Simple Quadrangulations
No ratings yet
The Scaling Limit of Random Simple Triangulations and Random Simple Quadrangulations
59 pages
Graph Theory and Interconnection Networks
100% (1)
Graph Theory and Interconnection Networks
722 pages
book-3506
No ratings yet
book-3506
57 pages
Lecture 34
No ratings yet
Lecture 34
33 pages
Homework2 Sol
No ratings yet
Homework2 Sol
3 pages
Approximation Algorithms 1
No ratings yet
Approximation Algorithms 1
56 pages
Combinatorial PDF
No ratings yet
Combinatorial PDF
40 pages
Stopping_times_and_online_algorithms___Lecture_notes__Copy_ (2)
No ratings yet
Stopping_times_and_online_algorithms___Lecture_notes__Copy_ (2)
11 pages
Algorithm Methodologies
No ratings yet
Algorithm Methodologies
28 pages
Quiz 2 Solutions
No ratings yet
Quiz 2 Solutions
3 pages
Problem Set 1
No ratings yet
Problem Set 1
6 pages
A Discourse Analysis of 1 Peter
From Everand
A Discourse Analysis of 1 Peter
Ervin Ray Starwalt
No ratings yet
Advance Discrete Algebra Part-2
No ratings yet
Advance Discrete Algebra Part-2
12 pages
atsp
No ratings yet
atsp
44 pages
Introduction To Combinatorial Optimization (Ding-Zhu Du Panos M. Pardalos Xiaodong Hu Weili Wu) (Personal - Utdallas.edu)
No ratings yet
Introduction To Combinatorial Optimization (Ding-Zhu Du Panos M. Pardalos Xiaodong Hu Weili Wu) (Personal - Utdallas.edu)
230 pages
DAA Approximation Algorithms
No ratings yet
DAA Approximation Algorithms
32 pages
Book Waterloo
No ratings yet
Book Waterloo
193 pages
Solomon
No ratings yet
Solomon
34 pages
Algo 09
No ratings yet
Algo 09
39 pages
Time-dependent Behaviour and Design of Composite Steel-concrete Structures
From Everand
Time-dependent Behaviour and Design of Composite Steel-concrete Structures
Massimiliano Bocciarelli
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
The Evolution of The Crumbling Quote Signal
No ratings yet
The Evolution of The Crumbling Quote Signal
30 pages
Black Box Obfuscation Is Impossible Obfuscate
No ratings yet
Black Box Obfuscation Is Impossible Obfuscate
48 pages
Re Sizable Arrays
No ratings yet
Re Sizable Arrays
12 pages
Be Better at Getting Drunk 2
No ratings yet
Be Better at Getting Drunk 2
57 pages
DR Jose Rizal
No ratings yet
DR Jose Rizal
3 pages
Learning Metaphors and Theories of Learning
No ratings yet
Learning Metaphors and Theories of Learning
47 pages
Wipro's HR
No ratings yet
Wipro's HR
52 pages
The Strategy To Translate Metaphor
No ratings yet
The Strategy To Translate Metaphor
24 pages
Lecture-23 Context Free Grammar (CFG)
No ratings yet
Lecture-23 Context Free Grammar (CFG)
26 pages
Skit Final
No ratings yet
Skit Final
9 pages
Romanesque Art PDF
100% (1)
Romanesque Art PDF
103 pages
CSEC Chemistry January 2011 P032
No ratings yet
CSEC Chemistry January 2011 P032
8 pages
Graphic Design Solutions 5th Edition Robin Landa Solutions Manual 1
100% (56)
Graphic Design Solutions 5th Edition Robin Landa Solutions Manual 1
36 pages
Salvador Dali Grammar Drills Reading Comprehension Exercises - 107295
0% (1)
Salvador Dali Grammar Drills Reading Comprehension Exercises - 107295
2 pages
Chest Pain
No ratings yet
Chest Pain
50 pages
Game of Thrones S08E01 480p English Esubs MoviesFlixPro in
No ratings yet
Game of Thrones S08E01 480p English Esubs MoviesFlixPro in
23 pages
History of Social Dance
No ratings yet
History of Social Dance
4 pages
Murabaha To The Purchase Orderer
No ratings yet
Murabaha To The Purchase Orderer
43 pages
Big Data Analytics Methods and Applications Jovan Pehcevski
100% (5)
Big Data Analytics Methods and Applications Jovan Pehcevski
430 pages
Als457 Case Study
No ratings yet
Als457 Case Study
4 pages
Act Practice Test Science Items
100% (1)
Act Practice Test Science Items
26 pages
Statistical Analysis With Software Applications: Module No. 4 Title: Survey Operations
No ratings yet
Statistical Analysis With Software Applications: Module No. 4 Title: Survey Operations
12 pages
Harpur Memorial My Booklet
No ratings yet
Harpur Memorial My Booklet
13 pages
Quinonez Narrative Report
No ratings yet
Quinonez Narrative Report
69 pages
CometLordMinis CometCodex Preview V1-5
100% (4)
CometLordMinis CometCodex Preview V1-5
13 pages
Communication Skills
100% (1)
Communication Skills
130 pages
Conditionals Exercises
No ratings yet
Conditionals Exercises
6 pages
Nationalisum in Europe
No ratings yet
Nationalisum in Europe
3 pages
3rd Quarter Examination in 3is
No ratings yet
3rd Quarter Examination in 3is
1 page
BDM Assigment Solution
No ratings yet
BDM Assigment Solution
6 pages
SAP-C01 New
No ratings yet
SAP-C01 New
3 pages

Slightly Improved Approximation Algorithm For Metric TSP

Uploaded by

Slightly Improved Approximation Algorithm For Metric TSP

Uploaded by

A (Slightly) Improved Approximation Algorithm

for Metric TSP

May 11, 2021

4 Polygons and the Hierarchy of Near Minimum Cuts 28

7 Reduction and payment 63

A Proofs from Section 5 81

min ∑ x(u,v) c(u, v)

Algorithm 1 An Improved Approximation Algorithm for TSP

1.2.2 Generalized Gurvits’ Lemma

P [∀i, Ai = ni ] ≥ f (ǫ, m).

1.2.3 Conditioning while Preserving Marginals

δ(S) = {(u, v) ∈ E : |{u, v} ∩ S| = 1}

For a set A ⊆ E and a function x : E → R we write

We write G = (V, E, x) to denote an (undirected) graph G together with special vertices u0 , v0

The following lemma is proved in [Ben97]:

2.4 Strongly Rayleigh Distributions and λ-uniform Spanning Tree Distributions

• Conditioning. For any e ∈ E, {µ|e out} and {µ|e in}.

• Truncation. For any integer k ≥ 0 and µ ∈ B E , truncation of µ to k, is the measure µk where

Closure Operations of λ-uniform Spanning Tree Distributions

• Conditioning. For any e ∈ E, {µ|e out}, {µ|e in}.

• Tree Conditioning. For G = (V, E), a spanning tree distribution µ ∈ B E , and S ⊆ V,

{µλ( G) |S tree} = µλ( G[S]) × µλ( G/S) .

Negative Dependence Properties. An upward event, A, on 2E is a collection of subsets of E that

Stochastic Dominance. For two measures µ, ν : 2E → R ≥0 , we say µ  ν if there exists a

and for all A, B such that ρ( A, B) > 0 we have A ⊆ B (coordinate-wise).

Rank Sequence. The rank sequence of µ is the sequence

2.5 Sum of Bernoullis

Proof. Note that

Summing them up we get,

pi+ k ≤ γ k−1 pi+1 ≤ γ k pi .

Fact 2.20. For integers k < t and k − 1 ≤ p ≤ k,

This is equivalent to show

Finally the latter holds because (t − k + 1)( p − k + 1) ≤ (t − k + 1) ≤ 2(t − p) where we use

Proof. Let X = B1 + · · · + Bn where Bi is a Bernoulli. Applying Hoeffding’s theorem, if ℓ of them

where again we used Fact 2.20.

where the minimum is over all non-negative integers ℓ ≤ p.

since it is always possible to set qi = pi for i ≤ n and q j = 0 for j > n.

2.6 Random Spanning Trees

The following simple fact also holds by the union bound.

E [WT |e 6∈ T ] ≤ E [WT ] + P [w ∈ Pu,v |e 6∈ T ] · P [e ∈ T ] ,

= ∑ ρ( Ti , To )((1 − xe )Wo + xe Wi ), (8)

where we write Wi /Wo instead of WTi /WTo

Figure 1: Setting of Lemma 2.27

E [WT |e 6∈ T ] ≤ E [WT ] + P [w ∈ Pu,v |e 6∈ T ] P [ e ∈ T ]

Note that the lemma only implies E [ δ(w) T |e ∈/ T ] ≤ E [ δ(w) T ] + P [w ∈ Pu,v |e ∈

P [w ∈ Pu,v ∧ e 6∈ T ] + P [ u ∈ Pv,w ∧ f 6∈ T ] ≤ 0.808

It remains to upper bound the RHS. Let α = P [ f ∈ T |e ∈

E [WT |e ∈ / T ] − P [ f ]) ≤ E [WT ] + 2ǫ + 0.4 ≤ E [WT ] + 0.405

/ T ] ≤ E [ UT ] + 0.405, so the claim follows.

ii) For each η-near-min-cut S of z, if δ(S) T is odd, then s(δ(S)) + s∗ (δ(S)) ≥ 0.

E [ c(y) + c(y∗ )] = OPT/4 + c( x)/4 + E [ c(s) + c(s∗ )]

3.1 Ideas underlying proof of Theorem 3.1

i) For each edge e ∈ E, se ≥ − xe η/8.

ii) For each cut S ∈ H if δ(S) T is not happy, then s(δ(S)) ≥ 0.

iii) For every LP edge e 6= e0 , E [ se ] ≤ −ηǫP xe for ǫP > 0.

• For each i such that 0 ≤ i < m − 1, x( E( ai , ai+1 )) ≈ 1 and similarly x( E( am−1 , a0 )) ≈ 1.

• For each i > 0, x(δ( ai )) ≈ 2.

3.2 Proof ideas for Theorem 3.2

similarly for reductions in v.) Thus,

We will lower bound P [δ(u) T even| g reduced]. We can write this as

This gives us a reasonable bound when ǫ ≤ xu , xv ≤ 1 − ǫ since, because x(δ(u)) = x(δ(v)) = 2,

min(P [(δ→ (u)) T even] , P [(δ→ (u)) T odd]) = Ω(ǫ).

We can therefore conclude that P [δ(u) T odd| g reduced] ≤ 1 − O(ǫ).

E [ se ] ≤ − pηxe + me,u (1 − ǫ) pη + me,v (1 − ǫ) pη

(c) We assumed that xu , xv ∈ [ǫ, 1 − ǫ].

Bottom edge reduction. Next, consider a bottom edge bundle f = ( a1 , a2 ) where p( a1 ) = p( a2 )

q = P [ triangle happy|reductions in A and B] .

P a1′ happy| A→ ∪ B→ reduced = P [ A T = BT = 1| A→ ∪ B→ reduced] = α2 + (1 − α)2 .

to fix cuts a1 , a2 . Putting this together, we get

which, since maxα∈[1/2,1] α[1 − α2 − (1 − α)2 ] = 8/27 and τ = 0.571β is

4 Polygons and the Hierarchy of Near Minimum Cuts

4.1 Cuts Crossed on Both Sides

• For any e∗ ∈ E∗ , E [ s∗e∗ ] ≤ 5η 2 .

Figure 7: L and R for an OPT edge e∗ .

|T ∩ δ( L(e∗ )) R | = 1, |T ∩ δ( R(e∗ )) L | = 1, T ∩ δ( L(e∗ ))O = ∅, and T ∩ δ( R(e∗ ))O = ∅. (12)

Lemma 4.4. For any OPT edge e∗ , P [I2 (e∗ )] ≤ 18η.

Stochastic Dominance. For two measures µ, ν : 2E → R ≥0 , we say µ ν if there exists a