0% found this document useful (0 votes)
294 views476 pages

2010 Book IntegerProgrammingAndCombinato

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
294 views476 pages

2010 Book IntegerProgrammingAndCombinato

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lecture Notes in Computer Science 6080

Commenced Publication in 1973


Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
TU Dortmund University, Germany
Madhu Sudan
Microsoft Research, Cambridge, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Friedrich Eisenbrand F. Bruce Shepherd (Eds.)

Integer Programming
and Combinatorial
Optimization

14th International Conference, IPCO 2010


Lausanne, Switzerland, June 9-11, 2010
Proceedings

13
Volume Editors

Friedrich Eisenbrand
École Polytechnique Féderale de Lausanne
Institute of Mathematics
1015 Lausanne, Switzerland
E-mail: [email protected]

F. Bruce Shepherd
McGill University
Department of Mathematics and Statistics
805 Sherbrooke West, Montreal, Quebec, H3A 2K6, Canada
E-mail: [email protected]

Library of Congress Control Number: 2010926408

CR Subject Classification (1998): F.2, E.1, I.3.5, G.2, G.1.6, F.2.2

LNCS Sublibrary: SL 1 – Theoretical Computer Science and General Issues

ISSN 0302-9743
ISBN-10 3-642-13035-6 Springer Berlin Heidelberg New York
ISBN-13 978-3-642-13035-9 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
springer.com
© Springer-Verlag Berlin Heidelberg 2010
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper 06/3180
Preface

The idea of a refereed conference for the mathematical programming community


was proposed by Ravi Kannan and William Pulleyblank to the Mathematical
Programming Society (MPS) in the late 1980s. Thus IPCO was born, and MPS
has sponsored the conference as one of its main events since IPCO I at the
University of Waterloo in 1990. The conference has become the main forum for
recent results in Integer Programming and Combinatorial Optimization in the
non-Symposium years.
This volume compiles the papers presented at IPCO XIV held June 9-11,
2010, at EPFL in Lausanne. The scope of papers considered for IPCO XIV is
likely broader than at IPCO I. This is sometimes due to the wealth of new
questions and directions brought from related areas. It can also be due to the
successful application of “math programming” techniques to models not tradi-
tionally considered. In any case, the interest in IPCO is greater than ever and
this is reflected in both the number (135) and quality of the submissions. The
Programme Committee with 13 members was also IPCO’s largest. We thank the
members of the committee, as well as their sub-reviewers, for their exceptional
(and time-consuming) work and especially during the online committee meeting
held over January. The process resulted in the selection of 34 excellent research
papers which were presented in non-parallel sessions over three days in Lau-
sanne. Unavoidably, this has meant that many excellent submissions were not
able to be included. As is typical, we would expect to see full versions of many
of the IPCO papers in scientific journals in the not too distant future. Finally,
a sincere thanks to all authors who submitted their current research to IPCO.
It is this support that determines the excellence of the conference.

March 2010 Friedrich Eisenbrand


Bruce Shepherd
Conference Organization

Programme Committee
Alper Atamtürk UC Berkeley
David Avis McGill
Friedrich Eisenbrand EPFL
Marcos Goycoolea Adolfo Ibañez
Oktay Günlük IBM
Satoru Iwata Kyoto
Tamás Király Eötvös Budapest
François Margot CMU
Bruce Shepherd (Chair) McGill
Levent Tunçel Waterloo
Santosh Vempala Georgia Tech
Peter Winkler Dartmouth
Neal E. Young UC Riverside

Local Organization
Michel Bierlaire
Jocelyne Blanc
Friedrich Eisenbrand (Chair)
Thomas Liebling
Martin Niemeier
Thomas Rothvoß
Laura Sanità

External Reviewers
Tobias Achterberg John Birge
Ernst Althaus Jaroslaw Byrka
Reid Andersen Alberto Caprara
Matthew Andrews Deeparnab Chakrabarty
Elliot Anshelevich Chandra Chekuri
Gary Au Kevin Cheung
Mourad Baiou Marek Chrobak
Nina Balcan Jose Coelho de Pina
Nikhil Bansal Michelangelo Conforti
Andre Berger Miguel Constantino
Attila Bernáth Jose Correa
Dan Bienstock Sanjeeb Dash
VIII Organization

Santanu Dey Tamas Kis


David Eppstein Robert Kleinberg
Daniel Espinoza Yusuke Kobayashi
Guy Even Jochen Könemann
Uriel Feige Lingchen Kong
Zsolt Fekete Nitish Korula
Christina Fernandes Christos Koufogiannakis
Carlo Filippi Erika Kovacs
Samuel Fiorini Marek Krcal
Nathan Fisher Sven Krumke
Lisa Fleischer Simge Kucukyavuz
Keith Frikken Lap Chi Lau
Tetsuya Fujie Monique Laurent
Toshihiro Fujito Adam Letchford
Ricardo Fukasawa Asaf Levin
Joao Gouveia Sven Leyffer
Marcos Goycoolea Christian Liebchen
Fabrizio Grandoni Jeff Linderoth
Betrand Guenin Quentin Louveaux
Dave Hartvigsen James Luedtke
Christoph Helmberg Avner Magen
Hiroshi Hirai Dániel Marx
Dorit Hochbaum Monaldo Mastrolilli
Chien-Chung Huang Kurt Mehlhorn
Cor Hurkens Zoltan Miklos
Sungjin Im Hiroyoshi Miwa
Nicole Immorlica Atefeh Mohajeri
Toshimasa Ishii Eduardo Moreno
Takehiro Ito Yiannis Mourtos
Garud Iyengar Kiyohito Nagano
Kamal Jain Arkadi Nemirovski
Klaus Jansen Martin Niemeier
David Johnson Neil Olver
Tibor Jordan Gianpaolo Oriolo
Vincent Jost Gyula Pap
Alpár Jüttmer Julia Pap
Satyen Kale Gabor Pataki
George Karakostas Sebastian Pokutta
Anna Karlin Imre Polik
Sean Kennedy David Pritchard
Rohit Khandekar Kirk Pruhs
Sanjeev Khanna Linxia Qin
Samir Khuller Maurice Queyranne
Shuji Kijima R Ravi
Zoltan Kiraly Gerhard Reinelt
Organization IX

Thomas Rothvoß Tamas Szantai


Laura Sanità Tami Tamir
Andreas S. Schulz Torsten Tholey
Andras Sebo Rekha Thomas
David Shmoys László Végh
Marcel Silva Juan Vera
Mohit Singh Adrian Vetta
Christian Sommer Juan Pablo Vielma
Gregory Sorkin Jan Vondrak
Frits Spieksma David Wagner
Clifford Stein Gerhard Woeginger
Ruediger Stephan Mihalis Yannakakis
Nicolas Stier-Moses Giacomo Zambelli
Zoya Svitkina Rico Zenklusen
Chaitanya Swamy Miklos Zoltan
Jacint Szabo
Table of Contents

Solving LP Relaxations of Large-Scale Precedence Constrained


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Daniel Bienstock and Mark Zuckerberg

Computing Minimum Multiway Cuts in Hypergraphs from Hypertree


Packings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Takuro Fukunaga

Eigenvalue Techniques for Convex Objective, Nonconvex Optimization


Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Daniel Bienstock

Restricted b-Matchings in Degree-Bounded Graphs . . . . . . . . . . . . . . . . . . . 43


Kristóf Bérczi and László A. Végh

Zero-Coefficient Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Kent Andersen and Robert Weismantel

Prize-Collecting Steiner Network Problems . . . . . . . . . . . . . . . . . . . . . . . . . . 71


MohammadTaghi Hajiaghayi, Rohit Khandekar, Guy Kortsarz, and
Zeev Nutov

On Lifting Integer Variables in Minimal Inequalities . . . . . . . . . . . . . . . . . . 85


Amitabh Basu, Manoel Campelo, Michele Conforti,
Gérard Cornuéjols, and Giacomo Zambelli

Efficient Edge Splitting-Off Algorithms Maintaining All-Pairs


Edge-Connectivities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Lap Chi Lau and Chun Kong Yung

On Generalizations of Network Design Problems with Degree Bounds . . . 110


Nikhil Bansal, Rohit Khandekar, Jochen Könemann,
Viswanath Nagarajan, and Britta Peis

A Polyhedral Study of the Mixed Integer Cut . . . . . . . . . . . . . . . . . . . . . . . . 124


Steve Tyber and Ellis L. Johnson

Symmetry Matters for the Sizes of Extended Formulations . . . . . . . . . . . . 135


Volker Kaibel, Kanstantsin Pashkovich, and Dirk Oliver Theis

A 3-Approximation for Facility Location with Uniform Capacities . . . . . . 149


Ankit Aggarwal, L. Anand, Manisha Bansal, Naveen Garg,
Neelima Gupta, Shubham Gupta, and Surabhi Jain
XII Table of Contents

Secretary Problems via Linear Programming . . . . . . . . . . . . . . . . . . . . . . . . 163


Niv Buchbinder, Kamal Jain, and Mohit Singh

Branched Polyhedral Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177


Volker Kaibel and Andreas Loos

Hitting Diamonds and Growing Cacti . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191


Samuel Fiorini, Gwenaël Joret, and Ugo Pietropaoli

Approximability of 3- and 4-Hop Bounded Disjoint Paths Problems . . . . 205


Andreas Bley and Jose Neto

A Polynomial-Time Algorithm for Optimizing over N -Fold 4-Block


Decomposable Integer Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
Raymond Hemmecke, Matthias Köppe, and Robert Weismantel

Universal Sequencing on a Single Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . 230


Leah Epstein, Asaf Levin, Alberto Marchetti-Spaccamela,
Nicole Megow, Julián Mestre, Martin Skutella, and Leen Stougie

Fault-Tolerant Facility Location: A Randomized Dependent


LP-Rounding Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244
Jaroslaw Byrka, Aravind Srinivasan, and Chaitanya Swamy

Integer Quadratic Quasi-polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258


Adam N. Letchford

An Integer Programming and Decomposition Approach to General


Chance-Constrained Mathematical Programs . . . . . . . . . . . . . . . . . . . . . . . . 271
James Luedtke

An Effective Branch-and-Bound Algorithm for Convex Quadratic


Integer Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
Christoph Buchheim, Alberto Caprara, and Andrea Lodi

Extending SDP Integrality Gaps to Sherali-Adams with Applications


to Quadratic Programming and MaxCutGain . . . . . . . . . . . . . . . . . . 299
Siavosh Benabbas and Avner Magen

The Price of Collusion in Series-Parallel Networks . . . . . . . . . . . . . . . . . . . . 313


Umang Bhaskar, Lisa Fleischer, and Chien-Chung Huang

The Chvátal-Gomory Closure of an Ellipsoid Is a Polyhedron . . . . . . . . . . 327


Santanu S. Dey and Juan Pablo Vielma

A Pumping Algorithm for Ergodic Stochastic Mean Payoff Games with


Perfect Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341
Endre Boros, Khaled Elbassioni, Vladimir Gurvich, and
Kazuhisa Makino
Table of Contents XIII

On Column-Restricted and Priority Covering Integer Programs . . . . . . . . 355


Deeparnab Chakrabarty, Elyot Grant, and Jochen Könemann

On k-Column Sparse Packing Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369


Nikhil Bansal, Nitish Korula, Viswanath Nagarajan, and
Aravind Srinivasan

Hypergraphic LP Relaxations for Steiner Trees . . . . . . . . . . . . . . . . . . . . . . 383


Deeparnab Chakrabarty, Jochen Könemann, and David Pritchard

Efficient Deterministic Algorithms for Finding a Minimum Cycle Basis


in Undirected Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397
Edoardo Amaldi, Claudio Iuliano, and Romeo Rizzi

Efficient Algorithms for Average Completion Time Scheduling . . . . . . . . . 411


René Sitters

Experiments with Two Row Tableau Cuts . . . . . . . . . . . . . . . . . . . . . . . . . . . 424


Santanu S. Dey, Andrea Lodi, Andrea Tramontani, and
Laurence A. Wolsey

An OP T + 1 Algorithm for the Cutting Stock Problem with Constant


Number of Object Lengths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
Klaus Jansen and Roberto Solis-Oba

On the Rank of Cutting-Plane Proof Systems . . . . . . . . . . . . . . . . . . . . . . . . 450


Sebastian Pokutta and Andreas S. Schulz

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 465


Solving LP Relaxations of Large-Scale
Precedence Constrained Problems

Daniel Bienstock1 and Mark Zuckerberg2


1
APAM and IEOR Depts., Columbia University
2
Resource and Business Optimization Group Function, BHP Billiton Ltd.

Abstract. We describe new algorithms for solving linear programming


relaxations of very large precedence constrained production scheduling
problems. We present theory that motivates a new set of algorithmic
ideas that can be employed on a wide range of problems; on data sets
arising in the mining industry our algorithms prove effective on prob-
lems with many millions of variables and constraints, obtaining provably
optimal solutions in a few minutes of computation1 .

1 Introduction
We consider problems involving the scheduling of jobs over several periods sub-
ject to precedence constraints among the jobs as well as side-constraints. We
must choose the subset of jobs to be performed, and, for each of these jobs,
how to perform it, choosing from among a given set of options (representing
facilities or modes of operation). Finally, there are side-constraints to be satis-
fied, including period-wise, per-facility processing capacity constraints, among
others. There are standard representations of these problems as (mixed) integer
programs.
Our data sets originate in the mining industry, where problems typically have
a small number of side constraints - often well under one hundred – but may
contain millions of jobs and tens of millions of precedences, as well as spanning
multiple planning periods. Appropriate formulations often achieve small inte-
grality gap in practice; unfortunately, the linear programming relaxations are
far beyond the practical reach of commercial software.
We present a new iterative algorithm for solving the LP relaxation of this prob-
lem. The algorithm incorporates, at a low level, ideas from Lagrangian relaxation
and column generation, but is however based on fundamental observations on
the underlying combinatorial structure of precedence constrained, capacitated
optimization problems. Rather than updating dual information, the algorithm
uses primal structure gleaned from the solution of subproblems in order to ac-
celerate convergence. The general version of our ideas should be applicable to
a wide class of problems. The algorithm can be proved to converge to optimal-
ity; in practice we have found that even for problems with millions of variables
1
The first author was partially funded by a gift from BHP Billiton Ltd., and ONR
Award N000140910327.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 1–14, 2010.

c Springer-Verlag Berlin Heidelberg 2010
2 D. Bienstock and M. Zuckerberg

and tens of millions of constraints, convergence to proved optimality is usually


obtained in under twenty iterations, with each iteration requiring only a few
seconds on current computer hardware.

2 Definitions and Preliminaries

2.1 The Precedence Constrained Production Scheduling Problem


Definition 1. We are given a directed graph G = (N , A), where the elements
of N represent jobs, and the arcs A represent precedence relationships among
the jobs: for each (i, j) ∈ A, j can be performed no later than job i. Denote by
F , the number of facilities, and T , the number of scheduling periods.
Let yj,t ∈ {0, 1} represent the choice to process job j in period t, and xj,t,f ∈
[0, 1] represent the proportion of job j performed in period t, and processed ac-
cording to processing option, or “facility”, f .
Let cT x be an objective function, and let Dx ≤ d be a collection of arbitrary
“side” constraints.
The linear programming relaxation of the resulting problem, which we will
refer to as PCPSP, is as follows:

(PCPSP): max cT x (1)


t 
t
Subject to: yi,τ ≤ yj,τ , ∀(i, j) ∈ A, 1 ≤ t ≤ T (2)
τ =1 τ =1

Dx ≤ d (3)


F
yj,t = xj,t,f , ∀j ∈ N , 1 ≤ t ≤ T (4)
f =1


T
yj,t ≤ 1, ∀j ∈ N (5)
t=1

x ≥ 0. (6)

For precedence constrained production scheduling problems that occur in the


mining industry some typical numbers are as follows:
– 1 million – 10 million jobs, and 1 million – 100 million precedences,
– 20 – 200 side-constraints, 10 – 20 periods, and 2 – 3 facilities.
These numbers indicate that the number of constraints of the form (2), (4) and
(5) can be expected to be very large.
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 3

2.2 Background
The Open Pit Mine Scheduling Problem. The practical motivating prob-
lem behind our study is the open pit mine scheduling problem. We are given a
three-dimensional region representing a mine to be exploited; this region is di-
vided into “blocks” (jobs, from a scheduling perspective) corresponding to units
of earth (“cubes”) that can be extracted in one step. In order for a block to be
extracted, the set of blocks located (broadly speaking) in a cone above it must
be extracted first. This gives rise to a set of precedences, i.e. to a directed graph
whose vertices are the blocks, and whose arcs represent the precedences. Finally,
the extraction of a block entails a certain (net) profit or cost.
The problem of selecting which blocks to extract so as to maximize profit can
be stated as follows:
 
max cT x : xi ≤ xj ∀ (i, j) ∈ A, xj ∈ {0, 1} ∀j ,

where as before A indicates the set of precedences. This is the so-called maximum
weight closure problem – in a directed graph, a closure is a set S of vertices such
that there exist no arcs (i, j) with i ∈ S and j ∈
/ S. It can be solved as a minimum
s − t cut problem in a related graph of roughly the same size. See [P76], and also
[J68], [Bal70] and [R70]. Further discussion can be found in [HC00], where the
authors note (at the end of Section 3.4) that it can be shown by reduction from
max clique that adding a single cardinality constraint to a max closure problem
is enough to make it NP-hard. For additional related material see [F06], [LG65],
[CH03], and references therein.
The problem we are concerned with here, by contrast, also incorporates pro-
duction scheduling. When a block is extracted it will be processed at one of
several facilities with different operating capabilities. The processing of a given
block i at a given facility f consumes a certain amount of processing capacity
vif and generates a certain net profit pif . This overall planning problem spans
several time periods; in each period we will have one or more knapsack (capac-
ity) constraints for each facility. We usually will also have additional, ad-hoc,
non-knapsack constraints. In this version the precedence constraints apply across
periods as per (2): if (i, j) ∈ A then j can only be extracted in the same or in a
later period than i.
Typically, we need to produce schedules spanning 10 to 20 periods. Addi-
tionally, we may have tens of thousands (or many more) blocks; this can easily
make for an optimization problem with millions of variables and tens of millions
of precedence constraints, but with (say) on the order of one hundred or fewer
processing capacity constraints (since the total number of processing facilities is
typically small).

Previous work. A great deal of research has been directed toward algorithms
for the maximum weight closure problems, starting with [LG65] and culminat-
ing in the very efficient method described in [H08] (also see [CH09]). A “nested
shells” heuristic for the capacitated, multiperiod problem, based on the work in
[LG65], is applicable to problems with a single capacity constraint, among other
4 D. Bienstock and M. Zuckerberg

simplifications. As commercial integer programming software has improved, mine


scheduling software packages have recently emerged that aggregate blocks in order
to yield a mixed integer program of tractable size. The required degree of aggrega-
tion can however be enormous; this can can severely compromise the validity and
the usefulness of the solution. For an overview of other heuristic approaches that
have appeared in the open mine planning literature [HC00] and [F06].
Recently (and independent of our work) there has been some new work rele-
vant to the solution of the LP relaxation of the open pit mine scheduling problem.
[BDFG09]2 have suggested a new approach in which blocks are aggregated only
with respect to the digging decisions but not with respect to the processing
decisions, i.e. all original blocks in an aggregate must be extracted in a com-
mon period, but the individual blocks comprising an aggregate can be processed
in different ways. This problem is referred to by the authors as the ”Optimal
Binning Problem”. As long as there is more than one processing option this ap-
proach still maintains variables for each block and period and is therefore still
very large, but the authors propose an algorithm for the LP relaxation of this
problem that is only required to solve a sequence of linear programs with a num-
ber of variables on the order of the number of aggregates (times the number of
periods) in order to come to a solution of the large LP. Thus if the number of
aggregates is small the LP can be solved quickly.
Another development that has come to our attention recently is an algorithm
by [CEGMR09] which can solve the LP relaxation of even very large instances
of the open pit mine scheduling problem very efficiently. This algorithm is only
applicable however to problems for which there is a single processing option and
for which the only constraints are knapsacks and there is a single such constraint
in each scheduling period. The authors note however that more general problems
can be relaxed to have this form in order to yield an upper bound on the solution
value.
From a broad perspective, the method we give below uses dual information in
order to effectively reduce the size of the linear program; in this sense our work
is similar to that in [BDFG09]. In the full version of this paper we describe what
our algorithm would look like when applied to the aggregated problem treated
by [BDFG09], which is a special case of ours. The relationship between the max
closure problem and the LP is a theme in common with the work of [CEGMR09].

3 Our Results
Empirically, it can be observed that formulation (1-6) frequently has small in-
tegrality gap. We present a new algorithm for solving the continuous relaxation
of this formulation and generalizations. Our algorithm is applicable to problems
with an arbitrary number of process options and arbitrary side constraints, and
it requires no aggregation. On very large, real-world instances our algorithm
proves very efficient.
2
We interacted with [BDFG09] as part of an industrial partnership, but our work was
performed independently.
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 5

Our algorithmic developments hinge on three ideas. In order to describe these


ideas, we will first recast PCPSP as a special case of a more general problem, to
which these results (and our solution techniques) apply.

Definition 2. Given a directed graph G = (N , A) with n vertices, and a system


Dx ≤ d of d constraints on n variables, the General Precedence Constrained
Problem is the following linear program:

(GPCP): max cT x (7)


Dx ≤ d (8)
xi − xj ≤ 0, ∀ (i, j) ∈ A, (9)
0 ≤ xj ≤ 1, ∀ j ∈ N . (10)

This problem is more general than PCPSP:

Lemma 1. Any instance of PCPSP can be reduced to an equivalent instance of


GP CP with the same number of variables and of constraints.

Proof. Consider an instance of PCPSP on G = (N , A), with T time periods,


F facilities and side constraints Dx ≤ d. Note that the y variables can be
eliminated. Consider the following system of inequalities on variables zj,t,f (j ∈
N , 1 ≤ t ≤ T , 1 ≤ f ≤ F ):

zj,t,f − zj,t,f +1 ≤ 0, ∀ j ∈ N , 1 ≤ t ≤ T, 1 ≤ f < F, (11)


zj,t,F − zj,t+1,1 ≤ 0, ∀ j ∈ N , 1 ≤ t < T, (12)
zj,T,F ≤ 1, j ∈ N , (13)
zi,t,F − zj,t,F ≤ 0, ∀ (i, j) ∈ A, 1 ≤ t ≤ T, (14)
z ≥ 0. (15)

Given a solution (x, y) to PCPSP, we obtain a solution z to (11)-(15) by setting,


for all j, t and f :


t−1 
F 
f
zj,t,f = xj,τ,f  + xj,t,f  ,
τ =1 f  =1 f  =1

and conversely. Thus, for an appropriate system D̄z ≤ d¯ (with the same number
of rows as Dx ≤ d) and objective c̄T z, PCPSP is equivalent to the linear program:
¯ and constraints (11)-(15)}.
min{c̄T z : D̄z ≤ d,

Note: In Lemma 1 the number of precedences in the instance of GP CP is larger


than in the original instance of PCPSP; nevertheless we stress that the number
of constraints (and variables) is indeed the same in both instances.
We will now describe ideas that apply to GP CP . First, we have the following
remark.
6 D. Bienstock and M. Zuckerberg

Observation 1. Consider an instance of problem GP CP , and let π ≥ 0 be a


given vector of dual variables for the side-constraints (8). Then the Lagrangian
obtained by dualizing (8) using π,

max cT x + π T (d − Dx) (16)


Subject to: xi − xj ≤ 0, ∀ (i, j) ∈ A (17)
0 ≤ xj ≤ 1, ∀ j ∈ N . (18)

is a maximum closure problem with |A| precedences.


Note: There is a stronger version of Observation 1 in the specific case of problem
PCPSP; namely, the x variables can be eliminated from the Lagrangian (details:
full paper, also see [BZ09]).
Observation 1 suggests that a Lagrangian relaxation algorithm for solving
problem GP CP – that is to say, an algorithm that iterates by solving prob-
lems of the form (16-18) for various vectors π – would enjoy fast individual
iterations. This is correct, as our experiments confirm that even extremely large
max closure instances can be solved quite fast using the appropriate algorithm
(details, below). However, in our experiments we also observed that traditional
Lagrangian relaxation methods (such as subgradient optimization), applied to
GP CP , performed quite poorly, requiring vast numbers of iterations and not
quite converging to solutions with desirable accuracy.
Our approach, instead, relies on leveraging combinatorial structure that opti-
mal solutions to GP CP must satisfy. Lemmas 2 and 3 are critical in suggesting
such structure.
Lemma 2. Let P = {x ∈ Rn : Ax ≤ b, Dx ≤ d}, where A, D, b, d are matrices
and vectors of appropriate dimensions. Let x̂ be an extreme point of P . Let
Āx = b̄, D̄x = d¯ be the set of binding constraints at x̂. Assume D̄ has q linearly
independent rows, and let N x̂ be the null space of Ā. Then dim(N x̂ ) ≤ q.
Proof: Ā must have at least n − q linearly independent rows and thus its null
space must have dimension ≤ q.
Lemma 3. Let P be the feasible space of a GP CP with q side constraints.
Denote by Ax ≤ b the subset of constraints containing the precedence constraints
and the constraints 0 ≤ x ≤ 1, and let Dx ≤ d denote the side constraints. Let x̂
be an extreme point of P , and the entries of x̂ attain k distinct fractional values
{α1 , . . . , αk }. For 1 ≤ r ≤ k, let θr ∈ {0, 1}n be defined by:

1, if x̂j = αr ,
for 1 ≤ j ≤ n, θj =r
0, otherwise.

Let Ā be the submatrix of A containing the binding constraints at x̂. Then the
vectors θr are linearly independent and belong to the null space of Ā. As a con-
sequence, k ≤ q.
Proof: First we prove that Āθr = 0. Given a precedence constraint xi − xj ≤ 0,
if the constraint is binding then x̂i = x̂j . Thus if x̂i = αr , so that θir = 1, then
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 7

x̂j = αr also, and so θjr = 1 as well, and so θir − θjr = 0. By the same token if
x̂i = αr then x̂j = αr and again θir − θjr = 0. If a constraint xi ≥ 0 or xi ≤ 1 is
binding at x̂ then naturally θir = 0 for all r as x̂i is not fractional. The supports
of the θr vectors are disjoint, yielding linear independence. Finally, k ≤ q follows
from Lemma 2.

Observation 1 implies that an optimal solution x∗ to an instance of GP CP can


be written as a weighted sum of incidence vectors of closures, i.e.,

Q

x = μq v q , (19)
q=1

where μ ≥ 0, and, for each q, v q ∈ {0, 1}n is the incidence vector of a closure
S q ⊂ N . [In fact, the S q can be assumed to be nested]. So for any i, j ∈ N ,
x∗j = x∗i if i and j belong to precisely the same family of sets S q . Also, Lemma 3
states that the number of distinct values that x∗j can take is small, if the number
of side constraints is small. Therefore it can be shown that when the number
of side constraints is small the number of closures (terms) in (19) must also be
small. In the full paper we show that a rich relationship exists between the max
closures produced by Lagrangian problems and the optimal dual and primal
solutions to GP CP . Next, we will develop an algorithm that solves GP CP by
attempting to “guess” the correct representation (19).
First, we present a result that partially generalizes Lemma 3.

Theorem 2. Let P , A, Ā, D, q, x̂ and N x̂ be as in Lemma 2, and assume


additionally that A is totally unimodular and that b is integral. Define

I x̂ = {y ∈ Rn : yi = 0, ∀i s.t. x̂i is integer}. (20)

Then there exists an integral vector xi ∈ Rn , and vectors θh ∈ Rn , 1 ≤ h ≤ q,


such that:
(a) Axi ≤ b,
(b) Āxi = b̄,
(c) xij = x̂j , ∀j s.t. x̂j is integer,

(d) x̂ = xi + qr=1 αr θr , for some α ∈ Rq ,
(e) The set {θ1 , . . . , θq } spans N x̂ ∩ I x̂ ,
(f ) |θjh | ≤ rank(Ā), for all 1 ≤ h ≤ q and 1 ≤ j ≤ n,

In the special case of the GP CP , we can choose xi satisfying the additional


condition:
(g) xij = 0, for all j such that x̂j is fractional.

Proof sketch: Let us refer to the integer coordinates of x as xI and to the


corresponding columns of A as AI , and to the fractional coordinates of x as xF ,
and to the corresponding columns of A as AF . Let h be the number of columns
in AF . Note that b − AI xI is integer, and so by total unimodularity there exists
8 D. Bienstock and M. Zuckerberg

integer y ∈ Rh satisfying AF y ≤ b − AI xI , ĀF y = b̄ − ĀI xI . Defining now


xi = (xI , y) then xi is integer; it is equal to x everywhere that x is integer, and
it satisfies Axi ≤ b and Āxi = b̄. Clearly x − xi belongs to I x , and moreover
Ā(x − xi )= 0 so that it belongs to N x as well, and so it can be decomposed as
q
x − x = r=1 αr θr . For the special case of GP CP we have already described
i

a decomposition for which xi equals x everywhere that x is integer and is zero


elsewhere. See the full paper for other details.

Comment: Note that rank(Ā) can be high and thus condition (d) is not quite
as strong as Lemma 3; nevertheless q is small in any case and so we obtain
a decomposition of x̂ into “few” terms when the number of side-constraints is
“small”. Theorem 2 can be strengthened for specific families of totally unimodu-
lar matrices. For example, when A is the node-arc incidence matrix of a digraph,
the θ vectors are incidence vectors of cycles, which yields the following corollary.

Corollary 1. Let P be the feasible set for a minimum cost network flow problem
with integer data and side constraints. Let x̂ be an extreme point of P , and let
q be the number of linearly independent side constraints that are binding at x̂.
Let ζ = {j : x̂j integral}. Then x̂ can be decomposed into the sum of an integer
vector v satisfying all network flow (but not necessarily side) constraints, and
with vj = x̂j ∀j ∈ ζ, and a sum of no more than q fractional cycle flows, over a
set of cycles disjoint from ζ.

4 A General Algorithmic Template


Now we return to the generic algorithm for GP CP that attempts to guess the
right representation of an optimal solution as a weighted sum of incidence vectors
of “few” closures. To motivate our approach, we first consider a more general
situation. We are given a linear program:

(P1 ) : max cT x
s.t. Ax ≤ b
Dx ≤ d. (21)

Denote by L(P1 , μ) the Lagrangian relaxation in which constraints (21) are du-
alized with penalties μ, i.e. the problem max{cT x + μT (d − Dx) : Ax ≤ b}.
One can approach problem P1 by means of Lagrangian relaxation, i.e. an algo-
rithm that iterates by solving multiple problems L(P1 , μ) for different choices of
μ; the multipliers μ are updated according to some procedure. A starting point
for our work concerns the fact that traditional Lagrangian relaxation schemes
(such as subgradient optimization) can prove frustratingly slow to achieve con-
vergence, often requiring seemingly instance-dependent choices of algorithmic
parameters. They also do not typically yield optimal feasible primal solutions;
in fact frequently failing to deliver a sufficiently accurate solutions (primal or
dual). However, as observed in [B02] (and also see [BA00]) Lagrangian relaxation
schemes can discover useful “structure.”
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 9

For example, Lagrangian relaxation can provide early information on which


constraints from among those that were dualized are likely to be tight, and on
which variables x are likely to be nonzero, even if the actual numerical values
for primal or dual variables computed by the relaxation are inaccurate. The
question then is how to use such structure in order to accelerate convergence
and to obtain higher accuracy. In [B02] the following approach was used:

– Periodically, interrupt the Lagrangian relaxation scheme to solve a restricted


linear program consisting of P1 with some additional constraints used to im-
pose the desired structure. Then use the duals for constraints (21) obtained
in the solution to the restricted LP to restart the Lagrangian procedure.

The restricted linear program includes all constraints, and thus could (poten-
tially) still be very hard – the idea is that the structure we have imposed renders
the LP much easier. Further, the LP includes all constraints, and thus the solu-
tion we obtain is fully feasible for P1 , thus proving a lower bound. Moreover, if
our guess as to “structure” is correct, we also obtain a high-quality dual feasible
vector, and our use of this vector so as to restart the Lagrangian scheme should
result in accelerated convergence (as well as proving an upper bound on P1 ). In
[B02] these observations were experimentally verified in the context of several
problem classes.

1. Set μ0 = 0 and set k = 1.

2. Solve L(P1 , μk−1 ). Let wk be an optimal solution.


If k > 1 and H k−1 wk = hk−1 , STOP.

3. Let H k x = hk be a system of equations satisfied by wk .

4. Define the restricted problem:

(P2k ) : max cT x
s.t. Ax ≤ b, Dx ≤ d, H k x = hk .

5. Solve P2k to obtain xk , an optimal primal vector (with value z k ) and


μk , an optimal dual vector corresponding to constraints Dx ≤ d.
If μk = μk−1 , STOP.

6. Set k = k + 1 and goto Step 2.

Fig. 1. Algorithmic template for solving P1

In this work we extend and recast these ideas in a generalized framework as an


algorithm to systematically extract information from the Lagrangian and from
restricted LP’s symbiotically so as to solve the Lagrangian and the primal LP
simultaneously.
10 D. Bienstock and M. Zuckerberg

In the template in Figure 1, at each iteration k we employ a linear system


H k x = hk that represents a structure satisfied by the current iteration’s La-
grangian solution and which can be interpreted as an educated guess for condi-
tions that an optimal solution to P1 should satisfy. This is problem-specific; we
will indicate later how this structure is discovered in the context of GP CP .
Notes:
1. Ideally, imposing H k x = hk in Step 4 should result in an easier linear program.
2. For simplicity, in what follows we will assume that P2k is always feasible;
though this is a requirement that can be easily circumvented in practice (full
paper).

Theorem 3. (a) If the algorithm stops at iteration k in Step 2, then xk−1 is


optimal for P1 . (b) If it stops in Step 5 then xk is optimal for P1 .

Proof: (a) We have

z k−1 = max{cT x + μTk−1 (d − Dx) : Ax ≤ b, H k−1 x = hk−1 } =


cT wk + μTk−1 (d − Dwk ),

where the first equality follows by duality and the second by definition of wk in
Step 2 since H k−1 wk = hk−1 . Also, clearly z k−1 ≤ z ∗ , and so in summary

z ∗ ≤ cT wk + μTk−1 (d − Dwk ) = z k−1 ≤ z ∗ . (22)

(b) μk = μk−1 implies that wk optimally solves L(P1 , μk ), so that we could


choose wk+1 = wk and so H k wk+1 = hk , obtaining case (a) again.

4.1 Applying the Template


We will now apply the above template to a case where P1 is an instance of GPCP
where, as before, we denote by N the set of jobs; and we use Ax ≤ b to describe
the precedences and the box constraints 0 ≤ xj ≤ 1 (∀j ∈ N ), and Dx ≤ d
denotes the side-constraints.
Thus, in Step 2 of the template, L(P1 , μk−1 ) can be solved as a max closure
problem (Observation 1); therefore its solution can be described by a 0/1-vector
which we will denote by y k . Recall that Lemma 3 implies that where D has m
rows, an optimal extreme point solution to GPCP has q ≤ m + 2 distinct q values
0 ≤ α1 < α1 < · · · < αq ≤ 1 and can therefore be written x∗ = r=1 αr θr ,
where for 1 ≤ r ≤ q, Vjr = {j ∈ N : x∗j = αr }, and θr is the incidence vector
of V r .
The structure that we will “guess” has been exposed by the current iter-
ate’s Lagrangian solution is that the nodes inside the max closure should be
distinguished from those nodes outside, i.e. that the nodes inside should not be
required to take the same value in the LP solution as those outside. Given an
existing partition of the nodeset N that represented our previous guess as to the
sets {V r }, this guess at structure implies a refinement of this partition. We will
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 11

note later that this partition never needs more than a small number of elements
for the algorithm to converge.
At iteration k, we denote by C k = {C1k , . . . , Crkk } the constructed partition of
N . Our basic application of the template is as follows:
GPCP Algorithm
1. Set μ0 = 0. Set r0 = 1, = N , C 0 = {C10 }, z 0 = −∞, and k = 1.
C10
k
2. Let y be an optimal solution to L(P1 , μk−1 ), and define
I k = {j ∈ N : yjk = 1} (23)
and define
Ok = {j ∈ N : yjk = 0}. (24)
If k > 1, and, for 1 ≤ h ≤ rk−1 , either Chk−1 ∩ I k = ∅ or Chk−1 ∩ Ok = ∅,
then STOP.
3. Let C k = {C1k , . . . , Crkk } consist of all nonempty sets in the collection
 k   
I ∩ Chk−1 : 1 ≤ h ≤ rk−1 ∪ Ok ∩ Chk−1 : 1 ≤ h ≤ rk−1 .
Let H k x = hk consist of the constraints:
xi = xj , for 1 ≤ h ≤ rk , and each pair i, j ∈ Chk .
4. Let P2k consist of P1 , plus the additional constraints H k x = hk .
5. Solve P2k , with optimal solution xk , and let μk denote the optimal duals
corresponding to the side-constraints Dx ≤ d. If μk = μk−1 , STOP.
6. Set k = k + 1 and goto Step 2.
We have:
Lemma 4. (a) For each k, problem P2k is an instance of GPCP with rk variables
and the same number of side-constraints as in Dx ≤ d. (b) If P21 is feasible, the
above algorithm terminates finitely with an optimal solution.
Proof: full paper.

Comments: Since each problem P2k is a GPCP, its extreme point solution xk
never attains more than m + 2 distinct values (where m is the number of linearly
independent rows in D), and thus the partition C k can be coarsened while main-
taining the feasibility of xk by merging the sets Cjk with common xk values. Note
also that in choosing C k+1 to be a refinement of C k the LP solution xk remains
available to the problem P2k+1 . The above algorithm is a basic application of
the template. Finer partitions than {I k , Ok } may also be used. The feasibility
assumption in (b) of Lemma 4 can be bypassed. Details will be provided in the
full paper.
In the full paper an analysis is presented that explains why the structure
exposed by the Lagrangian solutions can be expected to point the algorithm in
the right direction. In particular, the solution to the Lagrangian obtained by
using optimal duals for the side constraints can be shown to exhibit significant
structure.
12 D. Bienstock and M. Zuckerberg

Table 1. Sample runs, 1

Marvin Mine1B Mine2 Mine3,s Mine3,b

Jobs 9400 29277 96821 2975 177843


Precedences 145640 1271207 1053105 1748 2762864
Periods 14 14 25 8 8
Facilities 2 2 2 8 8
Variables 199626 571144 3782250 18970 3503095
Constraints 2048388 17826203 26424496 9593 19935500
Problem arcs 2229186 18338765 30013104 24789 23152350
Side-constraints 28 28 50 120 132
Binding side-constr.
at optimum 14 11 23 33 44
Cplex
time (sec) 55544 — — 5 —

Algorithm Performance

Iterations to 10−5
optimality 8 8 9 13 30
Time to 10−5
optimality (sec) 10 60 344 1 1076
Iterations to
comb. optimality 11 12 16 15 39
Time to comb.
optimality (sec) 15 96 649 1 1583

5 Computational Experiments
In this section we present results from some of our experiments. A more complete
set of results will be presented in the full paper. All these tests were conducted
using a single core of a dual quad-core 3.2 GHz Xeon machine with 64 GB of
memory. The LP solver we used was Cplex, version 12 and the min cut solver
we used was our implementation of Hochbaum’s pseudoflow algorithm ([H08]).
The tests reported on in Tables 1 and 2 are based on three real-world ex-
amples provided by BHP Billiton3 , to which we refer as ’Mine1’, ’Mine2’ and
’Mine3’ and a synthetic but realistic model called ’Marvin’ which is included
with Gemcom’s Whittle [W] mine planning software. ’Mine1B’ is a modifica-
tion of Mine1 with a denser precedence graph. Mine3 comes in two versions to
which we refer as ’big’ and ’small’. Using Mine1, we also obtained smaller and
larger problems by modifying the data in a number of realistic ways. Some of
the row entries in these tables are self-explanatory; the others have the following
meaning:
3
Data was masked.
Solving LP Relaxations of Large-Scale Precedence Constrained Problems 13

– Problem arcs. The number of arcs in the graph that the algorithm creates to
represent the scheduling problem (i.e., the size of the min cut problems we solve).
– Iterations, time to 10−5 optimality. The number of iterations (resp.,
the CPU time) taken by the algorithm until it obtained a solution it could
certify as having ≤ 10−5 relative optimality error.
– Iterations, time to combinatorial optimality. The number of iterations
(resp., the CPU time) taken by the algorithm to obtain a solution it could cer-
tify as optimal as per the stopping criteria in Steps 2 or 5. Notice that this
implies that the solution is optimal as per the numerical tolerances of Cplex.
Finally, an entry of ”—” indicates that Cplex was unable to terminate after 100000
seconds of CPU time. More detailed analyses will appear in the full paper.

Table 2. Sample runs, 2

Mine1 very Mine1 Mine1 Mine1 Mine1,3


small medium large full weekly

Jobs 755 7636 15003 29277 87831


Precedences 222 22671 113703 985011 2955033
Periods 12 12 12 12 100
Facilities 2 2 2 2 2
Variables 14282 160944 292800 489552 12238800
Constraints 8834 327628 1457684 11849433 295591331
Problem arcs 22232 477632 1727565 12280407 307654269
Side-constraints 24 24 24 24 200
Binding side-constr.
at optimum 12 11 11 11 151
Cplex
time (sec) 1 12424 — — —

Algorithm Performance

Iterations to 10−5
optimality 6 6 8 7 10
Time to 10−5
optimality (sec) 0 1 7 45 2875
Iterations to
comb. optimality 7 7 11 9 20
Time to comb.
optimality (sec) 0 2 10 61 6633

References
[Bal70] Balinsky, M.L.: On a selection problem. Management Science 17, 230–
231 (1970)
[BA00] Barahona, F., Anbil, R.: The Volume Algorithm: producing primal solu-
tions with a subgradient method. Math. Programming 87, 385–399 (2000)
14 D. Bienstock and M. Zuckerberg

[B02] Bienstock, D.: Potential Function Methods for Approximately Solving


Linear Programming Problems, Theory and Practice. Kluwer Academic
Publishers, Boston (2002), ISBN 1-4020-7173-6
[BZ09] Bienstock, D., Zuckerberg, M.: A new LP algorithm for precedence con-
strained production scheduling, posted on Optimization Online (August
2009)
[BDFG09] Boland, N., Dumitrescu, I., Froyland, G., Gleixner, A.M.: LP-based
disaggregation approaches to solving the open pit mining production
scheduling problem with block processing selectivity. Computers and
Operations Research 36, 1064–1089 (2009)
[CH03] Caccetta, L., Hill, S.P.: An application of branch and cut to open pit
mine scheduling. Journal of Global Optimization 27, 349–365 (2003)
[CH09] Chandran, B., Hochbaum, D.: A Computational Study of the Pseud-
oflow and Push-Relabel Algorithms for the Maximum Flow Problem.
Operations Research 57, 358–376 (2009)
[CEGMR09] Chicoisne, R., Espinoza, D., Goycoolea, M., Morena, E., Rubio, E.: A
New Algorithm for the Open-Pit Mine Scheduling Problem (submitted
for publication), http://mgoycool.uai.cl/
[F06] Fricke, C.: Applications of integer programming in open pit mine plan-
ning, PhD thesis, Department of Mathematics and Statistics, The Uni-
versity of Melbourne (2006)
[H08] Hochbaum, D.: The pseudoflow algorithm: a new algorithm for the max-
imum flow problem. Operations Research 58, 992–1009 (2008)
[HC00] Hochbaum, D., Chen, A.: Improved planning for the open - pit mining
problem. Operations Research 48, 894–914 (2000)
[J68] Johnson, T.B.: Optimum open pit mine production scheduling, PhD the-
sis, Operations Research Department, University of California, Berkeley
(1968)
[LG65] Lerchs, H., Grossman, I.F.: Optimum design of open-pit mines. Trans-
actions C.I.M. 68, 17–24 (1965)
[P76] Picard, J.C.: Maximal Closure of a graph and applications to combina-
torial problems. Management Science 22, 1268–1272 (1976)
[R70] Rhys, J.M.W.: A selection problem of shared fixed costs and network
flows. Management Science 17, 200–207 (1970)
[W] Gemcom Software International, Vancouver, BC, Canada
Computing Minimum Multiway Cuts in
Hypergraphs from Hypertree Packings

Takuro Fukunaga

Department of Applied Mathematics and Physics,


Graduate School of Informatics, Kyoto University, Japan
[email protected]

Abstract. Hypergraph k-cut problem is a problem of finding a mini-


mum capacity set of hyperedges whose removal divides a given hyper-
graph into k connected components. We present an algorithm for this
problem which runs in strongly polynomial-time if both k and the rank
of the hypergraph are constants. Our algorithm extends the algorithm
due to Thorup (2008) for computing minimum k-cuts of graphs from
greedy packings of spanning trees.

1 Introduction

Let Q+ denote the set of non-negative rationals. For a connected hypergraph


H = (V, E) with a non-negative hyperedge capacity c : E → Q+ and an integer
k ≥ 2, a k-cut of H is defined as a subset of E whose removal divides H into
k connected components. Hypergraph k-cut problem is a problem of finding a
minimum capacity k-cut of a hypergraph. If the given hypergraph is a graph,
then the problem is called graph k-cut problem.
The graph k-cut problem is one of the fundamental problems in combinatorial
optimization. It is closely related to the reliability of networks, and has many
applications, for example, to the traveling salesperson problem, VLSI design,
and evolutionary tree construction [4,14]. By Goldschmidt and Hochbaum [6], it
is shown that the problem is NP-hard when k is not fixed, and polynomial-time
solvable when k is fixed to a constant. After their work, there are many works
on the algorithmic aspect of this problem.
In spite of these active studies on the graph k-cut problem, there are few works
on the hypergraph k-cut problem. If k is not fixed, the NP-hardness of the graph k-
cut problem implies that of the hypergraph k-cut problem. When k = 2, the k-cut
problem is usually called the minimum cut problem. Klimmek and Wagner [9] and
Mak and Wong [13] extended an algorithm proposed by Stoer and Wagner [16] for
the minimum cut problem in graphs to hypergraphs. Lawler [10] showed that the
(s, t)-cut problem in hypergraphs can be reduced to computing maximum flows
in digraphs. For the case of k = 3, Xiao [19] gave a polynomial-time algorithm.

This work was partially supported by Grant-in-Aid for Scientific Research from the
Ministry of Education, Culture, Sports, Science and Technology of Japan.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 15–28, 2010.

c Springer-Verlag Berlin Heidelberg 2010
16 T. Fukunaga

However, it is not known whether the hypergraph k-cut problem is polynomial


solvable or NP-hard when k is a constant larger than 3.
In this paper, we partially answer this question. We present an algorithm
which runs in strongly polynomial-time if k and the rank γ of hyperedges (i.e.,
γ = maxe∈E |e|) are fixed to constants. Since graphs can be regarded as hyper-
graphs with γ = 2, this result extends the polynomial-solvability of the graph
k-cut problem.
Our algorithm is based on the idea due to Thorup [18], which is success-
fully applied to the graph k-cut problem. He showed that a maximum spanning
tree packing of a graph contains a spanning tree sharing at most a constant
number of edges with a minimum k-cut of the graph. Although this fact itself
gives a strongly polynomial-time algorithm for computing the minimum k-cuts
of graphs, he also showed that a set of spanning trees constructed in a greedy
way has the same property. Based on this fact, he gave the fastest algorithm
to the graph k-cut problem. In this paper, we show that these facts can be ex-
tended to hypergraphs with a hypertree packing theorem due to Frank, Király
and Kriesell [2] (see Section 3).
Let us mention the previous works on problems related to the hypergraph
k-cut problem. As mentioned above, the first polynomial-time algorithm for the
graph k-cut problem with fixed k was presented by Goldschmidt and
2
Hochbaum [6]. Its running time is O(nk T (n, m)) where T (n, m) is time for
computing max-flow in a graph consisting of n vertices and m edges. T (n, m) is
known to be O(mn log(n2 /m)) for now [5]. After their work, many polynomial-
time algorithms for fixed k are obtained. An√algorithm due to Kamidoi, Yoshida
and Nagamochi [7] runs in O(n4k/(1−1.71/ k)−34 T (n, m)). An algorithm due
to Xiao [20] runs in O(n4k−log k ). An algorithm due to Thorup [17] runs in
Õ(n2k ). In addition, Karger and Stein [8] gave a randomized algorithm running
in O(n2(k−1) log3 n).
For the hypergraph k-cut problem, Xiao [19] gave a polynomial-time divide-
and-conquer algorithm for k = 3. Zhao, Nagamochi and Ibaraki [21] gave an
approximation algorithm. It achieves the approximation factor (1 − k2 ) min{k, γ}
for k ≥ 4 by using the Xiao’s algorithm due for k = 3 as a subroutine. Moreover,
it is shown by Okumoto, Fukunaga and Nagamochi [15] that the problem can be
reduced to the terminal k-vertex cut problem in bipartite graphs (refer to [15]
for the definition of the terminal k-vertex cut problem). Hence the LP-rounding
algorithm due to Garg, Vazirani and Yannakakis [3] for the terminal k-vertex
cut problem achieves approximation factor 2 − k2 also for the hypergraph k-cut
problem. Recently Chekuri and Korula [1] claims that the randomized algorithm
proposed by Karger and Stein [8] for the graph k-cut problem can be extended
to the hypergraph k-cut problem.
Okumoto, Fukunaga and Nagamochi [15] showed that the hypergraph k-cut
problem is contained by the the submodular system k-partition problem. Zhao,
Nagamochi and Ibaraki [21] presented a (k − 1)-approximation algorithm to this
problem. Okumoto, Fukunaga and Nagamochi [15] presented an approximation

algorithm whose approximation factor is 1.5 for k = 4 and k + 1 − 2 k − 1 for
Computing Minimum Multiway Cuts in Hypergraphs 17

k ≥ 5. They also showed that, for the hypergraph 4-cut problem, their algorithm
achieves approximation factor 4/3.
The rest of this paper is organized as follows. Section 2 introduces basic facts
and notations. Section 3 explains outline of our result and presents our algo-
rithm. Sections 4 shows that a maximum hypertree packing contains a hyper-
tree sharing at most a constant number of hyperedges with a minimum k-cut.
Section 5 discusses a property of a set of hypertrees constructed greedily.
Section 6 concludes this paper and mentions the future works.

2 Preliminaries

Let H = (V, E) be a hypergraph with a capacity c : E → Q+ . Throughout


this paper, we denote |V | by n, |E| by m, and maxe∈E |e| by γ. We sometimes
denote the vertex set of H by VH , and the edge set of H by EH , respectively.
For non-empty X ⊂ V and F ⊆ E, δF (X) denotes the set of hyperedges in F
intersecting both X and V \ X. When F = E, we may represent δF (X) by δ(X).
For some function f : E → Q+ and F ⊆ E, e∈F f (e) is represented by f (F ).
For non-empty X ⊂ V , E[X] denotes the set of hyperedges in E contained in
X, and H[X] denotes the sub-hypergraph (X, E[X]) of H.
It is sometimes convenient to associate a hypergraph H = (V, E) with a
bipartite graph BH = (V, VE , E  ) as follows. Each vertex in VE corresponds to
an edge in E. x ∈ V and y ∈ VE is joined by an edge in E  if and only if x is
contained by the hyperedge corresponding to y in H.
H is called -connected if c(δ(X)) ≥  for all non-empty X ⊂ V . H is
called connected if |δ(X)| ≥ 1 for all non-empty X ⊂ V . Notice that the 1-
connectedness is not equivalent to the connectedness.
A partition V = {V1 , V2 , . . . , Vk } of V into k non-empty subsets is called k-
partition of V . We let δF (V) = ∪i=1 δF (Vi ). H is called -partition-connected
if c(δ(V)) ≥ (|V| − 1) for all partitions V of V into non-empty subsets. It is
easy to see that the -partition-connectivity is a stronger condition than the
-connectivity. We call a partition V achieving min{c(δ(V))/(|V| − 1)} weakest.
min{c(δ(V))/(|V| − 1)} is denoted by H . If |δ(V)| ≥ |V| − 1 for all V, H is called
partition-connected. Notice that the 1-partition-connectedness is not equivalent
to the partition-connectedness
A minimal k-cut of H is represented by δ(V) where V is the k-partition of
V consisting of the connected components after removing the k-cut. Hence the
hypergraph k-cut problem is equivalent to the problem of finding a k-partition
V of H minimizing c(δ(V)). We call such a partition minimum k-partition.
A hyperforest in H = (V, E) is defined as F ⊆ E such that |F [X]| ≤ |X|−1 for
every non-empty X ⊆ V . A hyperforest F is called hypertree if |F | = |V | − 1 and
∪e∈F e = V . Notice that if H = (V, E) is a graph, F ⊆ E is a hypertree if and
only if F is a spanning tree. Actually a hypertree is an extension of a spanning
tree which inherits many important properties of spanning trees. However, there
is also a difference between them. For example, in contrast to spanning trees, a
connected hypergraph may contain no hypertree.
18 T. Fukunaga

A hypertree packing of H is a pair of a set T of hypertrees in H and a non-


negative weight α : T → Q+ such that α(Te ) ≤ c(e) holds for all e ∈ E where
Te denotes the set of hypertrees in T containing e. A hypertree packing is called
maximum if α(T ) is maximum. We define the packing value of a hypergraph H
as the maximum of α(T ) over all hypertree packings of H. If α is integer, then
the hypertree packing (T , α) is called integer.
Frank, Király and Kriesell [2] characterized hypergraphs containing hypertrees
as follows.

Theorem 1 (Frank, Király, Kriesell [2]). Let H be a hypergraph with integer


hyperedge capacity. H has an integer hypertree packing (T , α) such that α(T ) ≥
 if and only if H is -partition-connected.

Let F be the family of hyperforests. In the proof of Theorem 1, it is mentioned


that (E, F ) is a matroid, which is originally proven by Lorea [11]. Matroids
defined from hypergraphs in such a way are called hypergraphic matroids. Hy-
pertrees are bases of the hypergraphic matroid.
Independence testing in hypergraphic matroids needs to judge whether a given
F ⊆ E satisfies F [X] ≤ |X| − 1 for every ∅ = X ⊆ V . By Hall’s theorem, this
condition holds if and only if the bipartite graph BH = (V, VE , E  ) defined
from H contains a matching covering V after removing any vertex v ∈ VE .
Thus the independence testing can be done in O(nθ(n, m − 1, γm)) time where
θ(n, m−1, γm) denotes the time for computing maximum matchings in bipartite
graph BH = (V, VE , E  ) − v with |V | = n, |VE | = m and |E  | ≤ γm.

3 Outline of Our Result

The first step of our result is to prove the following theorem originally proven for
graphs by Thorup [18]. A recursively maximum hypertree packing is a maximum
hypertree packing that satisfies some condition, which will be defined formally
in Section 4.

Theorem 2. A recursively maximum hypertree packing of H contains a hyper-


tree that shares at most γk − 3 hyperedges with a minimum k-cut of H.

We prove Theorem 2 in Section 4.


Assume that the hypertree and the h = γk − 3 hyperedges in Theorem 2 are
specified. Since each of the other n − 1 − h hyperedges in the hypertree intersects
only one elements of the minimum k-partition, shrinking them into single vertices
preserves the minimum k-partition. If γ = 2, these n − 1 − h hyperedges form at
most h + 1 connected components. Hence the hypergraph obtained by shrinking
them contains at most h + 1 vertices, for which the minimum k-partition can
be found by enumerating all k-partitions. If γ ≥ 3, the number of the connected
components cannot be bounded in general because one large deleted hyperedge
may connect many components. However, a characterization of hypertrees due
to Lovász [12] tells that such a case does not occur even if γ ≥ 3.
Computing Minimum Multiway Cuts in Hypergraphs 19

Theorem 3 (Lovász [12]). Consider an operation that replaces each hyper-


edge by an edge joining two vertices chosen from the hyperedge. It is possible to
construct a spanning tree from a hypergraph by this operation if and only if the
hypergraph is a hypertree.

Corollary 1. After removing h hyperedges from a hypertree, there exist at most


h + 1 connected components.

Proof. Consider the spanning tree constructed from a hypertree as shown by


Theorem 3. After removing h edges from the spanning tree, the remaining edges
forms h + 1 connected components. The vertices in the same connected com-
ponent are also connected by the hyperedges corresponding to the remaining
edges. Hence removing h hyperedges from a hypertree results in at most h + 1
connected components. 


Another thing to care is the existence of hypertrees. As mentioned in Section 2,


there exist connected hypergraphs which contain no hypertrees. For such hyper-
graphs, hypertree packings give no information on minimum k-cuts. We avoid
this situation by replacing each hyperedge e ∈ E by its |e| copies with capacity
c(e)/|e|. Obviously this replacement makes no effect on capacities of k-partitions
while the obtained hypergraphs contain hypertrees. Notice that after the replace-
ment, the number of hyperedges are increased to at most γm.

Theorem 4. Let H  = (V, E  ) be the hypergraph obtained from a connected


hypergraph H = (V, E) by replacing each e ∈ E by |e| copies of e. Then H 
contains a hypertree.
 
Proof. Let V be a partition of  each e ∈ δE (V) intersects at most |e |
V . Since


components of V, |δE  (V)| ≥ U∈V e ∈δE (U) (1/|e |). Since each e ∈ E has |e|
 
copies in E  , e ∈δE (U) (1/|e |) = e∈δE (U) 1 = |δE (U )|. Moreover, |δE (U )| ≥ 1

because H is connected. Thus |δE  (V)| ≥ U∈V 1 = |V|, which implies that H 
is partition-connected. Hence by Theorem 1, H  contains a hypertree. 


Now we describe our algorithm for the hypergraph k-cut problem.

Algorithm 1: Hypergraph k-Cut Algorithm


Input: A connected hypergraph H = (V, E) with capacity c : E → Q+ and an
integer k ≥ 2
Output: A minimum k-cut of H
Step 1: For each e ∈ E, prepare |e| copies e1 , e2 , . . . , e|e| of e with capacity
c(ei ) = c(e)/|e|, i ∈ {1, 2, . . . , |e|}, and replace e by them.
Step 2: Define F = E. Compute a recursively maximum hypertree (T ∗ , α∗ ) of
H.
Step 3: For each T ∈ T ∗ and each set T  of h = γk − 3 hyperedges in T ,
execute the following operations.
3-1: Compute a hypergraph H  obtained by shrinking all hyperedges in
T \ T .
20 T. Fukunaga

3-2: Compute a minimum k-cut F  of H  .


3-3: Let F := F  if c(F  ) ≤ c(F ).
Step 4: Output F .

Let us discuss the running time of this algorithm. For each hypertree, there
are O(nh ) ways to choose h hyperedges. By Corollary 1, shrinking n − 1 − h hy-
peredges in a hypertree results in a hypergraph with at most h+1 vertices. Hence
Step 3-2 can be done in O(k h+1 ) time. It means that Step 3 of the algorithm
runs in O(k h+1 nh ) time per one hypertree in T ∗ .
To bound the running time of all the steps, we must consider how to compute
a recursively maximum hypertree packing and how large its size is. A recursively
maximum hypertree packing can be computed in polynomial time. However we
know no algorithm to compute small recursively maximum hypertree packings.
Hence this paper follows the approach taken by Thorup [18] for the graph k-cut
problem. We show that a set of hypertrees constructed as below approximates a
recursively maximum hypertree packing well. It enables us to avoid computing
a recursively maximum hypertree packing.

Algorithm 2: Greedy Algorithm for Computing a Set of Hypertrees


Input: A connected hypergraph H = (V, E) with capacity c : E → Q+ and an
integer t.
Output: A set of t hypertrees of H.
Step 1: Let T := ∅.
Step 2: Compute a minimum cost hypertree T of H with respects to the cost
defined as |Te |/c(e) for each e ∈ E, and T := T ∪ {T }.
Step 3: If |T | = t, then output T . Otherwise, return to Step 2.

As mentioned in Section 2, hypertrees are bases of a hypergraphic matroid.


Hence a minimum cost hypertree can be computed by a greedy algorithm. The
running time of Algorithm 2 is O(tγm log(γm)nθ(n, γm − 1, γ 2 m)). The set of
hypertrees computed by Algorithm 2 approximates the recursively maximum
hypertree packing well.

Theorem 5. Let H = (V, E) be a hypergraph such that each e ∈ E has at least


|e| − 1 copies in E \ {e} of the same capacity. For this H, Algorithm 2 with
t = 24γ 4 mk 3 ln(2γ 2 kmn) outputs a set of hypertrees which contains a hypertree
sharing at most h = γk − 2 hyperedges with a minimum k-cut of H.

Replace the computation of recursively maximum hypertree packings in Step 2


of Algorithm 1 by Algorithm 2 with t = 24γ 4 mk 3 ln(2γ 2 kmn). Moreover, change
the definition of h in Step 3 as γk − 2. Then we obtain another algorithm for
the hypertree k-cut problem as summarized in the next corollary.

Corollary 2. The hypergraph k-cut problem is solvable in time

O(k γk+2 nγk−1 γ 5 m2 θ(n, γm − 1, γ 2 m) log(kγ 2 mn) log(γm)).


Computing Minimum Multiway Cuts in Hypergraphs 21

4 Proof of Theorem 2
Let V be an arbitrary partition of V , and (T , α) be an arbitrary hypertree
packing of H. Since every hypertree T ∈ T satisfies |T | = |V | − 1 and has at
most |U | − 1 hyperedges contained by U for each U ∈ V, we have
 
|δT (V)| = |T | − |T [U ]| ≥ |V | − 1 − (|U | − 1) = |V| − 1. (1)
U∈V U∈V

Moreover,
c(e) ≥ α(Te ) for each e ∈ δ(V) (2)
by the definition of hypertree packings. Thus it follows that
 
c(δ(V)) e∈δ(V) α(Te ) α(T )|δT (V)|
≥ = T ∈T ≥ α(T ). (3)
|V| − 1 |V| − 1 |V| − 1

Let V ∗ be a weakest partition of V (i.e., it attains minV c(δ(V))/(|V|−1)), and


(T ∗ , α∗ ) be a maximum hypertree packing of H (i.e., it attains max(T ,α) α(T )).
From Theorem 1, we can derive their important properties.

Lemma 1. V ∗ and (T ∗ , α∗ ) satisfy c(δ(V ∗ ))/(|V ∗ | − 1) = α∗ (T ∗ ). Moreover,


|δT (V ∗ )| = |V ∗ | − 1 holds for each T ∈ T ∗ , and α∗ (Te∗ ) = c(e) holds for each
e ∈ δ(V ∗ ). T [U ] defined from any T ∈ T ∗ and U ∈ V ∗ is a hypertree on H[U ].

Proof. Let M be a positive integer such that all of M c(e), e ∈ E and M α∗ (T ),


T ∈ T ∗ are integers. Notice that (T ∗ , M α∗ ) is a maximum hypertree packing of
the hypergraph H associated with hyperedge capacity M c. Applying Theorem 1
to this hypergraph shows that M c(δ(V ∗ ))/(|V ∗ | − 1) = T ∈T ∗ M α∗ (T ) holds.
That is to say, V ∗ and (T ∗ , α∗ ) satisfy c(δ(V ∗ ))/(|V ∗ | − 1) = α∗ (T ∗ ).
Since V ∗ and (T ∗ , α∗ ) satisfy (3) with equality, they also satisfy (1) and
(2), used for deriving (3), with equality. This proves the remaining part of the
lemma. 


Let U ∈ V ∗ , T  = {T [U ] | T ∈ T ∗ }, and α be the weight defined on the


hypertrees in T  such that α (T [U ]) = α∗ (T ) for T ∈ T ∗ . Since T  consists of
hypertrees of H[U ] by Lemma 1, (T  , α ) is a hypertree packing of H[U ]. How-
ever, it may not be a maximum hypertree packing of H[U ] since the partition-
connectivity of H[U ] may be larger than α (T  ). Let (S , β) be a maximum
hypertree packing of H[U ]. For each T ∈ T ∗ and S ∈ S , replacing hyperedges
in T contained by U with those in S generates another hypertree of H because
hypertrees are bases of hypergraphic matroids. Hence, from (T , α) and (S , β),
we can construct another maximum hypertree packing (U , ζ) of H such that
|U | ≤ |T | + |S | and (U  , ζ  ) is a maximum hypertree packing of H[U ] where
U  = {T [U ] | T ∈ U } and ζ  (T [U ]) = ζ(T )β(S )/ζ(U ) for each T [U ] ∈ U  .
A maximum hypertree packing obtained by repeating this operation is called
recursively maximum. That is to say, a recursively maximum hypertree packing
is defined as a hypertree packing computed by the following algorithm.
22 T. Fukunaga

Algorithm 3: Computing a Recursively Maximum Hypertree Packing

Input: A connected hypergraph H = (V, E) with capacity c : E → Q+ .


Output: A recursively maximum hypertree packing of H.
Step 1: Compute a maximum hypertree packing (T ∗ , α∗ ) of H, and a weakest
partition V ∗ of H.
Step 2: While there exits U ∈ V ∗ such that |U | > 1, execute the following
operations.
2-1: Compute a maximum hypertree packing (S , β) of H[U ] and a weakest
partition V of H[U ]. Define T := ∅, and β  (S) := β(S)α∗ (T ∗ )/β(S )
for each S ∈ S .
2-2: Choose T ∈ T ∗ \ T and S ∈ S . If α∗ (T ) < β  (S), then replace the
hyperedges in T [U ] by those in S, β  (S) := β  (S) − α∗ (T ), and T :=
T ∪{T }. Otherwise, i.e., α∗ (T ) ≥ β  (S), then construct a hypertree T  =
(T \ T [U ]) ∪ S with α∗ (T  ) := β  (S), and update α∗ (T ) := α∗ (T ) − β(S),
T := T ∪ {T  }, and S := S \ {S}. If α∗ (T ) = β  (S) in the latter case,
remove T from T ∗ in addition.
2-3: If T ∗ \ T = ∅, then return to Step 2-2.
2-4: V ∗ := (V ∗ \ {U }) ∪ V.
Step 3: Output (T ∗ , α∗ ).

From now on, we let (T ∗ , α∗ ) stand for a recursively maximum hypertree pack-
ing. For U ∈ V ∗ , let T  = {T [U ] | T ∈ T ∗ } and α (T [U ]) = α∗ (T )H[U] /α∗ (T ∗ )
where H[U] is the partition-connectivity of H[U ]. The definition of (T ∗ , α∗ ) im-
plies that (T  , α ) is a recursively maximum hypertree packing of H[U ] for any
U ∈ V ∗.
From T ∗ and given k, define Vk as the k-partition of V constructed by the
following algorithm.

Algorithm 4: Computing Vk
Input: A connected hypergraph H = (V, E) with capacity c : E → Q+ , and an
integer k ≥ 2.
Output: A k-partition of V .
Step 1: Define Vk := {V }.
Step 2: Let U ∈ Vk be a set attaining min{H[U] | U ∈ Vk , |U | ≥ 2}. Compute
aweakest partition U = {U1 , U 2 , . . . , U|U | } of H[U ], where we assume that
∗ ∗
T ∈T ∗ α (T )|δ(Ui ) ∩ T [U ]| ≤ T ∈T ∗ α (T )|δ(Uj ) ∩ T [U ]| for 1 ≤ i < j ≤
|U|.
Step 3: If |Vk | − 1 + |U| < k, then Vk := (Vk \ {U }) ∪ U and return to Step 2.
Step 4: If |Vk | − 1 + |U| ≥ k, then Vk := (Vk \ {U }) ∪ {U1 , U2 , . . . , Uk−|Vk | , U \
k−|Vk |
∪i=1 Ui }, and output Vk .

Lemma 2 
α∗ (Te∗ ) < (γk − 2)α∗ (T ∗ ).
e∈δ(Vk )
Computing Minimum Multiway Cuts in Hypergraphs 23

Proof. In this proof, Vk stands for Vk immediately before executing Step 4 of
Algorithm 4, and Vk stands for it outputted by Algorithm 4.
By the definition of recursively maximum hypertree packings, T [U ] is a hy-
 ∗ 
pertree of H[U ] for every pairof U ∈ Vk and T ∈ T . Thus |δ(Vk ) ∩ T | =
|T |− U∈V  |T [U ]| = |V |−1− U∈V  (|U |−1) = |Vk |−1 holds for each T ∈ T ∗ ,

k
∗ ∗
 k ∗   ∗ ∗
and hence e∈δ(Vk ) α (Te ) = T ∈T ∗ α (T )|δ(Vk ) ∩ T | = (|Vk | − 1)α (T )
holds.
Let U be the element of Vk and U = {U1 , U2 , . . . , U|U | } be the weakest par-
tition of U computed in Step 2 immediately before executing Step 4. Note that
|U| > k − |Vk | holds by the condition of Step 4. By the same  reason with above,
|δ(U) ∩ T [U ]| = |U| − 1 holds for each T ∈ T ∗ . Hence T ∈T ∗ α∗ (T )|δ(U) ∩
T [U ]| = (|U| − 1)α∗ (T ∗ ).
k−|V  |
Let VU = {U1 , U2 , . . . , Uk−|Vk | , U \ ∪j=1 k Uj }. Then

k−|Vk |
  

α (T )|δ(VU ) ∩ T [U ]| ≤ α∗ (T )|δ(Uj ) ∩ T [U ]|.
T ∈T ∗ j=1 T ∈T ∗


The
 elements in U are ordered so that they satisfy T ∈T ∗ α∗ (T )|δ(Ui )∩T [U ]| ≤

T ∈T ∗ α (T )|δ(Uj ) ∩ T [U ]| for 1 ≤ i < j ≤ |U|. Hence it holds that

k−|Vk | |U |
  k − |Vk |   ∗

α (T )|δ(Uj ) ∩ T [U ]| ≤ α (T )|δ(Uj ) ∩ T [U ]|.

|U| j=1 ∗
j=1 T ∈T T ∈T

Since each hyperedge intersects at most γ elements in δ(U), it holds that


|U |
k − |Vk |   ∗ k − |Vk |  ∗
α (T )|δ(Uj ) ∩ T [U ]| ≤ γ α (T )|δ(U) ∩ T [U ]|
|U| j=1 ∗
|U| ∗
T ∈T T ∈T
k − |Vk |
= γ(|U| − 1)α∗ (T ∗ )
|U|
< (k − |Vk |)γα∗ (T ∗ ).

Combining these implies that T ∈T ∗ α∗ (T )|δ(VU )∩T [U ]| < (k − |Vk |)γα∗ (T ∗ ).
Notice that δ(Vk ) ∩ T = (δ(Vk ) ∩ T ) ∪ (δ(VU ) ∩ T [U ]). Recall that |Vk | ≥ 1
and γ ≥ 2. Therefore it follows that
 
α∗ (Te∗ ) = α∗ (T )|δ(Vk ) ∩ T |
e∈δ(Vk ) T ∈T ∗

< {(|Vk | − 1) + (k − |Vk |)γ}α∗ (T ∗ ) ≤ (γk − 2)α∗ (T ∗ ).




Lemma 3. For each e ∈ δ(Vk ) and f ∈ E \ δ(Vk ), α∗ (Te∗ )/c(e) ≥ α∗ (Tf∗ )/c(f )
holds.
24 T. Fukunaga

Proof. Let U ∈ Vk denote the set containing f (i.e., f ∈ E[U ]). Let Vk denote
Vk immediately before e enters δ(Vk ) in Algorithm 4. Assume that e is contained
by U  ∈ Vk (i.e., e ∈ E[U  ]). Moreover, let U (resp., U  ) denote the packing
value of H[U ] (resp., H[U  ]).
From (T ∗ , α∗ ), define T  = {T [U ] | T ∈ T ∗ }. Moreover, define α (T [U  ]) =
α (T )U  /α∗ (T ∗ ) for T ∈ T ∗ . By the definition of recursively maximum hyper-

tree packings, (T  , α ) is a maximum hypertree packing of H[U  ]. By Lemma 1,


the capacity constraint of edge e is tight for any maximum hypertree packing
of H[U  ], i.e., c(e) = α (Te ). Since α (Te ) = α∗ (Te∗ )U  /α∗ (T ∗ ), it holds that
U  = c(e)α∗ (T ∗ )/α∗ (Te∗ ).
On the other hand, a maximum hypertree packing of H[U ] satisfies the capac-
ity constraint for edge f . Hence, similarly with above, U ≤ c(f )α∗ (T ∗ )/α∗ (Tf∗ ).
Vk contains U  such that U ⊆ U  . In other words, U = U  , or U is obtained by
dividing U  in Algorithm 4. As explained when recursively maximum hypertree
packings are defined, U  ≤ U holds. Since Step 2 chose U  immediately before
e enters δ(Vk ), U  ≤ U  holds. These facts show that
c(e)α∗ (T ∗ ) c(f )α∗ (T ∗ )
=  U  ≤ U  ≤ U ≤ ,
α∗ (Te∗ ) α∗ (Tf∗ )

implying the required inequality. 



Let V opt denote a minimum k-partition of H.
Lemma 4  
α∗ (T )|δ(V opt ) ∩ T | ≤ α∗ (Te∗ ).
T ∈T ∗ e∈δ(Vk )


Proof. Let η = mine∈δ(Vk ) α (Te∗ )/c(e).
By Lemma 3, each hyperedge e ∈
δ(V opt ) \ δ(Vk ) satisfies α∗ (Te∗ )/c(e) ≤ η. Hence it holds that
  α∗ (Te∗ )
α∗ (Te∗ ) = c(e) ≤ ηc(δ(V opt ) \ δ(Vk )).
c(e)
e∈δ(V opt )\δ(Vk ) e∈δ(V opt )\δ(Vk )

The definition of V opt implies that c(δ(V opt )) ≤ c(δ(Vk )), and hence c(δ(V opt ) \
δ(Vk )) ≤ c(δ(Vk ) \ δ(V opt )). Thus
  
α∗ (T )|δ(V opt ) ∩ T | = α∗ (Te∗ ) + α∗ (Te∗ )
T ∈T ∗ e∈δ(V opt )∩δ(Vk ) e∈δ(V opt )\δ(Vk )

≤ α∗ (Te∗ ) + ηc(δ(V opt ) \ δ(Vk ))
e∈δ(V opt )∩δ(V k)

≤ α∗ (Te∗ ) + ηc(δ(Vk ) \ δ(V opt ))
e∈δ(V opt )∩δ(Vk )

≤ α∗ (Te∗ ).
e∈δ(Vk )



Computing Minimum Multiway Cuts in Hypergraphs 25

From Lemmas 2 and 4, we can observe that


 ∗
T ∈T ∗ α (T )|δ(V
opt
) ∩ T|
∗ ∗
< γk − 2.
α (T )
This means that |δ(V opt ) ∩ T | < γk − 2 holds for some T ∈ T ∗ . Therefore
Theorem 2 has been proven.

5 Proof of Theorem 5
In this section, we present a proof of Theorem 5. Although it is almost same
with that for γ = 2 presented by Thorup [18], we sketch it for self-containment.
Throughout this section, we let H = (VH , EH ) be a hypergraph such that
each e ∈ EH has at least |e| − 1 copies in EH \ {e} of the same capacity. We
denote |EH | by γm, and the capacity of hyperedges in H by cH in order to avoid
confusion. Moreover, we assume that a recursively maximum hypertree packing
(T ∗ , α∗ ) of H satisfies α∗ (Te∗ ) = α∗ (Te∗ ) for e ∈ EH and a copy e ∈ EH of e.
For a set T of hypertrees of H and e ∈ EH , define uT H (e) = |Te |/(cH (e)|T |).
For each e ∈ EH , we also define u∗H (e) as α∗ (Te∗ )/(cH (e)α∗ (T ∗ )) from a recur-
sively maximum hypertree packing (T ∗ , α∗ ) of H. Since cH (e) ≥ α∗ (Te∗ ) for all
e ∈ EH , 1/u∗H (e) is at least the packing value of H, i.e., 1/u∗H (e) ≥ α∗ (T ∗ ).
Moreover, since cH (e) = α∗ (Te∗ ) holds for some e ∈ EH by the maximality of
(T ∗ , α∗ ), mine∈EH 1/u∗H (e) = α∗ (T ∗ ) holds.
Recall that Algorithm 3 updates V ∗ by partitioning non-singleton sets in

V repeatedly until no such sets exist. For e ∈ EH , define Ue as the last
set in V ∗ such that e ∈ EH [Ue ] during the execution of the algorithm. Then
maxe ∈EH [Ue ] u∗H[Ue ] (e ) = u∗H[Ue ] (e). The definition of recursively maximum hy-
pertree packings implies that u∗H[Ue ] (e ) = u∗H (e ) for each e ∈ EH [Ue ] because
α∗ (Te∗ )/α∗ (T ∗ ) = β(Se )/β(S ) holds with a maximum hypertree packing
(S , β) of H[Ue ]. Therefore, the partition-connectivity of H[Ue ] is 1/u∗H (e).
Lemma 5. Let I be a subgraph of H and assume that each  hyperedge e in I
has capacity cI (e) such that cmin ≤ cI (e) ≤ cH (e). Let C = e∈EI cI (e), and
uI = maxe∈EI u∗I (e). Moreover, let be an arbitrary real such that 0 < <
1/2, and T g be a set of hypertrees in H constructed by Algorithm 2 with t ≥
3 ln(C/cmin )/(cmin uI 2 ). Then
uT
g
H (e) < (1 + )uI (4)
holds for each e ∈ EI .
Proof. Scaling hyperedge capacity makes no effect on the claim. Hence we assume
without loss of generality that cmin = 1.
Let T denote a set of hypertrees kept by Algorithm 2 at some moment during
it is running for computing T g . The key is the following quantity:
 (1 + )|Te |/cH (e) (1 + uI )t−|T |
cI (e) . (5)
e∈EI
(1 + )(1+)uI t
26 T. Fukunaga

This quantity has the following properties:


(i) When T = ∅, (5) is less than 1;
(ii) If (5) is less than 1 when |T | = t, then (4) holds for all e ∈ EI ;
(iii) When a tree is added to T in Step 2 of Algorithm 2, then (5) is not
increased.
Clearly these three facts imply (4) for all e ∈ EI . We do not prove these proper-
ties here due to the space limitation. Refer to Thorup [18] or full version of this
paper for their proofs. We would like to note that an important fact for having
(iii) is that hypertrees are bases of the hypergraphic matroid. 

By applying Lemma 5 to some subgraph of H, we obtain the next lemma. We
skip the proof due to the space limitation.
Lemma 6. Let 0 < ≤ 1/2, and T g be a set of hypertrees of H constructed by
Algorithm 2 with |T g | = t ≥ 3γm ln(γmn/ )/ 3. Then
1 + α∗ (Te∗ )
|Teg | ≤ · · |T g | + 1
1 − α∗ (T ∗ )
holds for each e ∈ EH .
Lemma 6 proves Theorem 5 as follows. Let V opt stand for a minimum k-partition
of H. Lemma 6 shows that
 
|δ(V opt ) ∩ T | = |Teg |
T ∈T g e∈δ(V opt )

 1 + α∗ (Te∗ ) 1+
∗ ∗
e∈δ(V opt ) α (Te )
≤ (t · ∗ ∗ + 1) ≤ t · + γm.
1 − α (T ) 1− α∗ (T ∗ )
e∈δ(V opt )

In the last of Section 4, we have observed that


 ∗ ∗  ∗
e∈δ(V opt ) α (Te ) T ∈T ∗ α (T )|δ(V
opt
) ∩ T|
∗ ∗
= ∗ ∗
< γk − 2.
α (T ) α (T )
These mean that

T ∈T g |δ(V opt ) ∩ T | 1+ γm
< (γk − 2) + .
t 1− t
Recall that t = 3γm ln(γmn/ )/ 3 . Assume that n, m ≥ 2. Then t ≥ 6γm/ 3 ,
and hence the right-hand side of the above inequality is at most
 
1+ 2kγ − 4 2
(γk − 2) + /6 = γk − 2 + 2γk +
3
+ .
1− 1− 6
Setting to 1/(4k), the right-hand side is at most γk − 1, which means that

T ∈T g |δ(V ) ∩ T|
opt
< γk − 1.
t
This implies that T g contains a hypertree T such that |δ(V opt ) ∩ T | < γk − 1.
Moreover, t = 3γm ln(γmn/ )/ 3 = 24γ 4 mk 3 ln(2γ 2 kmn). Therefore the proof
has been completed.
Computing Minimum Multiway Cuts in Hypergraphs 27

6 Concluding Remarks

Our algorithm proposed in this paper is not polynomial if γ is not fixed. A


reason for this fact is that a bound obtained in Theorem 2 depends on γ. If
we can remove γ from the bound, we have a polynomial algorithm even if γ
is not fixed. However there exists a hypergraph in which every hypertree in a
recursively maximum hypertree packing shares γ + k − 3 hyperedges with any
minimum k-cuts.
Define a set V of vertices as {v1 , v2 , . . . , vn }. We identify i with i + n for
each i ∈ {1, 2, . . . , n} for convenience. We also define a set E of hyperedges
as {e1 , e2 , . . . , en−1 } where each hyperedge ei is defined as {vi , vi+1 , . . . , vi+γ }.
Let H = (V, E) be the hypergraph with uniform hyperedge capacity. Figure 1
illustrates H. The intervals represented by gray lines in the figure denote the
hyperedges of H.
Observe that H is a hypertree. Hence a recursively maximum hypertree packing
of H consists of a single hypertree H. On the other hand, any minimum k-partition
of H is represented by {{vi }, {vi+1 }, {vi+2 }, . . . , {vi+k−2 }, V − ∪i+k−2
j=i {vj }} with
some i ∈ {1, 2, . . . , n} because each hyperedge in H contains vertices of consec-
utive indices. Since less number of hyperedges contain vj , j ∈ {n, 1, . . . , γ − 1},
i ≤ n < γ − 1 ≤ i + k − 2 holds. Hence any minimum k-cut of H contains γ + k − 3
hyperedges (A minimum k-partition is represented by the dotted lines in Figure 1).
Therefore, any hypertree in a recursively maximum hypertree packing of H and
any minimum k-cut shares γ + k − 3 hyperedges.

Fig. 1. A hypergraph H in which every hypertree in a recursively maximum hypertree


packing shares γ + k − 3 hyperedges with any minimum k-cuts. Dotted lines represent
a minimum k-partition {{vn }, {v1 }, . . . , {vk−2 }, {vk−1 , vk , . . . , vn−1 }}.
28 T. Fukunaga

References
1. Chekuri, C., Korula, N.: Personal Communication (2010)
2. Frank, A., Király, T., Kriesell, M.: On decomposing a hypergraph into k connected
sub-hypergraphs. Discrete Applied Mathematics 131(2), 373–383 (2003)
3. Garg, N., Vazirani, V.V., Yannakakis, M.: Multiway cuts in node weighted graphs.
Journal of Algorithms 50, 49–61 (2004)
4. Gasieniec, L., Jansson, J., Lingas, A., Óstlin, A.: On the complexity of constructing
evolutionary trees. Journal of Combinatorial Optimization 3, 183–197 (1999)
5. Goldberg, A.V., Tarjan, R.E.: A new approach to the maximum flow problem.
Journal of the ACM 35, 921–940 (1988)
6. Goldschmidt, O., Hochbaum, D.: A polynomial algorithm for the k-cut problem
for fixed k. Mathematics of Operations Research 19, 24–37 (1994)
7. Kamidoi, Y., Yoshida, N., Nagamochi, H.: A deterministic algorithm for finding
all minimum k-way cuts. SIAM Journal on Computing 36, 1329–1341 (2006)
8. Karger, D.R., Stein, C.: A new approach to the minimum cut problem. Journal of
the ACM 43, 601–640 (1996)
9. Klimmek, R., Wagner, F.: A simple hypergraph min cut algorithm. Internal Report
B 96-02, Bericht FU Berlin Fachbereich Mathematik und Informatik (1995)
10. Lawler, E.L.: Cutsets and partitions of hypergraphs. Networks 3, 275–285 (1973)
11. Lorea, M.: Hypergraphes et matroides. Cahiers Centre Etudes Rech. Oper. 17,
289–291 (1975)
12. Lovász, L.: A generalization of König’s theorem. Acta. Math. Acad. Sci. Hungar. 21,
443–446 (1970)
13. Mak, W.-K., Wong, D.F.: A fast hypergraph min-cut algorithm for circuit parti-
tioning. Integ. VLSI J. 30, 1–11 (2000)
14. Nagamochi, H.: Algorithms for the minimum partitioning problems in graphs. IE-
ICE Transactions on Information and Systems J86-D-1, 53–68 (2003)
15. Okumoto, K., Fukunaga, T., Nagamochi, H.: Divide-and-conquer algorithms for
partitioning hypergraphs and submodular systems. In: Dong, Y., Du, D.-Z., Ibarra,
O. (eds.) ISAAC 2009. LNCS, vol. 5878, pp. 55–64. Springer, Heidelberg (2009)
16. Stoer, M., Wagner, F.: A simple min-cut algorithm. J. the ACM 44, 585–591 (1997)
17. Thorup, M.: Fully-dynamic min-cut. Combinatorica 27, 91–127 (2007)
18. Thorup, M.: Minimum k-way cuts via deterministic greedy tree packing. In: Pro-
ceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 159–
166 (2008)
19. Xiao, M.: Finding minimum 3-way cuts in hypergraphs. In: Agrawal, M., Du, D.-
Z., Duan, Z., Li, A. (eds.) TAMC 2008. LNCS, vol. 4978, pp. 270–281. Springer,
Heidelberg (2008)
20. Xiao, M.: An improved divide-and-conquer algorithm for finding all minimum k-
way cuts. In: Hong, S.-H., Nagamochi, H., Fukunaga, T. (eds.) ISAAC 2008. LNCS,
vol. 5369, pp. 208–219. Springer, Heidelberg (2008)
21. Zhao, L., Nagamochi, H., Ibaraki, T.: A unified framework for approximating mul-
tiway partition problems. In: Eades, P., Takaoka, T. (eds.) ISAAC 2001. LNCS,
vol. 2223, pp. 682–694. Springer, Heidelberg (2001)
Eigenvalue Techniques for Convex Objective,
Nonconvex Optimization Problems

Daniel Bienstock

APAM and IEOR Depts., Columbia University

Abstract. A fundamental difficulty when dealing with a minimization


problem given by a nonlinear, convex objective function over a nonconvex
feasible region, is that even if we can efficiently optimize over the con-
vex hull of the feasible region, the optimum will likely lie in the interior
of a high dimensional face, “far away” from any feasible point, yielding
weak bounds. We present theory and implementation for an approach
that relies on (a) the S-lemma, a major tool in convex analysis, (b) effi-
cient projection of quadratics to lower dimensional hyperplanes, and (c)
efficient computation of combinatorial bounds for the minimum distance
from a given point to the feasible set, in the case of several significant
optimization problems. On very large examples, we obtain significant
lower bound improvements at a small computational cost1 .

1 Introduction
We consider problems with the general form

(F ) : F̄ := min F (x), (1)


s.t. x ∈ P, (2)
x ∈ K, where (3)

– F is a convex quadratic, i.e. F (x) = xT M x + v T x (with M  0 and


v ∈ Rn ). Extensions to the non-quadratic case are possible (see below).
– P ⊆ Rn is a convex set over which we can efficiently optimize F ,
– K ⊆ Rn is a non-convex set with “special structure”.
We assume that a given convex relaxation of the set described by (2), (3) is under
consideration. A fundamental difficulty is likely to be encountered: because of
the convexity of F , the optimum solution to the relaxation will frequently be
attained in the interior of a high-dimensional face of the relaxation, and far from
the set K. Thus, the lower bound proved by the relaxation will often be weak.
What is more, if one were to rely on branch-and-cut the proved lower bound may
improve little if at all when n is large, even after massive amounts of branching.
This stalling of the lower bounding procedure is commonly encountered in
practice and constitutes a significant challenge, the primary subject of our study.
1
Work partially funded by ONR Award N000140910327 and a gift from BHP Billiton
Ltd.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 29–42, 2010.

c Springer-Verlag Berlin Heidelberg 2010
30 D. Bienstock

After obtaining the solution x∗ to the given relaxation for problem F , our meth-
ods will use techniques of convex analysis, of eigenvalue optimization, and com-
binatorial estimations, in order to quickly obtain a valid lower bound on F̄ which
is strictly larger than F (x∗ ). Our methods apply if F is not quadratic but there
is a convex quadratic G such that F (x) − F (x∗ ) ≥ G(x − x∗ ) for all feasible x.
We will describe an important class of problems where our method, applied
to a “cheap” but weak formulation, produces bounds comparable to or better
than those obtained by much more sophisticated formulations, and at a small
fraction of the computational cost.
Cardinality constrained optimization problems. Here, for some integer
0 < K ≤ n, K = { x ∈ Rn : x0 ≤ K }, where the zero-norm v0 of a vector
v is used to denote the number of nonzeros in v. This constraint arises in portfo-
lio optimization (see e.g. [2]) but modern applications involving this constraint
arise in statistics, machine learning [13], and, especially, in engineering and bi-
ology [19]. Problems related to compressive sensing have an explicit cardinality
constraint (see www.dsp.ece.rice.edu/cs for material). Also see [7].
The simplest canonical example of problem F is as follows:

F̄ = min F (x), (4)



s.t. xj = 1, x ≥ 0, (5)
j
x0 ≤ K. (6)

This problem is strongly NP-hard, and it does arise in practice, exactly as stated.
In spite of its difficulty, this example already incorporates the fundamental


difficulty alluded to above: clearly, conv x ∈ Rn+ : j xj = 1, x0 ≤ K =


x ∈ Rn+ : j xj = 1 . In other words, from a convexity standpoint the cardi-


nality constraint disappears. Moreover, if the quadratic in F is positive definite
and dominates the linear term, then the minimizer of F over the unit simplex
will be an interior point (all coordinates positive) whereas K  n in practice.
A second relevant example is given my a system of multiple (linear) disjunc-
tions, such as split-cuts [6]. Also see [3], [4]. Details in full paper. To the extent
that disjunctive sets are a general-purpose technique for formulating combinato-
rial constraints, the methods in this paper apply to a wide variety of optimization
problems.

1.1 Techniques
Our methods embody two primary techniques:
(a) The S-lemma (see [20], also [1], [5], [15]). Let f, g : Rn → R be quadratic
functions and suppose there exists x̄ ∈ RN such that g(x̄) > 0. Then

f (x) ≥ 0 whenever g(x) ≥ 0

if and only if there exists μ ≥ 0 such that (f − μg)(x) ≥ 0 for all x.


Eigenvalue Techniques for Convex Objective Problems 31

Remark: here, a “quadratic” may contain a linear as well as a constant term.


The S-lemma can be used as an algorithmic framework for minimizing a quadratic
subject to a quadratic constraint. Let p, q be quadratic functions and let α, β be
reals. Then

min{p(x) : q(x) ≥ β} ≥ α, iff ∃ μ ≥ 0 s.t. p(x) − α − μq(x) + μβ ≥ 0 ∀ x.


(7)

In other words, the minimization problem in (7) can be approached as a simul-


taneous search for two reals α and μ ≥ 0, with α largest possible such that the
last inequality in (7) holds. The S-lemma is significant in that it provides a good
characterization (i.e. polynomial-time) for a usually non-convex optimization
problem. See [14], [16], [17], [18], [21] and the references therein, in particular
regarding the connection to the trust-region subproblem.
(b) Consider a given nonconvex set K. We will assume, as a primitive, that
(possibly after an appropriate change of coordinates, given a point x̂ ∈ Rn , we
can efficiently compute a strong (combinatorial) lower bound for the Euclidean
distance between x̂ and the nearest point in P ∩ K. We will show that this is
indeed the case for the cardinality constrained case (see Section 1.4. Roughly,
we exploit the “structure” of a set K of interest. We will denote by D(x̂) our
lower bound on the minimum distance from x̂ to P ∩ K.
Using (a) and (b), we can compute a lower bound for F̄ :

Simple Template
S.1 Compute an optimal solution x∗ to the given relaxation to problem F .
S.2 Obtain the quantity D(x∗ ).
S.3 Apply the S-lemma as in (7), using F (x) for p(x), and (the exterior of)
the ball centered at x∗ with radius D(x∗ ) for q(x) − β.

x*

Fig. 1. A simple case

For a simple application of this template, consider Figure 1. This shows an


instance of problem (4)-(6), with n = 3 and K = 2 where all coordinates of x∗
are positive. The figure also assumes that D(x∗ ) is exact – it equals the minimum
distance from x∗ to the feasible region. If we minimize F (x), subject to being
on the exterior of this ball the optimum will be attained at y. Thus, F (y) ≤ F̄ ;
we have F (y) = F (x∗ ) + λ̃1 R2 , where R is the radius of the ball and λ̃1 is the
minimum eigenvalue of the restriction of F (x) to the unit simplex.
32 D. Bienstock

Now consider the example in Figure 2, corresponding to the case of a single


disjunction. Here, xF is the optimizer of F (x) over the affine hull of the set P.
A straightforward application of the S-Lemma will yield as a lower bound (on
F̄ ) the value F (y), which is weak – weaker, in fact, than F (x∗ ). The problem is
caused by the fact that xF is not in the relative interior of the convex hull of
the feasible region. In summary, a direct use of our template will not work.

y xF

x*

Fig. 2. The simple template fails

1.2 Adapting the Template


In order to correct the general form of the difficulty depicted by Figure 2 we
would need to solve a problem of the form:
 
V := min F (x) : x − x∗ ∈ C, (x − x∗ )T (x − x∗ ) ≥ δ 2 (8)
where δ > 0, and C is the cone of feasible directions (for P) at x∗ . We can view
this as a ’cone constrained’ version of the problem addressed by the S-Lemma.
Clearly, F (x∗ ) ≤ V ≤ F̄ with the first inequality in general strict. If we are
dealing with polyhedral sets, (8) becomes (after some renaming):
 
V = min F (ω) : Cω ≥ 0, ω T ω ≥ δ 2 (9)
where C is an appropriate matrix. However, we have (proof in full paper):
Theorem 1. Problem (9) is strongly NP-hard.
We stress that the NP-hardness result is not simply a consequence of the non-
convex constraint in (9) – without the linear constraints, the problem becomes
polynomially solvable (i.e., it is handled by the S-lemma, see the references).
To bypass this negative result, we will adopt a different approach. We assume
that there is a positive-definite quadratic function q(x) such that for any y ∈
Rn , in polynomial time we can produce a (strong, combinatorial) lower bound
2
Dmin (y, q) on the quantity
min{q(y − x) : x ∈ P ∩ K}.
Eigenvalue Techniques for Convex Objective Problems 33

In Section 1.4 we will address how to produce the quadratic q(x) and the value
D2 (y, q) when K is defined by a cardinality constraint.
Let c = ∇F (x∗ ) (other choices for c discussed in full paper). Note that for
any x ∈ P ∩ K, cT (x − x∗ ) ≥ 0. For α ≥ 0, let pα = x∗ + αc, and let H α be the
hyperplane through pα orthogonal to c. Finally, define

V (α) := min{F (x) : q(x − pα ) ≥ D2 (pα , q), x ∈ H α }, (10)

and let y α attain the minimum. Note: computing V (α) entails an application of
the S-lemma, “restricted” to H α . See Figure 3. Clearly, V (α) ≤ F̄ . Then

– Suppose α = 0, i.e. pα = x∗ . Then x∗ is a minimizer of F (x) subject to


x ∈ H 0 . Thus V (0) > F (x∗ ) when F is positive-definite.
– Suppose α > 0. Since cT (y α − x∗ ) > 0, by convexity V (α) = F (y) > F (x∗ ).

Thus, F (x∗ ) ≤ inf α≥0 V (α) ≤ F̄ ; the first inequality being strict in the positive-
definite case. [It can be shown that the “inf” is a “min”]. Each value V (α)
incorporates combinatorial information (through the quantity D2 (pα , q)) and
thus the computation of minα≥0 V (α) cannot be obtained through direct convex
optimization techniques. As a counterpoint to Theorem 1 one can prove (using
the notation in eq. (8):

Theorem 2. In (9), if C has one row and q(x) = j x2j , V ≤ inf α≥0 V (α).

c
α
H

x*

Bounding ellipsoid in minimizer of F(x) in


α α
H H

Fig. 3. A better paradigm

In order to develop a computationally practicable approach that uses these ob-


servations, let 0 = α(0) < α(1) < . . . < α(J) , such that for any x ∈ P ∩ K,
cT (x − x∗ ) ≤ α(J) c22 . Then:

Updated Template

1. For 0 ≤ i < J, compute a value Ṽ (i) ≤ min{ V (α) : α(i) ≤ α ≤ α(i+1) }.


2. Output min0≤i<J Ṽ (i).

The idea here is that if (for all i) α(i+1) −α(i) is small then V (α(i) ) ≈ V (α(i+1) ).
Thus the quantity output in (2) will closely approximate minα≥0 V (α).
34 D. Bienstock

In our implementation, we compute Ṽ (i) by appropriately interpolating be-


tween V (α(i) ) and V (α(i+1) ) (details, full paper). Thus our approach reduces to
computing quantities of the form V (α). We need a fast procedure for this task
(since J may be large). Considering eq. (10) we see that this involves an applica-
tion of the S-lemma, “restricted” to the hyperplane H α . An efficient realization
of this idea, which allows for additional leveraging of combinatorial information,
is obtained by computing the projection of the quadratic F (x) to H α . This is
the subject of the next section.

1.3 Projecting a Quadratic


Let M = QΛQT be a n × n matrix. Here the columns of Q are the eigen-
vectors of M and Λ = diag{λ1 , . . . , λn } where the λi are the eigenvalues of
M . We assume λ1 ≤ . . . ≤ λn . Let c = 0 be an arbitrary vector, and de-
note H = x ∈ Rn : cT x = 0 , and let P be the projection matrix onto H.
In this section we describe an efficient algorithm for computing an eigenvalue-
eigenvector decomposition of the “projected quadratic” P M P . Note that if
x ∈ H, xT P M P x = xT M x. The vector c could be dense (is dense in important
cases) and Q could also be dense.
In [8] (also see Section 12.6 of [9] and references therein) the following “in-
verse” problem is considered. Suppose λi < λi+1 (1 ≤ i < n) and that for
1 ≤ i ≤ n − 1 we are given a number λ̃i with λi < λ̃i < λi+1 . Then we want
to find c (and hence, P ) such that the λ̃i are the nonzero eigenvalues of P M P .
Our approach reverse engineers that of [8], and extends it so as to handle the
case where the λi are not distinct.
Returning to our problem, clearly c is an eigenvector of P M P (corresponding
to eigenvalue 0). The remaining eigenvalues λ̃1 , . . . , λ̃n−1 are known to satisfy
λ1 ≤ λ̃1 ≤ λ2 ≤ λ̃2 ≤ . . . ≤ λn−1 ≤ λ̃n−1 ≤ λn .
Definition 1. An eigenvector q of M is called acute if q T c = 0. An eigenvalue
λ of M is called acute if at least one eigenvector corresponding to λ is acute.
In (e.2) below we will use the convention 0/0 = 0.
Lemma 1. Let α1 < α2 < . . . < αq be the acute eigenvalues of M . Write
d = QT c. Then, for 1 ≤ i ≤ q − 1,
 d2j
(e.1) The equation nj=1 λj −λ = 0 has a unique solution λ̂i in (αi , αi+1 ).
(e.2) Let wi = Q(λ − λ̂i I)−1 d. Then cT wi = 0 and P M P wi = λ̂i wi .
Proof (e.2) Note that the expression in (e.1), evaluated at λ̂i , can be written as

0 = dT (Λ − λ̂i I)−1 d = cT Q(Λ − λ̂i I)−1 QT c = c, (11)


Thus, we have that wi is a linear combination of acute eigenvectors of M and
that wi ∈ H, and therefore P wi = wi . So
(M − λ̂i I)wi = Q(Λ − λ̂i I)QT wi = QQT c,
Eigenvalue Techniques for Convex Objective Problems 35

and therefore

P M P wi = P M wi = λ̂i P wi = λ̂i wi ,
as desired.
Altogether, Lemma 1 produces q − 1 eigenvalue/eigenvector pairs of P M P . The
vector in (e.2) should not be explicitly computed; rather the factorized form in
(e.2) will suffice. The root to the equation in (e.1) can be quickly obtained using
numerical methods (such as golden section search) since the expression in (e.1)
is monotonely increasing in (αi , αi+1 ) (it may also be possible to adapt the basic
trust-region algorithm [14], which addresses a similar but not identical problem).
Lemma 2. Let α be an eigenvalue of M , V α the set of columns of Q with
eigenvalue α, and A = A(α) denote the acute members of V α . If |A| > 0, then
we can construct |A| − 1 eigenvectors of P M P corresponding to eigenvalue α,
each of which is a linear combination of elements of A and is orthogonal to c.
Proof: Write m = |A|, and let H be the m × m Householder matrix [9] corre-
sponding to dA , i.e. H is a symmetric matrix with H 2 = Im such that
HdA = (dA 2 , 0, ..., 0)T ∈ Rm .
Let QA be the n × m submatrix of Q consisting of the columns corresponding
to A, and define
W = QA H. (12)
Then cT W = dTA H = (dA 2 , 0, ..., 0). In other words, the columns of the
submatrix Ŵ consisting of the last m − 1 columns of W are orthogonal to c.
Denoting by Ĥ the submatrix of H consisting of the last m − 1 columns of H,
we therefore have
Ŵ = QA Ĥ, and
P M P Ŵ = P QΛQT Ŵ = P QΛQT QA Ĥ = αP QA Ĥ = αŴ .
Finally, Ŵ T Ŵ = Ĥ T Ĥ = Im , as desired.
Now suppose that
α1 < α2 < . . . < αq
denote the distinct acute eigenvalues of M (possibly q = 0). Let p denote the
number of columns of Q which are perpendicular eigenvectors. Writing mi =
|A(αi )| > 0 for 1 ≤ i ≤ q, we have that

q
n= mi + p.
i=1

(p.1) Using Lemma 1 we obtain q − 1 eigenvectors of P M P , each of which is


a linear combination of acute eigenvectors among Q. Any eigenvalue of
P M P constructed in this manner is different from all acute eigenvalues of
M.
36 D. Bienstock

(p.2) Using Lemma 2 we obtain, for each i, a set of mi − 1 eigenvectors of


P M P , orthogonal to c and with eigenvalue αi , each of which is a linear
combination of elements of A(αi ). In total, we obtain n−q −p eigenvectors
of P M P .
(p.3) Let p denote the number of perpendicular vectors among Q. Any such
vector v (with eigenvalue λ, say) by definition satisfies P M P v = P M v =
λP v = λv.

By construction, all eigenvectors of P M P constructed as per (p.1) and (p.2) are


distinct. Those arising in (p.3) are different from those in (p.1) and (p.2) since
no column of Q is a linear combination of other columns of Q. Thus, altogether,
(p.1)-(p.3) account for n−1 distinct eigenvectors of P M P , all of them orthogonal
to c, by construction. Finally, the vector c itself is an eigenvector of P M P ,
corresponding to eigenvalue 0.
To conclude this section, we note that it is straightforward to iterate the
procedure in this section, so as to project a quadratic to hyperplanes of dimension
less than n − 1. More details will be provided in the full paper.

1.4 Combinatorial Bounds on Distance Functions

Here we take up the problem of computing strong lower bounds on the Euclidean
distance from a point to the set P ∩ K. In this abstract we will focus on the
cardinality constrained problem, but results of a similar flavor hold for the case
of disjunctive sets.
Let a ∈ Rn , b ∈ R, K < n be a positive integer, and ω ∈ Rn . Consider the
problem
⎧ ⎫
⎨n ⎬
2
Dmin (ω, a) := min (xj − ωj )2 , : aT x = b and x0 ≤ K . (13)
⎩ ⎭
j=1

Clearly, the sum of smallest n − K values ωj2 constitutes a (“naive”) lower bound
for problem (13). But it is straightforward to show that an exact solution to (13)
is obtained by choosing S ⊆ {1, . . . , n} with |S| ≤ K, so as to minimize

(b − j∈S aj ωj )2 
 2 + ωj2 . (14)
j∈S aj j ∈S
/

[We use the convention that 0/0 = 0.] Empirically, the naive bound mentioned
above is very weak since the first term in (14) is typically at least an order of
magnitude larger than the second; and it is the bound, rather than the set S
itself, that matters.
Suppose aj = 1 for all j. It can be shown, using (14), that the optimal set S
has the following structure: S = P ∪ N , where |P | + |N | ≤ K, and P consists of
the indices of the |P | smallest nonnegative ωj (resp., N consists of the indices
of the |N | smallest |ωj | with ωj < 0). The optimal S can be computed in O(K)
Eigenvalue Techniques for Convex Objective Problems 37

time, after sorting the ωj . When ω ≥ 0 or ω ≤ 0 we recover the naive procedure


mentioned above (though again we stress that the first term in (14) dominates).
In general, however, we have:
2
Theorem 3. (a) It is NP-hard to compute Dmin (ω, a). (b) Let 0 < < 1. We
can compute a vector x̂ with j aj x̂j = b and x̂0 ≤ K, and such that


n
(x̂j − ωj )2 ≤ (1 + )Dmin
2
(ω, a),
j=1

in time polynomial in n, −1 , and the number of bits needed to represent ω


and a.
In our current implementation we have not used the algorithm in part (b) of the
Lemma, though we certainly plan to evaluate this option. Instead, we proceed
as follows. Assume aj = 0 for all j. Rather than solving problem (13), instead
we consider
⎧ ⎫
⎨ n ⎬
min a2j (xj − ωj )2 : aT x = b and x0 ≤ K .
⎩ ⎭
j=1

Writing
 ω̄j = aj ω j (for all
j), this becomes
n 
j=1 (xj − ω̄j ) j xj = b and x0 ≤ K , which as noted above
2
min :
can be efficiently solved.

1.5 Application of the S-Lemma


Let M = QΛQT  0 be a matrix given by its eigenvector factorization. Let H
be a hyperplane through the origin, x̂ ∈ H, v ∈ Rn , δj > 0 for 1 ≤ j ≤ n, β > 0,
and v ∈ Rn . Here we solve the problem

n
min xT M x + v T x, subject to δi (xi − x̂i )2 ≥ β, and x ∈ H. (15)
i=1

By rescaling, translating, and appropriately changing notation, the problem


becomes:

n
min xT M x + v T x, subject to x2i ≥ β, and x ∈ H. (16)
i=1

Let P be the n × n matrix corresponding to projection onto H. Using Section


1.3 we can produce a representation of P M P as Q̃Λ̃Q̃T , where the the nth
eigenvector q̃n is orthogonal to H, and λ̃1 = mini<n {λ̃i }. Thus, problem (16)
becomes, for appropriately defined ṽ,

n−1 
n−1
Γ := min λ̃j yj2 + 2ṽ T y, subject to yj2 ≥ β. (17)
j=1 j=1
38 D. Bienstock

Using the S-lemma, we have that Γ ≥ γ, iff there exists μ ≥ 0 s.t.


⎛ ⎞

n−1 
n−1
λ̃j yj2 + 2ṽ T y − γ − μ ⎝ yj2 − β ⎠ ≥ 0 ∀ y ∈ Rn−1 . (18)
j=1 j=1

Using some linear algebra, this is equivalent to


 
 ṽ 2
n−1
Γ = max μβ − i
: 0 ≤ μ < λ̃1 . (19)
λ̃ − μ
i=1 i

This is a simple task, since in [0, λ̃1 ) the objective in (19) is concave in μ.
Remarks:
(1) Our updated template in Section 1.2 requires the solution of multiple prob-
lems of the form (19) but just one computation of Q̃ and Λ̃.
(2) Consider any integer 1 ≤ p < n − 1. When μ < λ̃1 ,  the expression max-
p ṽi2
n−1 2
i=p+1 ṽi
imized in (19) is lower bounded by μβ − i=1 λ̃i −μ − λp+1 −μ . This, and
related facts, yield an approximate version of our approach which only asks for
the first p elements of the eigenspace of P M P (and M ).

Capturing the second eigenvalue. We see that Γ < λ̃1 β (and frequently this
bound is close). In experiments, the solution y ∗ to (16) often “cheats” in that y1∗
is close to zero. We can then improve on our procedure if the second projected
eigenvalue, λ̃2 , is significantly larger than λ̃1 . Assuming that is the case, pick a
value θ with y1∗2 /β < θ < 1.

n that y1 ≥ θβ
2
(a) If we assert then we may be able to strengthen the constraint
in (15) to i=1 δi (xi − x̂i )2 ≥ γ, where γ = γ(θ) > β. See Lemma 3 below. So
the assertion amounts to applying
 the2 S-lemma, but using γ in place of β.
(b) Otherwise, we have that n−1 i=2 yi ≥ (1 − θ)β. In this case, instead of the
right-hand side of (19), we will have
 
 ṽ 2
n−1
max μ(1 − θ)β − i
: 0 ≤ μ ≤ λ̃2 . (20)
λ̃ − μ
i=2 i

The minimum of the quantities obtained in (a) and (b) yields a valid lower
bound on Γ ; we can evaluate several candidates for θ and choose the strongest
bound. When λ̃2 is significantly larger than λ̃1 we often obtain an improvement
over the basic approach as in Section 1.5.
Note: the approach in this section constitutes a form of branching and in our
testing has proved very useful when λ2 > λ1 . It is, intrinsically, a combinatorial
approach, and thus not easily reproducible using convexity arguments alone.
To complete this section, we point out that the construction of the quantities
γ(β) above is based on the following observation:
Eigenvalue Techniques for Convex Objective Problems 39

Lemma 3. Let v ∈ Rn , let H ⊂ Rn be a (n − 1)-dimensional hyperplane with


v ∈/ H, and w be the projection of v onto H. Let G ⊂ Rn be a ≤ (n − 1)-
dimensional hyperplane, K the intersection of G with the closed half-space of Rn
separated from v by H, Dw,G the distance from w to G and D̄v,K the distance
from v to K (D̄v,K = +∞ if K = ∅). Then D̄v,K 2
≥ v − w2 + Dw,G 2
.

2 Computational Experiments

We consider problems min{ xT M x + v T x : j xj = 1, x ≥ 0, x0 ≤ K }. The
matrix M  0 is given in its eigenvector/eigenvalue factorization QΛQT . To
stress-test our linear algebra routines, we construct Q as the product of random
rotations: as the number of rotations increases, so does the number of nonzeroes
in Q, and the overall “complexity” of M . We ran our procedure after computing
the solution to the (diagonalized) “weak” formulation

min{ y T Λy + v T x : QT x = y, xj = 1, x ≥ 0}.
j

We also ran the (again, diagonalized) perspective formulation [10], [12], a strong
conic formulation (here, λmin is the minimum λi ):
 
min λmin wj + (λj − λmin )yj2
j j

T
s.t. Q x = y, xj = 1
j

x2j − wj zj ≤ 0, 0 ≤ zj ≤ 1 ∀ j, (21)

zj ≤ K, xj ≤ zj ∀ j, x, w ∈ Rn+ .
j

We used the Updated Template given above, with c = ∇(x∗ ) and with the
α(i) quantities set according to the following method: (a) J = 100, and (b)
α(J) = argmax{α ≥ 0 : H α ∩ S n−1 = ∅} (S n−1 is the unit simplex). The
improvement technique involving the second eigenvalue was applied in all cases.
For the experiments in Tables 1 and 2, we used Cplex 12.1 on a single core
of a 2.66 GHz quad-core Xeon machine with 16 GB of RAM, which was never
exceeded. In the tests in Table 1, n = 2443 and the eigenvalues are from a
finance application. Q is the product of 5000 random rotations, resulting in
142712 nonzeros in Q.
Here, rQMIP refers to the weak formulation, PRSP to the perspective for-
mulation, and SLE to the approach in this paper. “LB” is the lower bound
produced by a given approach, and “sec” is the CPU time in seconds. The second
eigenvalue technique proved quite effective in all these tests.
In Table 2 we consider examples with n = 10000 and random Λ. In the table,
Nonz indicates the number of nonzeroes in Q; as this number increases the
quadratic becomes less diagonal dominant.
40 D. Bienstock

Table 1. Examples with few nonzeroes

K rQMIP PRSP SLE rQMIP PRSP SLE


LB LB LB sec sec sec

200 0.031 0.0379 0.0382 14.02 59.30 5.3


100 0.031 0.0466 0.0482 13.98 114.86 5.8
90 0.031 0.0485 0.0507 14.08 103.38 5.9
80 0.031 0.0509 0.0537 14.02 105.02 6.2
70 0.031 0.0540 0.0574 13.95 100.06 6.2
60 0.031 0.0581 0.0624 15.64 111.63 6.4
50 0.031 0.0638 0.0696 13.98 110.78 6.4
40 0.031 0.0725 0.0801 14.03 104.48 6.5
30 0.031 0.0869 0.0958 14.17 104.48 6.8
20 0.031 0.1157 0.1299 15.69 38.13 6.9
10 0.031 0.2020 0.2380 14.05 43.77 7.2

Table 2. Larger examples

Nonz rQMIP PRSP SLE rQMIP PRSP SLE


in Q LB LB LB sec sec sec

5.3e+05 2.483e-03 1.209e-02 1.060e-02 332 961.95 57.69


3.7e+06 2.588e-03 1.235e-02 1.113e-02 705 2299.75 57.55
1.8e+07 2.671e-03 1.248e-02 1.117e-02 2.4e+03 1.3e+04 57.69
5.3e+07 2.781e-03 1.263e-02 1.120e-02 1.1e+04 8.5e+04 58.44
8.3e+07 2.758e-03 1.262e-02 1.211e-02 2.3e+04 1.4e+05 57.38

As in Table 1, SLE and PRSP provide similar improvements over rQMIP


(which is clearly extremely weak). SLE proves uniformly fast. In the examples
in Table 2, the smallest ten (or so) eigenvalues are approximately equal, with
larger values after that. As a result, on these examples our second eigenvalue
technique proved ineffective.
Also note that the perspective formulation quickly proves impractical. A
cutting-plane procedure that replaces the conic constraints in (21) with (outer
approximating) linear inequalities is outlined in [10], [12] and tested on random
problems with n ≤ 400. The procedure begins by solving rQMIP and then it-
eratively adds the inequalities; or it could simply solve a formulation consisting
of rQMIP, augmented with a set of pre-computed inequalities. In our experi-
ments with this linearized approximation, we found that (a) it can provide a very
good lower bound to the conic perspective formulation, (b) it can run signifi-
cantly faster than the full conic formulation, but, (c) it proves significantly slower
than rQMIP, and, in particular, still significantly slower than the combination
Eigenvalue Techniques for Convex Objective Problems 41

Table 3. Detailed analysis of K = 70 case of Table 1

algorithm threads nodes wall-clock time (sec) LB UB

QPMIP
mip emph 3 4 10000 41685 0.0314 0.241
(16.67 sec/node)

PRSP-MIP
mip emph 2 16 14000 39550 0 0.8265
(90.4 sec/node)
mip emph 3 16 7000 19817 0 0.8099
(45.30 sec/node)

LPRSP-MIP

mip emph 0 4 39000 109333 0.0554 0.305


(11.21 sec/node) root 0.0540

mip emph 1 16 7000 36751 0.0542 0.412


(84.04 sec/node) root 0.0540

mip emph 2 16 16000 35222 0.0543 0.309


(35.22 sec/node) root 0.0540

mip emph 3 16 6000 57469 0.0564 0.702


(153 sec/node) root 0.0540

of rQMIP and SLE. A strengthened version of the perspective formulation,


which requires the solution of a semidefinite program, is given in [11].
Note that the perspective formulation itself is an example of the paradigm that
we consider in this paper: a convex formulation for a nonconvex problem with a
convex objective; thus we expect it to exhibit stalling. Table 3 concerns the K =
70 case of Table 1, using Cplex 12.1 on a dual 2.93 GHz quad-core “Nehalem”
machine with 48GB of physical memory. [This CPU uses “hyperthreading” and
Cplex 12.1, as a default, will use 16 threads]. On this machine, rQMIP requires
4.35 seconds (using Cplex) and our method, 3.54 seconds (to prove a lower bound
of 0.0574).
In this table, QPMIP is the weak formulation, PRSP-MIP is the perspec-
tive formulation, and LPRSP-MIP is the linearized perspective version (con-
straint (21) is linearized at xj = 1/K which proved better than other choices).
[Comment: Cplex 12.1 states a lower bound of 0 for PRSP-MIP]. “wall-clock
time” indicates the observed running time. The estimates of CPU time per node
were computed using the formula (wall-clock time)*threads/nodes.
42 D. Bienstock

References
1. Ben-Tal, A., Nemirovsky, A.: Lectures on Modern Convex Optimization: Analysis,
Algorithms, and Engineering Applications. MPS-SIAM Series on Optimization.
SIAM, Philadelphia (2001)
2. Bienstock, D.: Computational study of a family of mixed-integer quadratic pro-
gramming problems. Math. Programming 74, 121–140 (1996)
3. Bienstock, D., Zuckerberg, M.: Subset algebra lift algorithms for 0-1 integer pro-
gramming. SIAM J. Optimization 105, 9–27 (2006)
4. Bienstock, D., McClosky, B.:Tightening simple mixed-integer sets with guaranteed
bounds (submitted 2008)
5. Boyd, S., El Ghaoui, L., Feron, E., Balakrishnan, V.: Linear matrix inequalities in
system and control theory. SIAM, Philadelphia (1994)
6. Cook, W., Kannan, R., Schrijver, A.: Chv’atal closures for mixed integer programs.
Math. Programming 47, 155–174 (1990)
7. De Farias, I., Johnson, E., Nemhauser, G.: A polyhedral study of the cardinality
constrained knapsack problem. Math. Programming 95, 71–90 (2003)
8. Golub, G.H.: Some modified matrix eigenvalue problems. SIAM Review 15, 318–
334 (1973)
9. Golub, G.H., van Loan, C.: Matrix Computations. Johns Hopkins University Press,
Baltimore (1996)
10. Frangioni, A., Gentile, C.: Perspective cuts for a class of convex 0-1 mixed integer
programs. Mathematical Programming 106, 225–236 (2006)
11. Frangioni, A., Gentile, C.: SDP Diagonalizations and Perspective Cuts for a Class
of Nonseparable MIQP. Oper. Research Letters 35, 181–185 (2007)
12. Günlük, O., Linderoth, J.: Perspective Relaxation of Mixed Integer Nonlinear Pro-
grams with Indicator Variables. In: Lodi, A., Panconesi, A., Rinaldi, G. (eds.)
IPCO 2008. LNCS, vol. 5035, pp. 1–16. Springer, Heidelberg (2008)
13. Moghaddam, B., Weiss, Y., Avidan, S.: Generalized spectral bounds for sparse
LDA. In: Proc. 23rd Int. Conf. on Machine Learning, pp. 641–648 (2006)
14. Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Stat.
Comput. 4, 553–572 (1983)
15. Pólik, I., Terlaky, T.: A survey of the S-lemma. SIAM Review 49, 371–418 (2007)
16. Rendl, F., Wolkowicz, H.: A semidefinite framework for trust region subproblems
with applications to large scale minimization. Math. Program 77, 273–299 (1997)
17. Stern, R.J., Wolkowicz, H.: Indefinite trust region subproblems and nonsymmetric
eigenvalue perturbations. SIAM J. Optim. 5, 286–313 (1995)
18. Sturm, J., Zhang, S.: On cones of nonnegative quadratic functions. Mathematics
of Operations Research 28, 246–267 (2003)
19. Miller, W., Wright, S., Zhang, Y., Schuster, S., Hayes, V.: Optimization methods for
selecting founder individuals for captive breeding or reintroduction of endangered
species (2009) (manuscript)
20. Yakubovich, V.A.: S-procedure in nonlinear control theory, vol. 1, pp. 62–77. Vest-
nik Leningrad University (1971)
21. Ye, Y., Zhang, S.: New results on quadratic minimization. SIAM J. Optim. 14,
245–267 (2003)
Restricted b-Matchings
in Degree-Bounded Graphs

Kristóf Bérczi and László A. Végh

MTA-ELTE Egerváry Research Group (EGRES),


Department of Operations Research, Eötvös Loránd University,
Pázmány Péter sétány 1/C, Budapest, Hungary, H-1117
{berkri,veghal}@cs.elte.hu

Abstract. We present a min-max formula and a polynomial time al-


gorithm for a slight generalization of the following problem: in a simple
undirected graph in which the degree of each node is at most t + 1, find a
maximum t-matching containing no member of a list K of forbidden Kt,t
and Kt+1 subgraphs. An analogous problem for bipartite graphs without
degree bounds was solved by Makai [15], while the special case of finding
a maximum square-free 2-matching in a subcubic graph was solved in [1].

Keywords: square-free, Kt,t -free, Kt+1 -free, b-matching, subcubic graph.

1 Introduction
Let G = (V, E) be an undirected graph and let b : V → Z+ be an upper bound
on the nodes. An edge set F ⊆ E is called a b-matching if dF (v), the number of
edges in F incident to v, is at most b(v) for each node v. (This is often called
simple b-matching in the literature.) For some integer t ≥ 2, by a t-matching
we mean a b-matching with b(v) = t for every v ∈ V . Let K be a set consisting
of Kt,t ’s, complete bipartite subgraphs of G on two colour classes of size t, and
Kt+1 ’s, complete subgraphs of G on t + 1 nodes. The node-set and the edge-set
of a subgraph K ∈ K are denoted by VK and EK , respectively. By a K-free b-
matching we mean a b-matching not containing any member of K. In this paper,
we give a min-max formula on the size of K-free b-matchings and a polynomial
time algorithm for finding one with maximum size (that is, a K-free b-matching
F ⊆ E with maximum cardinality) under the assumptions that for any K ∈ K
and any node v of K,
VK spans no parallel edges (1)
b(v) = t (2)
dG (v) ≤ t + 1. (3)
Note that this is a generalization of the problem mentioned in the abstract. The
most important special case of K-free b-matching is to find a maximum C3 -free

Supported by the Hungarian National Foundation for Scientific Research (OTKA)
grant K60802.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 43–56, 2010.

c Springer-Verlag Berlin Heidelberg 2010
44 K. Bérczi and L.A. Végh

or C4 -free 2-matching in a graph where Ck stands for a cycle of length k. The


motivation for these problems is twofold. On the one hand, a natural relaxation
of the Hamiltonian cycle problem is to find a C≤k -free 2-factor, that is, a 2-factor
containing no cycle of length at most k. Cornuéjols and Pulleyblank [2] showed
this problem to be NP-complete for k ≥ 5. In his Ph.D. thesis [6], Hartvigsen
proposed a solution for the case k = 3. Hence the remaining question is to
find a maximum C≤4 -free 2-matching, and another natural question is to find a
maximum C4 -free 2-matching (possibly containing triangles).
The other motivation comes from connectivity-augmentation, that is, when
one would like to make a graph G = (V, E) k-node-connected by the addition of
a minimum number of new edges. It is easy to see that for k = n − 2 (n = |V |)
this problem is equivalent to finding a maximum matching in the complement
graph of G. For k = n − 3 the problem is equivalent to finding a maximum
C4 -free 2-matching.
The C4 -free 2-matching problem admits two natural generalizations. The first
one is Kt,t -free t-matchings considered in this paper, while the second is t-
matchings containing no complete bipartite graph Ka,b with a + b = t + 2.
This latter problem is equivalent to connectivity augmentation for k = n − t − 1.
The complexity of connectivity augmentation for general k is yet open, while
connectivity augmentation by one, that is, when the input graph is already
(k − 1)-connected was recently solved in [20] (this corresponds to the case when
the graph contains no Ka,b with a + b = t + 3, in particular, d(v) ≤ t + 1).
The weighted versions of these problems are also of interest. The weighted
C≤k -free 2-matching problem asks for a C≤k -free 2-matching with maximum
weight for a weight function defined on the edge set. For k = 2 the problem
is just to find a 2-matching with maximum weight, while Király showed [11]
that the problem is NP-complete for k = 4 even in bipartite graphs with 0 − 1
weights on the edges. The case of k = 3 in general graphs is still open. Hartvigsen
and Li [9], and recently Kobayashi [12] gave polynomial-time algorithms for
the weighted C3 -free 2-matching problem in subcubic graphs with an arbitrary
weight function.
Let us now consider the special case of C4 -free 2-matchings in bipartite graphs.
This problem was solved by Hartvigsen [7,8] and Király [10]. A generalization of
the problem to maximum Kt,t -free t-matchings in bipartite graphs was given by
Frank [3] who observed that this is a special case of covering positively crossing
supermodular functions on set pairs, solved by Frank and Jordán in [4]. Makai
[15] generalized Frank’s theorem for the case when a list K of forbidden Kt,t ’s
is given (that is, a t-matching may contain Kt,t ’s not in K.) He gave a min-max
formula based on a polyhedral description for the minimum cost version for node-
induced cost functions. Pap [16] gave a further generalization of the maximum
cardinality version for excluded complete bipartite subgraphs and developed a
simple, purely combinatorial algorithm. For node induced cost functions, such
an algorithm was given by Takazawa [19] for Kt,t -free t-matching.
Much less is known when the underlying graph is not assumed to be bipartite
and finding a maximum C4 -free 2-matching is still open. The special case when
Restricted b-Matchings in Degree-Bounded Graphs 45

the graph is subcubic was solved by the first author and Kobayashi [1]. In terms
of connectivity augmentation, the equivalent problem is augmenting an (n − 4)-
connected graph to (n − 3) connected. Our theorem is a generalization of this
result.
It is worth mentioning that the polynomial solvability of the above problems
seems to show a strong connection with jump systems. In [18], Szabó proved that
for a list K of forbidden Kt,t and Kt+1 subgraphs the degree sequences of K-free t-
matchings form a jump system in any graph. Concerning bipartite graphs,
Kobayashi and Takazawa showed [14] that the degree sequences of C≤k -free 2-
matchings do not always form a jump system for k ≥ 6. These results are con-
sistent with the polynomial solvability of the C≤k -free 2-matching problem, even
when restricting it to bipartite graphs. Similar results are known about even fac-
tors due to [13]. Although Szabó’s result suggests that finding a maximum K-free
t-matching should be solvable in polynomial time, the problem is still open.
Among our assumptions, (1) and (2) may be considered as natural ones as
they hold for the maximum Kt,t -free t-matching problem in a simple graph.
We exclude parallel edges on the node sets of members of K in order to avoid
having two different Kt,t ’s on the same two colour classes or two Kt+1 ’s on the
same ground set. However, the degree bound (3) is a restrictive assumption and
dissipates essential difficulties. Our proof strongly relies on this and the theorem
cannot be straightforwardly generalized, as it can be shown by using the example
in Chapter 6 of [20].
The proof and algorithm use the contraction technique of [11], [16] and [1].
Our contribution on the one hand is the extension of this technique for t ≥ 2 and
forbidding Kt+1 ’s as well, while on the other hand the argument is significantly
simpler than the argument in [1].
Throughout the paper we use the following notation. For an undirected graph
G = (V, E), the set of edges induced by X ⊆ V is denoted by E[X]. For disjoint
subsets X, Y of V , E[X, Y ] denotes the set of edges between X and Y . The set of
nodes in V − X adjacent to X by some edge from F ⊆ E is denoted by ΓF (X).
We let dF (v) denote the number of edges in F ⊆ E incident to v, where loops
in G are counted twice, while dF (X, Y ) stands for the number of edges going
between disjoint subsets X and Y . For a node v ∈ V , we sometimes abbreviate
the set {v} by v, e.g. dF (v,
 X) is the number of edges between v and X. For a
set X ⊆ V , let hF (X) = v∈X dF (v), the sum of the number of edges  incident
to X and twice the number of edges spanned by X. We use b(U ) = v∈U b(v)
for a function b : V → Z+ and a set U ⊆ V .
Let K be the list of forbidden Kt,t and Kt+1 subgraphs. For disjoint subsets
X, Y of V we denote by K[X] and K[X, Y ] the members of K contained in X
and having edges only between X and Y , respectively. That is, K[X, Y ] stands
for forbidden Kt,t ’s whose colour classes are subsets of X and Y . Recall that
VK and EK denote the node-set and edge-set of the forbidden graph K ∈ K,
respectively.
The rest of the paper is organized as follows. In Section 2 we formalize the
theorem and prove the trivial max ≤ min direction. Two shrinking operations
46 K. Bérczi and L.A. Végh

are introduced in Section 3, and Section 4 contains the proof of the max ≥ min
direction. Finally, the algorithm is presented in Section 5.

2 Main Theorem
Before stating our theorem, let us recall the well-known min-max formula on the
maximum size of a b-matching (see e.g. [17, Vol A, p. 562.]).

Theorem 1 (Maximum size of a b-matching). Let G = (V, E) be a graph


with an upper bound b : V → Z+ . The maximum size of a b-matching is equal to
the minimum value of
 
b(U ) + |E[W ]| + 2 (b(T ) + |E[T, W ]|)
1
(4)
T

where U and W are disjoint subsets of V , and T ranges over the connected
components of G − U − W .

Let us now formulate our theorem. There are minor technical difficulties when
t = 2 that do not occur for larger t. In order to make both the formulation and
the proof simpler it is worth introducing the following definitions. We refer to
forbidden K2,2 and K3 subgraphs as squares and triangles, respectively.

Definition 2. For t = 2, we call a complete subgraph on four nodes square-full


if it contains three forbidden squares.

Note that, by assumption (3), every square-full subgraph is a connected com-


ponent of G. We denote the number of square-full components of G by S(G)
for t = 2, and define S(G) = 0 for t > 2. It is easy to see that a K-free b-
matching contains at most three edges from each square-full component of G.
The following definition will be used in the proof of the theorem.

Definition 3. For t = 2, a forbidden triangle is called square-covered if its node


set is contained in the node set of a forbidden square, otherwise uncovered.

The theorem is as follows.

Theorem 4. Let G = (V, E) be a graph with an upper bound b : V → Z+ and


K be a list of forbidden Kt,t and Kt+1 subgraphs of G so that (1), (2) and (3)
hold. Then the maximum size of a K-free b-matching is equal to the minimum
value of
 
b(U ) + |E[W ]| − |K̇[W ]| + 1
2 (b(T ) + |E[T, W ]| − |K̇[T, W ]|) − S(G) (5)
T ∈P

where U and W are disjoint subsets of V , P is a partition of the connected


components of G − U − W and K̇ ⊆ K is a collection of node-disjoint forbidden
subgraphs.
Restricted b-Matchings in Degree-Bounded Graphs 47

For fixed U, W, P and K̇ the value of (5) is denoted by τ (U, W, P, K̇). It is easy
to see that the contribution of a square-full component to (5) is always 3 and
a maximum K-free b-matching contains exactly 3 of its edges. Hence we may
count these components of G separately, so the following theorem immediately
implies the general one.

Theorem 5. Let G = (V, E) be a graph with an upper bound b : V → Z+ and


K be a list of forbidden Kt,t and Kt+1 subgraphs of G so that (1), (2) and (3)
hold. Furthermore, if t = 2, assume that G has no square-full component. Then
the maximum size of a K-free b-matching is equal to the minimum value of
 
b(U ) + |E[W ]| − |K̇[W ]| + 1
2 (b(T ) + |E[T, W ]| − |K̇[T, W ]|) (6)
T ∈P

where U and W are disjoint subsets of V , P is a partition of the connected


components of G − U − W and K̇ ⊆ K is a collection of node-disjoint forbidden
subgraphs.

Proof (of max ≤ min in Theorem 5). Let M be a K-free b-matching. Then clearly
|M ∩(E[U ]∪E[U, V −U ])| ≤ b(U ) and |M ∩E[W ]| ≤ |E[W ]|−|K̇[W ]|. Moreover,
for each T ∈ P we have

2 · |M ∩ (E[T ] ∪ E[T, W ])| = 2 · |M ∩ E[T ]| + 2 · |M ∩ E[T, W ]|


≤ 2 · |M ∩ E[T ]| + |M ∩ E[T, W ]|
+ |E[T, W ]| − |K̇[T, W ]|
≤ b(T ) + |E[T, W ]| − |K̇[T, W ]|.

These together prove the inequality. 




3 Shrinking
In the proof of max ≥ min we use two shrinking operations to get rid of the Kt,t
and Kt+1 subgraphs in K.

Definition 6 (Shrinking a Kt,t subgraph). Let K be a Kt,t subgraph of


G = (V, E) with colour classes KA and KB . Shrinking K in G consists of the
following operations:
• identify the nodes in KA , and denote the corresponding node by ka ,
• identify the nodes in KB , and denote the corresponding node by kb , and
• replace the edges between KA and KB with t − 1 parallel edges between ka
and kb (we call the set of these edges a shrunk bundle between ka and kb ).

When identifying the nodes in KA and KB , the edges (and also loops) spanned
by KA and KB are replaced by loops on ka and kb , respectively. Each edge
48 K. Bérczi and L.A. Végh

KA ka

t − 1 edges

KB kb

Fig. 1. Shrinking a Kt,t subgraph

e ∈ E − EK is denoted by e again after shrinking a Kt,t subgraph and is called


the image of the original edge. By abuse of notation, for an edge set F ⊆ E −EK ,
the corresponding subset of edges in the contracted graph is also denoted by F .
Hence for an edge set F ⊆ E−EK we have hF (KA ) = dF (ka ), hF (KB ) = dF (kb ).
Definition 7 (Shrinking a Kt+1 subgraph). Let K be a Kt+1 subgraph of
G = (V, E). Shrinking K in G consists of the following operations:

• identify the nodes in VK , and denote the corresponding node by k,


 
• replace the edges in EK by t+12 − 1 loops on the new node k.

VK

 t+1
2
 − 1 loops

Fig. 2. Shrinking a Kt+1 subgraph

Again, for an edge set F ⊆ E − EK , the corresponding subset of edges in the


contracted graph is also denoted by F .
We usually denote the graph obtained by applying one of the shrinking op-
erations by G◦ = (V ◦ , E ◦ ). Throughout the section, the graph G, the function
b and the list K of forbidden subgraphs are supposed to satisfy the conditions
of Theorem 5. It is easy to see, by using (3), that two members of K are edge-
disjoint if and only if they are also node-disjoint, hence we simply call such pairs
disjoint.
The following two lemmas give the connection between the maximum size
of a K-free b-matching in G and a K◦ -free b◦ -matching in G◦ where b◦ is a
properly defined upper bound on V ◦ and K◦ is a list of forbidden sugraphs in
the contracted graph.
Restricted b-Matchings in Degree-Bounded Graphs 49

Lemma 8. Let G◦ = (V ◦ , E ◦ ) be the graph obtained by shrinking a Kt,t sub-


graph K. Let K◦ be the set of forbidden subgraphs disjoint from K and define
b◦ as b◦ (v) = b(v) for v ∈ V − VK and b◦ (ka ) = b◦ (kb ) = t. Then the difference
between the maximum size of a K-free b-matching in G and the maximum size
of a K◦ -free b◦ -matching in G◦ is exactly t2 − t.
Lemma 9. Let G◦ = (V ◦ , E ◦ ) be the graph obtained by shrinking a Kt+1 sub-
graph K ∈ K where K is uncovered if t = 2. Let K◦ be the set of forbidden sub-
graphs disjoint from K and define b◦ as b◦ (v) = b(v) for v ∈ V − VK , b◦ (k) = t if
t is even and b◦ (k) = t + 1 if t is odd. Then the difference between the maximum
size of a K-free b-matching
 2 in G and the maximum size of a K◦ -free b◦ -matching
◦ t
in G is exactly 2 .

The proof of Lemma 8 is based on the following claim.


Claim 10. Assume that K ∈ K is a Kt,t subgraph with colour classes KA and
KB and M  is a K-free b-matching of G − EK . Then M  can be extended to a
K-free b-matching M of G with |M | = |M  | + t2 − max{1, hM  (KA ), hM  (KB )}.
Proof. First we consider the case t ≥ 3. Let P be a minimum size matching of
K covering each node v ∈ VK with dM  (v) = 1 (note that dM  (v) ≤ 1 for v ∈ VK
as d(v) ≤ t + 1). If there is no such node, then let P consist of an arbitrary
edge in EK . We claim that M = M  ∪ (EK − P ) satisfies the above conditions.
Indeed, M is a b-matching and |M ∩ EK | = t2 − max{1, hM  (KA ), hM  (KB )}
clearly holds, so we only have to verify that it is also K-free.
Assume that there is a forbidden Kt,t subgraph K  in M with colour classes
   
KA , KB . EK  must contain an edge uv ∈ EK ∩ M with u ∈ KA and v ∈ KB .

By symmetry, we may assume that u ∈ KA . As b(u) = t, ΓM (u) = KB and also

|ΓM (u) ∩ KB | ≥ t − 1. Hence |KB ∩ KB | ≥ t − 1. Consider a node z ∈ KA .
 
Since dM (z, KB ) ≥ t − 1 and t ≥ 3, we get dM (z, KB ) > 0, thus KA ⊆ ΓM (KB ).
   
Because of ΓM (KB ) = KA , this gives KA = KA . KB = KB follows similarly,
giving a contradiction.
If there is a forbidden Kt+1 subgraph K  in M , then EK  must contain an
edge uv ∈ EK ∩ M , u ∈ KA . As above, |VK  ∩ KB | ≥ t − 1. Using t ≥ 3
again, KA ⊆ ΓM (VK  ∩ KB ) ⊆ VK  . But KA ⊆ VK  is a contradiction since
t + 1 = |VK  | ≥ |VK  ∩ KA | + |VK  ∩ KB | ≥ t + t − 1 = 2t − 1 > t + 1.
Now let t = 2 and KA = {v1 , v3 }, KB = {v2 , v4 }. If max{hM  (KA ), hM  (KB )}
≤ 1, then we may assume by symmetry that dM  (v1 ) = dM  (v2 ) = 0. Clearly,
M = M  ∪{v1 v2 , v1 v4 , v2 v3 } is a K-free 2-matching. If max{hM  (KA ), hM  (KB )}
= 2, we claim that at least one of M1 = M  ∪ {v1 v2 , v3 v4 } and M2 = M  ∪
{v1 v4 , v2 v3 } is K-free. Assume M1 contains a forbidden square or triangle K  ;
by symmetry assume it contains the edge v1 v2 . If K  contains v3 v4 as well, then
K  is the square v1 v3 v4 v2 . Otherwise, it consists of v1 v2 and a path L of length
2 or 3 between v1 and v2 , not containing v3 and v4 . In the first case, the only
forbidden subgraph possibly contained in M2 is the square v1 v3 v2 v4 , implying
that {v1 , v2 , v3 , v4 } is a square-full component, a contradiction. In the latter case,
it is easy to see that M2 cannot contain a forbidden subgraph. 

50 K. Bérczi and L.A. Végh

Proof (of Lemma 8). First we show that if M is a K-free b-matching in G then
there is a K◦ -free b◦ -matching M ◦ in G◦ with |M ◦ | ≥ |M | − (t2 − t). Let M  =
M −EK . Clearly, |M ∩EK | ≤ t2 −max{1, hM  (KA ), hM  (KB )}. In G◦ , let M ◦ be
the union of M  and t− max{1, dM  (ka ), dM  (kb )} parallel edges from the shrunk
bundle between ka and kb . Is is easy to see that M ◦ is a K◦ -free b◦ -matching in
G◦ with |M ◦ | ≥ |M | − (t2 − t).
The proof is completed by showing that for an arbitrary K◦ -free b◦ -matching
M in G◦ there exists a K-free b-matching M in G with |M | ≥ |M ◦ | + (t2 − t).

Let H denote the set of parallel edges in the shrunk bundle between ka and kb ,
and let M  = M ◦ − H. Now |M ◦ ∩ H| ≤ t − max{1, dM  (ka ), dM  (kb )} and, by
Claim 10, M  may be extended to a K-free b-matching in G with |M ∩ EK | =
t2 − max{1, hM  (KA ), hM  (KB )}, that is

|M | = |M ◦ | − |M ◦ ∩ H| + |M ∩ EK | ≥ |M ◦ | − (t − max{1, dM  (ka ), dM  (kb )})


+ (t2 − max{1, hM  (KA ), hM  (KB )}) ≥ |M ◦ | + (t2 − t). 


Lemma 9 can be proved in a similar way by using the following claim.

Claim 11. Assume that K ∈ K is a Kt+1 subgraph and M  is a K-free b-


matching of G − EK . If t = 2, then assume that K is uncovered. Then M  can
be
 extended to obtain
 a K-free b-matching M of G with |M | = |M  | + t+1
2 −
max{1,hM  (VK )}
2 .

Proof. Let P be a minimum size subgraph of K covering each node v ∈ VK with


dM  (v) = 1 (so P is a matching or a matching and one more edge in EK ). If there
is no such node, then let P consist of an arbitrary edge in EK . For t = 2 and
3, we will choose P in a specific way, as given later in the proof. We show that
M = M  ∪ (EK − P ) satisfies the above conditions. Indeed, M is a b-matching
   max{1,hM  (K)} 
and |M ∩ EK | = t+1 2 − 2 clearly holds, so we only have to show
that it is also K-free.
Assume that there is a forbidden Kt+1 subgraph K  in M . EK  must contain
an edge uv ∈ EK ∩M . By the minimal choice of P at least one of |ΓM (u)∩VK | ≥

v3 v3

v4 v1 v4 v1

v2 v2
: edges in M : edges in M
: edges in P : edges in P

Fig. 3. Choice of P for t = 2 in the proof of Claim 11


Restricted b-Matchings in Degree-Bounded Graphs 51

v1 v2 x v1 v2 x
 
KA KA
K K
 
KB KB
v3 v4 y v3 v4 y

: edges in M 
: edges in P
: edges in E \ (P ∪ M  )

Fig. 4. Choice of P for t = 3 in the proof of Claim 11

t − 1 and |ΓM (v) ∩ VK | ≥ t − 1 is satisfied which implies |VK  ∩ VK | ≥ t − 1. For


t ≥ 3 this immediately implies VK ⊆ ΓM (VK  ∩ VK ) ⊆ VK  , a contradiction.
If t = 2, then |VK  ∩ VK | ≥ 1 does not imply VK ⊆ VK  and an improper
choice of P may enable M to contain a forbidden K3 . The only such case is
when hM  (VK ) = 3, VK = {v1 , v2 , v3 }, VK  = {v2 , v3 , v4 }, v2 v4 , v3 v4 ∈ M  and
P = {v1 v2 , v1 v3 } (Figure 3). In this case, we may leave the edge incident to v1
from M  and then P = {v2 v3 } is a good choice. Indeed, the only problem could
be that v1 v2 v3 v4 is a forbidden square, contradicting K being uncovered.
Otherwise hM  (VK ) ≤ 2 implies |P | ≤ 1. Hence at least one of |ΓM (u)∩VK | =
2 and |ΓM (v) ∩ VK | = 2 is satisfied meaning K  = K, a contradiction again.
Now assume that there is a forbidden Kt,t subgraph K  in M with colour
 
classes KA , KB . The same argument gives a contradiction for t ≥ 4. However,
in case of t = 3, choosing P arbitrarily may enable M to contain a forbidden

K3,3 in the following single configuration: VK = {v1 , v2 , v3 , v4 }, KA = {v1 , v2 , x},
 
KB = {v3 , v4 , y}, xv3 , xv4 , yv1 , yv2 , xy ∈ M and P = {v1 v2 , v3 v4 } (Figure 4).
In this case, P = {v1 v4 , v2 v3 } is a good choice.
Finally, for t = 2 no forbidden square appears if hM  (K) ≤ 2 as otherwise
K would be a square-covered triangle. If hM  (K) = 3, then such a square K 
may appear only if VK = {v1 , v2 , v3 }, VK  = {v2 , v3 , v4 , v5 }, v3 v4 , v4 v5 , v5 v2 ∈
M  , P = {v1 v2 , v1 v3 } (v1 = v4 , v5 as K is uncovered). In this case both P =
{v1 v2 , v2 v3 } and P = {v1 v3 , v2 v3 } give a proper M (Figure 5). 


v4 v3 v4 v3
v1 v1

v5 v2 v5 v2

: edges in M : edges in M
: edges in P : edges in P
Fig. 5. Choice of P for t = 2 in the proof of Claim 11
52 K. Bérczi and L.A. Végh

Proof (of Lemma 9). First we show that if M is a K-free b-matching  2  in G then
there is a K◦ -free b◦ -matching M ◦ in G◦ with |M ◦ | ≥ |M | − t2 . Let M  =
   max{1,hM  (VK )} 
M − EK . Clearly, |M ∩ EK | ≤ t+1 − . In G◦ , let M ◦ be the
 2  2

union of M  and t−max{1,d 2
M  (k)}
or t+1−max{1,d
2
M  (k)}
loops on k depending
on whether t is even or not, respectively. Is
 2 is easy to see that M ◦ is a K◦ -free
◦ ◦ ◦
b -matching in G with |M | ≥ |M | − 2 . t

The proof is completed by showing that for an arbitrary K◦ -free b◦ -matching  2


M in G◦ there exists a K-free b-matching M in G with |M | ≥ |M ◦ | + t2 .

Let H denote the set of loops on k obtained when  shrinking K, and let M  =
M ◦ − H. Now |M ◦ ∩ H| ≤ t−max{1,d M (k)}
if t is even and |M ◦ ∩ H| ≤
  2
t+1−max{1,dM  (k)}
if t is odd. By Claim 10, M  can be extended modified as to
   max{1,h  (VK )} 
2

get a K-free b-matching in G with |M ∩ EK | = t+1 2 − M


2 , that is
 
|M | = |M ◦ | − |M ◦ ∩ H| + |M ∩ EK | ≥ |M ◦ | − t−max{1,d 2
M  (k)}

   max{1,hM  (VK )}  ◦
 2
+ t+1 2 − 2 ≥ |M | + t
2

if t is even and
 
|M | = |M ◦ | − |M ◦ ∩ H| + |M ∩ EK | ≥ |M ◦ | − t+1−max{1,dM  (k)}
2
   max{1,hM  (VK )}   2
+ t+1
2 − 2 ≥ |M ◦ | + t2

if t is odd. 


4 Proof of Theorem 5
We prove max ≥ min by induction on |K|. For K = ∅, this is simply a consequence
of Theorem 1.
Assume now that K = ∅ and let K be a forbidden subgraph such that K is
uncovered if t = 2. Let G◦ = (V ◦ , E ◦ ) denote the graph obtained by shrinking
K, let b◦ be defined as in Lemma 8 or 9 depending on whether K is a Kt,t or a
Kt+1 . We denote by K◦ the list of forbidden subgraphs disjoint from K.
By induction, the maximum size of a K◦ -free b◦ -matching in G◦ is equal to the
minimum value of τ (U ◦ , W ◦ , P ◦ , K̇◦ ). Let us choose an optimal U ◦ , W ◦ , P ◦ , K˙◦
so that |U ◦ | is minimal. The following claim gives a useful property of U ◦ .
Claim 12. Assume that v ∈ U is such that d(v, W )+|Γ (v)∩(V −W )| ≤ b(v)+1.
Then τ (U −v, W, P  , K̇) ≤ τ (U, W, P, K̇) where P  is obtained from P by replacing
its members incident to v by their union plus v.
Proof. By removing v from U , b(U ) decreases by b(v). |E[W ]| − |K̇[W ]| remains
unchanged, while the bound on d(v, W ) + |Γ (v) ∩ (V − W )| implies that the
increment in the sum over the components of G − U − W is at most b(v). 

Restricted b-Matchings in Degree-Bounded Graphs 53

Case 1: K is a Kt,t with colour classes KA and KB .


By Lemma 8, the difference between the maximum size of a K-free b-matching
in G and the maximum size of a K◦ -free b◦ -matching in G◦ is exactly t2 − t. We
will define U, W, P and K̇ such that
τ (U, W, P, K̇) = τ (U ◦ , W ◦ , P ◦ , K˙◦ ) + t2 − t. (7)
The shrinking replaces KA and KB by two nodes ka and kb with t − 1 parallel
edges between them. Let U, W and P denote the pre-images of U ◦ , W ◦ , P ◦ in G,
respectively and let K̇ = K̇◦ ∪ {K}. By (3), dG◦ −kb (ka ), dG◦ −ka (kb ) ≤ t. Since
b◦ (ka ) = b◦ (kb ) = t, Claim 12 and the minimal choice of |U ◦ | implies that if
ka ∈ U ◦ , then kb ∈ W ◦ .
Hence we have the following cases (T ◦ denotes a member of P ◦ ). In each
case we are only considering those terms in τ (U ◦ , W ◦ , P ◦ , K˙◦ ) that change when
taking τ (U, W, P, K̇) instead.
• ka ∈ U ◦ , kb ∈ W ◦ : b(U ) = b◦ (U ◦ ) + t2 − t.
• ka , kb ∈ W ◦ : |E[W ]| = |E ◦ [W ◦ ]| + t2 − t + 1 and |K̇[W ]| = |K˙◦ [W ◦ ]| + 1.
• ka ∈ W ◦ , kb ∈ T ◦ : |E[T, W ]| = |E ◦ [T ◦ , W ◦ ]|+t2 −t+1, b(T ) = b◦ (T ◦ )+t2 −t
and |K̇[T, W ]| = |K˙◦ [T ◦ , W ◦ ]| + 1 (see Figure 6 for an example).
• ka ∈ T ◦ , kb ∈ W ◦ : similar to the previous case.
• ka , kb ∈ T ◦ : b(T ) = b◦ (T ◦ ) + 2t2 − 2t.
(7) is satisfied in each of the above cases, hence we are done. Note that in the first
and the last case we may leave out K from K̇ as it is not counted in any term.

1 1 3 3 3 T2 1 1 3 3 3 T1

2 3 3 3 U 2 3 3 3 W
Forbidden K3,3 τ (U, W, P, K̇) = 5 + 32 − 3 = 11
Shrinking
1 1 3 T2◦ 1 1

3 T1

2 3 U◦ 2 3 W◦
τ (U ◦ , W ◦ , P ◦ , K˙◦ ) = 5
Fig. 6. Extending M ◦

Case 2: K is a Kt+1 .
By Lemma 9, the difference between the maximum size of a K-free
 2  b-matching in
G and the maximum size of a K◦ -free b◦ -matching in G◦ is t2 . We show that
for the pre-images U, W and P of U ◦ , W ◦ and P ◦ with K̇ = K̇◦ ∪ {K} satisfy
54 K. Bérczi and L.A. Végh

 2
τ (U, W, P, K̇) = τ (U ◦ , W ◦ , P ◦ , K˙◦ ) + t
2 . (8)
 t+1 
After shrinking K = (VK , EK ) we get a new node k with 2 − 1 loops on
it. (3) implies that there are at most t + 1 non-loop edges incident to k. Since
b◦ (k) ≥ t, Claim 12 implies k ∈ U . Hence we have the following two cases (T ◦
denotes a member of P ◦ ).
   t+1 
• k ∈ W ◦ : |E[W ]| = |E ◦ [W ◦ ]| + t+1
2 − 2 + 1 and |K̇[W ]| = |K˙◦ [W ◦ ]| + 1.
• k ∈ T ◦ : b(T ) = b◦ (T ◦ ) + t2 if t is even and b(T ) = b◦ (T ◦ ) + t2 − 1 for an
odd t.

(8) is satisfied in both cases, hence we are done. We may also leave out K from
K̇ in the second case as it is not counted in any term. 


5 Algorithm
In this section we show how the proof of Theorem 5 immediately yields an
algorithm for finding a maximum K-free b-matching in strongly polynomial time.
In such problems, an important question from an algorithmic point of view is how
K is represented. For example, in the K-free b-matching problem for bipartite
graphs solved by Pap in [16], the set of excluded subgraphs may be exponentially
large. Therefore Pap assumes that K is given by a membership oracle, that is,
a subroutine is given for determining whether a given subgraph is a member
of K. However, with such an oracle there is no general method for determining
whether K = ∅. Fortunately, we do not have to tackle such problems: by the next
claim, we may assume that K is given explicitly, as its size is linear in n. We use
n = |V |, m = |E| for the number of nodes and edges of the graph, respectively.

Claim 13. If the graph G = (V, E) satisfies (1) and (3), then the total number
of Kt,t and Kt+1 subgraphs is bounded by (t+3)n
2 .

Proof. Assume that v ∈ V is contained in a forbidden subgraph and so dG (v) =


t + 1. If we select an edge incident to v, the remaining t edges may be contained
in at most one Kt+1 subgraph hence the number of Kt+1 ’s containing v is at
most t + 1. However, these t edges also determine one of the colour classes
of those Kt,t ’s containing them. If we pick a node v  from this colour class
(implying dG (v  ) = t + 1), pick an edge incident to v  (but not to v), then the
remaining t edges, if they do so, exactly determine the other colour class of a Kt,t
subgraph. Therefore the number of Kt,t subgraphs containing v is bounded by
(t + 1)t = t2 + t. Hence the total number of forbidden Kt,t and Kt+1 subgraphs
2
is at most (t +t)n
2t + (t+1)n
t+1 =
(t+3)n
2 . 


Now we turn to the algorithm. First we choose an inclusionwise maximal subset


H = {H1 , . . . , Hk } of disjoint forbidden subgraphs greedily. For t = 2, let us
always choose squares as long as possible and then go on with triangles. This
Restricted b-Matchings in Degree-Bounded Graphs 55

can be done in O(t3 n) time as follows. Maintain an array of size m that encodes
for each edge whether it is used in one of the selected forbidden subgraphs or
not. When increasing H, one only has to check whether any of the edges of the
examined forbidden subgraph is already used, which takes O(t2 ) time. This and
Claim 13 together give an O(t3 n) bound.
Let us shrink the members of H simultaneously (this can be easily done since
they are disjoint), resulting in a graph G = (V  , E  ) with a bound b : V  → Z+
and no forbidden subgraphs since H was maximal. One can find a maximal b -
matching M  in G in O(|V  ||E  | log |V  |) = O(nm log m) time as in [5]. Using the
constructions described in Lemmas 8 and 9 for Hk , ..., H1 , this can be modified
into a maximal K-free b-matching M . Note that, for t = 2, Hi is always uncovered
in the actual graph by the selection rule. A dual optimal solution U, W, P, K̇ can
be constructed simultaneously as in the proof of Theorem 5. The best time bound
of the shrinking and extension steps may depend on the data structure used and
the representation of the graph. In any case, one such step may be performed in
O(m) time and |H| = O(n), hence the total running time is O(t3 n + nm log m).

References
1. Bérczi, K., Kobayashi, Y.: An Algorithm for (n − 3)–Connectivity Augmentation
Problem: Jump System Approach. Technical report, Department of Mathematical
Engineering, University of Tokyo, METR 2009-12
2. Cornuéjols, G., Pulleyblank, W.: A Matching Problem With Side Conditions. Dis-
crete Math. 29, 135–139 (1980)
3. Frank, A.: Restricted t-matchings in Bipartite Graphs. Discrete Appl. Math. 131,
337–346 (2003)
4. Frank, A., Jordán, T.: Minimal Edge-Coverings of Pairs of Sets. J. Comb. Theory
Ser. B 65, 73–110 (1995)
5. Gabow, H.N.: An Efficient Reduction Technique for Degree-Constrained Subgraph
and Bidirected Network Flow Problems. In: STOC ’83: Proceedings of the fifteenth
annual ACM symposium on Theory of computing, pp. 448–456. ACM, New York
(1983)
6. Hartvigsen, D.: Extensions of Matching Theory. PhD Thesis, Carnegie-Mellon Uni-
versity (1984)
7. Hartvigsen, D.: The Square-Free 2-factor Problem in Bipartite Graphs. In:
Cornuéjols, G., Burkard, R.E., Woeginger, G.J. (eds.) IPCO 1999. LNCS, vol. 1610,
pp. 234–241. Springer, Heidelberg (1999)
8. Hartvigsen, D.: Finding maximum square-free 2-matchings in bipartite graphs. J.
Comb. Theory Ser. B 96, 693–705 (2006)
9. Hartvigsen, D., Li, Y.: Triangle-Free Simple 2-matchings in Subcubic Graphs (Ex-
tended Abstract). In: Fischetti, M., Williamson, D.P. (eds.) IPCO 2007. LNCS,
vol. 4513, pp. 43–52. Springer, Heidelberg (2007)
10. Király, Z.: C4 -free 2-factors in Bipartite Graphs. Technical report, Egerváry Re-
search Group, Department of Operations Research, Eötvös Loránd University, Bu-
dapest, TR-2001-13 (2001)
11. Király, Z.: Restricted t-matchings in Bipartite Graphs. Technical report, Egerváry
Research Group, Department of Operations Research, Eötvös Loránd University,
Budapest, TR-2009-04 (2009)
56 K. Bérczi and L.A. Végh

12. Kobayashi, Y.: A Simple Algorithm for Finding a Maximum Triangle-Free 2-


matching in Subcubic Graphs. Technical report, Department of Mathematical En-
gineering, University of Tokyo, METR 2009-26 (2009)
13. Kobayashi, Y., Takazawa, K.: Even Factors, Jump Systems, and Discrete Con-
vexity. Technical report, Department of Mathematical Engineering, University of
Tokyo, METR 2007-36 (2007)
14. Kobayashi, Y., Takazawa, K.: Square-Free 2-matchings in Bipartite Graphs and
Jump Systems. Technical report, Department of Mathematical Engineering, Uni-
versity of Tokyo, METR 2008-40 (2008)
15. Makai, M.: On Maximum Cost Kt,t -free t-matchings of Bipartite Graphs. SIAM J.
Discret. Math. 21, 349–360 (2007)
16. Pap, G.: Alternating Paths Revisited II: Restricted b-matchings in Bipartite
Graphs. Technical report, Egerváry Research Group, Department of Operations
Research, Eötvös Loránd University, Budapest, TR-2005-13 (2005)
17. Schrijver, A.: Combinatorial Optimization - Polyhedra and Efficiency. Springer,
Heidelberg (2003)
18. Szabó, J.: Jump systems and matroid parity (in hungarian). Master’s Thesis,
Eötvös Loránd University, Budapest (2002)
19. Takazawa, K.: A Weighted Kt,t -free t-factor Algorithm for Bipartite Graphs. Math.
Oper. Res. 34, 351–362 (2009) (INFORMS)
20. Végh, L. A.: Augmenting Undirected Node-Connectivity by One. Technical report,
Egerváry Research Group, Department of Operations Research, Eötvös Loránd
University, Budapest, TR-2009-10 (2009)
Zero-Coefficient Cuts

Kent Andersen and Robert Weismantel

Otto-von-Guericke-University of Magdeburg,
Department of Mathematics/IMO, Universitätsplatz 2,
39106 Magdeburg, Germany
{andersen,weismant}@mail.math.uni-magdeburg.de

Abstract. Many cuts used in practice to solve mixed integer programs


are derived from a basis of the linear relaxation. Every such cut is of
the form αT x ≥ 1, where x ≥ 0 is the vector of non-basic variables and
α ≥ 0. For a point x̄ of the linear relaxation, we call αT x ≥ 1 a zero-
coefficient cut wrt. x̄ if αT x̄ = 0, since this implies αj = 0 when x̄j > 0.
We consider the following problem: Given a point x̄ of the linear relax-
ation, find a basis, and a zero-coefficient cut wrt. x̄ derived from this
basis, or provide a certificate that shows no such cut exists. We show
that this problem can be solved in polynomial time. We also test the
performance of zero-coefficient cuts on a number of test problems. For
several instances zero-coefficient cuts provide a substantial strengthening
of the linear relaxation.

Keywords: Mixed integer program; Lattice basis; Cutting plane; Split


cut.

1 Introduction
This paper concerns mixed integer linear sets of the form:

PI := {x ∈ Rn : Ax = b, x ≥ 0 and xj ∈ Z for j ∈ NI }, (1)

where A ∈ Qm×n , b ∈ Qm , N := {1, 2, . . . , n} is an index set for the variables,


and NI ⊆ N is an index set for the integer constrained variables. The linear
relaxation of PI is denoted P . For simplicity we assume A has full row rank. A
basis of A is a subset B ⊆ N of m variables such that the columns {a.j }j∈B
of A are linearly independent. From a basis B one obtains the basic polyhedron
associated with B:

P (B) := {x ∈ Rn : Ax = b and xj ≥ 0 for all j ∈ N \ B}, (2)

and the corresponding corner polyhedron:

PI (B) := {x ∈ P (B) : xj ∈ Z for all j ∈ NI }. (3)

Observe that P (B) can be obtained from P by deleting the non-negativity con-
straints on the basic variables xi for i ∈ B. Also observe that P (B) can be

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 57–70, 2010.

c Springer-Verlag Berlin Heidelberg 2010
58 K. Andersen and R. Weismantel

written in the form:



P (B) = {x ∈ Rn : x = x̄B + xj rj,B and xj ≥ 0 for j ∈ N \ B}, (4)
j∈N \B

where x̄B ∈ Rn is the basic solution associated with B, and the vectors rj,B ∈ Rn
for j ∈ N \B are the extreme rays of P (B).Finally observe that every non-trivial
valid inequality for PI (B) is of the form j∈N \B αj xj ≥ 1 with αj ≥ 0 for all

j ∈ N \ B. We say that a valid inequality j∈N \B αj xj ≥ 1 for PI (B) is a valid
cut for PI that can be derived from the basis B.
Several classes of cuts can be derived from a basis. Some of these cuts are
derived from a single equation. This equation may be one of the equalities
x = x̄B + j∈N \B xj rj,B , or an integer combination of these equalities. The
integrality constraints on the variables are then used to obtain a valid cut. Single-
row cuts are named either Mixed Integer Gomory (MIG) cuts [10], Mixed Integer
Rounding (MIR) cuts [11] or Split Cuts [8].
Recent research has attempted to use several equations simultaneously to
generate valid cuts from a basis. In [3,9], two equations were considered, and
cuts named disection cuts and lifted two-variable cuts were derived. All these
cuts are intersection cuts [4], and their validity is based on lattice-free polyhedra.
This paper is motivated by the following question: Which properties should
cuts derived from bases of the linear relaxation have in order to be effective
cutting planes for mixed integer programs ? Such properties could be useful for
identifying classes of cuts that are effective in practice.
A first realization is that such cuts must be sparse, i.e., the cuts must have
many zero coefficients. Dense cuts are hard for linear programming solvers to
handle, and they have not been shown to be effective in closing the integrality
gap of mixed integer programs.
Secondly, deciding exactly which variables should have a zero coefficient in a
cut seems hard. It therefore seems natural to consider a specific point x̄ ∈ P and
aim at cuts that form solutions to the following variant of a separation problem:
 
min{ αj x̄j : αj xj ≥ 1 is valid for PI (B) for some basis B} (5)
j∈N \B j∈N \B

withmany zero coefficients on variables j ∈ N \ B for which x̄j > 0. Ideally a


cut j∈N \B αj xj ≥ 1 should be maximally violated, i.e., the cut should satisfy
αj = 0 for all j ∈ N \ B with x̄j > 0. We call a maximally violated cut
obtained from a basis of the linear relaxation for a zero-coefficient cut wrt. x̄.
Zero-coefficient cuts are optimal solutions to the above separation problem when
they exist, and they necessarily have coordinates with zero coefficients. Zero-
coefficient cuts therefore seem to be a class of cuts of high quality for solving
mixed integer programs in practice.
The main result in this paper is to show that a zero-coefficient cut wrt. a point
x̄ ∈ P can be identified in polynomial time if such a cut exists. In other words,
given a point x̄ ∈ P , it is possible in polynomial time to find a basis B, and a
Zero-Coefficient Cuts 59

 
valid inequality j∈N \B αj xj ≥ 1 for PI (B) which satisfies j∈N \B αj x̄j = 0 if
such an inequality exists. The cuts we identify are split cuts, and we show that, if
there exists a zero-coefficient cut wrt. x̄, then there also exists a zero-coefficient
cut wrt. x̄ which is a split cut. The cut is computed by first pivoting to an
appropriate basis, and then computing a lattice basis of a well chosen lattice.
It has been shown that, in general, the separation problem for split cuts is
NP-hard [7]. Our result demonstrates that, if one insists on a maximally violated
split cut, then the separation problem can be solved in polynomial time. Zero-
coefficient cuts therefore seem to provide a reasonably efficient alternative to
optimizing over the split closure of a mixed integer set. The quality of the split
closure as an approximation of a mixed integer set was demonstrated in [5].
The performance of zero-coefficient cuts is tested computationally on instances
from miplib 3.0 [6] and miplib 2003 [1]. We restrict our experiments to the corner
polyhedron PI (B ∗ ) obtained from an optimal basis B ∗ of the LP relaxation. In
other words we do not examine the effect of pivoting in our experiments. On
several test problems, zero-coefficient close substantially more integrality gap
than the MIG cuts obtained from the equations defining the optimal simplex
tableau.
The remainder of the paper is organized as follows. In Sect. 2 we derive an
infeasibility certificate for the set PI (B) for a given basis B. This certificate
is key for deriving zero-coefficient cuts. Zero-coefficient cuts are motivated and
presented in Sect. 3. Our main theorem is proved in Sect. 4. Finally our compu-
tational results are presented in Sect. 5.

2 Infeasibility Certificates for Corner Polyhedra


We now consider a fixed basis B, and we address the question of when PI (B) is
empty. We will present a certificate that proves PI (B) = ∅ whenever this is the
case. We first derive the representation (4) of P (B). Since we consider a fixed
basis B throughout this section, we let x̄ := x̄B and rj := rj,B for j ∈ N \ B.
Let AB denote the induced sub-matrix of A composed of the columns in B, and
define ā.j := (AB )−1 a.j for j ∈ N \ B. We may write P (B) in the form:

P (B) = {x ∈ Rn : xi = x̄i − āi,j xj , for i ∈ B,
j∈N \B

xj = xj , for j ∈ N \ B,
xj ≥ 0 for j ∈ N \ B }. (6)
Defining the following vectors r ∈ R for j ∈ N \ B:
j n

⎨ −āk,j if k ∈ B,
rkj := 1 if k = j, (7)

0 otherwise,
the representation (6) of P (B) can be re-written in the form

P (B) = {x ∈ Rn : x = x̄ + sj rj and sj ≥ 0 for j ∈ N \ B}. (8)
j∈N \B
60 K. Andersen and R. Weismantel

Hence PI (B) is empty if and only if the translated cone x̄ + cone({rj }j∈N \B )
does not contain mixed integer points. Our certificate for proving PI (B) is empty
is a split disjunction. A split disjunction is of the form π T x ≤ π0 ∨ π T x ≥ π0 + 1,
where (π, π0 ) ∈ Zn+1 and πj = 0 for all j ∈ N \ NI . All mixed integer points
x ∈ PI (B) satisfy all split disjunctions.
Our point of departure is a result characterizing when an affine set contains
integer points (see [2]). Specifically, consider the affine set:
T a := f + span({q j }j∈J ), (9)

where f ∈ Qn , J is a finite index set and {q j }j∈J ⊂ Qn . A result in [2] shows


that T a does not contain integer points if and only if there exists π ∈ Zn such
that π T f ∈
/ Z and π T q j = 0 for all j ∈ J. Observe that such a vector π ∈ Zn
gives a split disjunction π T x ≤ π T f ∨ π T x ≥ π T f  which proves T a ∩ Zn = ∅,
i.e., we have T a ⊆ {x ∈ Rn : π T f  < π T x < π T f }.
We first generalize this result from integer points in affine sets to mixed integer
points in affine sets.
Lemma 1. The set T a does not contain mixed integer points if and only if there
exists π ∈ Zn such that π T f ∈
/ Z, π T q j = 0 for all j ∈ J and πj = 0 for all
j ∈ N \ NI .
Proof. We have that {x ∈ T a : xj ∈ Z for all j ∈ NI } is empty if and only if

{x ∈ Rn : xi = fi + j∈J sj qij and xi ∈ Z for all i ∈ NI } is empty if and only
if there exists a vector π ∈ Zn such that π T f ∈
/ Z, π T q j = 0 for all j ∈ J and
πj = 0 for all j ∈ N \ NI .
Lemma 1 shows that, if T a does not contain mixed integer points, then there
exists split disjunction π T x ≤ π T f  ∨π T x ≥ π T f , where π ∈ Zn satisfies
πj = 0 for all j ∈ N \ NI , such that T a ⊂ {x ∈ Rn : π T f  < π T x < π T f },
i.e., this split disjunction provides a certificate that shows that T a does not
contain any mixed integer points.
We next generalize Lemma 1 from mixed integer points in affine sets to mixed
integer points in translates of polyhedral cones.
Lemma 2. The set f + cone({q j }j∈J ) contains no mixed integer points if and
only if the set f + span({q j }j∈J ) contains no mixed integer points.
Proof. Let T c := f + cone({q j }j∈J ). We have to show that T c does not contain
mixed integer points if and only if T a does not contain mixed integer points.
Clearly, if T c contains mixed integer points, then T a also contains mixed integer
points since T c ⊆ T a . Hence we only need to show the other direction.
Therefore suppose T c does not contain mixed integer points, and assume, for a
contradiction, that T a contains mixed integer points. Let x ∈ T a satisfy xj ∈ Z

for all j ∈ NI , and let s ∈ R|J| be such that x = f + j∈J sj q j . Choose an
 s
integer d > 0 such that dq j ∈ Zn for all j ∈ J and define x := x − j∈J  dj dq j .
 s s
We have x ∈ {x ∈ Rn : xj ∈ Z for j ∈ NI } and x = f + j∈J ( dj −  dj )dq j .
Hence x ∈ T c which is a contradiction.
Zero-Coefficient Cuts 61

Since PI (B) is the set of mixed integer points in a translate of a polyhedral cone,
we now have the following certificate for when PI (B) is empty.
Corollary 1. We have PI (B) = ∅ if and only there exists π ∈ Zn such that
π T x̄ ∈
/ Z, π T rj = 0 for all j ∈ N \ B and πj = 0 for all j ∈ N \ NI .

3 Zero-Coefficient Cuts from Corner Polyhedra

We now use the certificate obtained in Sect. 2 to derive zero-coefficient cuts for
a corner polyhedron PI (B) for a fixed basis B. As in Sect. 2, we let x̄ := x̄B and
rj := rj,B for j ∈ N \ B. We consider an optimization problem (MIP):

min{ cj xj : x ∈ PI (B)},
j∈N \B

where c ∈ R|N \B| denotes the objective coefficients. The linear programming
relaxation of (MIP) is denoted (LP). The set of feasible solutions to (LP) is the
set P (B). We assume cj ≥ 0 for all j ∈ N \B since otherwise (LP) is unbounded.
To motivate zero-coefficient cuts, we first consider a generic cutting plane
|N \B|
algorithm for strengthening the LP relaxation (LP) of (MIP).  Let V ⊂ Q+
be a family of valid inequalities for PI (B), i.e., we have that j∈N \B αj xj ≥ 1 is
valid for PI (B) for all α ∈ V . Let x ∈ P (B) be arbitrary. A separation problem
(SEP) wrt. x can be formulated as:

min{ αj xj : α ∈ V }.
j∈N \B

A cutting plane algorithm (CPalg) for using V to strengthen the LP relaxation


(LP) of (MIP) can now be designed by iteratively solving (SEP) wrt. various
points x ∈ P (B):

Cutting plane algorithm (CPalg):


(1) Set k := 0. Let xk := x̄ be an optimal solution to (LP).
(2) Solve (SEP) wrt. xk . Let αk ∈ V be an optimal solution.
k k
(3) While j∈N \B αj xj < 1:

j∈N \B αj xj ≥ 1 to (LP) and re-optimize.
k
(a) Add
k+1
Let x be an optimal solution.
(b) Solve (SEP) wrt. xk+1 .
Let αk+1 ∈ V be an optimal solution.
(c) Set k := k + 1.
End.

In (CPalg) above, only one cut is added in every iteration. It is also possible
to add several optimal and sub-optimal solutions to (SEP). Furthermore, for
many classes V of valid cutting planes for PI (B), (SEP) can not necessarily be
62 K. Andersen and R. Weismantel

solved in polynomial time, and final convergence of (CPalg) is not guaranteed.


For instance, if V is the class of split cuts, (SEP)
 is NP-hard [7].
For α ∈ V and x ∈ P (B), the inequality j∈N \B αj xj ≥ 1 is maximally vio-
 
lated by x when j∈N \B αj xj = 0. We call j∈N \B αj xj ≥ 1 a zero-coefficient

cut wrt. x when j∈N \B αj xj = 0. Observe that if a zero-coefficient wrt. xk ex-
ists in the family V of valid inequalities for PI (B) in the k th iteration of (CPalg),
then this cut is an optimal solution to (SEP).
Since (SEP) always returns a zero-coefficient cut wrt. the point that is being
separated whenever such a cut exists, the structure of (CPalg) is such that zero-
coefficient cuts are separated first, i.e., the first iterations of (CPalg) consist of
the following cutting plane algorithm (InitCPalg):

Cutting plane algorithm (InitCPalg):


(1) Set k := 0. Let xk := x̄ be an optimal solution to (LP).
(2) While there exists αk ∈ V such that j∈N \B αj xj ≥ 1
k

is a zero-coefficient
 cut wrt. xk :
j∈N \B αj xj ≥ 1 to (LP) and re-optimize.
k
(a) Add
Let xk+1 be an optimal solution.
(b) Set k := k + 1.
End.

When (InitCPalg) terminates, a point x∗ ∈ P (B) is obtained that satisfies


 ∗
j∈N \B αj xj > 0 for all α ∈ V , i.e., there does not exist any zero-coefficient cut

wrt. x in the family V . Following (InitCPalg), if possible and desirable, (CPalg)
can be continued in order to strengthen the LP relaxation of (MIP) further with
valid cuts that are not maximally violated.
In the following we show that (InitCPalg) can be implemented to run in
polynomial time by using only split cuts. Since the initial phase (InitCPalg)
of (CPalg) can be implemented with split cuts only, this could suggest an ex-
planation of why split cuts have been observed to close a large amount of the
integrality on many instances [5].
We first review how split cuts are derived for PI (B). Every split cut is derived
from a vector π ∈ Zn satisfying π T x̄ ∈ / Z and πj = 0 for all j ∈ N \ NI . Define
f0 (π) := π T x̄−π T x̄ and fj (π) := π T rj −π T rj  for j ∈ NI \B. The inequality:
 xj
≥1 (10)
αj (π)
j∈N \B

is the (strengthened) split cut defined by π, where αj (π) for j ∈ N \ B is:


⎧ f0 (π)

⎪ 1−fj (π) if j ∈ NI and 0 < fj (π) < 1 − f0 (π),


⎪ 1−f0 (π)
⎨ fj (π) if j ∈ NI and 1 − f0 (π) ≤ fj (π) < 1,
αj (π) := 1−fT0 (π) (11)

⎪ π r j if j ∈ / NI and π T rj > 0,

⎪ − T j if j ∈
f (π) T j

0
⎩ π r / NI and π r < 0,
+∞ otherwise.
Zero-Coefficient Cuts 63

We next prove that, for a given point x ∈ P (B), if there exists any valid
inequality for PI (B) which is maximally violated wrt. x , then there also exists
a split cut which is maximally violated wrt. x .
Theorem 1. Let x ∈ P (B) be arbitrary. If there exists a valid inequality for
PI (B) which is a zero-coefficient cut wrt. x , then there also exists a split cut for
PI (B) which is a zero-coefficient cut wrt. x .

Proof. Let x ∈ P (B), and let j∈N \B αj xj ≥ 1 be a valid inequality for PI (B)

which is a zero-coefficient cut wrt. x . Since j∈N \B αj xj = 0 and αj , sj ≥ 0
for all j ∈ N \ B, we must have αj = 0 for all j ∈ N \ B satisfying xj > 0. Let
X  := {j ∈ N \ B : xj > 0}. It follows that 0 ≥ 1 is valid for:

QI := {x ∈ PI (B) : xj = 0 for all j ∈ (N \ B) \ X  }


= {x ∈ x̄ + cone({rj }j∈X  ) : xj ∈ Z for all j ∈ NI }.

Since QI = ∅, Lemma 2 shows there exists π̄ ∈ Zn such that π̄ T rj = 0 for


j ∈ X  , π̄j = 0 for j ∈ N \ NI and π̄ T x̄ ∈
/ Z. From (10) and (11) it now follows
that the split cut derived from π̄ is a zero-coefficient cut wrt. x .
In general it is NP-hard to separate a split cut for PI (B) [7]. However, as we
will show next, it is possible to separate a zero-coefficient split cut wrt. a given
point in polynomial time whenever such a split cut exists.
Let x ∈ P (B). Define X  := {j ∈ N \ B : xj > 0}. From (11) we have that
π ∈ Zn defines a maximally violated split cut wrt. x if and only if

π T rj = 0 for all j ∈ X  , (12)


πj = 0 for all j ∈ N \ NI , (13)
π T x̄ ∈
/ Z. (14)

If π ∈ Zn satisfies (12)-(14), then a split cut can be derived from π since π T x̄ ∈


/ Z,
and we have αj (π) = +∞ for all j ∈ X  , which implies that the coefficients on
the variables xj for j ∈ X  in the split cut (10) are all zero. Hence any π ∈ Zn
that satisfies (12)-(14) defines a zero-coefficient split cut wrt. x . Conversely, if
there exists a valid inequality for PI (B) which is maximally violated wrt. x ,
then Theorem 1 shows there exists π ∈ Zn that satisfies (12)-(14).
Let L(x ) ⊆ Zn denote the set of π ∈ Zn that satisfy (12) and (13):

L(x ) := {π ∈ Zn : π T rj = 0 for all j ∈ X  and πj = 0 for all j ∈ N \ NI }.

Observe that L(x ) is a lattice, i.e., for any π 1 , π 2 ∈ L(x ) and k ∈ Z, we have
kπ 1 ∈ L(x ) and π 1 + π 2 ∈ L(x ). For any lattice it is possible to compute a basis
for the lattice in polynomial time. Hence we can find vectors π 1 , . . . , π p ∈ L(x )
in polynomial time such that:

p
L(x ) = {π ∈ Zn : π = λi π i and λi ∈ Z for i = 1, 2, . . . , p}.
i=1
64 K. Andersen and R. Weismantel

Now, if there exists a lattice basis vector π ī ∈ L(x ) with ī ∈ {1, 2, . . . , p} such
that (π ī )T x̄ ∈
/ Z, then the split cut derived from π ī is maximally violated wrt.

x . Conversely, if we have (π i )T x̄ ∈ Z for all i ∈ {1, 2, . . . , p}, then π T x̄ ∈ Z for
all π ∈ L(x ). We therefore have the following.
Corollary 2. Let x ∈ P (B) be arbitrary. If there exists a valid inequality for
PI (B) that is maximally violated wrt. x , then it is possible to find such an
inequality in polynomial time.
Based on the above results, we have the following implementation of the cutting
plane algorithm (InitCPalg) presented earlier:

Implementation of (InitCPalg):
(1) Set k := 0. Let xk := x̄ be an optimal solution to (LP).
(2) Find a lattice basis π 1 , . . . , π pk for L(xk ).
Let I(xk ) := {i ∈ {1, . . . , pk } : (π i )T x̄ ∈
/ Z}.
(3) While I(xk ) = ∅:
(a) Add all split cuts generated from vectors π i
with i ∈ I(xk ) to (LP) and re-optimize.
Let xk+1 be an optimal solution.
(b) Find a lattice basis π 1 , . . . , π pk+1 for L(xk+1 ).
Let I(xk+1 ) := {i ∈ {1, . . . , pk+1 } : (π i )T x̄ ∈
/ Z}.
(c) Set k := k + 1.
End.

We next argue that mixed integer Gomory cuts play a natural role in the above
implementation of (InitCPalg). Consider the computation of the lattice basis for
L(x0 ) in step (2) of (InitCPalg). Observe that, since x0 = x̄, we have L(x0 ) = Zn ,
and therefore π 1 := e1 , . . . , π n := en is a lattice basis for L(x0 ), where e1 , . . . , en
denote the unit vectors in Rn . Since a split cut (10) obtained from a unit vector
is a mixed integer Gomory cut, the first cuts added in step (3).(a) of the above
implementation of (InitCPalg) are the mixed integer Gomory cuts. A natural
computational question therefore seems to be how much more integrality gap
can be closed by continuing (InitCPalg) and generating the remaining zero-
coefficient cuts.

4 Zero-Coefficient Cuts from Mixed Integer Polyhedra


In Sect. 3 we considered a fixed basis B and a point x ∈ P (B), and we demon-
strated how to obtain a zero-coefficient cut wrt. x from PI (B) whenever such a
cut exists. Given x ∈ P , we now consider how to obtain an appropriate basis,
i.e., we show how to identify a basis B such that a zero-coefficient cut wrt. x
can be derived from PI (B). For this, we first relate the emptyness of two corner
polyhedra PI (B) and PI (B  ) obtained from two adjacent bases B and B  for P .

Lemma 3. Let B be a basis for P , and let B  := (B \ {ī}) ∪ {j̄} be an adjacent


basis to B, where ī ∈ B and j̄ ∈ N \B. Then PI (B) = ∅ if and only if PI (B  ) = ∅.
Zero-Coefficient Cuts 65


Proof. For simplicity let x̄ := x̄B and x̄ := x̄B . Also let ā.j := (AB )−1 a.j for all
j ∈ N \ B and ā.j := (AB  )−1 a.j for all j ∈ N \ B  , where AB and AB  denote

the basis matrices associated with the bases  B and B respectively. 

Suppose z ∈ PI (B). We have zi = x̄i + j∈N \B  āi,j zj for all i ∈ B , zj ≥ 0 for
all j ∈ N \ (B  ∪ {ī}) and zj ∈ Z for all j ∈ NI . If zī ≥ 0, we are done, so suppose
zī < 0. Choose an integer k > 0 such that kā.ī ∈ Zm and zī + k ≥ 0. Defining
zi := zi + kāi,ī for all i ∈ B  , zī := zī + k and zj := zj for all j ∈ N \ (B  ∪{ī}), we
have z  ∈ PI (B  ). Hence PI (B) = ∅ implies PI (B  ) = ∅. The opposite direction
is symmetric.
From Lemma 3 it follows that either all corner polyhedra PI (B) associated with
bases B for P are empty, or they are all non-empty. We next present a pivot
operation from a basis B to an adjacent basis B  with the property that, if a zero-
coefficient cut wrt. a point x ∈ P can be derived from B, then a zero-coefficient
cut wrt. x can also be derived from B  .
Lemma 4. Let B be a basis for P , let x ∈ P and define X  := {j ∈ N : xj > 0}.
Also let B  := (B \ {ī}) ∪ {j̄} be an adjacent basis to B, where ī ∈ B \ X  and
j̄ ∈ X  \ B. If a zero-coefficient cut wrt. x can be derived from B, then a zero-
coefficient cut wrt. x can also be derived from B  .
Proof. Given a set S ⊆ N , we will use sets obtained from P , PI , P (B) and
PI (B) by setting xj = 0 for all j ∈ N \ S. For S ⊆ N , define Q(S) := {x ∈ P :
xj = 0 for j ∈ N \ S} and QI (S) := {x ∈ PI : xj = 0 for j ∈ N \ S}. Also,
given a basis B ⊆ S, define Q(B, S) := {x ∈ P (B) : xj = 0 for j ∈ N \ S} and
QI (B, S) := {x ∈ PI (B) : xj = 0 for j ∈ N \ S}.
Assume a zero-coefficient cut wrt. x can be derived from B. Observe that this
implies PI (B, B ∪X  ) = ∅. Now, PI (B, B ∪X  ) is a corner polyhedron associated
with PI (B ∪ X  ), and PI (B  , B ∪ X  ) is also a corner polyhedron associated with
PI (B ∪ X  ). Since any two bases of P (B ∪ X  ) can be obtained from each other
by pivoting, it follows from Lemma 3 that also PI (B  , B ∪ X  ) = ∅. Corollary 1
now gives a split cut which is a zero-coefficient cut wrt. x derived from B  .
Lemma 4 shows that, for the purpose of identifying zero-coefficient cuts wrt. x ,
the interesting bases to consider are those bases for which it is not possible to
pivot a variable xj with j ∈ X  into the basis.
Definition 1. Let x ∈ P , and define X  := {j ∈ N : xj > 0}. A basis B for P
is called maximal wrt. x if (B \ {ī}) ∪ {j̄} is not a basis for P for all ī ∈ B \ X 
and j̄ ∈ X  \ B.
From the above results it is not clear whether it is necessary to investigate all
maximal bases wrt. x in order to identify a zero-coefficient cut wrt. x . However,
the following lemma shows that it is sufficient to examine just a single arbitrarily
chosen maximal basis wrt. x . In other words, if there exists a basis from which
a zero-coefficient cut wrt. x can be derived, then a zero-coefficient cut wrt. x
can be derived from every maximal basis wrt. x .
66 K. Andersen and R. Weismantel

Lemma 5. If there exists a basis B for P from which a zero-coefficient cut wrt.
x can be derived, then a zero-coefficient cut can be derived from every basis for
P which is maximal wrt. x .
Proof. Suppose B is a basis from which a zero-coefficient cut wrt. x can be
derived. Let J := N \ B, Bx := B ∩ X  and Jx := J ∩ X  . Also let x̄ := x̄B and
ā.j := (AB )−1 a.j for j ∈ J, where AB denotes the basis matrix associated with
B. Lemma 4 shows that we may assume B is maximal, i.e., we may assume that
the simplex tableau associated with B is of the form:

xi = 0 + āi,j xj for all i ∈ B \ Bx , (15)
j∈J\Jx
 
xi = x̄i + āi,j xj + āi,j xj for all i ∈ Bx . (16)
j∈Jx j∈J\Jx

xj ≥ 0 for all i ∈ J. (17)


Observe that x̄i = 0 for all i ∈ B \ Bx , since x satisfies (15). The set P (B) is
the set of solutions to (15)-(17), and PI (B) is the set of mixed integer solutions
to (15)-(17). Furthermore, from (15)-(17) it follows that a zero-coefficient cut
can be derived from B if and only if the following set does not contain mixed
integer points:

T (B) := {x ∈ Rn : xi = x̄i + āi,j xj for i ∈ Bx , and xj ≥ 0 for j ∈ Jx }.
j∈Jx

Now, T (B) is a basic polyhedron associated with the set:



T := {x ∈ Rn : xi = x̄i + āi,j xj for i ∈ Bx , and xj ≥ 0 for j ∈ Jx ∪ Bx }.
j∈Jx

Furthermore, from any basis B  for P which is maximal wrt. x , a basic poly-
hedron T (B  ) of T can be associated of the above form, and a zero-coefficient
cut wrt. x can be derived from B  if and only if T (B  ) does not contain mixed
integer points. Since T (B) does not contain mixed integer points, it follows from
Lemma 3 that every basic polyhedron T (B  ) for T does not contain mixed in-
teger points. Hence a zero-coefficient cut can be derived from every basis B  for
P which is maximal wrt. x .
Since a maximal basis wrt. x ∈ P can be obtained in polynomial time, we
immediately have our main theorem.
Theorem 2. Let x ∈ P be arbitrary. If there exists basis B, and a valid inequal-
ity for PI (B) which is a zero-coefficient cut wrt. x , then such a zero-coefficient
cut can be obtained in polynomial time.

5 Computational Results
We now test the performance of the cutting plane algorithm (InitCPalg) de-
scribed in Sect. 3. In our implementation, we use CPLEX 9.1 for solving linear
Zero-Coefficient Cuts 67

programs, and the open source software NTL for the lattice computations. We
use instances from miplib 3.0 [6] and miplib 2003 [1] in our experiments. All in-
stances are minimization problems, and we use the preprocessed version of each
instance, i.e., when we refer to an instance, we refer to the instance obtained
after applying the preprocessor of CPLEX 9.1.
For each instance, we formulate the optimization problem over the corner
polyhedron associated with an optimal basis of the LP relaxation. To distin-
guish the optimization problem over the corner polyhedron from the original
mixed integer program, we use the following notation: The original mixed inte-
ger program is denoted (MIP), and the mixed integer program over the corner
polyhedron is denoted (MIPc ). The optimal objective of (MIP) is denoted z MIP ,
c
and the optimal objective value of (MIPc ) is denoted z MIP . The LP relaxation
of (MIP) is denoted (LP), and the optimal objective value of (LP) is denoted
z LP .
We assume the (original) mixed integer program (MIP) has n variables, and
includes slack, surplus and artificial variables in the formulation:
min cT x
such that
aTi. x = bi , for all i ∈ M, (18)
lj ≤ xj ≤ uj , for all j ∈ N, (19)
xj ∈ Z, for all j ∈ NI . (20)
where M is an index set for the constraints, c ∈ Qn+|M| denotes the objective
coefficients, N := {1, 2, . . . , (n + |M |)} is an index set for the variables, NI ⊆ N
denotes those variables that are integer constrained, l and u are the lower and
upper bounds on the variables respectively and (ai. , bi ) ∈ Q|N |+1 for i ∈ M
denotes the coefficients in the ith constraint. The variables xn+i for i ∈ M are
either slack, surplus or artificial variables.
The problem (MIPc ) is formulated as follows. An optimal basis for (LP) is an
|M |-subset B ∗ ⊆ N of basic variables. Let J ∗ := N \ B ∗ denote the non-basic
variables. The problem (MIPc ) can be constructed from (MIP) by eliminating

certain bounds on the variables. Let JA denote the non-basic artificial variables,
∗ ∗ ∗
let JL ⊆ J \ JA denote the non-basic structural variables on lower bound, and
let JU∗ ⊆ J ∗ \ JA

denote the non-basic structural variables on upper bound. By
re-defining the bounds on the variables xj for j ∈ N to:
⎧ ∗
⎧ ∗
⎨0 if j ∈ JA , ⎨0 if j ∈ JA ,
∗ ∗
lj (B ) := lj if j ∈ JL , and uj (B ):= uj if j ∈ JU∗ ,

(21)
⎩ ⎩
−∞ otherwise, +∞ otherwise,
then the problem (MIPc ) associated with B ∗ is given by:
min cT x
such that
aTi. x = bi , for all i ∈ M, (22)
68 K. Andersen and R. Weismantel

lj (B ∗ ) ≤ xj ≤ uj (B ∗ ), for all j ∈ N, (23)


xj ∈ Z, for all j ∈ NI . (24)

We evaluate the performance of zero-coefficient cuts by calculating how much


more of the integrality gap of (MIPc ) can be closed by continuing (InitCPalg)
beyond the generation of the mixed integer Gomory (MIG) cuts. For this, we
first evaluate the quality of (MIPc ) as an approximation to (MIP). Our measure
of quality is the size of the integrality gap. The integrality gap of (MIPc ) is
c
the number GapI (MIPc ) := z MIP − z LP , and the integrality gap of (MIP) is
the number GapI (MIP) := z MIP
− z LP . The relationship between the numbers
c
GapI (MIP) and GapI (MIP ) give information on the quality of (MIPc ) as an
object for cutting plane generation for (MIP).
Table 1 contains our results for evaluating the quality of (MIPc ). The first
three columns contain the problem name, the number of constraints and the
number of variables for each instance respectively.
There are six instances (not included in Table 1) for which (MIPc ) does not
close any of the integrality gap between (LP) and (MIP). This means that the
bounds deleted from (MIP) to create (MIPc ) are important constraints of (MIP),
although these bounds are not active in defining an optimal solution to (LP).
This seems to indicate that, for these instances, (MIPc ) is not the right relaxation
of (MIP) from which to derive strong cuts.
For the first seven instances in Table 1 the opposite is true, i.e., for these
instances the optimal objective of (MIPc ) is the same as the optimal objective
of (MIP). Therefore, for these instances, if we can identify all facet defining
inequalities associated with (MIPc ), and add these inequalities (LP), then all
of the integrality gap between (LP) and (MIP) will be closed. Hence, for these
instances, (MIPc ) seems to be the right object from which to derive strong cuts
for (MIP). For the remaining instances in Table 1, not all the integrality gap
between (LP) and (MIP) is closed by valid cuts from (MIPc ). However, for most
instances in Table 1, it is still a large amount of integrality gap between (LP)
and (MIP) that can potentially be closed with valid cuts for (MIPc ).
We next evaluate the performance of zero-coefficient cuts. Table 2 contains
the main results. Before considering the results in Table 2, we first make a few
comments on those instances that are not in Table 2.
For three instances, MIG cuts close all the integrality gap of (MIPc ). For these
instances, zero-coefficient cuts can therefore not close any additional integrality
gap, and we did not include these instances in our test of the performance of
zero-coefficient cuts. Furthermore, for another nine instances, no further zero-
coefficient cuts were generated besides the MIG cuts. Observe that (InitCPalg)
does not do much work for these instances, since this is detected after the first
lattice basis has been computed.
For the remaining instances, we divided them into those instances where MIG
cuts closed less than 80% of the total integrality gap that can be closed with
zero-coefficient cuts, and those where MIG cuts closed more than 80% of the
total integrality gap that can be closed with zero-coefficient cuts.
Zero-Coefficient Cuts 69

Table 1. Strength of corner polyhedron


c GapI (MIPc )
Problem # Constr. # Var. z MIP z MIP z LP GapI (MIP)
× 100%
10teams 210 1600 904 904 897 100.00 %
air04 614 7564 54632 54632 54030.44 100.00 %
egout 35 47 299.01 299.01 242.52 100.00 %
l152lav 97 1988 4722 4722 4656.36 100.00 %
mas76 12 148 40005.05 40005.05 38893.90 100.00 %
mod008 6 319 307 307 290.93 100.00 %
p0282 160 200 258401 258401 179990.30 100.00 %
qnet1 363 1417 15997.04 16029.69 14274.10 98.14 %
flugpl 13 14 759600 760500 726875 97.32 %
nsrand-ipx 76 4162 50880.00 51200.00 49667.89 79.11 %
vpm1 128 188 19 20 16.43 71.99 %
vpm2 128 188 13 13.75 11.14 71.26 %
pp08a 133 234 5870 7350 2748.35 67.84 %
p2756 702 2642 2893 3124 2701.67 45.30 %
swath 482 6260 378.07 467.41 334.50 32.78 %
modglob 286 384 19886358 20099766 19790206 31.06 %
fixnet6 477 877 3357 3981 3190.04 21.11 %
p0201 107 183 7185 7615 7155 6.52 %
rout 290 555 -1388.42 -1297.69 -1393.39 5.19 %

Table 2. Instances where the increase in objective with all ZC cuts was substantially
larger than the increase in objective with only MIG cuts
ΔObj. MIG cuts
Problem # Constr. # Var. # MIG # Additional ΔObj. All cuts × 100%
cuts ZC cuts
l152lav 97 1988 53 6 0.00%
mkc∗ 1286 3230 119 29 0.00%
p0201 107 183 42 27 0.00%
p2756 702 2642 201 7 6.02%
rout 290 555 52 36 6.24%
swath 482 6260 80 26 7.58%
vpm1 114 188 15 40 22.38%
vpm2 128 188 29 29 22.73%
flugpl 13 14 13 5 23.36%
fixnet6 477 877 12 19 27.06%
timtab2∗ 287 648 237 653 29.85%
timtab1∗ 166 378 133 342 30.93%
egout 35 47 8 5 41.47%
qnet1 363 1417 55 47 45.05%
p0282 160 200 34 53 49.60%
air04 614 7564 290 30 50.53%
modglob 286 384 60 28 56.77%
mas76 12 148 11 11 65.77%
pp08a 133 234 51 34 66.16%
10teams 210 1600 179 76 66.67%
mod008 6 319 5 10 69.77%
nsrand-ipx 590 4162 226 91 73.17%

Table 2 contains those instances where MIG cuts closed less than 80% of the
total integrality gap that can be closed with zero-coefficient cuts. We observe
that for the first 16 instances in Table 2, continuing (InitCPalg) beyond MIG
cuts closed at least twice as much integrality gap as would have been achieved
70 K. Andersen and R. Weismantel

by using only MIG cuts. For the remaining instances in Table 2, it was not at
least a factor of two which was achieved, but still a substantial improvement.
The instances marked with an asterisk in Table 2 are instances where we were
unable to solve (MIPc ). For those instances, the results are based on the best
possible solution we were able to find.
The remaining class of instances are those instances where MIG cuts closed
more than 80% of the total integrality gap that can be closed with zero-coefficient
cuts. There were 28 of these instances. For these instances, continuing (InitC-
Palg) beyond MIG cuts was therefore not beneficial. However, we observe that
for all except two of these instances (markshare1 and markshare2), this was
because very few zero-coefficient cuts were generated that are not MIG cuts.
Detecting that it is not beneficial to continue (InitCPalg) beyond the generation
of MIG cuts was therefore done after only very few lattice basis computations.

References
1. Achterberg, T., Koch, T.: MIPLIB 2003. Operations Research Letters 34, 361–372
(2006)
2. Andersen, K., Louveaux, Q., Weismantel, R.: Certificates of linear mixed integer
infeasibility. Operations Research Letters 36, 734–738 (2008)
3. Andersen, K., Louveaux, Q., Weismantel, R., Wolsey, L.A.: Inequalities from Two
Rows of a Simplex Tableau. In: Fischetti, M., Williamson, D.P. (eds.) IPCO 2007.
LNCS, vol. 4513, pp. 1–15. Springer, Heidelberg (2007)
4. Balas, E.: Intersection Cuts - a new type of cutting planes for integer programming.
Operations Research 19, 19–39 (1971)
5. Balas, E., Saxena, A.: Optimizing over the split closure. Mathematical Program-
ming, Ser. A 113, 219–240 (2008)
6. Bixby, R.E., Ceria, S., McZeal, C.M., Savelsbergh, M.W.P.: An updated mixed
integer programming library: MIPLIB 3. 0. Optima 58, 12–15 (1998)
7. Caprara, A., Letchford, A.: On the separation of split cuts and related inequalities.
Mathematical Programming, Ser. A 94, 279–294 (2003)
8. Cook, W.J., Kannan, R., Schrijver, A.: Chvátal closures for mixed integer pro-
gramming problems. Mathematical Programming 47, 155–174 (1990)
9. Cornuéjols, G., Margot, F.: On the Facets of Mixed Integer Programs with Two
Integer Variables and Two Constraints. Mathematical Programming, Ser. A 120,
429–456 (2009)
10. Gomory, R.E.: An algorithm for the mixed integer problem. Technical Report RM-
2597, The Rand Corporation (1960a)
11. Nemhauser, G., Wolsey, L.A.: A recursive procedure to generate all cuts for 0-1
mixed integer programs. Mathematical Programming, Ser. A 46, 379–390 (1990)
Prize-Collecting Steiner Network Problems

MohammadTaghi Hajiaghayi1, Rohit Khandekar2 ,


Guy Kortsarz3, , and Zeev Nutov4
1
AT&T Research Lab Research
[email protected]
2
IBM T.J.Watson Research Center
[email protected]
3
Rutgers University, Camden
[email protected]
4
The Open University of Israel
[email protected]

Abstract. In the Steiner Network problem we are given a graph G with


edge-costs and connectivity requirements ruv between node pairs u, v.
The goal is to find a minimum-cost subgraph H of G that contains ruv
edge-disjoint paths for all u, v ∈ V . In Prize-Collecting Steiner Network
problems we do not need to satisfy all requirements, but are given a
penalty function for violating the connectivity requirements, and the goal
is to find a subgraph H that minimizes the cost plus the penalty. The case
when ruv ∈ {0, 1} is the classic Prize-Collecting Steiner Forest problem.
In this paper we present a novel linear programming relaxation for
the Prize-Collecting Steiner Network problem, and by rounding it, obtain
the first constant-factor approximation algorithm for submodular and
monotone non-decreasing penalty functions. In particular, our setting
includes all-or-nothing penalty functions, which charge the penalty even
if the connectivity requirement is slightly violated; this resolves an open
question posed in [SSW07]. We further generalize our results for element-
connectivity and node-connectivity.

1 Introduction
Prize-collecting Steiner problems are well-known network design problems with
several applications in expanding telecommunications networks (see for exam-
ple [JMP00, SCRS00]), cost sharing, and Lagrangian relaxation techniques (see
e.g. [JV01, CRW01]). A general form of these problems is the Prize-Collecting
Steiner Forest problem1 : given a network (graph) G = (V, E), a set of source-
sink pairs P = {{s1 , t1 }, {s2 , t2 }, . . . , {sk , tk }}, a non-negative cost function
c : E → + , and a non-negative penalty function π : P → + , our goal is

Part of this work was done while the authors were meeting at DIMACS. We would
like to thank DIMACS for hospitality.

Partially supported by NSF Award Grant number 0829959.
1
In the literature, this problem is also called “prize-collecting generalized Steiner
tree”.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 71–84, 2010.

c Springer-Verlag Berlin Heidelberg 2010
72 M. Hajiaghayi et al.

a minimum-cost way of installing (buying) a set of links (edges) and paying the
penalty for those pairs which are not connected via installed links. When all
penalties are ∞, the problem is the classic APX-hard Steiner Forest problem, for
which the best known approximation ratio is 2 − n2 (n is the number of nodes of
the graph) due to Agrawal, Klein, and Ravi [AKR95] (see also [GW95] for a more
general result and a simpler analysis). The case of Prize-Collecting Steiner Forest
problem when all sinks are identical is the classic Prize-Collecting Steiner Tree
problem. Bienstock, Goemans, Simchi-Levi, and Williamson [BGSLW93] first
considered this problem (based on a problem earlier proposed by Balas [Bal89])
and gave for it a 3-approximation algorithm. The current best ratio for this
problem is 1.992 by Archer, Bateni, Hajiaghayi,
! and Karloff [ABHK09], im-
proving upon a primal-dual 2 − n−1 1
-approximation algorithm of Goemans
and Williamson [GW95]. When in addition all penalties are ∞, the problem is
the classic Steiner Tree problem, which is known to be APX-hard [BP89] and
for which the best approximation ratio is 1.55 [RZ05]. Very recently, Byrka et
al. [BGRS10] have announced an improved approximation algorithm for the
Steiner tree problem.
The general form of the Prize-Collecting Steiner Forest problem first has been
formulated by Hajiaghayi and Jain [HJ06]. They showed how by a primal-dual
algorithm to a novel integer programming formulation of the problem with
doubly-exponential variables, we can obtain a 3-approximation algorithm for
the problem. In addition, they show that the factor 3 in the analysis of their
algorithm is tight. However they show how a direct randomized LP-rounding al-
gorithm with approximation factor 2.54 can be obtained for this problem. Their
approach has been generalized by Sharma, Swamy, and Williamson [SSW07] for
network design problems where violated arbitrary 0-1 connectivity constraints
are allowed in exchange for a very general penalty function. The work of Ha-
jiaghayi and Jain has also motivated a game-theoretic version of the problem
considered by Gupta et al. [GKL+ 07].
In this paper, we consider a much more general high-connectivity version of
Prize-Collecting Steiner Forest, called Prize-Collecting Steiner Network, in which
we are also given connectivity requirements ruv for pairs of nodes u and v and
a penalty function in case we do not satisfy all ruv . Our goal is to find a mini-
mum way of constructing a network (graph) in which we connect u and v with

ruv ≤ ruv edge-disjoint paths and paying a penalty for all violated connectivity
between source-sink pairs. This problem can arise in real-world network design,
in which a typical client not only might want to connect to the network but
also might want to connect via a few disjoint paths (e.g., to have a higher band-
width or redundant connections in case of edge failures) and a penalty might
be charged if we cannot satisfy its connectivity requirement. When all penalties
are ∞, the problem is the classic Steiner Network problem. Improving on a long
line of earlier research that applied primal-dual methods, Jain [Jai01] obtained
a 2-approximation algorithm for Steiner Network using the iterative rounding
method. This algorithm was generalized to so called “element-connectivity” by
Fleischer, Jain, and Williamson [FJW01] and by Cheriyan, Vempala, and Vetta
Prize-Collecting Steiner Network Problems 73

[CVV06]. Recently, some results were obtained for the node-connectivity version;
the currently best known ratios for the node-connectivity case are O(R3 log n)
for general requirements [CK09] and O(R2 ) for rooted requirements [Nut09],
where R = maxu,v∈V ruv is the maximum requirement. See also the survey by
Kortsarz and Nutov [KN07] for various min-cost connectivity problems.
Hajiaghayi and Nasri [HN10] generalize the iterative rounding approach of
Jain to Prize-Collecting Steiner Network when there is a separate non-increasing
marginal penalty function for each pair u, v whose ruv -connectivity requirement
is not satisfied. They obtain an iterative rounding 3-approximation algorithm for
this case. For the special case when penalty functions are linear in the violation
of the connectivity requirements, Nagarajan, Sharma, and Williamson [NSW08]
using Jains iterative rounding algorithm as a black box give a 2.54-factor approx-
imation algorithm. They also generalize the 0-1 requirements of Prize-Collecting
Steiner Forest problem introduced by Sharma, Swamy, and Williamson [SSW07]
to include general connectivity requirements. Assuming the monotone submod-
ular penalty function of Sharma et al. is generalized to a multiset function that
can be decomposed into functions in the same type as that of Sharma et al.,
they give an O(log R)-approximation algorithm (recall that R is the maximum
connectivity requirement). In this algorithm, they assume that we can use each
edge possibly many times (without bound). They raise the question whether we
can obtain a constant ratio without all these assumptions, when penalty is a sub-
modular multi-set function of the set of disconnected pairs? More importantly
they pose as an open problem to design a good approximation algorithm for the
all-or-nothing version of penalty functions: penalty functions which charge the
penalty even if the connectivity requirement is slightly violated. In this paper,
we answer affirmatively all these open problems by proving the first constant
factor 2.54-approximation algorithm which is based on a novel LP formulation
of the problem. We further generalize our results for element-connectivity and
node-connectivity. In fact, for all types of connectivities, we prove a very gen-
eral result (see Theorem 1) stating that if Steiner Network (the version without
penalties) admits an LP-based ρ-approximation algorithm, then the correspond-
ing prize-collecting version admits a (ρ + 1)-approximation algorithm.

1.1 Problems We Consider


In this section, we define formally the terms used in the paper. For a subset
S of nodes in a graph H, let λSH (u, v) denote the S-connectivity between u
and v in H, namely, the maximum number of edge-disjoint uv-paths in H so
that no two of them have a node in S − {u, v} in common. In the Generalized
Steiner-Network (GSN) problem we are given a graph G = (V, E) with edge-costs
{ce ≥ 0 | e ∈ E}, a node subset S ⊆ V , a collection {u1 , v1 }, . . . , {uk , vk } of node
pairs from V , and S-connectivity requirements r1 , . . . , rk . The goal is to find a
minimum cost subgraph H of G so that λSH (ui , vi ) ≥ ri for all i. Extensively
studied particular cases of GSN are: the Steiner Network problem, called also
Edge-Connectivity GSN (S = ∅), Node-Connectivity GSN (S = V ), and Element-
Connectivity GSN (S ∩ {ui , vi } = ∅ for all i). The case of rooted requirements
74 M. Hajiaghayi et al.

is when there is a “root” s that belongs to all pairs {ui , vi }. We consider the
following “prize-collecting” version of GSN.

All-or-Nothing Prize Collecting Generalized Steiner Network (PC-GSN):

Instance: A graph G = (V, E) with edge-costs {ce ≥ 0 | e ∈ E}, S ⊆ V ,


a collection {u1 , v1 }, . . . , {uk , vk } of node pairs from V , S-connectivity
requirements r1 , . . . , rk > 0, and a penalty function π : 2{1,...,k} → + .

Objective: Find a subgraph H of G that minimizes the value

val(H) = c(H) + π(unsat(H))

of H, where unsat(H) = {i | λSH (ui , vi ) < ri } is the set of requirements not


satisfied by H.
We will assume that the penalty function π is given by an evaluation oracle.
We will also assume that π is submodular, namely, that π(A) + π(B) ≥ π(A ∩
B) + π(A ∪ B) for all A, B and that it is monotone non-decreasing, namely,
π(A) ≤ π(B) for all A, B with A ⊆ B. As was mentioned, approximating the
edge-connectivity variant of PC-GSN was posed as the main open problem by
Nagarajan, Sharma, and Williamson [NSW08]. We resolve this open problem for
the submodular function val(H) considered here.
We next define the second problem we consider.

Generalized Steiner Network with Generalized Penalties (GSN-GP):

Instance: A graph G = (V, E) with edge-costs {ce ≥ 0 | e ∈ E}, S ⊆ V ,


a collection {u1 , v1 }, . . . , {uk , vk } of node pairs from V , and non-increasing
penalty functions p1 , . . . , pk : {0, 1, . . . , n − 1} → + .

Objective: Find a subgraph H of G that minimizes the value


k
val (H) = c(H) + pi (λSH (ui , vi )).
i=1

The above problem captures general penalty functions of the S-connectivity


λS (ui , vi ) for given pairs {ui , vi }. It is natural to assume that the penalty func-
tions are non-increasing, i.e., we pay less in the objective function if the achieved
connectivity is more. This problem was posed as an open question by Nagarajan
et al. [NSW08]. In this paper, we use the convention that pi (n) = 0 for all i.
We need some definitions to introduce our results. A pair T = {T  , T  } of
subsets of V is called a setpair (of V ) if T  ∩ T  = ∅. Let K = {1, . . . , k}.
Let T = {T  , T  } be a setpair of V . We denote by δ(T ) the set of edges in E
between T  and T  . For i ∈ K we use T  (i, S) to denote that |T  ∩ {ui , vi }| = 1,
|T  ∩ {ui , vi }| = 1 and V \ (T  ∪ T  ) ⊆ S. While in the case of edge-connectivity
a “cut” consists of edges only, in the case of S-connectivity a cut that separates
Prize-Collecting Steiner Network Problems 75

between u and v is “mixed”, meaning it may contain both edges in the graph
and nodes from S. Note that if T  (i, S) then δ(T ) ∪ (V \ (T  ∪ T  )) is such
a mixed cut that separates between ui and vi . Intuitively, Menger’s Theorem
for S-connectivity (c.f. [KN07]) states that the S-connectivity between ui and
vi equals the minimum size of such a mixed cut. Formally, for a node pair ui , vi
of a graph H = (V, E) and S ⊆ V we have:

λSH (ui , vi ) = min (|δ(T )|+ |V \ (T  ∪T  )|) = min (|δ(T )|+ |V |− (|T  |+ |T  |))
T (i,S) T (i,S)

Hence if λSH (ui , vi ) ≥ ri for a graph H = (V, E), then for any setpair T with
T  (i, S) we must have |δ(T )| ≥ ri (T ), where ri (T ) = max{ri + |T  | + |T  | −
|V |, 0}. Consequently, a standard “cut-type” LP-relaxation of the GSN problem
is as follows (c.f. [KN07]):
⎧ ⎫
⎨  ⎬
min ce xe | xe ≥ ri (T ) ∀T  (i, S), ∀i ∈ K, xe ∈ [0, 1] ∀e . (1)
⎩ ⎭
e∈E e∈δ(T )

1.2 Our Results


We introduce a novel LP relaxation of the problem which is shown to be bet-
ter, in terms of the integrality gap, than a “natural” LP relaxation considered
in [NSW08]. Using our LP relaxation, we prove the following main result.
Theorem 1. Suppose that there exists a polynomial time algorithm that com-
putes an integral solution to LP (1) of cost at most ρ times the optimal value
of LP (1) for any subset of node pairs. Then PC-GSN admits a (1 − e−1/ρ )−1 -
approximation algorithm, provided that the penalty function π is submodular and
monotone non-decreasing.
Note that since 1 − ρ1 < e− ρ < 1 − ρ+1
1
1
holds for ρ ≥ 1, we have ρ < (1 −
−1/ρ −1
e ) < ρ + 1.
Let R = maxi ri denote the maximum requirement. The best known values of
ρ are as follows: 2 for Edge-GSN [Jai01], 2 for Element-GSN [FJW01, CVV06],
O(R3 log |V |) for Node-GSN [CK09], and O(R2 ) for Node-GSN with rooted re-
quirements [Nut09]. Substituting these values in Theorem 1, we obtain:
Corollary 1. PC-GSN problems admit the following approximation ratios pro-
vided that the penalty function π is submodular and monotone non-decreasing:
2.54 for edge- and element-connectivity, O(R3 log |V |) for node-connectivity, and
O(R2 ) for node-connectivity with rooted requirements.
Our results for GSN-GP follow from Corollary 1.
Corollary 2. GSN-GP problems admit the following approximation ratios: 2.54
for edge- and element-connectivity, O(R3 log |V |) for node-connectivity, and O(R2 )
for node-connectivity with rooted requirements. Here R = max1≤i≤k min{λ ≥ 0 |
pi (λ) = 0}.
76 M. Hajiaghayi et al.

Proof. We present an approximation ratio preserving reduction from the GSN-


GP problem to the corresponding PC-GSN problem. Given an instance of the
GSN-GP problem, we create an instance of the PC-GSN problem as follows.
The PC-GSN instance inherits the graph G, its edge-costs, and the set S. Let
(ui , vi ) be a pair in GSN-GP and let Ri = min{λ ≥ 0 | pi (λ) = 0}. We in-
troduce Ri copies of this pair, {(u1i , vi1 ), . . . , (uR i Ri
i , vi )}, to the set of pairs
in the PC-GSN instance. We set the edge-connectivity requirement of a pair
(uti , vit ) to be t for 1 ≤ t ≤ Ri . We also set the penalty function for single-
ton sets as follows π({(uti , vit )}) = pi (t − 1) − pi (t) for all 1 ≤ t ≤ Ri . Fi-
nally,
 we extend this function π to a set of pairs P by linearity, i.e., π(P ) =
p∈P π({p}). Note that such a function π is clearly submodular and monotone
non-decreasing.
It is sufficient to show that for any subgraph H of G, its value in the GSN-
GP instance equals its value in the PC-GSN instance, i.e., val(H) = val (H);
then we can use the algorithm from Corollary 1 to complete the proof. Fix a
pair (ui , vi ) in the GSN-GP instance. Let λSH (ui , vi ) = ti . Thus the contribu-
tion of pair (ui , vi ) to the objective function val(H) of the GSN-GP instance
is pi (ti ). On the other hand, since π is linear, the total contribution of pairs
{(u1i , vi1 ), . . . , (uRi
, viRi )} to the objective function val (H) of the PC-GSN in-
Ri i Ri
stance is t=ti +1 π({(uti , vit )}) = t=t i +1
(pi (t − 1) − pi (t)) = pi (ti ). Note that
the pairs (uti , vit ) for 1 ≤ t ≤ ti do not incur any penalty. Summing up over all
pairs, we conclude that val(H) = val (H), as claimed.

2 A New LP Relaxation

We use the following LP-relaxation for the PC-GSN problem. We introduce vari-
ables xe for e ∈ E (xe = 1 if e ∈ H), fi,e for i ∈ K and e ∈ E (fi,e = 1 if
i ∈ unsat(H) and e appears on a chosen set of ri S-disjoint {ui , vi }-paths in
H), and zI for I ⊆ K (zI = 1 if I = unsat(H)).
 
Minimize e∈E ce xe + I⊆K π(I)zI

 
Subject to fi,e ≥ (1 − I:i∈I zI )ri (T ) ∀i ∀T  (i, S)
e∈δ(T ) 
fi,e ≤ 1 − I:i∈I zI ∀i ∀e
 xe ≥ fi,e ∀i ∀e (2)
I⊆K zI =1
xe , fi,e , zI ∈ [0, 1] ∀i ∀e ∀I

We first prove that (2) is a valid LP-relaxation of the PC-GSN problem.


Lemma 1. The optimal value of LP (2) is at most the optimal solution value
to the PC-GSN problem. Moreover, if π is monotone non-decreasing, the opti-
mum solution value to the PC-GSN problem is at most the value of the optimum
integral solution of LP (2).
Prize-Collecting Steiner Network Problems 77

Proof. Given a feasible solution H to the PC-GSN problem define a feasible


solution to LP (2) as follows. Let xe = 1 if e ∈ H and xe = 0 otherwise. Let
zI = 1 if I = unsat(H) and zI = 0 otherwise. For each i ∈ unsat(H) set
fi,e = 0 for all e ∈ E, while for i ∈ / unsat(H) the variables fi,e take values
as follows: fix a set of ri pairwise S-disjoint {ui , vi }-paths, and let fi,e = 1 if
e belongs to one of these paths and fi,e = 0 otherwise. The defined solution is
feasible for LP (2): the first set of constraints are satisfied by Menger’s Theorem
for S-connectivity, while the remaining constraints are satisfied by the above
definition of variables. It is also easy to see that the above solution has value
exactly val(H).
If π is monotone non-decreasing, we prove that for any integral solution
{xe , fi,e , zI } to (2), the graph H with edge-set {e ∈ E | xe = 1} has val(H)
at most the value of the solution {xe , fi,e , zI }. To see this, first note that there
 a unique set I ⊆ K with zI = 1, since the variables
is  zI are integral and
 I⊆K z I = 1. Now
 consider an index i ∈/ I. Since I:i∈I zI = 0, we have
e∈δ(T ) xe ≥ e∈δ(T ) fi,e ≥ ri (T ) for all T  (i, S). This implies that i ∈
/
unsat(H), by Menger’s Theorem for S-connectivity. Consequently, unsat(H) ⊆
I, hence π(unsat(H))  ≤ π(I) by the monotonicity of π. Thus val(H) = c(H) +
π(unsat(H)) ≤ e∈E ce xe + I⊆K π(I)zI and the lemma follows.

2.1 Why Does a “Natural” LP Relaxation Not Work?


One may be tempted to consider a natural LP without using the flow variables
fi,e , namely, the LP obtained from LP (2) by replacing the the first three sets
of constraints by the set of constraints
 
xe ≥ (1 − zI )ri (T )
e∈δ(T ) I:i∈I

for all i and T  (i, S). Here is an example demonstrating that the integrality
gap of this LP can be as large as R = maxi ri even for edge-connectivity. Let G
consist of R − 1 edge-disjoint paths between two nodes s and t. All the edges
have cost 0. There is only one pair {u1 , v1 } = {s, t} that has requirement r1 = R
and penalty π({1}) = 1. Let π(∅) = 0. Clearly, π is submodular and monotone
non-decreasing. We have S = ∅. No integral solution can satisfy the requirement
r1 , hence an optimal integral solution pays the penalty π({1}) and has value 1.
A feasible fractional solution (without the flow variables) sets xe = 1 for all e,
 sets z{1} = 1/R, z∅ = 1 − 1/R. The new set of constraints is satisfied since
and
e∈δ(T ) xe ≥ (1 − 1/R) · R = (1 − z{1} )r1 (T ) for any {s, t}-cut T . Thus the
optimal LP-value is at most 1/R, giving a gap of at least R.
With flow variables, however, we have an upper bound f1,e ≤ 1 − z{1} . Since
there
 is an {s, t}-cut T with |δ(T )| = R − 1, we cannot satisfy the constraints
e∈δ(T ) f1,e ≥ (1 − z{1} )r1 (T ) and f1,e ≤ 1 − z{1} simultaneously unless we set
z{1} = 1. Thus in this case, our LP (2) with flow variables has the same optimal
value of as the integral optimum.
78 M. Hajiaghayi et al.

2.2 Some Technical Results Regarding LP (2)

We will prove the following two statements that together imply Theorem 1.
Lemma 2. Any basic feasible solution to (2) has a polynomial number of non-
zero variables. Furthermore, an optimal basic solution to (2) (the non-zero en-
tries) can be computed in polynomial time.

Lemma 3. Suppose that there exists a polynomial time algorithm that computes
an integral solution to LP (1) of cost at most ρ times the optimal value of LP (1)
for any subset of node pairs. Then there exists a polynomial time algorithm that
given a feasible solution to (2) computes as a solution to PC-GSN a subgraph H
of G so that val(H) = c(H) + π(unsat(H)) is at most (1 − e−1/ρ )−1 times the
value of this solution, assuming π is submodular and monotone non-decreasing.

Before proving these lemmas, we prove some useful results. The following state-
ment can be deduced from a theorem of Edmonds for polymatroids (c.f. [KV02,
Chapter 14.2]), as the dual LP d(γ) in the lemma seeks to optimize a linear
function over a polymatroid. We provide a direct proof for completeness of ex-
position.
Lemma 4. Let γ ∈ [0, 1]k be a vector. Consider a primal LP
⎧ ⎫
⎨  ⎬
p(γ) := min π(I)zI | zI ≥ γi ∀i ∈ K, zI ≥ 0 ∀I ⊆ K
⎩ ⎭
I⊆K I:i∈I

and its dual LP


 
 
d(γ) := max γi yi | yi ≤ π(I) ∀I ⊆ K, yi ≥ 0 ∀i ∈ K .
i∈K i∈I

Let σ be a permutation of K such that γσ(1) ≤ γσ(2) ≤ . . . ≤ γσ(k) . Let us also use
the notation that γσ(0) = 0. The optimum solutions to p(γ) and d(γ) respectively
are given by

γσ(i) − γσ(i−1) , for I = {σ(i), . . . , σ(k)}, i ∈ K
zI =
0, otherwise;

and

yσ(i) = π({σ(i), . . . , σ(k)}) − π({σ(i + 1), . . . , σ(k)}), for i ∈ K.

Proof. To simplify the notation, we assume without loss of generality that γ1 ≤


γ2 ≤ · · · ≤ γk , i.e., that σ is the identity permutation.
We argue that the above defined {zI } and {yi } form feasible solutions
 to the
primal and dual LPs respectively. Note that zI ≥ 0 for all I and I:i∈I zI =
i
j=1 (γj − γj−1 ) = γi for all i. Since π is monotone non-decreasing, the above
Prize-Collecting Steiner Network Problems 79

defined yi satisfy yi ≥ 0 for all i. Now fix I ⊆ K. Let I = {i1 , . . . , ip } where


i1 < · · · < ip . Therefore
 
p 
p
yi = yij = [π({ij , . . . , k}) − π({ij + 1, . . . , k})]
i∈I j=1 j=1
p
≤ [π({ij , ij+1 , . . . , ip }) − π({ij+1 , ij+2 , . . . , ip })]
j=1
= π({i1 , . . . , ip }) = π(I).
The above inequality holds because of the submodularity of π. Next observe that
the solutions {zI } and {yi } satisfy
 
k
π(I)zI = π({i, . . . , k}) · (γi − γi−1 )
I i=1


k 
k
= γi · (π({i, . . . , k}) − π({i + 1, . . . , k})) = γi · yi .
i=1 i=1

Thus from weak LP duality, they in fact form optimum solutions to primal and
dual LPs respectively.
Recall that a sub-gradient of a convex function g : k →  at a point γ ∈ k
is a vector d ∈ k such that for any γ  ∈ k , we have g(γ  ) − g(γ) ≥ d ·
(γ  − γ). For a differentiable convex function g, the sub-gradient corresponds
to gradient ∇g. The function p(γ) defined in Lemma 4 is essentially Lovasz’s
continuous extension of the submodular function π. The fact that p is convex
and its subgradient can be computed efficiently is given in [Fuj05]. We provide
a full proof for completeness of exposition.
Lemma 5. The function p(γ) in Lemma 4 is convex and given γ ∈ [0, 1]k , both
p(γ) and its sub-gradient ∇p(γ) can be computed in polynomial time.
Proof. We first prove that p is convex. Fix γ1 , γ2 ∈ [0, 1]k and α ∈ [0, 1]. To show
that p is convex, we will show p(αγ1 + (1 − α)γ2 ) ≤ αp(γ1 ) + (1 − α)p(γ2 ). Let
{zI1 } and {zI2 } be the optimum solutions of the primal LP defining p for γ1 and
γ2 respectively. Note that the solution {αzI1 + (1 − α)zI2 } is feasible for this LP
for γ = αγ1 + (1 − α)γ2 . Thus the optimum solution has value not greater than
the value of this solution which is αp(γ1 ) + (1 − α)p(γ2 ).
From Lemma 4, it is clear that given γ ∈ [0, 1]k , the value p(γ) can be com-
puted in polynomial time. Lemma 4 also implies that the optimum dual solution
y ∗ = (y1∗ , . . . , yk∗ ) ∈ k+ can be computed in polynomial time. We now argue that
y ∗ is a sub-gradient of p at γ. Fix any γ  ∈ k . First note that, from LP duality,
p(γ) = y ∗ · γ. Thus we have
p(γ) + y ∗ · (γ  − γ) = y ∗ · γ + y ∗ · (γ  − γ) = y ∗ · γ  ≤ p(γ  ).
The last inequality holds from weak LP duality since y ∗ is a feasible solution for
the dual LP d(γ  ) as well. The lemma follows.
80 M. Hajiaghayi et al.

3 Proof of Lemma 3

We now describe how to round LP (2) solutions to obtain a (ρ+1)-approximation


for PC-GSN. Later we show how to improve it to (1 − e−1/ρ )−1 . Let {x∗e , fi,e

, zI∗ }
be a feasible solution to LP (2). Let α ∈ (0, 1) be a parameter to be fixed later.
Wepartition the requirements into two classes: we call a requirement i ∈ K good
if I:i∈I zI∗ ≤ α and bad otherwise. Let Kg denote the set of good requirements.
The following statement shows how to satisfy the good requirements.

Lemma 6. There exists a polynomial


 time algorithm that computes a subgraph
H of G of cost c(H) ≤ 1−α
ρ
· e ce x∗e that satisfies all good requirements.

Proof. Consider the LP-relaxation (1) of the GSN problem with good require-
ments only, with K replaced by Kg ; namely, we seek a minimum cost sub-
graph H of G that satisfies the set Kg of good requirements. We claim that
x∗∗ ∗
e = min {1, xe /(1 − α)} for each e ∈ E  is a feasible solution to LP (1). Thus
the optimum value of LP (1) is at most e∈E ce x∗∗ e . Consequently, using the
algorithm that computes an integral solution to LP (1) of cost at most ρ times
the optimal value of LP (1), we can construct a subgraph H that satisfies all
ρ 
good requirements and has cost at most c(H) ≤ ρ e∈E ce x∗∗ e ≤ 1−α e c e x∗
e,
as desired.
∗∗
 We now∗∗ show that {xe } is a feasible solution to LP (1), namely, that
x ≥ ri (T ) for any i ∈ Kg and any T  (i, S). Let i ∈ Kg and let ζi =
 ) e
e∈δ(T
1 − I:i∈I zI∗ . Note that ζi ≥ 1 − α, by the definition of Kg . By the second and
the third sets of constraints in LP (2), for every e ∈ E we have min{ζi , x∗e } ≥ fi,e∗
.


Thus we obtain: x∗∗ x∗e ≥ ζi min{ζi , x∗e } ≥


x 1 ζi 1
e = min 1, 1−α = ζi min ζi , 1−α
e


fi,e f∗
=  i,e
ζi 1− zI∗ .
Consequently, combining with the first set of constraints in


I:i∈I

e∈δ(T ) fi,e
LP (2), for any T  (i, S) we obtain that e∈δ(T ) x∗∗
e ≥ 1−

z ∗ ≥ ri (T ).
I:i∈I I

Let H be as in Lemma 6, and recall that unsat(H) denotes the set of require-
ments not satisfied by H. Clearly each requirement i ∈ unsat(H) is bad. The
following lemma bounds the total penalty we pay for unsat(H).

Lemma 7. π(unsat(H)) ≤ α1 · I π(I)zI∗ .

Proof. Define γ ∈ [0, 1]k as follows: γi = 1 if i ∈ unsat(H) and 0 otherwise. Now


consider LP p(γ) defined in Lemma 4. Since each i ∈ unsat(H) is bad, from
the definition of bad requirements, it is clear that {zI∗ /α} is a feasible solution
to LP p(γ). Furthermore, from Lemma 4, the solution {zI } defined as zI = 1
if I = unsat(H) and 0 otherwise is the optimum solution to p(γ). The cost of
this solution, π(unsat(H)),
 is therefore at most the cost of the feasible solution
{zI∗ /α} which is α1 · I π(I)zI∗ . The lemma thus follows.

Combining Lemmas 6 and 7, we obtain max{ 1−α ρ


, α1 }-approximation. If we sub-
stitute α = 1/(ρ + 1), we obtain a (ρ + 1)-approximation for PC-GSN.
Prize-Collecting Steiner Network Problems 81

Improving the Approximation to (1 − e−1/ρ )−1 . We use a technique


introduced by Goemans as follows. We pick α uniformly at random from the
interval (0, β] where β = 1 − e−1/ρ . From Lemmas 6 and 7, the expected cost of
the solution is at most
" # 
ρ
Eα · ce x∗e + Eα [π(unsat(H))] . (3)
1−α
e∈E

To complete the proof of β1 -approximation, we now argue that the above expec-
 
tation is at most β1 · e∈E (ce x∗e + I π(I)zI∗ ).
$ % 
Since Eα 1−αρ
= β1 , the first term in (3) is at most β1 · e∈E ce x∗e . Since

unsat(H) ⊆ {i | I:i∈I zI∗ ≥ α} and & sinceπ is monotone'non-decreasing, the
second term in (3) is at most Eα π {i | I:i∈I zI∗ ≥ α} . Lemma 8 bounds
this quantity as follows. The ideas used here are also presented in Sharma et
al. [SSW07].
Lemma 8. We have
( ) *+
 1 
Eα π {i | zI∗ ≥ α} ≤ · π(I)zI∗ . (4)
β
I:i∈I I

Proof. Let γi = I:i∈I zI∗ for all i ∈ K. Let us, without loss of generality, order
the elements i ∈ K such that γ1 ≤ γ2 ≤ · · · ≤ γk . We also use the notation
γ0 = 0. Note that {zI∗ } forms a feasible solution to the primal LP p(γ) given in
Lemma 4. Therefore, from Lemma 4, its objective value is at least that of the
optimum solution:

 
k
π(I)zI∗ ≥ [(γi − γi−1 ) · π({i, . . . , k})] . (5)
I i=1

We now observe that the LHS of (4) can be expressed as follows. Since α is picked
uniformly at random from (0, β], we have that for all 1 ≤ i ≤ k, with probability
at most γi −γ i−1
, the random variable α lies in the interval (γi−1 , γi ]. When this
β 
event happens, we get that {i | I:i ∈I zI∗ ≥ α} = {i | γi ≥ α} = {i, . . . , k}.
Thus the expectation in LHS of (4) is at most
k "
 #
γi − γi−1
· π({i, . . . , k}) . (6)
i=1
β

From expressions (5) and (6), the lemma follows.

Thus the proof of (1 − e−1/ρ )−1 -approximation is complete. It is worth men-


tioning so far in this section we obtain a solution with a bound on its expected
cost. However, the choice of α can be simply derandomized by trying out all the
breakpoints where a good demand pair becomes a bad one (plus 0 and β).
82 M. Hajiaghayi et al.

4 Proof of Lemma 2
We next show that even if LP (2) has exponential number of variables and
constraints, the following lemma holds.
Lemma 9. Any basic feasible solution to LP (2) has a polynomial number of
non-zero variables.
Proof. Fix a basic feasible solution {x∗e , fi,e

, zi∗ } to (2). For i ∈ K, let
 ∗
min e∈δ(T ) fi,e
T :T i
γi = 1 − and γi = 1 − max fi,e

.
ri e
Now fix the values of variables {xe , fi,e } to {x∗e , fi,e

} and project the LP (2) onto
variables {zI } as follows.

 ⎨
ce x∗e + min π(I)zI |

e∈E I⊆K

  ⎬
zI = 1, γi ≤ zI ≤ γi ∀i ∈ K, zI ≥ 0 ∀I ⊆ K . (7)

I⊆K I:i∈I

Since {x∗e , fi,e



, zi∗ } is a basic feasible solution to (2), it cannot be written as a
convex combination of two distinct feasible solutions to (2). Thus we get that
{zI∗ } cannot be written as a convex combination of two distinct feasible solutions
to (7), and hence it forms a basic feasible solution to (7). Since there are 1 + 2|K|
non-trivial constraints in (7), at most 1 + 2|K| variables zI can be non-zero in
any basic feasible solution of (7). Thus the lemma follows.
We prove that LP (2) can be solved in polynomial time. Introduce variables
γ ∈ [0, 1]k and obtain the following program (the function p is as in Lemma 4).

Minimize e∈E ce xe + p(γ)

Subject to e∈δ(T ) fi,e ≥ (1 − γi )ri (T ) ∀i ∀T  (i, S)
fi,e ≤ 1 − γi ∀i ∀e
(8)
xe ≥ fi,e ∀i ∀e
xe , fi,e , γi ∈ [0, 1] ∀i ∀e
It is clear that solving (8) is enough to solve (2). Now note that this is a convex
program since p is a convex
 function. To solve (8), we convert its objective func-
tion into a constraint e∈E ce xe + p(γ) ≤ opt where opt is the target objective
value and thus reduce it to a feasibility problem. Now to find a feasible solution
using an ellipsoid algorithm, we need to show a polynomial time separation or-
acle. The separation oracle for the first set of constraints can be reduced to a
minimum u-v cut problem using standard techniques. The separation oracle for
the remaining constraints is trivial.
The separation oracle for the objective
 function is as follows. Given a point
(x, γ) = {xe , γi } that satisfies e∈E c e xe + p(γ) > opt, we compute a
Prize-Collecting Steiner Network Problems 83


sub-gradient of the function e∈E ce xe + p(γ) w.r.t. variables {xe , γi }. The sub-
gradient of e∈E ce xe w.r.t. x is simply the cost vector c. The sub-gradient
of p(γ) w.r.t. γ is computed using Lemma 5, denote it by y ∈ k+ . From the
definition of sub-gradient, we have that the sub-gradient (c, y) to the objective
function at point (x, γ) satisfies
) * ) *
 
 
ce xe + p(γ ) − ce xe + p(γ) ≥ (c, y) · ((x , γ  ) − (x, γ)) .
e∈E e∈E

Now fix any feasible solution (x∗ , γ ∗ ), i.e., the one that satisfies e∈E ce x∗e +
p(γ ∗ ) ≤ opt. Substituting (x , γ  ) = (x∗ , γ ∗ ) in the above equation we get,
) * ) *
 
0 = opt − opt > ce x∗e + p(γ ∗ ) − ce xe + p(γ)
e∈E e∈E
≥ (c, y) · (x∗ , γ ∗ ) − (c, y) · (x, γ).

Thus (c, y) defines a separating hyperplane between the point (x, γ) and any
point (x∗ , γ ∗ ) that satisfies e∈E ce x∗e + p(γ ∗ ) ≤ opt. Hence we have a polyno-
mial time separation oracle for the objective function as well.
Thus we can solve (8) using the ellipsoid algorithm. The proof of Lemma 2 is
hence complete.

References
[ABHK09] Archer, A., Bateni, M., Hajiaghayi, M., Karloff, H.: A technique for im-
proving approximation algorithms for prize-collecting problems. In: Proc.
50th IEEE Symp. on Foundations of Computer Science, FOCS (2009)
[AKR95] Agrawal, A., Klein, P., Ravi, R.: When trees collide: an approximation
algorithm for the generalized Steiner problem on networks. SIAM J. Com-
put. 24(3), 440–456 (1995)
[Bal89] Balas, E.: The prize collecting traveling salesman problem. Networks 19(6),
621–636 (1989)
[BGRS10] Byrka, J., Grandoni, F., Rothvoss, T., Sanita, L.: An improved lp-based
approximation for steiner tree. In: Proceedings of the 42nd annual ACM
Symposium on Theory of computing, STOC (2010)
[BGSLW93] Bienstock, D., Goemans, M., Simchi-Levi, D., Williamson, D.: A note on
the prize collecting traveling salesman problem. Math. Programming 59(3,
Ser. A), 413–420 (1993)
[BP89] Bern, M., Plassmann, P.: The Steiner problem with edge lengths 1 and 2.
Information Processing Letters 32, 171–176 (1989)
[CK09] Chuzhoy, J., Khanna, S.: An O(k3 log n)-approximation algorithms for
vertex-connectivity network design. In: Proceedings of the 50th Annual
IEEE Symposium on Foundations of Computer Science, FOCS (2009)
[CRW01] Chudak, F., Roughgarden, T., Williamson, D.: Approximate k-MSTs and
k-Steiner trees via the primal-dual method and Lagrangean relaxation. In:
Aardal, K., Gerards, B. (eds.) IPCO 2001. LNCS, vol. 2081, pp. 60–70.
Springer, Heidelberg (2001)
84 M. Hajiaghayi et al.

[CVV06] Cheriyan, J., Vempala, S., Vetta, A.: Network design via iterative rounding
of setpair relaxations. Combinatorica 26(3), 255–275 (2006)
[FJW01] Fleischer, L., Jain, K., Williamson, D.: An iterative rounding 2-
approximation algorithm for the element connectivity problem. In: Proc.
of the 42nd IEEE Symp. on Foundations of Computer Science (FOCS), pp.
339–347 (2001)
[Fuj05] Fujishige, S.: Submodular functions and optimization. Elsevier, Amster-
dam (2005)
[GKL+ 07] Gupta, A., Könemann, J., Leonardi, S., Ravi, R., Schäfer, G.: An efficient
cost-sharing mechanism for the prize-collecting steiner forest problem. In:
Proc. of the 18th ACM-SIAM Symposium on Discrete algorithms (SODA),
pp. 1153–1162 (2007)
[GW95] Goemans, M., Williamson, D.: A general approximation technique for con-
strained forest problems. SIAM J. Comput. 24(2), 296–317 (1995)
[HJ06] Hajiaghayi, M., Jain, K.: The prize-collecting generalized Steiner tree prob-
lem via a new approach of primal-dual schema. In: Proc. of the 17th ACM-
SIAM Symp. on Discrete Algorithms (SODA), pp. 631–640 (2006)
[HN10] Hajiahayi, M., Nasri, A.: Prize-collecting Steiner networks via iterative
rounding. In: LATIN (to appear, 2010)
[Jai01] Jain, K.: A factor 2 approximation algorithm for the generalized Steiner
network problem. Combinatorica 21(1), 39–60 (2001)
[JMP00] Johnson, D., Minkoff, M., Phillips, S.: The prize collecting Steiner tree
problem: theory and practice. In: Proceedings of the Eleventh Annual
ACM-SIAM Symposium on Discrete Algorithms, pp. 760–769 (2000)
[JV01] Jain, K., Vazirani, V.: Approximation algorithms for metric facility loca-
tion and k-median problems using the primal-dual schema and Lagrangian
relaxation. J. ACM 48(2), 274–296 (2001)
[KN07] Kortsarz, G., Nutov, Z.: Approximating minimum cost connectivity prob-
lems. In: Gonzales, T.F. (ed.) Approximation Algorithms and Metahueris-
tics, ch. 58. CRC Press, Boca Raton (2007)
[KV02] Korte, B., Vygen, J.: Combinatorial Optimization: Theory and Algorithms.
Springer, Berlin (2002)
[NSW08] Nagarajan, C., Sharma, Y., Williamson, D.: Approximation algorithms for
prize-collecting network design problems with general connectivity require-
ments. In: Bampis, E., Skutella, M. (eds.) WAOA 2008. LNCS, vol. 5426,
pp. 174–187. Springer, Heidelberg (2009)
[Nut09] Nutov, Z.: Approximating minimum cost connectivity problems via un-
crossable bifamilies and spider-cover decompositions. In: Proc. of the 50th
IEEE Symposium on Foundations of Computer Science, FOCS (2009)
[RZ05] Robins, G., Zelikovsky, A.: Tighter bounds for graph Steiner tree approx-
imation. SIAM J. on Discrete Mathematics 19(1), 122–134 (2005)
[SCRS00] Salman, F., Cheriyan, J., Ravi, R., Subramanian, S.: Approximating the
single-sink link-installation problem in network design. SIAM J. on Opti-
mization 11(3), 595–610 (2000)
[SSW07] Sharma, Y., Swamy, C., Williamson, D.: Approximation algorithms for
prize collecting forest problems with submodular penalty functions. In:
Proceedings of the 18th ACM-SIAM Symposium on Discrete Algorithms
(SODA), pp. 1275–1284 (2007)
On Lifting Integer Variables in Minimal
Inequalities

Amitabh Basu1, , Manoel Campelo2, , Michele Conforti3 ,


Gérard Cornuéjols1,4, , and Giacomo Zambelli3
1
Tepper School of Business, Carnegie Mellon University, Pittsburgh, PA 15213
2
Departamento de Estatı́stica e Matemática Aplicada,
Universidade Federal do Ceará, Brazil
3
Dipartimento di Matematica Pura e Applicata, Universitá di Padova, Via Trieste
63, 35121 Padova, Italy
4
LIF, Faculté des Sciences de Luminy, Université de Marseille, France

Abstract. This paper contributes to the theory of cutting planes for


mixed integer linear programs (MILPs). Minimal valid inequalities are
well understood for a relaxation of an MILP in tableau form where all
the nonbasic variables are continuous. In this paper we study lifting
functions for the nonbasic integer variables starting from such minimal
valid inequalities. We characterize precisely when the lifted coefficient is
equal to the coefficient of the corresponding continuous variable in every
minimal lifting. The answer is a nonconvex region that can be obtained
as the union of convex polyhedra.

1 Introduction
There has been a renewed interest recently in the study of cutting planes for
general mixed integer linear programs (MILPs) that cut off a basic solution
of the linear programming relaxation. More precisely, consider a mixed integer
linear set in which the variables are partitioned into a basic set B and a nonbasic
set N , and K ⊆ B ∪ N indexes the integer variables:

xi = fi − j∈N aij xj for i ∈ B
x≥0 (1)
xk ∈ Z for k ∈ K.

Let X be the relaxation of (1) obtained by dropping the nonnegativity restriction


on all the basic variables xi , i ∈ B. The convex hull of X is the corner polyhedron
introduced by Gomory  [11] (see also [12]). Note that, for any i ∈ B \ K, the
equation xi = fi − j∈N aij xj can be removed from the formulation of X
since it just defines variable xi . Therefore, throughout the paper, we will assume

Supported by a Mellon Fellowship and NSF grant CMMI0653419.

Partially supported by CNPq Brazil.

Supported by NSF grant CMMI0653419, ONR grant N00014-09-1-0133 and ANR
grant ANR06-BLAN-0375.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 85–95, 2010.

c Springer-Verlag Berlin Heidelberg 2010
86 A. Basu et al.

B ⊆ K, i.e. all basic variables are integer. Andersen, Louveaux, Weismantel and
Wolsey [1] studied the corner polyhedron when |B| = 2 and B = K, i.e. all
nonbasic variables are continuous. They give a complete characterization of the
corner polyhedron using intersection cuts (Balas [2]) arising from splits, triangles
and quadrilaterals. This very elegant result has been extended to |B| > 2 and
B = K by showing a correspondence between minimal valid inequalities and
maximal lattice-free convex sets [5], [7]. These results and their extensions [6],
[10] are best described in an infinite model, which we motivate next.
A classical family of cutting planes for (1) is that of Gomory mixed integer
cuts.
 For a given row of the tableau, the Gomory mixed integer cut is of the form
j∈N \K ψ(aij )xj + j∈N ∩K π(aij )xj ≥ 1 where ψ and π are functions given by
simple formulas. A nice feature of the Gomory mixed integer cut is that, for fixed
fi , the same functions ψ, π are used for any possible choice of the aij s in (1). It is
well known that the Gomory mixed integer cuts are also valid for X. More gener-
ij , i ∈ B; we are interested
ally, let aj be the vector with entries a  in pairs (ψ, π)
of functions such that the inequality j∈N \K ψ(aj )xj + j∈N ∩K π(aj )xj ≥ 1
is valid for X for any possible choice of the nonbasic coefficients aij . Since we
are interested in nonredundant inequalities, we can assume that the function
(ψ, π) is pointwise minimal. While a general characterization of minimal valid
functions seems hopeless (see for example [4]), when N ∩ K = ∅ the minimal
valid functions ψ are well understood in terms of maximal lattice-free convex
sets, as already mentioned. Starting from such a minimal valid function ψ, an
interesting question is how to generate a function π such that (ψ, π) is valid and
minimal. Recent papers [8], [9] study when such a function π is unique. Here we
prove a theorem that generalizes and unifies results from these two papers.
In order to formalize the concept of valid function (ψ, π), we introduce the
following infinite model. In the setting below, we also allow further linear con-
straints on the basic variables. Let S be the set of integral points in some rational
polyhedron in Rn such that dim(S) = n (for example, S could be the set of non-
negative integer points). Let f ∈ Rn \ S. Consider the following semi-infinite
relaxation of (1), introduced in [10].
 
x= f+ rsr + ryr , (2)
r∈Rn r∈Rn
x ∈ S,
sr ∈ R+ , ∀r ∈ Rn ,
yr ∈ Z+ , ∀r ∈ Rn ,
s, y have finite support

where the nonbasic continuous variables have been renamed s and the nonbasic
integer variables have been renamed y. Given π : Rn → R, (ψ, π)
two functions ψ, 
is said to be valid for (2) if the inequality r∈Rn ψ(r)sr + r∈Rn π(r)yr ≥ 1
holds for every (x, s, y) satisfying (2). We also consider the semi-infinite model
where we only have continuous nonbasic variables.
On Lifting Integer Variables in Minimal Inequalities 87


x= f+ rsr (3)
r∈Rn
x ∈ S,
sr ∈ R+ , ∀r ∈ Rn ,
s has finite support.

A function ψ : Rn → R is said to be valid for (3) if the inequality r∈Rn ψ(r)sr ≥
1 holds for every (x, s) satisfying (3). Given a valid function ψ for (3), a function
π is a lifting of ψ if (ψ, π) is valid for (2). One is interested only in (pointwise)
minimal valid functions, since non-minimal ones are implied by some minimal
valid function. If ψ is a minimal valid function for (3) and π is a lifting of ψ such
that (ψ, π) is a minimal valid function for (2) then we say that π is a minimal
lifting of ψ.
While minimal valid functions for (3) have a simple characterization [6], min-
imal valid functions for (2) are not well understood. A general idea to derive
minimal valid functions for (2) is to start from some minimal valid function ψ
for (3), and construct a minimal lifting π of ψ. While there is no general tech-
nique to compute such minimal lifting π, it is known that there exists a region
Rψ , containing the origin in its interior, where ψ coincides with π for any mini-
mal lifting π. This latter fact was observed by Dey and Wolsey [9] for the case
of S = Z2 and by Conforti, Cornuéjols and Zambelli [8] for the general case.
Furthermore, it is remarked in [8] and [10] that, if π is a minimal lifting of ψ,
then π(r) = π(r ) for every r, r ∈ Rn such that r − r ∈ Zn ∩ lin(conv(S)).
Therefore the coefficients of any minimal lifting π are uniquely determined in
the region Rψ + (Zn ∩ lin(conv(S))). In particular, whenever translating Rψ by
integer vectors in lin(conv(S)) covers Rn , ψ has a unique minimal lifting. The
purpose of this paper is to give a precise description of the region Rψ .
To state our main result, we need to explain the characterization of minimal
valid functions for (3). We say that a convex set B ⊆ Rn is S-free if B does not
contain any point of S in its interior. A set B is a maximal S-free convex set if it
is an S-free convex set that is not properly contained in any S-free convex set.
It was proved in [6] that maximal S-free convex sets are polyhedra containing a
point of S in the relative interior of each facet.
Given an S-free polyhedron B ⊆ Rn containing f in its interior, B can be
uniquely written in the form

B = {x ∈ Rn : ai (x − f ) ≤ 1, i ∈ I},

where I is a finite set of indices and ai (x − f ) ≤ 1 is facet-defining for B for


every i ∈ I.
Let ψB : Rn → R be the function defined by

ψB (r) = max ai r, ∀r ∈ Rn .
i∈I

Note in particular that, since maximal S-free convex sets are polyhedra, the
above function is defined for all maximal S-free convex sets B.
88 A. Basu et al.

Theorem 1. [6] Let ψ be a minimal valid function for (3). Then the set

Bψ := {x ∈ Rn | ψ(x − f ) ≤ 1}

is a maximal S-free convex set containing f in its interior, and ψ = ψBψ .


Conversely, if B is a maximal S-free convex set containing f in its interior, then
ψB is a minimal valid function for (3).
We are now ready to state the main result of the paper. Given a minimal valid
function ψ for (3), by Theorem 1 Bψ is a maximal S-free convex set containing
f in its interior, thus it can be uniquely written as Bψ = {x ∈ Rn | ai (x − f ) ≤
1, i ∈ I}. For every r ∈ Rn , let I(r) = {i ∈ I | ψ(r) = ai r}. Given x ∈ S, let

R(x) := {r ∈ Rn | I(r) ⊇ I(x − f ) and I(x − f − r) ⊇ I(x − f )}.

We define ,
Rψ := R(x).
x∈S∩Bψ

Theorem 2. Let ψ be a minimal valid function for (3). If π is a minimal lifting


of ψ, then π(r) = ψ(r) for every r ∈ Rψ .
Conversely, for every r̄ ∈ Rψ , there exists a lifting π of ψ such that π(r̄) < ψ(r̄).
Figure 1 illustrates the region Rψ for several examples. We conclude the intro-
duction presenting a different characterization of the regions R(x).
Proposition 1. Let ψ be a minimal valid function for (3), and let x ∈ S. Then
R(x) = {r ∈ Rn | ψ(r) + ψ(x − f − r) = ψ(x − f )}.

Proof. We can uniquely write Bψ = {x ∈ Rn | ai (x − f ) ≤ 1, i ∈ I}. Let h ∈


I(x − f ). Then

ψ(x − f ) = ah (x − f ) = ah r+ah (x −f − r) ≤ maxi∈I ai r+maxi∈I ai (x − f − r) =


ψ(r) + ψ(x − f − r).

In the above expression, equality holds if and only if h ∈ I(r) and h ∈ I(x−f −r).

2 Minimum Lifting Coefficient of a Single Variable


Given r∗ ∈ Rn , we consider the set of solutions to

x=f+ rsr + r∗ yr∗
r∈Rn
x∈S
s≥0 (4)
yr∗ ≥ 0, yr∗ ∈ Z
s has finite support.
On Lifting Integer Variables in Minimal Inequalities 89

x3 R(x3) R(x ) x f
f 2 2
x1 x2

R(x2)
R(x1)

R(x1)
x1

l1 l
(a) A maximal Z2 -free triangle with (b) A wedge
three integer points

x3 R(x3)
x1
R(x1)

Bψ x1 x2
f
x6 R(x1)
x2
R(x2) R(x6) R(x2)
f
R(x5)
R(x4)

x3 x4 x5
R(x3)
(c) A maximal Z2 -free triangle with integer (d) A truncated wedge
vertices

Fig. 1. Regions R(x) for some maximal S-free convex sets in the plane. The thick dark
line indicates the boundary of Bψ . For a particular x, the dark gray regions denote
R(x). The jagged lines in a region indicate that it extends to infinity. For example, in
Figure 1(b), R(x1 ) is the strip between lines l1 and l. Figure 1(c) shows an example
where R(x) is full-dimensional for x2 , x4 , x6 , but is not full-dimensional for x1 , x3 , x5 .
90 A. Basu et al.

Given
 a minimal valid function ψ for (3) and scalar λ, we say that the inequality
r∈Rn ψ(r)sr + λyr ∗ ≥ 1 is valid for (4) if it holds for every (x,
s, yr∗ ) satisfy-
ing (4). We denote by ψ ∗ (r∗ ) the minimum value of λ for which r∈Rn ψ(r)sr +
λyr∗ ≥ 1 is valid for (4).
We observe that, for any lifting π of ψ, we have

ψ ∗ (r∗ ) ≤ π(r∗ ).

Indeed, r∈Rn ψ(r)sr + π(r∗ )yr∗ ≥ 1 is valid for (4), since, for any (s̄, ȳr∗ )
satisfying (4), the vector (s̄, ȳ), defined by ȳr = 0 for all r ∈ Rn \{r∗ }, satisfies (2).
Moreover, the following fact was shown in [8].
Lemma 1. If ψ is a minimal valid function for (3) and π is a minimal lifting
of ψ, then π ≤ ψ.
So we have the following relation for every minimal lifting π of ψ :

ψ ∗ (r) ≤ π(r) ≤ ψ(r) for all r ∈ Rn .

In general ψ ∗ is not a lifting of ψ, but if it is, then the above relation implies
that it is the unique minimal lifting of ψ.
Remark 1. For any r ∈ Rn such that ψ ∗ (r) = ψ(r), we have π(r) = ψ(r) for
every minimal lifting π of ψ. Conversely, if ψ ∗ (r∗ ) < ψ(r∗ ) for some r∗ ∈ Rn ,
then there exists some lifting π of ψ such that π(r∗ ) < ψ(r∗ ).

Proof. The first part follows from ψ ∗ ≤ π ≤ ψ. For the second part, given
r∗ ∈ Rn such that ψ ∗ (r∗ ) < ψ(r∗ ), we can define π by π(r∗ ) = ψ ∗ (r∗ ) and
π(r) = ψ(r) for all r ∈ Rn , r = r∗ . Since ψ is valid for (3), it follows by the
definition of ψ ∗ (r∗ ) that π is a lifting of ψ.

By the above remark, in order to prove Theorem 2 we need to show that


Rψ = {r ∈ Rn | ψ(r) = ψ ∗ (r)}. We will need the following results.
Theorem 3. [6] A full-dimensional convex set B is a maximal S-free convex
set if and only if it is a polyhedron such that B does not contain any point
of S in its interior and each facet of B contains a point of S in its relative
interior. Furthermore if B ∩ conv(S) has nonempty interior, lin(B) contains
rec(B ∩ conv(S)).

Remark 2. The proof of Theorem 3 in [6] implies the following. Given a maximal
S-free convex set B, there exists δ > 0 such that there is no point of S \ B at
distance less than δ from B.

Let r∗ ∈ Rn . Given a maximal S-free convex set B = {x ∈ Rn | ai (x − f ) ≤


1, i ∈ I}, for any λ ∈ R, we define the set B(λ) ⊂ Rn+1 as follows
 
x
B(λ) = { ∈ Rn+1 | ai (x − f ) + (λ − ai r∗ )xn+1 ≤ 1, i ∈ I}. (5)
xn+1
On Lifting Integer Variables in Minimal Inequalities 91

Theorem 4. [8] Let r∗ ∈ Rn .  Given a maximal S-free convex set B, let ψ =


ψB . Given λ ∈ R, the inequality r∈Rn ψ(r)sr + λyr∗ ≥ 1 is valid for (4) if and
only if B(λ) is (S × Z+ )-free.
Remark 3. Let r∗ ∈ Rn , and let B be a maximal S-free convex set. For every λ
such that B(λ) is (S × Z+ )-free, B(λ) is maximal (S × Z+ )-free.
Proof. Since B is a maximal S-free convex set, then by Theorem 3 each facet of B
contains a point x̄ of S in its relative
 interior. Therefore the corresponding facet
of B(λ) contains the point x̄0 in its relative interior. If B(λ) is (S × Z+ )-free,
by Theorem 3 it is a maximal (S × Z+ )-free convex set.

3 Characterizing the Region Rψ


Next we state and prove a theorem that characterizes when ψ ∗ (r∗ ) = ψ(r∗ ). The
main result of this paper, Theorem 2, will then follow easily.
Theorem 5. Given a maximal S-free convex set B, let ψ = ψB . Given r∗ ∈ Rn ,
the following are equivalent:
(i) ψ ∗ (r∗ ) = ψ(r∗ ).  
(ii)There exists a point x̄ ∈ S such that x̄1 ∈ B(ψ(r∗ )).

Proof. Let λ∗ = ψ(r∗ ). Note that the inequality r∈Rn ψ(r)sr + λ∗ yr∗ ≥ 1 is
valid for (4). Thus, it follows from Theorem 4 that B(λ∗ ) is S × Z+ -free .
x̄We first∗ show that (ii) implies (i). x̄ Assume there exists x̄ ∈ ∗
S such that
1 ∈ B(λ ). Then, for every > 0, 1 is in the interior of B(λ − ), because
ai (x̄ − f ) + (λ − ε − ai r∗ ) ≤ 1 − ε < 1 for all i ∈ I. Theorem 4 then implies that
ψ ∗ (r∗ ) = λ∗ .
Next we show that (i) implies (ii). Assume that ψ ∗ (r∗ ) = ψ(r∗ ) = λ∗ . We
recall that λ∗ = maxi∈I ai r∗ .
Note that, if ai r∗ = λ∗ for all i ∈ I, then B(λ∗ ) = B × R, so given any point
x̄ in B ∩ S, x̄1 is in B(λ∗ ). Thus we assume that there exists an index h such
that ah r∗ < λ∗ .
By Remark 3, B(λ∗ ) is maximal (S × Z+ )-free. Theorem 3 implies the
following,
a) rec(B ∩ conv(S)) ⊆ lin(B),
b) rec(B(λ∗ ) ∩ conv(S × Z+ )) ⊆ lin(B(λ∗ )).
Lemma 2. rec(B(λ∗ ) ∩ conv(S × Z+ )) = rec(B ∩ conv(S)) × {0}.
 r̄ 
Let r̄n+1 ∈ rec(B(λ∗ )∩conv(S×Z+ )). Note that rec(conv(S×Z+ )) = rec(conv(S))
× Z+ , thus r̄ ∈ rec(conv(S)) and r̄n+1 ≥ 0. We only need to show that r̄n+1 = 0.

By b), r̄n+1 satisfies

ai r̄ + (λ∗ − ai r∗ )r̄n+1 = 0, i ∈ I. (6)

Since λ∗ − ai r∗ ≥ 0 and r̄n+1 ≥ 0,

ai r̄ ≤ 0, i ∈ I,
92 A. Basu et al.

therefore r̄ ∈ rec(B). Thus r̄ ∈ rec(B ∩ conv(S)) which, by a), is contained in


lin(B). This implies
ai r̄ = 0, i ∈ I.
It follows from the above and from (6) that (λ∗ − ai r∗ )r̄n+1 = 0 for i ∈ I. Since
λ∗ − ah r∗ > 0 for some index h, it follows that r̄n+1 = 0. This concludes the
proof of Lemma 2.
Lemma 3. There exists ε̄ > 0 such that rec(B(λ∗ − ε) ∩ conv(S × Z+ )) =
rec(B ∩ conv(S)) × {0} for every ε ∈ [0, ε̄].
Since conv(S) is a rational polyhedron, S = {x∈Rn | Cx ≤ d} for some rational
matrix (C, d). By Lemma 2, there is no vector 1r in rec(B(λ∗ ) ∩ conv(S × Z+ )).
Thus the system
ai r + (λ∗ − ai r∗ ) ≤ 0, i ∈ I
Cr ≤ 0
is infeasible. By Farkas Lemma, there exist scalars μi ≥ 0, i ∈ I and a nonnega-
tive vector γ such that

μi ai + γC = 0
i∈I
 

λ ( μi ) − ( μi ai )r∗ > 0.
i∈I i∈I

This implies that there exists some ε̄ > 0 such that for all ε ≤ ε̄,

μi ai + γC = 0
i∈I
 

(λ − ε)( μi ) − ( μi ai )r∗ > 0,
i∈I i∈I

thus the system


ai r + (λ∗ − ε − ai r∗ ) ≤ 0, i ∈ I
Cr ≤ 0
is infeasible. This implies that rec(B(λ∗ − ε) ∩ conv(S × Z+ )) = rec(B ∩ conv(S))
× {0}.
 x̄ 
Lemma 4. B(λ∗ ) contains a point x̄n+1 ∈ S × Z+ such that x̄n+1 > 0.
By Lemma 3, there exists ε̄ such that, for every ε ∈ [0, ε̄], rec(B(λ∗ −ε)∩conv(S×
Z+ )) = rec(B ∩ conv(S)) × {0}. This implies x̄ that there exists a scalar M such
that, for every ε ∈ [0, ε̄] and every point x̄n+1 ∈ B(λ∗ − ε) ∩ (S × Z+ ), it follows
x̄n+1 ≤ M .
 Remark 2 and Remark 3 imply that there exists δ > 0 such that, for every

x̄n+1 ∈ (S × Z+ ) \ B(λ∗ ), there exists h ∈ I such that ah (x̄ − f ) + (λ∗ −
ah r∗ )x̄n+1 ≥ 1 + δ. Choose ε > 0 such that ε ≤ ε̄ and εM ≤δ. 
Since ψ ∗ (r∗ ) = λ∗ , by Theorem 4, B(λ∗ − ε) has a point x̄n+1 x̄
∈ S × Z+ in
its interior. Thus ai (x̄ − f ) + (λ∗ − ε − ai r∗ )x̄n+1 < 1, i ∈ I.
On Lifting Integer Variables in Minimal Inequalities 93
 x̄ 
We show that x̄n+1 is also in B(λ∗ ). Suppose not. Then, by our choice of δ,
there exists h ∈ I such that ah (x̄ − f ) + (λ∗ − ah r∗ )x̄n+1 ≥ 1 + δ. By our choice
of M and ε,
1 + δ ≤ ah (x̄ − f ) + (λ∗ − ah r∗ )x̄n+1 ≤ ah (x̄ − f ) + (λ∗ − ε − ah r∗ )x̄n+1 + εM <
1 + εM ≤ 1 + δ,
a contradiction.
 x̄ 
Hence x̄n+1 is in B(λ∗ ). Since B is S-free and B(λ∗ −ε)∩(Rn ×{0}) = B×{0},
it follows that B(λ∗ − ε) does not contain any point of S × {0} in its interior.
Thus x̄n+1 > 0. This concludes the proof of Lemma 4.
 
By the previous lemma, B(λ∗ ) contains a point x̄
∈ S × Z+ such that
  x̄n+1
x̄n+1 > 0. Note that B(λ∗ ) contains x̄1 , since

ai (x̄ − f ) + (λ∗ − ai r∗ ) ≤ ai (x̄ − f ) + (λ∗ − ai r∗ )x̄n+1 ≤ 1, i∈I

since λ∗ − ai r∗ ≥ 0, i ∈ I.

Corollary 1. Let ψ be a minimal valid function for (3). Then ψ ∗ (r∗ ) = ψ(r∗ )
if and only if there exists x̄ ∈ S such that

ψ(r∗ ) + ψ(x̄ − f − r∗ ) = 1. (7)

Proof. We first showthat, if there exist x̄ ∈ S satisfying (7), then ψ ∗ (r∗ ) =


ψ(r∗ ). Indeed, since r∈Rn ψ(r)sr + ψ ∗ (r∗ )yr∗ ≥ 1 is valid for (4),

1 ≤ ψ ∗ (r∗ ) + ψ(x̄ − f − r∗ ) ≤ ψ(r∗ ) + ψ(x̄ − f − r∗ ) = 1.

We show the converse. Since ψ is a valid function for (3), ψ(x̄−f −r∗ )+ψ(r∗ ) ≥ 1.
Since ψ is a minimal valid function for (3), by Theorem 1 there exists a maximal
S-free convex set B such that ψ = ψB . Let Bψ = {x ∈ Rn | ai (x − f ) ≤ 1, i ∈ I}.
∗ ∗ ∗
x̄Assume ψ∗ (r ) = ψ(r ). By Theorem 5, there exists a point x̄ ∈ S such that
1 ∈ B(ψ(r )). Therefore

ai (x̄ − f ) + ψ(r∗ ) − ai r∗ ≤ 1, i ∈ I.

Thus

max ai (x̄ − f − r∗ ) ≤ 1 − ψ(r∗ ),


i∈I

which implies ψ(x̄ − f − r∗ ) + ψ(r∗ ) ≤ 1. Hence ψ(x̄ − f − r∗ ) + ψ(r∗ ) = 1.

Proof (Proof of Theorem 2). By Remark 1, we only need to show that Rψ =


{r ∈ Rn | ψ(r) = ψ ∗ (r)}. For every x ∈ S, we have ψ(x − f ) = 1 if and only if
x ∈ S ∩ Bψ . Therefore, by Proposition 1, R(x) = {r ∈ Rn | ψ(r) + ψ(x − f − r) =
ψ(x − f ) = 1} if and only if x ∈ S ∩ Bψ . The latter fact and Corollary 1 imply
that a vector r ∈ Rn satisfies ψ ∗ (r) = ψ(r) if and only if r ∈ R(x) for some
x ∈ S ∩ Bψ . The statement now follows from the definition of Rψ .
94 A. Basu et al.

4 Conclusion
In this paper we give an exact characterization of the region where a minimal
valid inequality ψ and any minimal lifting π of ψ coincide. This was exhibited in
Theorem 2, which generalizes results from [8] and [9] about liftings of minimal
valid inequalities.
As already mentioned in the introduction, the following theorem was proved
in [8].
Theorem 6. Let ψ be a minimal valid function for (3). If Rψ + (Zn ∩ lin
(conv(S))) covers all of Rn , then there exists a unique minimal lifting π of ψ.
We conjecture that the converse also holds.
Conjecture 7 Let ψ be a minimal valid function for (3). There exists a unique
minimal lifting π of ψ if and only if Rψ + (Zn ∩ lin(conv(S))) covers all of Rn .

Acknowledgements
The authors would like to thank Marco Molinaro for helpful discussions about
the results presented in this paper.

References
1. Andersen, K., Louveaux, Q., Weismantel, R., Wolsey, L.A.: Cutting Planes from
Two Rows of a Simplex Tableau. In: Fischetti, M., Williamson, D.P. (eds.) IPCO
2007. LNCS, vol. 4513, pp. 1–15. Springer, Heidelberg (2007)
2. Balas, E.: Intersection Cuts - A New Type of Cutting Planes for Integer Program-
ming. Operations Research 19, 19–39 (1971)
3. Barvinok, A.: A Course in Convexity. In: Graduate Studies in Mathematics, vol. 54.
American Mathematical Society, Providence (2002)
4. Basu, A., Conforti, M., Cornuejols, G., Zambelli, G.: A Counterexample to a Con-
jecture of Gomory and Johnson. Mathematical Programming Ser. A (to appear
2010)
5. Basu, A., Conforti, M., Cornuejols, G., Zambelli, G.: Maximal Lattice-free Convex
Sets in Linear Subspaces (2009) (manuscript)
6. Basu, A., Conforti, M., Cornuejols, G., Zambelli, G.: Minimal Inequalities for an
Infinite Relaxation of Integer Programs. SIAM Journal of Discrete Mathematics
(to appear 2010)
7. Borozan, V., Cornuéjols, G.: Minimal Valid Inequalities for Integer Constraints.
Mathematics of Operations Research 34, 538–546 (2009)
8. Conforti, M., Cornuejols, G., Zambelli, G.: A Geometric Perspective on Lifting
(May 2009) (manuscript)
9. Dey, S.S., Wolsey, L.A.: Lifting Integer Variables in Minimal Inequalities corre-
sponding to Lattice-Free Triangles. In: Lodi, A., Panconesi, A., Rinaldi, G. (eds.)
IPCO 2008. LNCS, vol. 5035, pp. 463–475. Springer, Heidelberg (2008)
10. Dey, S.S., Wolsey, L.A.: Constrained Infinite Group Relaxations of MIPs (March
2009) (manuscript)
On Lifting Integer Variables in Minimal Inequalities 95

11. Gomory, R.E.: Some Polyhedra related to Combinatorial Problems. Linear Algebra
and its Applications 2, 451–558 (1969)
12. Gomory, R.E., Johnson, E.L.: Some Continuous Functions Related to Corner Poly-
hedra, Part I. Mathematical Programming 3, 23–85 (1972)
13. Johnson, E.L.: On the Group Problem for Mixed Integer Programming. In: Math-
ematical Programming Study, pp. 137–179 (1974)
14. Schrijver, A.: Theory of Linear and Integer Programming. John Wiley & Sons,
New York (1986)
15. Meyer, R.R.: On the Existence of Optimal Solutions to Integer and Mixed-Integer
Programming Problems. Mathematical Programming 7, 223–235 (1974)
Efficient Edge Splitting-Off Algorithms
Maintaining All-Pairs Edge-Connectivities

Lap Chi Lau and Chun Kong Yung

Department of Computer Science and Engineering


The Chinese University of Hong Kong

Abstract. In this paper we present new edge splitting-off results main-


taining all-pairs edge-connectivities of a graph. We first give an alter-
nate proof of Mader’s theorem, and use it to obtain a deterministic
Õ(rmax 2 · n2 )-time complete edge splitting-off algorithm for unweighted
graphs, where rmax denotes the maximum edge-connectivity requirement.
This improves upon the best known algorithm by Gabow by a factor of
Ω̃(n). We then prove a new structural property, and use it to further
speedup the algorithm to obtain a randomized Õ(m + rmax 3 · n)-time
algorithm. These edge splitting-off algorithms can be used directly to
speedup various graph algorithms.

1 Introduction
The edge splitting-off operation plays an important role in many basic graph
problems, both in proving theorems and obtaining efficient algorithms. Splitting-
off a pair of edges (xu, xv) means deleting these two edges and adding a new
edge uv if u = v. This operation is introduced by Lovász [18] who showed that
splitting-off can be performed to maintain the global edge-connectivity of a graph.
Mader extended Lovász’s result significantly to prove that splitting-off can be
performed to maintain the local edge-connectivity for all pairs:
Theorem 1 (Mader [19]). Let G = (V, E) be an undirected graph that has at
least r(s, t) edge-disjoint paths between s and t for all s, t ∈ V − x. If there is
no cut edge incident to x and d(x) = 3, then some edge pair (xu, xv) can be
split-off so that in the resulting graph there are still at least r(s, t) edge-disjoint
paths between s and t for all s, t ∈ V − x.
These splitting-off theorems have applications in various graph problems. Lovász
[18] and Mader [19] used their splitting-off theorems to derive Nash-Williams’ graph
orientation theorems [23]. Subsequently these theorems and their extensions have
found applications in a number of problems, including edge-connectivity augmen-
tation problems [4, 8, 9], network design problems [7, 13, 16], tree packing problems
[1, 6, 17], and graph orientation problems [11].
Efficient splitting-off algorithms have been developed to give fast algorithms
for the above problems [4, 6, 12, 20, 22]. However, most of the efficient algorithms
are developed only in the global edge-connectivity setting, although there are
important applications in the more general local edge-connectivity setting.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 96–109, 2010.

c Springer-Verlag Berlin Heidelberg 2010
Efficient Edge Splitting-Off Algorithms 97

In this paper we present new edge splitting-off results maintaining all-pairs


edge-connectivities. First we give an alternate proof of Mader’s theorem (The-
orem 1). Based on this, we develop a faster deterministic algorithm for edge
splitting-off maintaining all-pairs edge-connectivities (Theorem 2). Then we prove
a new structural property (Theorem 3), and use it to design a randomized pro-
cedure to further speedup the splitting-off algorithm (Theorem 2). These algo-
rithms improve the best known algorithm by a factor of Ω̃(n), and can be applied
directly to speedup various graph algorithms using edge splitting-off.

1.1 Efficient Complete Edge Splitting-Off Algorithm

Mader’s theorem can be applied repeatedly until d(x) = 0 when d(x) is even and
there is no cut edge incident to x. This is called a complete edge splitting-off at
x, which is a key subroutine in algorithms for connectivity augmentation, graph
orientation, and tree packing.
A straightforward algorithm to compute a complete splitting-off sequence is
to split-off (xu, xv) for every pair u, v ∈ N (x) where N (x) is the neighbor set
of x, and then check whether the connectivity requirements are violated by
computing all-pairs edge-connectivities in the resulting graph, and repeat this
procedure until d(x) = 0.
Several efficient algorithms are proposed for the complete splitting-off prob-
lem, but only Gabow’s algorithm [12] can be used in the local edge-connectivity
setting, with running time O(rmax 2 · n3 ). Our algorithms improve the running
time of Gabow’s algorithm by a factor of Ω̃(n). In applications where rmax is
small, the improvement of the randomized algorithm could be a factor of Ω̃(n2 ).
Theorem 2. In the local edge-connectivity setting, there is a deterministic
Õ(rmax 2 · n2 )-time algorithm and a randomized Õ(m + rmax 3 · n)-time algorithm
for the complete edge splitting-off problem in unweighted graphs.
These edge splitting-off algorithms can be used directly to improve the run-
ning time of various graph algorithms [7, 9, 12, 13, 17, 23]. For instance, us-
ing Theorem 2 in Gabow’s local edge-connectivity augmentation algorithm [12]
in unweighted graphs, the running time can be improved from Õ(rmax 2 n3 ) to
Õ(rmax 2 n2 ) time. Similarly, using Theorem 2 in Gabow’s orientation algorithm
[12], one can find a well-balanced orientation in unweighted graphs in Õ(rmax 3 n2 )
expected time, improving the O(rmax 2 n3 ) result by Gabow [12]. We will not dis-
cuss the details of these applications in this paper.
Our edge splitting-off algorithms are conceptually very simple, which can be
seen as refinements of the straightforward algorithm. The improvements come
from some new structural results, and a recent fast Gomory-Hu tree construc-
tion algorithm by Bhalgat, Hariharan, Kavitha, and Panigrahi [5]. First, in
Section 3.2, we show how to find a complete edge splitting-off sequence by using
at most O(|N (x)|) splitting-off attempts, instead of O(|N (x)|2 ) attempts by the
straightforward algorithm. This is based on an alternative proof of Mader’s the-
orem in Section 3.1. Then, in Section 3.4, we show how to reduce the problem of
checking local edge-connectivities for all pairs, to the problem of checking local
98 L.C. Lau and C.K. Yung

edge-connectivities from a particular vertex (i.e. checking at most O(n) pairs


instead of checking O(n2 ) pairs). This allows us to use the recent fast Gomory-
Hu tree algorithm [5] to check connectivities efficiently. Finally, using a new
structural property (Theorem 3), we show how to speedup the algorithm by a
randomized edge splitting-off procedure in Section 4.

1.2 Structural Property and Randomized Algorithm


Mader’s theorem shows the existence of one admissible edge pair, whose splitting-
off maintains the local edge-connectivity requirements of the graph. Given an
edge xv, we say an edge xw is a non-admissible partner of xv if (xv, xw) is not
admissible. We prove a tight upper bound on the number of non-admissible part-
ners of a given edge xv, which may be of independent interest. In the following
rmax := maxs,t∈V −x r(s, t) is the maximum edge-connectivity requirement.
Theorem 3. Suppose there is no cut edge incident to x and rmax ≥ 2. Then the
number of non-admissible partners for any given edge xv is at most 2rmax − 2.
This improves the result of Bang-Jensen and Jordán [2] by a factor of rmax , and
the bound is best possible as there are examples achieving it. Theorem 3 implies
that when d(x) is considerably larger than rmax , most of the edge pairs incident
to x are admissible. Therefore, we can split-off edge pairs randomly to speedup
our efficient splitting-off algorithm. The proof of Theorem 3 is based on a new
inductive argument and will be presented in Section 4.

2 Preliminaries
Let G = (V, E) be a graph. For X, Y ⊆ V , denote by δ(X, Y ) the set of edges
with one endpoint in X − Y and the other endpoint in Y − X and d(X, Y ) :=
¯
|δ(X, Y )|, and also define d(X, Y ) := d(X ∩ Y, V − (X ∪ Y )). For X ⊆ V , define
δ(X) := δ(X, V − X) and the degree of X as d(X) := |δ(X)|. Denote the degree
of a vertex as d(v) := d({v}). Also denote the set of neighbors of v by N (v), and
call a vertex in N (v) a v-neighbor.
Let λ(s, t) be the maximum number of edge-disjoint paths between s and t in
V , and let r(s, t) be an edge-connectivity requirement for s, t ∈ V . The connec-
tivity requirement is global if rs,t = k for all s, t ∈ V , otherwise it is local. We
say a graph G satisfies the connectivity requirements if λ(s, t) ≥ r(s, t) for any
s, t ∈ V . The requirement r(X) of a set X ⊆ V is the maximum edge-connectivity
requirement between u and v with u ∈ X and v ∈ V − X. By Menger’s theorem,
to satisfy the requirements, it suffices to guarantee that d(X) ≥ r(X) for all
X ⊂ V . The surplus s(X) of a set X ⊆ V is defined as d(X) − r(X). A graph
satisfies the edge-connectivity requirements if s(X) ≥ 0 for all ∅ = X ⊂ V . For
X ⊂ V − x, X is called dangerous if s(X) ≤ 1 and tight if s(X) = 0. The
following proposition will be used throughout our proofs.
Proposition 4 ([10] Proposition 2.3). For X, Y ⊆ V at least one of the
following inequalities holds:
Efficient Edge Splitting-Off Algorithms 99

s(X) + s(Y ) ≥ s(X ∩ Y ) + s(X ∪ Y ) + 2d(X, Y ) (4a)


¯
s(X) + s(Y ) ≥ s(X − Y ) + s(Y − X) + 2d(X, Y) (4b)
In edge splitting-off problems, the objective is to split-off a pair of edges incident
to a designated vertex x to maintain the edge-connectivity requirements for all
other pairs in V −x. For this purpose, we may assume that the edge-connectivity
requirements between x and other vertices are zero. In particular, we may assume
that r(V − x) = 0 and thus the set V − x is not a dangerous set. Two edges
xu, xv form an admissible pair if the graph after splitting-off (xu, xv) does not
violate s(X) ≥ 0 for all X ⊂ V . Given an edge xv, we say an edge xw is a
non-admissible partner of xv if (xv, xw) is not admissible. The following simple
proposition characterizes when a pair is admissible.
Proposition 5 ([10] Claim 3.1). A pair xu, xv is not admissible if and only
if u, v are contained in a dangerous set.
A vertex subset S ⊆ N (x) is called a non-admissible set if (xu, xv) is non-
admissible for every u, v ∈ S. We define the capacity of an edge pair to be the
number of copies of the edge pair that can be split-off while satisfying edge-
connectivity requirements. In our algorithms we will always split-off an edge
pair to its capacity (which could be zero), and only attempt at most O(|N (x)|)
many pairs. Following the definition of Gabow [12], we say that a splitting-off
operation voids a vertex u if d(x, u) = 0 after the splitting-off.
Throughout the complete splitting-off algorithm, we assume that there is no
cut edge incident to x. This holds at the beginning by our assumption, and so
the local edge-connectivity between x and v is at least two for each x-neighbor
v. Therefore, we can reset the connectivity requirement between u and v as
max{r(u, v), 2}, and hence splitting-off any admissible pair would maintain the
property that there is no cut edge incident to x at each step.

2.1 Some Useful Results


The first lemma is about a reduction step of contracting tight sets. Suppose
there is a non-trivial tight set T , i.e. T is a tight set and |T | ≥ 2. Clearly there
are no admissible pairs xu, xv with u, v ∈ T . Let G/T be the graph obtained
by contracting T into a single vertex t, and define the connectivity requirement
r(t, v) as maxu∈T r(u, v), while other connectivity requirements remain the same.
The following lemma says that one can consider the admissible pairs in G/T ,
without losing any information about the admissible pairs in G. This lemma is
useful in proofs to assume that every tight set is a singleton, and is useful in
algorithms to allow us to make progress by contracting non-trivial tight sets.
Lemma 6 ([19], [10] Claim 3.2). Let T be a non-trivial tight set. For an x-
neighbor w in G/T , let w be the corresponding vertex in G if w = t, and let w
be any x-neighbor in T in G if w = t. Suppose (xu, xv) is an admissible pair in
G/T , then (xu , xv  ) is an admissible pair in G.
The next lemma proved in [7] shows that if the conditions in Mader’s theorem are
satisfied, then there is no “3-dangerous-set structure”. This lemma is important
in the efficient edge splitting-off algorithm.
100 L.C. Lau and C.K. Yung

Lemma 7 ([7] Lemma 2.7). If d(x) = 3 and there is no cut edge incident to
x, then there are no maximal dangerous sets X, Y, Z and u, v, w ∈ N (x) with
u ∈ X ∩ Y , v ∈ X ∩ Z, w ∈ Y ∩ Z and u, v, w ∈
/ X ∩ Y ∩ Z.
Nagamochi and Ibaraki [21] gave a fast algorithm to find a sparse subgraph that
satisfies edge-connectivity requirements, which will be used in Section 3.3 as a
preprocessing step.
Theorem 8 ([21] Lemma 2.1). There is an O(m)-time algorithm to construct
a subgraph with O(rmax · n) edges that satisfies all the connectivity requirements.
As a key tool in checking local edge-connectivities, we need to construct a
Gomory-Hu tree, which is a compact representation of all pairwise min-cuts
of an undirected graph. Let G = (V, E) be an undirected graph, a Gomory-Hu
tree is a weighted tree T = (V, F ) with the following property. Consider any
s, t ∈ V , the unique s-t path P in T , an edge e = uv on P with minimum
weight, and any component K of T − e. Then the local edge-connectivity be-
tween s and t in G is equal to the weight of e in T , and δ(K) is a minimum s-t
cut in G. To check whether the connectivity requirements are satisfied, we only
need to check the pairs with λ(u, v) ≤ rmax . A partial Gomory-Hu tree Tk of G is
obtained from a Gomory-Hu tree T of G by contracting all edges with weight at
least k. Therefore, each node in Tk represents a subset of vertices S in G, where
the local edge-connectivity between each pair of vertices in S is at least k. For
vertices u, v ∈ G in different nodes of Tk , their local edge-connectivity (which
is less than k) is determined in the same way as in an ordinary Gomory-Hu
tree. Bhalgat et.al. [5] gave a fast randomized algorithm to construct a partial
Gomory-Hu tree. We will use the following theorem by setting k = rmax . The
following result can be obtained by using the algorithm in [15], with the fast tree
packing algorithm in [5].
Theorem 9 ([5, 15]). A partial Gomory-Hu tree Tk can be constructed in
Õ(km) expected time.

3 Efficient Complete Edge Splitting-Off Algorithm


In this section we present the deterministic splitting-off algorithm as stated
in Theorem 2. First we present an alternative proof of Mader’s theorem in
Section 3.1. Extending the ideas in the alternative proof we show how to find
a complete edge splitting-off sequence by only O(|N (x)|) edge splitting-off at-
tempts in Section 3.2. Then, in Section 3.3, we show how to efficiently perform
one edge splitting-off attempt, by doing some preprocessing and applying some
fast algorithms to check edge-connectivities. Combining these two steps yields
an Õ(rmax 2 · n2 ) randomized algorithm for the complete splitting-off problem.
Finally, in Section 3.5, we describe how to modify some steps in Section 3.3 to
obtain an Õ(rmax 2 · n2 ) deterministic algorithm for the problem.
Efficient Edge Splitting-Off Algorithms 101

3.1 Mader’s Theorem


We present an alternative proof of Mader’s theorem, which can be extended to
obtain an efficient algorithm. The following lemma about non-admissible sets
can be used directly to derive Mader’s theorem.
Lemma 10. Suppose there is no 3-dangerous set structure. Then, for any non-
admissible set U ⊆ N (x) with |U | ≥ 2, there is a dangerous set containing U .

Proof. We prove the lemma by a simple induction. The statement holds trivially
for |U | = 2 by Proposition 5. Consider U = {u1 , u2 , . . . , uk+1 } ⊆ N (x) where ev-
ery pair (ui , uj ) is non-admissible. By induction, since every pair (ui , uj ) is non-
admissible, there are maximal dangerous sets X, Y such that {u1 , ..., uk−1 , uk } ⊆
X and {u1 , ..., uk−1 , uk+1 } ⊆ Y . Since (uk , uk+1 ) is non-admissible, by Propo-
sition 5, there is a dangerous set Z containing uk and uk+1 . If uk+1 ∈ / X and
uk ∈ / Y and there is some ui ∈ / Z, then X, Y and Z form a 3-dangerous-set
structure with u = ui , v = uk , w = uk+1 . Hence either X, Y or Z contains U .  

To prove Mader’s theorem, consider a vertex x ∈ V with d(x) is even and


there is no cut edge incident to it. By Lemma 7, there is no 3-dangerous set
structure in G. Suppose that there is no admissible pair incident to x. Then, by
Lemma 10, there is a dangerous set D containing all the vertices in N (x). But
this is impossible since r(V −D−x) = r(D) ≥ d(D)−1 = d(V −D−x)+d(x)−1 ≥
d(V − D − x) + 1, contradicting that the connectivity requirements are satisfied
in G. This completes the proof.

3.2 An Upper Bound on Splitting-Off Attempts


Extending the ideas in the proof of Lemma 10, we present an algorithm to
find a complete splitting-off sequence by making at most O(|N (x)|) splitting-off
attempts (to split-off to capacity). In the algorithm we maintain a non-admissible
set C; initially C = ∅. The algorithm will apply one of the following three
operations guaranteed by the following lemma. Here we assume that {u} is a
non-admissible set for every u ∈ N (x). This can be achieved by a pre-processing
step that split-off every (u, u) to capacity.
Lemma 11. Suppose that C is a non-admissible set and there is a vertex u ∈
N (x) − C. Then, using at most three splitting-off attempts, at least one of the
following operations can be applied:
1. Splitting-off an edge pair to capacity that voids an x-neighbor.
2. Deducing that every pair in C ∪ {u} is non-admissible, and add u to C.
3. Contracting a tight set T containing at least two x-neighbors.

Proof. We consider three cases based on the size of C. When |C| = 0, we simply
assign C = {u}. When |C| = 1, pick the vertex v ∈ C, and split-off (u, v) to
capacity. Either case (1) applies when either u or v becomes void, or case (2)
applies in the resulting graph after (u, v) is split-off to capacity. Hence, when
|C| ≤ 1, either case (1) or case (2) applies after only one splitting-off attempt.
102 L.C. Lau and C.K. Yung

The interesting case is when |C| ≥ 2 and let v1 , v2 ∈ C. Since C is a non-


admissible set, by Lemma 10, there is a maximal dangerous set D containing C.
First, we split-off (u, v1 ) and (u, v2 ) to capacity. If case (1) applies then we are done,
so we assume that none of the three x-neighbors voids, implying that (u, v1 ) and
(u, v2 ) are non-admissible in the resulting graph G after splitting-off these edge
pairs to capacity. Note that the edge pair (v1 , v2 ) is also non-admissible since non-
admissible edge pair in G remains non-admissible in G . By Lemma 10, there exists
a maximal dangerous set D covering the non-admissible set {u, v1 , v2 }. Then in-
equality (4b) cannot hold for D and D , since 1 + 1 = s(D) + s(D ) ≥ s(D − D ) +
s(D − D) + 2d(D,
¯ D ) ≥ 0 + 0 + 2d(x, {v1 , v2 }) ≥ 2 · 2. Therefore inequality (4a)
must hold for D and D , hence 1 + 1 = s(D) + s(D ) ≥ s(D ∩ D ) + s(D ∪ D ).
This implies that either D ∪ D is a dangerous set for which case (2) applies,
since C ∪ {u} is contained in a dangerous set and hence every pair is a non-
admissible pair by Proposition 5, or D ∩ D is a tight set for which case (3)
applies since v1 and v2 are x-neighbors. Note that v1 , v2 are contained in a
tight set if and only if after splitting-off one copy of (xv1 , xv2 ) the connectivity
requirement of some pair is violated by two. Hence this can be checked by one
splitting-off attempt, and thus we can distinguish between case (2) and case (3),
and in case (3) we can find such a tight set efficiently. Therefore, by making
at most three splitting-off attempts ((xu, xv1 ), (xu, xv2 ), (xv1 , xv2 )), one of the
three operations can be applied. 

The following result can be obtained by applying Lemma 11 repeatedly.
Lemma 12. The algorithm computes a complete edge splitting-off sequence us-
ing at most O(|N (x)|) numbers of splitting-off attempts.
Proof. The algorithm maintains the property that C is a non-admissible set,
which holds at the beginning when C = ∅. It is clear that in case (2) the set
C remains non-admissible. In case (1), by splitting-off an admissible pair, every
pair of vertices in C remains non-admissible. Also, in case (3), by contracting a
tight set, every pair of vertices in C remains non-admissible by Lemma 6.
The algorithm terminates when there is no vertex in N (x) − C. At that time,
if C = ∅, then we have found a complete splitting-off sequence; if C = ∅, then by
Mader’s theorem (or by the proof in Section 3.1), this only happens if d(x) = 3
and d(x) is odd at the beginning. In any case, the longest splitting-off sequence
is found and the given complete edge splitting-off problem is solved.
It remains to prove that the total number of splitting-off attempts in the whole
algorithm is at most O(|N (x)|). To see this, we claim that each of the operations
in Lemma 11 will be performed at most |N (x)| times. Indeed, case (1) and (3)
will be applied at most |N (x)| times since each application reduces the number
of x-neighbors by at least one, and case (2) will be applied at most |N (x)| times
since each application reduces the number of x-neighbors in N (x)−C by one.  

3.3 Algorithm Outline


The following is an outline of the whole algorithm for the complete splitting-off
problem. First we use the O(m) time algorithm in Theorem 8 to construct a
Efficient Edge Splitting-Off Algorithms 103

subgraph of G with O(rmax · n) edges satisfying the connectivity requirements.


To find a complete splitting-off sequence, we can thus restrict our attention to
maintain the local edge-connectivities in this subgraph.
In the next preprocessing step, we will reduce the problem further to an
instance where there is a particular indicator vertex t = x, with the property
that for any pair of vertices u, v ∈ V − x with λ(u, v) ≤ rmax , then it holds that
λ(u, v) = min{λ(u, t), λ(v, t)}. With this indicator vertex, to check the local
edge-connectivity for all pairs with λ(u, v) ≤ rmax , we only need to check the
local edge-connectivities from t to every vertex v with λ(v, t) ≤ rmax . This allows
us to make only O(n) queries (instead of O(n2 ) queries) to check the local edge-
connectivities. This reduction step can be done by computing a partial Gomory-
Hu tree and contracting appropriate tight sets; see the details in Section 3.4.
The total preprocessing time is at most Õ(m + rmax 2 · n), by using the fast
Gomory-Hu tree algorithm in Theorem 9.
After these two preprocessing steps, we can perform a splitting-off attempt
(split-off a pair to capacity) efficiently. For a vertex pair (u, v), we replace
min{d(x, u), d(x, v)} copies of xu and xv by copies of uv, and then determine
the maximum violation of connectivity requirements by constructing a partial
Gomory-Hu tree and checking the local edge-connectivities from the indicator
vertex t to every other vertex. If q is the maximum violation of the connectivity
requirements, then exactly min{d(x, u), d(x, v)}−q/2 copies of (xu, xv) are ad-
missible. Therefore, using Theorem 9, one splitting-off attempt can be performed
in Õ(rmax · m + n) = Õ(rmax 2 · n) expected time. By Lemma 12, the complete
splitting-off problem can be solved by at most O(|N (x)|) = O(n) splitting-off
attempts. Hence we obtain the following result.
Theorem 13. The complete edge splitting-off problem can be solved in Õ(rmax 2 ·
|N (x)| · n) = Õ(rmax 2 · n2 ) expected time.

3.4 Indicator Vertex


We show how to reduce the problem into an instance with a particular indicator
vertex t = x, with the property that if λ(u, v) ≤ rmax for u, v = x, then λ(u, v) =
min{λ(u, t), λ(v, t)}. Hence if we could maintain the local edge-connectivity from
t to v for every v ∈ V −x with λ(v, t) ≤ rmax , then the connectivity requirements
for every pair in V − x will be satisfied. Furthermore, by maintaining the local
edge-connectivity, the indicator vertex t will remain to be an indicator vertex,
and therefore this procedure needs to be executed only once. Without loss of
generality, we assume that the connectivity requirement for each pair of vertices
u, v ∈ V − x is equal to min{λ(u, v), rmax }, and r(x, v) = 0 for every v ∈ V − x.
First we compute a partial Gomory-Hu tree Trmax in Õ(rmax · m) time by
Theorem 9, which is Õ(rmax 2 · n) after applying the sparsifying algorithm in
Theorem 8. Recall that each node in Trmax represents a subset of vertices in G.
In the following we will use a capital letter (say U ) to denote both a node in
Trmax and the corresponding subset of vertices in G. If Trmax has only one node,
then this means that the local edge-connectivity between every pair of vertices in
G is at least rmax . In this case, any vertex t = x is an indicator vertex. So assume
104 L.C. Lau and C.K. Yung

that Trmax has at least two nodes. Let X be the node in Trmax that contains x
in G, and U1 , . . . , Up be the nodes adjacent to X in Trmax , and let XU1 be the
edge in Trmax with largest weight among XUi for 1 ≤ i ≤ p. See Figure (a).

X x U1
t
U2 Up W1 Wq
U1 U2 … Up … …

U1* U2* Up* U2* Up* W1* Wq*

(a) (b)

Suppose X contains a vertex t = x in G. The idea is to contract tight sets so


that t will become an indicator vertex in the resulting graph. For any edge XUi
in Trmax , let Ti be the component of Trmax that contains Ui when XUi is removed
from Trmax . We claim that each Ui∗ := ∪U∈Ti U is a tight set in G; see Figure (a).
By the definition of a Gomory-Hu tree, the local edge-connectivity between any
vertex ui ∈ Ui and t is equal to the edge weight of XUi in Trmax . Also, by the
definition of a Gomory-Hu tree, d(Ui∗ ) is equal to the weight of edge XUi in
Trmax . Therefore, Ui∗ is a tight set in G, because r(ui , t) = λ(ui , t) = d(Ui∗ ) for
some pair ui , t ∈ V − x. By Proposition 5, we can contract each Ui∗ into a single
vertex ui for 1 ≤ i ≤ p without losing any information about admissible pairs
in G. Since each Ui∗ becomes a single vertex, the vertex t becomes an indicator
vertex in the resulting graph.
Suppose X contains only x in G. Then U1∗ may not be a tight set, since there
may not exist a pair u, v ∈ V − x with r(u, v) = λ(u, v) = d(U1∗ ) (note that
there is a vertex v with λ(x, v) = d(U1∗ ), but r(x, v) = 0 for every vertex v). In
this case, we will contract some tight sets so that any vertex in U1 will become
an indicator vertex. Let W1 = X, . . . , Wq = X be the nodes (if any) adjacent
to U1 in Trmax ; see Figure (b). By using similar arguments as before, it can be
shown that each Ui∗ is a tight set for 2 ≤ i ≤ p (through ui ∈ Ui and u1 ∈ U1 ).
Therefore we can contract each Ui∗ into a single vertex ui for 2 ≤ i ≤ p. Similarly,
we can argue that each Wj∗ (defined analogously as Ui∗ ) is a tight set, and hence
we can contract each Wj∗ into a single vertex wj for each 1 ≤ j ≤ q. We can
see that any vertex t ∈ U1 is an indicator vertex in the resulting graph, because
λ(t, v) ≥ min{λ(w, v), rmax } for any pair of vertices v, w.
Henceforth we can consider this resulting graph instead of G for the purpose of
computing a complete splitting-off sequence, and using t as the indicator vertex
to check connectivities. The running time of this procedure is dominated by the
partial Gomory-Hu tree computation, which is at most Õ(rmax 2 · n).

3.5 Deterministic Algorithm


We describe how to modify the randomized algorithm in Theorem 13 to obtain a
deterministic algorithm with the same running time. Every step in the algorithm
Efficient Edge Splitting-Off Algorithms 105

is deterministic except the Gomory-Hu tree construction in Theorem 9. The


randomized Gomory-Hu tree construction is used in two places. First it is used in
finding an indicator vertex in Section 3.4, and for this purpose it is executed only
once. Here we can replace it by a slower deterministic partial Gomory-Hu tree
construction algorithm. It is well-known that a Gomory-Hu tree can be computed
using at most n − 1 max-flow computations [14]. By using the Ford-Fulkerson
flow algorithm, one can obtain an O(rmax 2 · n2 )-time deterministic algorithm
to construct a partial Gomory-Hu tree Trmax . The randomized partial Gomory-
Hu construction is also used in every splitting-off attempt to check whether the
connectivity requirements are satisfied. With the indicator vertex t, this task
reduces to checking the local edge-connectivities from t to other vertices, and
there is a fast deterministic algorithm for this simpler task by Bhalgat et.al. [5].

Theorem 14 ([5]). Given an undirected graph G and a vertex t, there is an


Õ(rmax · m)-time deterministic algorithm to compute min{λG (t, v), rmax } for all
vertices v ∈ G.
Thus we can replace the randomized partial Gomory-Hu tree algorithm by this
algorithm, and so Theorem 13 still holds deterministically. Hence there is a
deterministic Õ(rmax 2 ·n2 ) time algorithm for the complete splitting-off problem.

4 Structural Property and Randomized Algorithm


Before we give the proof of Theorem 3, we first show how to use it in a randomized
edge splitting-off procedure to speedup the algorithm. By Theorem 3, when the
degree of x is much larger than 2rmax , even a random edge pair will be admissible
with probability at least 1 − 2rmax /(d(x) − 1). Using this observation, we show
how to reduce d(x) to O(rmax ) in Õ(rmax 3 · n) time. Then, by Theorem 13, the
remaining edges can be split-off in Õ(rmax 2 · d(x) · n) = Õ(rmax 3 · n) time. So
the total running time of the complete splitting-off algorithm is improved to
Õ(m + rmax 3 · n), proving Theorem 2.
The idea is to split-off many random edge pairs in parallel, before checking if
some connectivity requirement is violated. Suppose that 2l+q−1 < d(x) ≤ 2l+q
and 2l−1 < rmax ≤ 2l for some positive integers l and q. To reduce d(x) to 2l+q−1 ,
we need to split-off at most 2l+q−1 x-edges. Since each x-edge has fewer than
2rmax non-admissible partners by Theorem 3, the probability that a random
−2l+1
= 2 2q−2−1 . Now,
l+q−1 q−2
edge pair is admissible is at least (d(x)−1)−2r
d(x)−1
max
≥ 2 2l+q−1
consider a random splitting-off operation that split-off at most 2q−2 edge pairs at
random in parallel. The operation is successful if all the edge pairs are admissible.
The probability for the operation to succeed is at least ( 2 2q−2−1 )2
q−2 q−2
= O(1).
After each operation, we run the checking algorithm to determine whether this
operation is successful or not. Consider an iteration that consists of c · log n
operations, for some constant c. The iteration is successful if it finds a set of
2q−2 admissible pairs, i.e. any of its operations succeeds. The probability for an
iteration to fail is hence at most 1/nc for q ≥ 3. The time complexity of an
iteration is Õ(rmax 2 · n).
106 L.C. Lau and C.K. Yung

Since each iteration reduces the degree of x by 2q−2 , with at most 2l+1 =
O(rmax ) successful iterations, we can then reduce d(x) to 2l+q−1 , i.e. reduce
d(x) by half. This procedure is applicable as long as q ≥ 3. Therefore, we can
reduce d(x) to 2l+2 by using this procedure for O(log n) times. The total running
time is thus Õ(2l+1 · log n · rmax 2 · n) = Õ(rmax 3 · n). Note that there are at most
Õ(rmax ) iterations and the failure probability of each iteration is at most 1/nc .
By the union bound, the probability for above randomized algorithm to fail
is at most 1/nc−1 . Therefore, with high probability, the algorithm succeeds in
Õ(rmax 3 · n) time to reduce d(x) to O(rmax ). Since the correctness of solution
can be verified by a Gomory-Hu Tree, this also gives a Las Vegas algorithm with
the same expected runtime.

4.1 Proof of Theorem 3


In this subsection we will prove that each edge has at most 2rmax − 2 non-
admissible partners. Given an edge pair (xv, xw), if it is a non-admissible pair,
then there is a dangerous set D with {xv, xw} ⊆ δ(D) by Proposition 5, and we
say such a dangerous set D covers xv and xw. Let P be the set of non-admissible
partners of xv in the initial graph. Our goal is to show that |P | ≤ 2rmax − 2.
Proposition 15 ([2] Lemma 5.4). Suppose there is no cut edge incident to
x. For any disjoint vertex sets S1 , S2 with d(S1 , S2 ) = 0 and d(x, S1 ) ≥ 1 and
d(x, S2 ) ≥ 1, then S1 ∪ S2 is not a dangerous set.
We first present an outline of the proof. Let DP be a minimal set of maximal
dangerous sets such that (i) each set D ∈ DP covers the edge xv and (ii) each
edge in P is covered by some set D ∈ DP . First, we consider the base case
with |DP | ≤ 2. The theorem follows immediately if |DP | = 1, so assume DP =
{D1 , D2 }. By Proposition 15, d(D1 − D2 , D1 ∩ D2 ) ≥ 1 as DP is minimal. Hence
d(D, V − x − D) ≥ 1 for each D ∈ DP . Since d(D) ≤ rmax + 1 and D covers
xv for each D ∈ DP , each set in DP can cover at most rmax − 1 non-admissible
partner of xv, proving |P | ≤ 2rmax − 2.
The next step is to show that |DP | ≤ rmax − 1 when |DP | ≥ 3, where the
proofs of this step use very similar ideas as in [2, 24]. When |DP | ≥ 3, we show
in Lemma 16 that inequality (4a) must hold for each pair of dangerous sets in
DP . Since each dangerous set is connected by Proposition 15, this allows us to
conclude in Lemma 17 that |DP | ≤ rmax − 1. This implies that |P | < rmax 2
.
To improve this bound, we use a new inductive argument to show that |P | ≤
rmax − 1 + |DP | ≤ 2rmax − 2. First we prove in Lemma 18 that there is an
admissible pair (xa, xb) in P (so by definition a, b = v). By splitting-off (xa, xb),
let P  = P − {xa, xb} with |P  | = |P | − 2. In the resulting graph, we prove
in Lemma 19 that |DP  | ≤ |DP | − 2. Hence, by repeating this reduction, we
can show that after splitting-off |DP |/2 pairs of edges in P , the remaining
edges in P is covered by one dangerous set. Therefore, we can conclude that
|P | ≤ rmax − 1 + |DP | ≤ 2rmax − 2. In the following we will first prove the upper
bound on |DP |, then we will provide the details of the inductive argument.
Efficient Edge Splitting-Off Algorithms 107

An Upper Bound on |DP |: By contracting non-trivial tight sets, each edge in P


is still a non-admissible partner of xv by Lemma 6. Henceforth, we will assume that
all tight sets in G are singletons. Also we assume there is no cut edge incident to x
and rmax ≥ 2 as required in the proof by Theorem 3. Recall that DP is a minimal
set of maximal dangerous sets such that (i) each set D ∈ DP covers the edge xv
and (ii) each edge in P is covered by some set D ∈ DP . We use the following result.

Lemma 16 ([2] Lemma 5.4, [24] Lemma 2.6). If |DP | ≥ 3, then inequal-
ity (4a) holds for every X, Y ∈ DP . Furthermore, X ∩ Y = {v} and is a tight
set for any X, Y ∈ DP .
Lemma 17. |DP | ≤ rmax − 1 when |DP | ≥ 3.

Proof. By Lemma 16, we have X ∩ Y = {v} for any X, Y ∈ DP . For each set
X ∈ DP , we have d(x, v) ≥ 1 and d(x, X − v) ≥ 1 by the minimality of DP .
Therefore, we must have d(v, X − v) ≥ 1 by Proposition 15. By Lemma 16, X − v
and Y − v are disjoint for each pair X, Y ∈ DP . Since d(v, X − v) ≥ 1 for each
X ∈ DP and d(x, v) ≥ 1, it follows that |DP | ≤ d(v) − 1. By Lemma 16, {v} is
a tight set, and thus |DP | ≤ d(v) − 1 ≤ rmax − 1. 

An Inductive Argument: The goal is to prove that |P | ≤ rmax − 1 + |DP |.
By Lemma 17, this holds if d(x, X − v) = 1 for every dangerous set X ∈ DP .
Hence we assume that there is a dangerous set A ∈ DP with d(x, A − v) ≥ 2;
this property will only be used at the very end of the proof. By Lemma 16,
inequality (4a) holds for A and B for every B ∈ DP . By the minimality of DP ,
there exists a x-neighbor a ∈ A which is not contained in any other set in DP .
Similarly, there exists b ∈ B which is not contained in any other set in DP . The
following lemma shows that the edge pair (xa, xb) is admissible.
Lemma 18. For any A, B ∈ DP satisfying inequality (4a), an edge pair (xa, xb)
is admissible if a ∈ A − B and b ∈ B − A.
Proof. Suppose, by way of contradiction, that (xa, xb) is non-admissible. Then,
by Proposition 5, there exists a maximal dangerous set C containing a and b. We
claim that v ∈ C; otherwise there exists a 3-dangerous-set structure, contradict-
ing Lemma 7. Then d(x, A ∩ C) ≥ d(x, {v, a}) ≥ 2, and so inequality (4b) cannot
¯ C) ≥
hold for A and C, since 1 + 1 ≥ s(A) + s(C) ≥ s(A − C) + s(C − A) + 2d(A,
0 + 0 + 2 · 2. Therefore, inequality (4a) must hold for A and C. Since A and
C are maximal dangerous sets, A ∪ C cannot be a dangerous set, and thus
1 + 1 ≥ s(A) + s(C) ≥ s(A ∪ C) + s(A ∩ C) + 2d(A, C) ≥ 2 + s(A ∩ C) + 0, which
implies that A ∩ C is a tight set, but this contradicts the assumption that each
tight set is a singleton as {v, a} ⊆ A ∩ C. 

After splitting-off (xa, xb), let the resulting graph be G and P  = P − {xa, xb}.
Clearly, since each edge in P  is a non-admissible partner of xv in G, every edge
in P  is still a non-admissible partner of xv in G . Furthermore, by contracting
non-trivial tight sets in G , each edge in P  is still a non-admissible partner of
xv by Lemma 6. Hence we assume all tight sets in G are singletons. Let DP  be a
108 L.C. Lau and C.K. Yung

minimal set of maximal dangerous sets such that (i) each set D ∈ DP  covers the
edge xv and (ii) each edge in P  is covered by some set D ∈ DP  . The following
lemma shows that there exists DP  with |DP  | ≤ |DP | − 2.
Lemma 19. When |DP | ≥ 3, the edges in P  can be covered by a set DP  of
maximal dangerous sets in G such that (i) each set in DP  covers xv, and (ii)
each edge in P  is covered by some set in DP  , and (iii) |DP  | ≤ |DP | − 2.
Proof. We will use the dangerous sets in DP to construct DP  . Since each pair of sets
in DP satisfies inequality (4a), we have s(A∪D) = 2 before splitting-off (xa, xb) for
each D ∈ DP . Also, before splitting-off (xa, xb), for A, B, C ∈ DP , inequality (4b)
cannot hold for A ∪ B and C because 2 + 1 = s(A ∪ B) + s(C) ≥ s((A ∪ B) − C) +
¯
s(C − (A∪B))+ 2d(A∪B, C) ≥ 2 + 0 + 2 ·1, where the last inequality follows since
v ∈ A∩B∩C and (A∪B)−C is not dangerous (as it covers the admissible edge pair
(xa, xb)). Therefore inequality (4a) must hold for A ∪ B and C, which implies that
s(A ∪ B ∪ C) ≤ 3 since 2 + 1 = s(A ∪ B) + s(C) ≥ s((A ∪ B) ∪ C) + s((A ∪ B) ∩ C).
For A and B as defined before Lemma 18, since s(A ∪ B) = 2 before splitting-off
(xa, xb), A∪B becomes a tight set after splitting-off (xa, xb). For any other set C ∈
DP −A−B, since s(A∪B ∪C) ≤ 3 before splitting-off (xa, xb), A∪B ∪C becomes
a dangerous set after splitting-off (xa, xb). Hence, after splitting-off (xa, xb) and
contracting the tight set A ∪ B into v, each set in DP − A − B becomes a dangerous
set. Then DP  = DP − A − B is a set of dangerous sets covering each edge in P  ,
satisfying properties (i)-(iii). By replacing a dangerous set C ∈ DP  by a maximal
dangerous set C  ⊇ C and removing redundant dangerous sets in DP  so that it
minimally covers P  , we have found DP  as required by the lemma. 

Recall that we chose A with d(x, A − v) ≥ 2, and hence d(x, v) ≥ 2 after the
splitting-off and contraction of tight sets. Therefore, inequality (4a) holds for
every two maximal dangerous sets in DP  . By induction, when |DP | ≥ 3, we
have |P | = |P  | + 2 ≤ rmax − 1 + |DP  | + 2 ≤ rmax − 1 + |DP |. In the base case
when |DP | = 2 and A, B ∈ DP satisfy (4a), the same argument in Lemma 19 can
be used to show that the edges in P  is covered by one tight set after splitting-off
(xa, xb), and thus |P | = |P  |+ 2 ≤ rmax − 1 + 2 ≤ rmax − 1 + |DP |. This completes
the proof that |P | ≤ rmax − 1 + |DP |, proving the theorem.

5 Concluding Remarks
Theorem 3 can be applied to constrained edge splitting-off problems, and give
additive approximation algorithms for constrained augmentation problems. The
efficient algorithms can also be adapted to these problems. We refer the reader
to [25] for these results.

References
1. Bang-Jensen, J., Frank, A., Jackson, B.: Preserving and increasing local edge-
connectivity in mixed graphs. SIAM J. Disc. Math. 8(2), 155–178 (1995)
2. Bang-Jensen, J., Jordán, T.: Edge-connectivity augmentation preserving simplicity.
SIAM Journal on Discrete Mathematics 11(4), 603–623 (1998)
Efficient Edge Splitting-Off Algorithms 109

3. Bernáth, A., Király, T.: A new approach to splitting-off. In: Lodi, A., Panconesi, A.,
Rinaldi, G. (eds.) IPCO 2008. LNCS, vol. 5035, pp. 401–415. Springer, Heidelberg
(2008)
4. Benczúr, A.A., Karger, D.R.: Augmenting undirected edge connectivity in O(n2 )
time. Journal of Algorithms 37(1), 2–36 (2000)
5. Bhalgat, A., Hariharan, R., Kavitha, T., Panigrahi, D.: An Õ(mn) Gomory-Hu
tree construction algorithm for unweighted graphs. In: STOC 2007, pp. 605–614
(2007)
6. Bhalgat, A., Hariharan, R., Kavitha, T., Panigrahi, D.: Fast edge splitting and
Edmonds’ arborescence construction for unweighted graphs. In: SODA ’08, pp.
455–464 (2008)
7. Chan, Y.H., Fung, W.S., Lau, L.C., Yung, C.K.: Degree Bounded Network Design
with Metric Costs. In: FOCS ’08, pp. 125–134 (2008)
8. Cheng, E., Jordán, T.: Successive edge-connectivity augmentation problems. Math-
ematical Programming 84(3), 577–593 (1999)
9. Frank, A.: Augmenting graphs to meet edge-connectivity requirements. SIAM
Journal on Discrete Mathematics 5(1), 25–53 (1992)
10. Frank, A.: On a theorem of Mader. Ann. of Disc. Math. 101, 49–57 (1992)
11. Frank, A., Király, Z.: Graph orientations with edge-connection and parity con-
straints. Combinatorica 22(1), 47–70 (2002)
12. Gabow, H.N.: Efficient splitting off algorithms for graphs. In: STOC ’94, pp. 696–
705 (1994)
13. Goemans, M.X., Bertsimas, D.J.: Survivable networks, linear programming relax-
ations and the parsimonious property. Math. Prog. 60(1), 145–166 (1993)
14. Gomory, R.E., Hu, T.C.: Multi-terminal network flows. Journal of the Society for
Industrial and Applied Mathematics 9(4), 551–570 (1961)
15. Hariharan, R., Kavitha, T., Panigrahi, D.: Efficient algorithms for computing all
low st edge connectivities and related problems. In: SODA ’07, pp. 127–136 (2007)
16. Jordán, T.: On minimally k-edge-connected graphs and shortest k-edge-connected
Steiner networks. Discrete Applied Mathematics 131(2), 421–432 (2003)
17. Lau, L.C.: An approximate max-Steiner-tree-packing min-Steiner-cut theorem.
Combinatorica 27(1), 71–90 (2007)
18. Lovász, L.: Lecture. Conference of Graph Theory, Prague (1974); See also Combi-
natorial problems and exercises. North-Holland (1979)
19. Mader, W.: A reduction method for edge-connectivity in graphs. Annals of Discrete
Mathematics 3, 145–164 (1978)
20. Nagamochi, H.: A fast edge-splitting algorithm in edge-weighted graphs. IEICE
Transactions on Fundamentals of Electronics, Communications and Computer Sci-
ences, 1263–1268 (2006)
21. Nagamochi, H., Ibaraki, T.: Linear time algorithm for finding a sparse k-connected
spanning subgraph of a k-connected graph. Algorithmica 7(1), 583–596 (1992)
22. Nagamochi, H., Ibaraki, T.: Deterministic O(nm) time edge-splitting in undirected
graphs. Journal of Combinatorial Optimization 1(1), 5–46 (1997)
23. Nash-Williams, C.S.J.A.: On orientations, connectivity and odd vertex pairings in
finite graphs. Canadian Journal of Mathematics 12, 555–567 (1960)
24. Szigeti, Z.: Edge-splittings preserving local edge-connectivity of graphs. Discrete
Applied Mathematics 156(7), 1011–1018 (2008)
25. Yung, C.K.: Edge splitting-off and network design problems. Master thesis, The
Chinese University of Hong Kong (2009)
On Generalizations of Network Design Problems with
Degree Bounds

Nikhil Bansal1 , Rohit Khandekar1, Jochen Könemann2,


Viswanath Nagarajan1, and Britta Peis3
1
IBM T.J. Watson Research Center, Yorktown Heights, NY 10598, USA
2
University of Waterloo
3
Technische Universität Berlin

Abstract. Iterative rounding and relaxation have arguably become the method of
choice in dealing with unconstrained and constrained network design problems.
In this paper we extend the scope of the iterative relaxation method in two direc-
tions: (1) by handling more complex degree constraints in the minimum spanning
tree problem (namely laminar crossing spanning tree), and (2) by incorporating
‘degree bounds’ in other combinatorial optimization problems such as matroid
intersection and lattice polyhedra. We give new or improved approximation al-
gorithms, hardness results, and integrality gaps for these problems.

1 Introduction
Iterative rounding and relaxation have arguably become the method of choice in dealing
with unconstrained and constrained network design problems. Starting with Jain’s ele-
gant iterative rounding scheme for the generalized Steiner network problem in [14], an
extension of this technique (iterative relaxation) has more recently lead to breakthrough
results in the area of constrained network design, where a number of linear constraints
are added to a classical network design problem. Such constraints arise naturally in
a wide variety of practical applications, and model limitations in processing power,
bandwidth or budget. The design of powerful techniques to deal with these problems is
therefore an important goal.
The most widely studied constrained network design problem is the minimum-cost
degree-bounded spanning tree problem. In an instance of this problem, we are given an
undirected graph, non-negative costs for the edges, and positive, integral degree-bounds
for each of the nodes. The problem is easily seen to be NP-hard, even in the absence
of edge-costs, since finding a spanning tree with maximum degree two is equivalent to
finding a Hamiltonian Path. A variety of techniques have been applied to this problem
[5,6,11,17,18,23,24], culminating in Singh and Lau’s breakthrough result in [27]. They
presented an algorithm that computes a spanning tree of at most optimum cost whose
degree at each vertex v exceeds its bound by at most 1, using the iterative relaxation
framework developed in [20,27].
The iterative relaxation technique has been applied to several constrained network
design problems: spanning tree [27], survivable network design [20,21], directed graphs
with intersecting and crossing super-modular connectivity [20,2]. It has also been ap-
plied to degree bounded versions of matroids and submodular flow [15].

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 110–123, 2010.
c Springer-Verlag Berlin Heidelberg 2010
On Generalizations of Network Design Problems with Degree Bounds 111

In this paper we further extend the applicability of iterative relaxation, and obtain
new or improved bicriteria approximation results for minimum crossing spanning tree
(MCST), crossing matroid intersection, and crossing lattice polyhedra. We also provide
hardness results and integrality gaps for these problems.
Notation. As is usual, when dealing with an undirected graph G = (V, E), for any
S ⊆ V we let δG (S) := {(u, v) ∈ E | u ∈ S, v ∈ S}. When the graph is clear from
context, the subscript is dropped. A collection {U1 , · · · , Ut } of vertex-sets is called
laminar if for every pair Ui , Uj in this collection, we have Ui ⊆ Uj , Uj ⊆ Ui , or
Ui ∩ Uj = ∅. A (ρ, f (b)) approximation for minimum cost degree bounded problems
refers to a solution that (1) has cost at most ρ times the optimum that satisfies the degree
bounds, and (2) satisfies the relaxed degree constraints in which a bound b is replaced
with a bound f (b).

1.1 Our Results, Techniques and Paper Outline


Laminar MCST. Our main result is for a natural generalization of bounded-degree MST
(called Laminar Minimum Crossing Spanning Tree or laminar MCST), where we are
given an edge-weighted undirected graph with a laminar family L = {Si }m i=1 of vertex-
sets having bounds {bi }mi=1 ; and the goal is to compute a spanning tree of minimum cost
that contains at most bi edges from δ(Si ) for each i ∈ [m].
The motivation behind this problem is in designing a network where there is a hi-
erarchy (i.e. laminar family) of service providers that control nodes (i.e. vertices). The
number of edges crossing the boundary of any service provider (i.e. its vertex-cut) rep-
resents some cost to this provider, and is therefore limited. The laminar MCST problem
precisely models the question of connecting all nodes in the network while satisfying
bounds imposed by all the service providers.
From a theoretical viewpoint, cut systems induced by laminar families are well stud-
ied, and are known to display rich structure. For example, one-way cut-incidence ma-
trices are matrices whose rows are incidence vectors of directed cuts induced by the
vertex-sets of a laminar family; It is well known (e.g., see [19]) that such matrices are
totally unimodular. Using the laminar structure of degree-constraints and the iterative
relaxation framework, we obtain the following main result, and present its proof in
Section 2.
Theorem 1. There is a polynomial time (1, b + O(log n)) bicriteria approximation al-
gorithm for laminar MCST. That is, the cost is no more than the optimum cost and the
degree violation is at most additive O(log n). This guarantee is relative to the natural
LP relaxation.
This guarantee is substantially stronger than what follows from known results for the
general minimum crossing spanning tree (MCST) problem: where the degree bounds
could be on arbitrary edge-subsets E1 , . . . , Em . In particular, for general MCST a
(1, b + Δ − 1) [2,15] is known where Δ is the maximum number of degree-bounds an
edge appears in. However, this guarantee is not useful for laminar MCST as Δ can be as
large as Ω(n) in this case. If a multiplicative factor
 in the degree violationis allowed,
Chekuri et al. [8] recently gave a very elegant 1, (1 + )b + O( 1 log m) guarantee
(which subsumes the previous best (O(log n), O(log m) b) [4] result). However, these
112 N. Bansal et al.

results also cannot be used to obtain a small additive violation, especially if b is large.
In particular, both the results [4,8] for general MCST √ are based on the natural LP relax-
ation, for which there is an integrality gap of b + Ω( n) even without regard to costs
and when m = O(n) [26] (see also [3]). On the other hand, Theorem 1 shows that a
purely additive O(log n) guarantee on degree (relative to the LP relaxation and even in
presence of costs) is indeed achievable for MCST, when the degree-bounds arise from
a laminar cut-family.
The algorithm in Theorem 1 is based on iterative relaxation and uses two main new
ideas. Firstly, we drop a carefully chosen constant fraction of degree-constraints in each
iteration. This is crucial as it can be shown that dropping one constraint at a time as in
the usual applications of iterative relaxation can indeed lead to a degree violation of
Ω(Δ). Secondly, the algorithm does not just drop degree constraints, but in some itera-
tions it also generates new degree constraints, by merging existing degree constraints.
All previous applications of iterative relaxation to constrained network design treat
connectivity and degree constraints rather asymmetrically. While the structure of the
connectivity constraints of the underlying LP is used crucially (e.g., in the ubiquitous
uncrossing argument), the handling of degree constraints is remarkably simple. Con-
straints are dropped one by one, and the final performance of the algorithm is good only
if the number of side constraints is small (e.g., in recent work by Grandoni et al. [12]),
or if their structure is simple (e.g., if the ‘frequency’ of each element is small). In con-
trast, our algorithm for laminar MCST exploits the structure of degree constraints in a
non-trivial manner.
Hardness Results. We obtain the following hardness of approximation for the general
MCST problem (and its matroid counterpart). In particular this rules out any algorithm
for MCST that has additive constant degree violation, even without regard to costs.
Theorem 2. Unless N P has quasi-polynomial time algorithms, the MCST problem
admits no polynomial time O(logα m) additive approximation for the degree bounds
for some constant α > 0; this holds even when there are no costs.
The proof for this theorem is given in Section 3, and uses a a two-step reduction from
the well-known Label Cover problem. First, we show hardness for a uniform matroid
instance. In a second step, we then demonstrate how this implies the result for MCST
claimed in Theorem 2.
Note that our hardness bound nearly matches the result obtained by Chekuri et al.
in [8]. We note however that in terms of purely additive degree guarantees,√a large gap
remains. As noted above, there is a much stronger lower bound of b + Ω( n) for LP-
based algorithms [26] (even without regard to costs), which is based on discrepancy. In
light of the small number of known hardness results for discrepancy type problems, it
is unclear how our bounds for MCST could be strengthened.
Degree Bounds in More General Settings. We consider crossing versions of other clas-
sic combinatorial optimization problems, namely matroid intersection and lattice poly-
hedra. We discuss our results briefly and defer the proofs to the full version of the
paper [3].
On Generalizations of Network Design Problems with Degree Bounds 113

Definition 1 (Minimum crossing matroid intersection problem). Let r1 , r2 : 2E →


Z be two supermodular functions, c : E → R and {Ei }i∈I be a collection of subsets of
E with corresponding bounds {bi }i∈I . Then the goal is to minimize:
-
{cT x - x(S) ≥ max{r1 (S), r2 (S)}, ∀ S ⊆ E;
x(Ei ) ≤ bi , ∀ i ∈ [m]; x ∈ {0, 1}E }.

We remark that there are alternate definitions of matroid intersection (e.g., see Schri-
jver [25]) and that our result below extends to those as well.
Let Δ = maxe∈E |{i ∈ [m] | e ∈ Ei }| be the largest number of sets Ei that any
element of E belongs to, and refer to it as frequency.
Theorem 3. Any optimal basic solution x∗ of the linear relaxation of the minimum
crossing matroid intersection problem can be rounded into an integral solution x̂ such
that x̂(S) ≥ max{r1 (S), r2 (S)} for all S ⊆ E and

cT x̂ ≤ 2cT x∗ and x̂(Ei ) ≤ 2bi + Δ − 1 ∀i ∈ I.

The algorithm for this theorem again uses iterative relaxation, and its proof is based on
a ‘fractional token’ counting argument similar to the one used in [2].
An interesting special case is for the bounded-degree arborescence problem (where
Δ = 1). As the set of arborescences in a digraph can be expressed as the intersection
of partition and graphic matroids, Theorem 3 readily implies a (2, 2b) approximation
for this problem. This is an improvement over the previously best-known (2, 2b + 2)
bound [20] for this problem.
The bounded-degree arborescence problem is potentially of wider interest since it is
a relaxation of ATSP, and it is hoped that ideas from this problem lead to new ideas
for ATSP. In fact Theorem 3 also implies an improved (2, 2b)-approximation for the
bounded-degree arborescence packing problem, where the goal is to pack a given num-
ber of arc-disjoint arborescences while satisfying degree-bounds on vertices (arbores-
cence packing can again be phrased as matroid intersection). The previously best known
bound for this problem was (2, 2b + 4) [2]. We also give the following integrality gap.
Theorem 4. For any > 0, there exists an instance of unweighted minimum crossing
arborescence for which the LP is feasible, and any integral solution must violate the
bound on some set {Ei }m
i=1 by a multiplicative factor of at least 2 − . Moreover, this
instance has Δ = 1, and just one non-degree constraint.
Thus Theorem 3 is the best one can hope for, relative to the LP relaxation. First,
Theorem 4 implies that the multiplicative factor in the degree cannot be improved be-
yond 2 (even without regard to costs). Second, the lower bound for arborescences with
costs presented in [2] implies that no cost-approximation ratio better than 2 is possible,
without violating degrees by a factor greater than 2.
Crossing Lattice Polyhedra. Classical lattice polyhedra form a unified framework for
various discrete optimization problems and go back to Hoffman and Schwartz [13] who
proved their integrality. They are polyhedra of type

{x ∈ [0, 1]E | x(ρ(S)) ≥ r(S), ∀S ∈ F }


114 N. Bansal et al.

where F is a consecutive submodular lattice, ρ : F → 2E is a mapping from F to


subsets of the ground-set E, and r ∈ RF is supermodular. A key property of lattice
polyhedra is that the uncrossing technique can be applied which turns out to be cru-
cial in almost all iterative relaxation approaches for optimization problems with degree
bounds. We refer the reader to [25] for a more comprehensive treatment of this subject.
We generalize our work further to crossing lattice polyhedra which arise from clas-
sical lattice polyhedra by adding “degree-constraints” of the form ai ≤ x(Ei ) ≤ bi
for a given collection {Ei ⊆ E | i ∈ I} and lower and upper bounds a, b ∈ RI . We
mention that this model covers several important applications including the crossing
matroid basis and crossing planar mincut problems, among others.
We can show that the standard LP relaxation for the general crossing lattice polyhe-
dron problem is weak; details are deferred to the full version of the paper in [3]. For
this reason, we henceforth focus on a restricted class of crossing lattice polyhedra in
which the underlying lattice (F , ≤) satisfies the following monotonicity property

(∗) S < T =⇒ |ρ(S)| < |ρ(T )| ∀ S, T ∈ F.


We obtain the following theorem whose proof is given in [3].
Theorem 5. For any instance of the crossing lattice polyhedron problem in which F
satisfies property (∗), there exists an algorithm that computes an integral solution of
cost at most the optimal, where all rank constraints are satisfied, and each degree bound
is violated by at most an additive 2Δ − 1.
We note that the above property (∗) is satisfied for matroids, and hence Theorem 5
matches the previously best-known bound [15] for degree bounded matroids (with both
upper/lower bounds). Also note that property (∗) holds whenever F is ordered by inclu-
sion. In this special case, we can improve the result to an additive Δ − 1 approximation
if only upper bounds are given.

1.2 Related Work


As mentioned earlier, the basic bounded-degree MST problem has been extensively stud-
ied [5,6,11,17,18,23,24,27]. The iterative relaxation technique for degree-constrained
problems was developed in [20,27].
MCST was first introduced by Bilo et al. [4], who presented a randomized-rounding
algorithm that computes a tree of cost O(log n) times the optimum where each degree
constraint is violated by a multiplicative O(log n) factor and an additive O(log m) term.
Subsequently, Bansal et al. [2] gave an algorithm that attains an optimal cost guarantee
and an additive Δ − 1 guarantee on degree; recall that Δ is the maximum number of de-
gree constraints that an edge lies in. This algorithm used iterative relaxation as its main

tool. Recently, Chekuri et al. [8] obtained an improved 1, (1 + )b + O( 1 log m) ap-
proximation algorithm for MCST, for any > 0; this algorithm is based on pipage
rounding.
The minimum crossing matroid basis problem was introduced in [15], where the au-
thors used iterative relaxation to obtain (1) (1, b + Δ − 1)-approximation when there
are only upper bounds on degree, and (2) (1, b + 2Δ − 1)-approximation in the pres-
ence of both upper and lowed degree-bounds. The [8] result also holds in this matroid
On Generalizations of Network Design Problems with Degree Bounds 115

setting. [15] also considered a degree-bounded version of the submodular flow problem
and gave a (1, b + 1) approximation guarantee.
The bounded-degree arborescence problem was considered in Lau et al. [20], where
a (2, 2b + 2) approximation guarantee was obtained. Subsequently Bansal et al. [2]
designed an algorithm that for any 0 < ≤ 1/2, achieves a (1/ , bv /(1 − ) + 4)
approximation guarantee. They also showed that this guarantee is the best one can hope
for via the natural LP relaxation (for every 0 < ≤ 1/2). In the absence of edge-costs,
[2] gave an algorithm that violates degree bounds by at most an additive two. Recently
Nutov [22] studied the arborescence problem under weighted degree constraints, and
gave a (2, 5b) approximation for it.
Lattice polyhedra were first investigated by Hoffman and Schwartz [13] and the nat-
ural LP relaxation was shown to be totally dual integral. Even though greedy-type algo-
rithms are known for all examples mentioned earlier, so far no combinatorial algorithm
has been found for lattice polyhedra in general. Two-phase greedy algorithms have been
established only in cases where an underlying rank function satisfies a monotonicity
property [10], [9].

2 Crossing Spanning Tree with Laminar Degree Bounds


In this section we prove Theorem 1 by presenting an iterative relaxation-based algo-
rithm with the stated performance guarantee. During its execution, the algorithm selects
and deletes edges, and it modifies the given laminar family of degree bounds. A generic
iteration starts with a subset F of edges already picked in the solution, a subset E of
undecided edges, i.e., the edges not yet picked or dropped from the solution, a laminar
family L on V , and residual degree bounds b(S) for each S ∈ L.
The laminar family L has a natural forest-like structure with nodes corresponding
to each element of L. A node S ∈ L is called the parent of node C ∈ L if S is the
inclusion-wise minimal set in L \ {C} that contains C; and C is called a child of S.
Node D ∈ L is called a grandchild of node S ∈ L if S is the parent of D’s parent.
Nodes S, T ∈ L are siblings if they have the same parent node. A node that has no
parent is called root. The level of any node S ∈ L is the length of the path in this forest
from S to the root of its tree. We also maintain a linear ordering of the children of
each L-node. A subset B ⊆ L is called consecutive if all nodes in B are siblings (with
parent S) and they appear consecutively in the ordering of S’s children. In any iteration
(F, E, L, b), the algorithm solves the following LP relaxation of the residual problem.

min ce xe (1)
e∈E

s.t. x(E(V )) = |V | − |F | − 1
x(E(U )) ≤ |U | − |F (U )| − 1 ∀U ⊂ V
x(δE (S)) ≤ b(S) ∀S ∈ L
xe ≥ 0 ∀e ∈ E

For any vertex-subset W ⊆ V and edge-set H, we let H(W ) := {(u, v) ∈ H | u, v ∈


W } denote the edges induced on W ; and δH (W ) := {(u, v) ∈ H | u ∈ W, v ∈ W }
the set of edges crossing W . The first two sets of constraints are spanning tree con-
straints while the third set corresponds to the degree bounds. Let x denote an optimal
116 N. Bansal et al.

extreme point solution to this LP. By reducing degree bounds b(S), if needed, we as-
sume that x satisfies all degree bounds at equality (the degree bounds may therefore be
fractional-valued). Let α := 24.
Definition 2. An edge e ∈ E is said to be local for S ∈ L if e has at least one end-point
in S but is neither in E(C) nor in δ(C) ∩ δ(S) for any grandchild C of S. Let local(S)
denote the set of local edges for S. A node S ∈ L is said to be good if |local(S)| ≤ α.
The figure on the left shows a set S, its
children B1 and B2 , and grand-children
C1 , . . . , C4 ; edges in local(S) are drawn
S
solid, non-local ones are shown dashed. C4 B2
C1
Initially, E is the set of edges in the C 3
B1
given graph, F ← ∅, L is the original C2
laminar family of vertex sets for which
there are degree bounds, and an arbitrary
linear ordering is chosen on the children
of each node in L. In a generic iteration (F, E, L, b), the algorithm performs one of the
following steps (see also Figure 1):

1. If xe = 1 for some edge e ∈ E then F ← F ∪ {e}, E ← E \ {e}, and set


b(S) ← b(S) − 1 for all S ∈ L with e ∈ δ(S).
2. If xe = 0 for some edge e ∈ E then E ← E \ {e}.
3. DropN: Suppose there at least |L|/4 good non-leaf nodes in L. Then either odd-
levels or even-levels contain a set M ⊆ L of |L|/8 good non-leaf nodes. Drop
the degree bounds of all children of M and modify L accordingly. The ordering of
siblings also extends naturally.
4. DropL: Suppose there are more than |L|/4 good leaf nodes in L, denoted by N .
Then partition N into parts corresponding to siblings in L. For any part {N1 , · · · ,
Nk } ⊆ N consisting of ordered (not necessarily contiguous) children of some node
S:
(a) Define Mi = N2i−1 ∪ N2i for all 1 ≤ i ≤ k/2 (if k is odd Nk is not used).
(b) Modify L by removing leaves {N1 , · · · , Nk } and adding new leaf-nodes {M1 ,
· · · , Mk/2 } as children of S (if k is odd Nk is removed). The children of S in
the new laminar family are ordered as follows: each node Mi takes the position
of either N2i−1 or N2i , and other children of S are unaffected.
(c) Set the degree bound of each Mi to b(Mi ) = b(N2i−1 ) + b(N2i ).

Assuming that one of the above steps applies at each iteration, the algorithm terminates
when E = ∅ and outputs the final set F as a solution. It is clear that the algorithm
outputs a spanning tree of G. An inductive argument (see e.g. [20]) can be used to show
that the LP (1) is feasible at each each iteration and c(F ) + zcur ≤ zo where zo is
the original LP value, zcur is the current LP value, and F is the chosen edge-set at the
current iteration. Thus the cost of the final solution is at most the initial LP optimum zo .
Next we show that one of the four iterative steps always applies.
Lemma 1. In each iteration, one of the four steps above applies.
On Generalizations of Network Design Problems with Degree Bounds 117

S S

N1 T N2 N3 N4 N5
1 2 3 4 DropN step DropL step
Good non-leaf S Good leaves {Ni}5i=1
S
S

1 2 3 4 T
M1 M2
Fig. 1. Examples of the degree constraint modifications DropN and DropL

Proof. Let x∗ be the optimal basic solution of (1), and suppose that the first two steps
do not apply. Hence, we have 0 < x∗e < 1 for all e ∈ E. The fact that x∗ is a basic
solution together with a standard uncrossing argument (e.g., see [14]) implies that x∗ is
uniquely defined by

x(E(U )) = |U | − |F (U )| − 1 ∀ U ∈ S, and x(δE (S)) = b(S), ∀ S ∈ L ,

where S is a laminar subset of the tight spanning tree constraints, and L is a subset of
tight degree constraints, and where |E| = |S| + |L |.
A simple counting argument (see, e.g., [27]) shows that there are at least 2 edges
induced on each S ∈ S that are not induced on any of its children; so 2|S| ≤ |E|. Thus
we obtain |E| ≤ 2|L | ≤ 2|L|.
From the definition of local edges, we get that any edge e = (u, v) is local to at most
the following six sets: the smallest set S1 ∈ L containing u, the smallest set S2 ∈ L
containing v, the parents P1 and P2 of S1  and S2 resp., the least-common-ancestor L
of P1 and P2 , andthe parent of L. Thus S∈L |local(S)| ≤ 6|E|. From the above,
we conclude that S∈L |local(S)| ≤ 12|L|. Thus at least |L|/2 sets S ∈ L must have
|local(S)| ≤ α = 24, i.e., must be good. Now either at least |L|/4 of them must be
non-leaves or at least |L|/4 of them must be leaves. In the first case, step 3 holds and in
the second case, step 4 holds.
It remains to bound the violation in the degree constraints, which turns out to be rather
challenging. We note that this is unlike usual applications of iterative rounding/relaxation,
where the harder part is in showing that one of the iterative steps applies.
It is clear that the algorithm reduces the size of L by at least |L|/8 in each DropN or
DropL iteration. Since the initial number of degree constraints is at most 2n − 1, we get
the following lemma.
Lemma 2. The number of drop iterations (DropN and DropL) is T := O(log n).
Performance guarantee for degree constraints. We begin with some notation. The
iterations of the algorithm are broken into periods between successive drop iterations:
there are exactly T drop-iterations (Lemma 2). In what follows, the t-th drop iteration
118 N. Bansal et al.

is called round t. The time t refers to the instant just after round t; time 0 refers to the
start of the algorithm. At any time t, consider the following parameters.
– Lt denotes the laminar family of degree constraints.
– Et denotes the undecided edge set, i.e., support of the current
 LP optimal solution.
– For any set B of consecutive siblings in Lt , Bnd(B, t) = N ∈B b(N ) equals the
sum of the residual degree bounds on nodes of B.
– For any set B of consecutive siblings in Lt , Inc(B, t) equals the number of edges
from δEt (∪N ∈B N ) included in the final solution.
Recall that b denotes the residual degree bounds at any point in the algorithm. The
following lemma is the main ingredient in bounding the degree violation.
Lemma 3. For any set B of consecutive siblings in Lt (at any time t), Inc(B, t) ≤
Bnd(B, t) + 4α · (T − t).
Observe that this implies the desired bound on each original degree constraint S: using
t = 0 and B = {S}, the violation is bounded by an additive 4α · T term.
Proof. The proof of this lemma is by induction on T − t. The base case t = T is trivial
since the only iterations after this correspond to including 1-edges: hence there is no
bound, i.e. Inc({N }, T) ≤ b(N ) for all N ∈ LT . Hence for any
violation in any degree
B ⊆ L, Inc(B, T ) ≤ N ∈B Inc({N }, T ) ≤ N ∈B b(N ) = Bnd(B, T ).
Now suppose t < T , and assume the lemma for t + 1. Fix a consecutive B ⊆ Lt . We
consider different cases depending on what kind of drop occurs in round t + 1.
DropN round. Here either all nodes in B get dropped or none gets dropped.
Case 1: None of B is dropped. Then observe that B is consecutive in Lt+1 as well;
so the inductive hypothesis implies Inc(B, t + 1) ≤ Bnd(B, t + 1) + 4α · (T − t − 1).
Since the only iterations between round t and round t + 1 involve edge-fixing, we have
Inc(B, t) ≤ Bnd(B, t) − Bnd(B, t + 1) + Inc(B, t + 1) ≤ Bnd(B, t) + 4α · (T − t − 1) ≤
Bnd(B, t) + 4α · (T − t).
Case 2: All of B is dropped. Let C denote the set of all children (in Lt ) of nodes in
B. Note that C consists of consecutive siblings in Lt+1 , and inductively Inc(C, t + 1) ≤
Bnd(C, t + 1) + 4α · (T − t − 1). Let S ∈ Lt denote the parent of the B-nodes;
so C are grand-children of S in Lt . Let x denote the optimal LP solution just before
round t + 1 (when the degree bounds are still given by Lt ), and H = Et+1 the support
edges of x. At that  point, we have b(N ) = x(δ(N )) for all N ∈ B ∪ C. Also let
Bnd (B, t + 1) := N ∈B b(N ) be the sum of bounds on B-nodes just before round

t+ 1. Since S is  t + 1, |Bnd (B,
a good node in round t + 1) − Bnd(C, t + 1)| =
| N ∈B b(N ) − M∈C b(M )| = | N ∈B x(δ(N )) − M∈C x(δ(M ))| ≤ 2α. The
last inequality follows since S is good; the factor of 2 appears since some edges, e.g.,
the edges between two children or two grandchildren of S, may get counted twice. Note
also that the symmetric difference of δH (∪N ∈B N ) and δH (∪M∈C M ) is contained in
local(S). Thus δH (∪N ∈B N ) and δH (∪M∈C M ) differ in at most α edges.
Again since all iterations between time t and t + 1 are edge-fixing:

Inc(B, t) ≤ Bnd(B, t) − Bnd (B, t + 1) + |δH (∪N ∈B N ) \ δH (∪M∈C M )|


+Inc(C, t + 1)
On Generalizations of Network Design Problems with Degree Bounds 119

≤ Bnd(B, t) − Bnd (B, t + 1) + α + Inc(C, t + 1)


≤ Bnd(B, t) − Bnd (B, t + 1) + α + Bnd(C, t + 1) + 4α · (T − t − 1)
≤ Bnd(B, t) − Bnd (B, t + 1) + α + Bnd (B, t + 1) + 2α+4α ·(T − t − 1)
≤ Bnd(B, t) + 4α · (T − t)
The first inequality above follows from simple counting; the second follows since
δH (∪N ∈B N ) and δH (∪M∈C M ) differ in at most α edges; the third is the induction
hypothesis, and the fourth is Bnd(C, t + 1) ≤ Bnd (B, t + 1) + 2α (as shown above).
DropL round. In this case, let S be the parent of B-nodes in Lt , and N = {N1 , · · · , Np }
be all the ordered children of S, of which B is a subsequence (since it is consecutive).
Suppose indices 1 ≤ π(1) < π(2) < · · · < π(k) ≤ p correspond to good leaf-nodes
in N . Then for each 1 ≤ i ≤ k/2, nodes Nπ(2i−1) and Nπ(2i) are merged in this
round. Let {π(i) | e ≤ i ≤ f } (possibly empty) denote the indices of good leaf-nodes
in B. Then it is clear that the only nodes of B that may be merged with nodes outside
B are Nπ(e) and Nπ(f ) ; all other B-nodes are either not merged or merged with another
B-node. Let C be the inclusion-wise minimal set of children of S in Lt+1 s.t.
– C is consecutive in Lt+1 ,
– C contains all nodes of B \ {Nπ(i) }ki=1 , and
– C contains all new leaf nodes resulting from merging two good leaf nodes of B.
Note that ∪M∈C M consists of some subset of B and at most two good leaf-nodes in
N \ B. These two extra nodes (if any) are those
merged with the good leaf-nodes Nπ(e)
and Nπ(f ) of B. Again let Bnd (B, t + 1) := N ∈B b(N ) denote the sum of bounds
on B just before drop round t + 1, when degree constraints are Lt . Let H = Et+1 be
the undecided edges in round t + 1. By the definition of bounds on merged leaves, we
have Bnd(C, t + 1) ≤ Bnd (B, t + 1) + 2α. The term 2α is present due to the two extra
good leaf-nodes described above.
Claim 6. We have |δH (∪N ∈B N ) \ δH (∪M∈C M )| ≤ 2α.
Proof. We say that N ∈ N is represented in C if either N ∈ C or N is contained
in some node of C. Let D be set of nodes of B that are not represented in C and the
nodes of N \ B that are represented in C. Observe that by definition of C, the set D ⊆
{Nπ(e−1) , Nπ(e) , Nπ(f ) , Nπ(f +1) }; in fact it can be easily seen that |D| ≤ 2. Moreover
D consists of only good leaf nodes. Thus, we have | ∪L∈D δH (L)| ≤ 2α. Now note that
the edges in δH (∪N ∈B N ) \ δH (∪M∈C M ) must be in ∪L∈D δH (L). This completes the
proof.
As in the previous case, we have:
Inc(B, t) ≤ Bnd(B, t) − Bnd (B, t + 1) + |δH (∪N ∈B N ) \ δH (∪M∈C M )|
+Inc(C, t + 1)
≤ Bnd(B, t) − Bnd (B, t + 1) + 2α + Inc(C, t + 1)
≤ Bnd(B, t) − Bnd (B, t + 1) + 2α + Bnd(C, t + 1) + 4α · (T − t − 1)
≤ Bnd(B, t) − Bnd (B, t + 1)+2α+Bnd (B, t + 1)+2α+4α · (T − t − 1)
= Bnd(B, t) + 4α · (T − t)
120 N. Bansal et al.

The first inequality follows from simple counting; the second uses Claim 6, the third
is the induction hypothesis (since C is consecutive), and the fourth is Bnd(C, t + 1) ≤
Bnd (B, t + 1) + 2α (from above).
This completes the proof of the inductive step and hence Lemma 3.

3 Hardness Results
We now prove Theorem 2. The first step to proving this result is a hardness for the more
general minimum crossing matroid basis problem: given a matroid M on a ground set
V of elements, a cost function c : V → R+ , and degree bounds specified by pairs
i=1 (where each Ei ⊆ V and bi ∈ N), find a minimum cost basis I in M
{(Ei , bi )}m
such that |I ∩ Ei | ≤ bi for all i ∈ [m].
Theorem 7. Unless N P has quasi-polynomial time algorithms, the unweighted min-
imum crossing matroid basis problem admits no polynomial time O(logc m) additive
approximation for the degree bounds for some fixed constant c > 0.
Proof. We reduce from the label cover problem [1]. The input is a graph G = (U, E)
where the vertex set U is partitioned into pieces U1 , · · · , Un each having size q, and all
edges in E are between distinct pieces. We say that there is a superedge between Ui and
Uj if there is an edge connecting some vertex in Ui to some vertex in Uj . Let t denote
the total number of superedges; i.e.,
-   .-
- [n] -
t = -- (i, j) ∈ : there is an edge in E between Ui and Uj --
2
The goal is to pick one vertex from each part {Ui }ni=1 so as to maximize the number of
induced edges. This is called the value of the label cover instance and is at most t.
It is well known that there exists a universal constant γ > 1 such that for every
k ∈ N, there is a reduction from any instance of SAT (having size N ) to a label cover
instance "G = (U, E), q, t# such that:
– If the SAT instance is satisfiable, the label cover instance has optimal value t.
– If the SAT instance is not satisfiable, the label cover instance has optimal value
< t/γ k .
– |G| = N O(k) , q = 2k , |E| ≤ t2 , and the reduction runs in time N O(k) .
We consider a uniform matroid M with rank t on ground set E (recall that any subset
of t edges is a basis in a uniform matroid). We now construct a crossing matroid basis
instance I on M. There is a set of degree bounds corresponding to each i ∈ [n]: for
every collection C of edges incident to vertices in Ui such that no two edges in C are
incident to the same vertex in Ui , there is a degree bound in I requiring at most one
element to be chosen from C. Note that the number of degree bounds m is at most
k
|E|q ≤ N O(k 2 ) . The following claim links the SAT and crossing matroid instances.
Its proof is deferred to the full version of this paper.
Claim 8. [Yes instance] If the SAT instance is satisfiable, there is a basis (i.e. subset
B ⊆ E with |B| = t) satisfying all degree bounds.
 
√subset B ⊆ E with |B | ≥ t/2
[No instance] If the SAT instance is unsatisfiable, every
k/2
violates some degree bound by an additive ρ = γ / 2.
On Generalizations of Network Design Problems with Degree Bounds 121

The steps described in the above reduction can be done in time polynomial in m and
|G|. Also, instead of randomly choosing vertices from the sets Wi , we can use condi-
tional expectations to derive a deterministic algorithm that recovers at least t/ρ2 edges.
Setting k = Θ(log log N ) (recall that N is the size of the original SAT instance), we
a
obtain an instance of bounded-degree matroid basis of size max{m, |G|} = N log N
and ρ = logb N , where a, b > 0 are constants. Note that log m = loga+1 N , which
implies ρ = logc m for c = a+1 b
> 0, a constant. Thus it follows that for this constant
c > 0 the bounded-degree matroid basis problem has no polynomial time O(logc m)
additive approximation for the degree bounds, unless N P has quasi-polynomial time
algorithms.
We now prove Theorem 2.
Proof. [Proof of Theorem 2] We show how the bases of a uniform matroid can be
represented in a suitable instance of the crossing spanning tree problem. Let the uniform √
matroid from Theorem 7 consist of e elements and have rank t ≤ e; recall that t ≥ e
and clearly m ≤ 2e . We construct a graph as in Figure 2, with vertices v1 , · · · , ve
corresponding to elements in the uniform matroid. Each vertex vi is connected to the
root r by two vertex-disjoint paths: "vi , ui , r# and "vi , wi , r#. There are no costs in
this instance. Corresponding to each degree bound (in the uniform matroid) of b(C)
on a subset C ⊆ [e], there is a constraint to pick at most |C| + b(C) edges from
δ({ui / | i ∈ C}). Additionally, there is a special degree bound of 2e − t on the edge-set
E  = ei=1 δ(wi ); this corresponds to picking a basis in the uniform matroid.
Observe that for each i ∈ [e], any r
spanning tree must choose exactly three u
1
w
edges amongst {(r, ui ), (ui , vi ), (r, wi ), w
e

1
e u
(wi , vi )}, in fact any three edges suffice. u
i w
i
v1
v
Hence every spanning tree T in this graph e

corresponds to a subset X ⊆ [e] such


v
that: (I) T contains both edges in δ(ui ) i

and one edge from δ(wi ), for each i ∈ X,


Fig. 2. The crossing spanning tree instance used
and (II) T contains both edges in δ(wi )
in the reduction
and one edge from δ(ui ) for each i ∈
[e] \ X.
From Theorem 7, for the crossing matroid problem, we obtain the two cases:
Yes instance. There is a basis B ∗ (i.e. B ∗ ⊆ [e], |B ∗ | = t) satisfying all degree bounds.
Consider the spanning tree
,
T ∗ = {(r, ui ), (ui , vi ), (r, wi ) | i ∈ B ∗ } {(r, wi ), (ui , wi ), (r, ui ) | i ∈ [e] \ B ∗ }.

Since B ∗ satisfies its degree-bounds, T ∗ satisfies all degree bounds derived from the
crossing matroid instance. For the special degree bound on E  , note that |T ∗ ∩ E  | =
2e − |B ∗ | = 2e − t; so this is also satisfied. Thus there is a spanning tree satisfying all
the degree bounds.
No instance. Every subset B  ⊆ [e] with |B  | ≥ t/2 (i.e. near basis) violates some
degree bound by an additive ρ = Ω(logc m) term, where c > 0 is a fixed constant.
Consider any spanning tree T that corresponds to subset X ⊆ [e] as described above.
122 N. Bansal et al.

1. Suppose that |X| ≤ t/2; then we have |T ∩ E  | = 2e − |X| ≥ 2e − t + 2t , i.e. the



special degree bound is violated by t/2 ≥ Ω( e) = Ω(log1/2 m).
2. Now suppose that |X| ≥ t/2. Then by the guarantee on the no-instance, T violates
some degree-bound derived from the crossing matroid instance by additive ρ.
Thus in either case, every spanning tree violates some degree bound by additive ρ =
Ω(logc m).
By Theorem 7, it is hard to distinguish the above cases and we obtain the correspond-
ing hardness result for crossing spanning tree, as claimed in Theorem 2.

References
1. Arora, S., Babai, L., Stern, J., Sweedyk, Z.: The hardness of approximate optima in lattices,
codes, and systems of linear equations. J. Comput. Syst. Sci. 54(2), 317–331 (1997)
2. Bansal, N., Khandekar, R., Nagarajan, V.: Additive guarantees for degree bounded network
design. In: STOC, pp. 769–778 (2008)
3. Bansal, N., Khandekar, R., Könemann, J., Nagarajan, V., Peis, B.: On Generalizations of
Network Design Problems with Degree Bounds (full version),Technical Report (2010)
4. Bilo, V., Goyal, V., Ravi, R., Singh, M.: On the crossing spanning tree problem. In: Jansen,
K., Khanna, S., Rolim, J.D.P., Ron, D. (eds.) RANDOM 2004 and APPROX 2004. LNCS,
vol. 3122, pp. 51–60. Springer, Heidelberg (2004)
5. Chaudhuri, K., Rao, S., Riesenfeld, S., Talwar, K.: What would Edmonds do? Augment-
ing paths and witnesses for degree-bounded MSTs. In: Chekuri, C., Jansen, K., Rolim,
J.D.P., Trevisan, L. (eds.) APPROX 2005 and RANDOM 2005. LNCS, vol. 3624, pp. 26–39.
Springer, Heidelberg (2005)
6. Chaudhuri, K., Rao, S., Riesenfeld, S., Talwar, K.: Push relabel and an improved approxima-
tion algorithm for the bounded-degree MST problem. In: Bugliesi, M., Preneel, B., Sassone,
V., Wegener, I. (eds.) ICALP 2006. LNCS, vol. 4051, pp. 191–201. Springer, Heidelberg
(2006)
7. Chazelle, B.: The Discrepancy Method: Randomness and Complexity. Cambridge University
Press, Cambridge (2000)
8. Chekuri, C., Vondrák, J., Zenklusen, R.: Dependent Randomized Rounding for Matroid Poly-
topes and Applications (2009), http://arxiv.org/abs/0909.4348
9. Faigle, U., Peis, B.: Two-phase greedy algorithms for some classes of combinatorial linear
programs. In: SODA, pp. 161–166 (2008)
10. Frank, A.: Increasing the rooted connectivity of a digraph by one. Math. Programming 84,
565–576 (1999)
11. Goemans, M.X.: Minimum Bounded-Degree Spanning Trees. In: FOCS, pp. 273–282 (2006)
12. Grandoni, F., Ravi, R., Singh, M.: Iterative Rounding for Multiobjective Optimization Prob-
lems. In: Fiat, A., Sanders, P. (eds.) ESA 2009. LNCS, vol. 5757, pp. 95–106. Springer,
Heidelberg (2009)
13. Hoffman, A., Schwartz, D.E.: On lattice polyhedra. In: Hajnal, A., Sos, V.T. (eds.) Proceed-
ings of Fifth Hungarian Combinatorial Coll, pp. 593–598. North-Holland, Amsterdam (1978)
14. Jain, K.: A factor 2 approximation algorithm for the generalized Steiner network problem.
In: Combinatorica, pp. 39–61 (2001)
15. Király, T., Lau, L.C., Singh, M.: Degree bounded matroids and submodular flows. In: Lodi,
A., Panconesi, A., Rinaldi, G. (eds.) IPCO 2008. LNCS, vol. 5035, pp. 259–272. Springer,
Heidelberg (2008)
On Generalizations of Network Design Problems with Degree Bounds 123

16. Klein, P.N., Krishnan, R., Raghavachari, B., Ravi, R.: Approximation algorithms for finding
low degree subgraphs. Networks 44(3), 203–215 (2004)
17. Könemann, J., Ravi, R.: A matter of degree: Improved approximation algorithms for degree
bounded minimum spanning trees. SIAM J. on Computing 31, 1783–1793 (2002)
18. Könemann, J., Ravi, R.: Primal-Dual meets local search: approximating MSTs with nonuni-
form degree bounds. SIAM J. on Computing 34(3), 763–773 (2005)
19. Korte, B., Vygen, J.: Combinatorial Optimization, 4th edn. Springer, New York (2008)
20. Lau, L.C., Naor, J., Salavatipour, M.R., Singh, M.: Survivable network design with degree or
order constraints (full version). In: STOC, pp. 651–660 (2007)
21. Lau, L.C., Singh, M.: Additive Approximation for Bounded Degree Survivable Network
Design. In: STOC, pp. 759–768 (2008)
22. Nutov, Z.: Approximating Directed Weighted-Degree Constrained Networks. In: Goel, A.,
Jansen, K., Rolim, J.D.P., Rubinfeld, R. (eds.) APPROX 2008 and RANDOM 2008. LNCS,
vol. 5171, pp. 219–232. Springer, Heidelberg (2008)
23. Ravi, R., Marathe, M.V., Ravi, S.S., Rosenkrantz, D.J., Hunt, H.B.: Many birds with one
stone: Multi-objective approximation algorithms. In: STOC, pp. 438–447 (1993)
24. Ravi, R., Singh, M.: Delegate and Conquer: An LP-based approximation algorithm for Min-
imum Degree MSTs. In: Bugliesi, M., Preneel, B., Sassone, V., Wegener, I. (eds.) ICALP
2006. LNCS, vol. 4051, pp. 169–180. Springer, Heidelberg (2006)
25. Schrijver, A.: Combinatorial Optimization. Springer, Heidelberg (2003)
26. Singh, M.: Personal Communication (2008)
27. Singh, M., Lau, L.C.: Approximating minimum bounded degree spanning trees to within one
of optimal. In: STOC, pp. 661–670 (2007)
A Polyhedral Study of the Mixed Integer Cut

Steve Tyber and Ellis L. Johnson

H. Milton Stewart School of Industrial and Systems Engineering,


Georgia Institute of Technology, Atlanta, GA USA
{styber,ejohnson}@isye.gatech.edu

Abstract. General purpose cutting planes have played a central role in


modern IP solvers. In practice, the Gomory mixed integer cut has proven
to be among the most useful general purpose cuts. One may obtain this
inequality from the group relaxation of an IP, which arises by relaxing
non-negativity on the basic variables. We study the mixed integer cut as a
facet of the master cyclic group polyhedron and characterize its extreme
points and adjacent facets in this setting. Extensions are provided under
automorphic and homomorphic mappings.

Keywords: Integer Programming, Group Relaxation, Master Cyclic


Group Polyhedron, Master Knapsack Polytope, Cutting Planes.

1 Introduction
Consider the integer program
min(cx : Ax = b, x ∈ Zn+ ), (1)
where A ∈ Zm×n , b ∈ Zm , and c ∈ Rn . Given a basis B of the LP relaxation of
(1), the group relaxation of X, is obtained by relaxing non-negativity on xB , i.e.
XGR = {x : BxB + N xN = b, xB ∈ Zm , xN ∈ Zn−m
+ }.
It follows that for an integer vector xN , xB is integral if and only if N xN ≡ b
(mod B); that is, N xN − b belongs to the lattice generated by the columns of
B.
Consider the group G of equivalence classes of Zn modulo B. Let N be the
set of distinct equivalence classes represented by the columns of N , and let g0
be the equivalence class represented by b. The group polyhedron is given by
⎧ ⎫
⎨  ⎬
|N |
P (N , g0 ) = conv t ∈ Z+ : gt(g) = g0 ,
⎩ ⎭
g∈N

where equality is taken modulo B. Letting G+ = G \ 0, i.e. the set of equivalence


classes distinct from the lattice generated by B, the master group polyhedron is
given by ⎧ ⎫
⎨  ⎬
|G|−1
P (G, g0 ) = conv t ∈ Z+ : gt(g) = g0 .
⎩ ⎭
g∈G+

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 124–134, 2010.

c Springer-Verlag Berlin Heidelberg 2010
A Polyhedral Study of the Mixed Integer Cut 125

When A consists of a single row, the master group polyhedron is of the form
⎧ ⎫
⎨ |D|−1
 ⎬
|D|−1
P (CD , r) = conv t ∈ Z+ : iti ≡ r (mod D)
⎩ ⎭
i=1

and is called the master cyclic group polyhedron.


In [3], Gomory introduces the group polyhedron and studies its facets. In
particular, he shows that one can obtain facets of the group polyhedron from
facets of its corresponding master polyhedron, and that these can be used to
obtain valid inequalities for P . Further, Gomory identifies the mixed integer cut
as a facet of the master cyclic group polyhedron P (Cn , r) where r = 0:
1 r−1 n−r−1 1
t1 + · · · + tr−1 + tr + tr+1 + · · · + tn−1 ≥ 1.
r r n−r n−r
Indeed, by dropping the coefficients in the above inequality for elements not
appearing in the group problem for a tableau row, one readily obtains the familiar
Gomory mixed integer cut.
Empirically, this cut has been effective in solving integer programs [2], and
shooting experiments indicate that this facet describes a large portion of the
master cyclic group polyhedron [4].
We continue this investigation of the mixed integer cut. In Section 2, we
characterize its extreme points and identify the adjacent facets of the master
cyclic group polyhedron; in Section 3, we extend our characterization of extreme
points to all integer points of the mixed integer cut; and in Section 4, we discuss
mappings of the mixed integer cut under automorphisms and homomorphisms of
groups and provide extensions of our results. We conclude with future research
directions.

2 Facets and Extreme Points of the Mixed Integer Cut


Throughout, we consider the master cyclic group polyhedron:
 

n−1
P (Cn , r) = conv t ∈ Z+ :
n−1
iti ≡ r (mod n) .
i=1

We will also frequently refer to the master knapsack polyhedron,


 

m
P (Km ) = conv x ∈ Zm + : ixi = m .
i=1

Further, we will always assume that r > 0 and that n ≥ 3. By observing that the
recession cone of P (Cn , r) is the non-negative orthant, one notes that P (Cn , r) is
of dimension n − 1. It is also easily observed that P (Km ) is of dimension m − 1.
By the assumption that n ≥ 3, it follows that the non-negativity constraints
are facet defining. In our discussion, these shall be referred to as the trivial facets.
126 S. Tyber and E.L. Johnson

Let (π, π0 ), denote the inequality



n−1
πi ti ≥ π0 .
t=1

When speaking of valid inequalities for the master knapsack polyhedra, we shall
use the same notation where entries are understood to be of appropriate dimen-
sion. Denote the mixed integer cut by (μ, 1), where

i
i≤r
μi = rn−i .
n−r i>r
For completeness, we include the following theorem to which we have already
referred:
Theorem 1 (Gomory [3]). (μ, 1) is a facet of P (Cn , r).
We consider the mixed integer cut as the polytope
PMIC (n, r) = P (Cn , r) ∩ {t : μt = 1}.
Since (μ, 1) is a facet of P (Cn , r) and P (Cn , r) is integral, PMIC (n, r) is also
integral. Note that a facet (π, π0 ) is adjacent to (μ, 1) if and only if it is a facet
of PMIC (n, r). We assume that 1 < r < n − 1, since otherwise the non-trivial
facets of PMIC (n, r) are clearly knapsack facets.
We shall now discuss the connection between PMIC (n, r) and the master knap-
sack polytopes P (Kr ) and P (Kn−r ). The following proposition highlights an
operation that we will call extending a knapsack solution.
Proposition 1. If x ∈ P (Kr ), x = (x1 , . . . , xr ), then t = (x1 , . . . , xr , 0, . . . , 0)
belongs to PMIC (n, r). Likewise, if x ∈ P (Kn−r ), x = (x1 , . . . , xn−r ), then t =
(0, . . . , 0, xn−r , . . . , x1 ) belongs to PMIC (n, r).
Proof. For x ∈ P (Kr ), the result is trivial. So take x ∈ P (Kn−r ). Since P (Kn−r )
is convex and integral, we may assume that x is integral. Rewriting i = n−(n−i)
for i = 1, . . . , r and applying the assumption that x is an integral knapsack
solution, the proposition follows.
In terms of facets, we shall focus on a family of facets introduced in [1]. Before
stating the theorem, we note that for any non-trivial knapsack facet, by taking
an appropriate linear combination with the knapsack equation, we may assume
the following:
Proposition 2. Let (ρ, ρ0 ) be a non-trivial facet of P (Km ). Without loss of
generality we may assume that (ρ, ρ0 ) ≥ 0, ρ0 = ρm = 1. Moreover, we may
assume there exists some i = m such that ρi = 0.
Theorem 2 (Aráoz et. al. [1]). Let (ρ, ρr ) be a non-trivial facet of P (Kr )
such that ρ ≥ 0, ρi = 0 for at least one i, and ρr = 1. Let
 
n−r−1 1
ρ = ρ1 , . . . , ρr = 1, ,..., .
n−r n−r
A Polyhedral Study of the Mixed Integer Cut 127

Then there exists some α ∈ R such that (π, π0 ) = (ρ + αμ, 1 + α) is a facet of


P (Cn , r).

Although not explicitly stated in [1], as an easy consequence of Theorem 2 and


Theorem 6 (Section 4), this operation can also be performed using non-trivial
facets of P (Kn−r ).
Proposition 3. Let (ρ, ρn−r ) be a non-trivial facet of P (Kn−r ) such that ρ ≥ 0,
ρi = 0 for at least one i, and ρn−r = 1. Let
 
1 r−1
ρ= ,..., , 1 = ρn−r , ρn−r−1 , . . . , ρ1 .
r r

Then there exists some α ∈ R such that (π, π0 ) = (ρ + αμ, 1 + α) is a facet of


P (Cn , r).
In particular, given any non-trivial facet of P (Kr ) or P (Kn−r ) we can construct
a facet of P (Cn , r). Such facets are called tilted knapsack facets and the α is
called the tilting coefficient. Details for calculating α are given in [1]. Applying
Proposition 1 we arrive at the following:
Lemma 1. The tilted knapsack facets are facets of PMIC (n, r).

Proof. We argue for facets tilted from P (Kr ); an analogous argument proves the
result for facets tilted from P (Kn−r ).
Let (π, π0 ) be tilted from (ρ, 1), and let ρ be as described in Theorem 2 and
α be the corresponding tilting coefficient. Since (ρ, 1) is a facet of P (Kr ), there
exist r − 1 affinely independent extreme points x1 , . . . , xr−1 satisfying (ρ, 1)
at equality. As described in Proposition 1, these points may be extended to
points t1 , . . . , tr−1 ∈ PMIC (n, r), and clearly this preserves affine independence.
Moreover, for i = 1, . . . , r − 1, μti = 1 and ρti = ρx = 1, thus

πti = (ρ + αμ)ti = ρti + α · μti = 1 + α = πr .

Now consider n−r affinely independent extreme points y 1 , . . . , y n−r of P (Kn−r ),


and again as in Proposition 1, extend them to points s1 , . . . , sn−r ∈ PMIC (n, r).

πsi = (ρ + αμ)si = ρsi + α · μsi = 1 + α = πr .

It is easily seen that {t1 , . . . , tr−1 } ∩ {s1 , . . . , sn−r } = er . Therefore we have


produced n − 2 affinely independent points, proving the claim.

Consider a tilted knapsack facet (π, π0 ) arising from the facet (ρ, 1) of P (Kr )
with tilting coefficient α. Letting μ denote first r coefficients of μ, the same
facet of P (Kr ) is described by (γ, 0) = (ρ, 1) − (μ , 1). In particular letting,

(γ̄, 0) = (γ1 , . . . , γr = 0, 0, . . . , 0),

it follows that (π, π0 ) = (γ̄, 0)+(1+α)(μ, 1). The same applies to tilted knapsack
facets arising from P (Kn−r ).
128 S. Tyber and E.L. Johnson

Therefore we will think of tilted knapsack facets as arising from facets of


the form (ρ, 0), and by subtracting off the mixed integer cut we think of tilted
knapsack facets in the form (ρ̄, 0).
We now prove our main result.
Theorem 3. The convex hull of PMIC (n, r) is given by the tilted knapsack facets
and the non-negativity constraints.

Proof. For convenience, say that P (Kr ) has non-trivial facets (ρ1 , 0), . . . , (ρM , 0)
and that P (Kn−r ) has non-trivial facets (γ 1 , 0), . . . , (γ N , 0). Let (ρ̄i , 0) and
(γ̄ i , 0) denote the tilted knapsack facets from (ρi , 0) and (γ i , 0) respectively.
We shall show that the system

min c · t
s.t. μ · t = 1
ρ̄i · t ≥ 0 i = 1, . . . M (2)
γ̄ i · t ≥ 0 i = 1, . . . N
t ≥0

attains an integer optimum that belongs to PMIC (n, r) for every c. 
Let c = (c1 , . . . ,!cr ) and c = (cn−1 , . . . , cr ), μ = 1r , . . . , r−1 
r , 1 , and μ =
1 n−r−1
n−r , . . . , n−r , 1 . Consider the systems

min c · x
s.t. μ · x = 1
(3)
ρi · x ≥ 0 i = 1, . . . M
x ≥0

and
min c · x
s.t. μ · x = 1
(4)
γ i · x ≥ 0 i = 1, . . . N
x ≥0
representing P (Kr ) and P (Kn−r ) respectively. Since both systems are integral,
the minima are obtained at integer extreme points x0 and x 1 respectively. Now

let t be obtained by extending the solution achieving the smaller objective value
to a feasible point of PMIC (n, r). Indeed this t∗ is feasible and integral; it remains
to show that it is optimal.
We now consider the duals. The dual of (3) is given by

max(λ1 : λ1 μ + α1 ρ1 + · · · + αM ρM ≤ c , α ≥ 0), (5)


λ1 ,α

and the dual of (4) is given by

max (λ2 : λ2 μ + β1 γ 1 + · · · + βN



γ N ≤ c , β  ≥ 0). (6)
λ2 ,β
A Polyhedral Study of the Mixed Integer Cut 129

Lastly the dual of (2) is given by


 
λμ + α1 ρ̄1 + · · · + αM ρ̄M + β1 γ̄ 1 + · · · + βN γ̄ N ≤ c
max λ : . (7)
λ,α,β α, β ≥ 0

11 , α0 ) and (λ
Let (λ 12 , β0 ) attain the maxima in (5) and (6) respectively. Setting
0 1 1
λ = min(λ1 , λ2 ), it easily follows from the zero pattern of (2) and non-negativity
of μ that (λ, 0 α0 , β0 ) is feasible to (7). Moreover λ
0 = c · t∗ , proving optimality.

Further observe that PMIC (n, r) is pointed, and so from this same proof we get
the following characterization of extreme points:
Theorem 4. A point t is an extreme point of PMIC (n, r) if and only if it can
be obtained by extending an extreme point of P (Kr ) or P (Kn−r ).

3 Integer Points of the Mixed Integer Cut

In this section we highlight a noteworthy extension of Theorem 4 to all integer


points of PMIC (n, r).
Theorem 5. t ∈ PMIC (n, r) ∩ Zn−1 if and only if t can be obtained by extending
an integer solution of P (Kr ) or P (Kn−r ).

Proof. If t = er the claim is obvious. So we suppose that tr = 0. We shall show


that if t ∈ PMIC (n, r) ∩ Zn−1 , tr = 0, then either (I) (t1 , . . . , tr ) > 0 or (II)
(tr , . . . , tn−1 ) > 0 but not both.
Since t ∈ P (Cn , r),

t1 + · · · + (r − 1)tr−1 + rtr + (r + 1)tr+1 + · · · + (n − 1)tn−1 ≡ r (mod n).

Thus there exists some β ∈ Z such that

t1 + · · · + (r − 1)tr−1 + rtr + (r + 1)tr+1 + · · · + (n − 1)tn−1 = r + βn

, and since r > 0, we may rewrite this


1 r−1 r+1 n−1 n
t1 + · · · + tr−1 + tr + tr+1 + · · · + tn−1 = 1 + β . (8)
r r r r r
Now, t ∈ PMIC (n, r) therefore

1 r−1 n−r−1 1
t1 + · · · + tr−1 + tr + tr+1 + · · · + tn−1 = 1
r r n−r n−r
or
" #
1 r−1 n−r−1 1
t1 + · · · + tr−1 + tr = 1 − tr+1 + · · · + tn−1 . (9)
r r n−r n−r
130 S. Tyber and E.L. Johnson

Substituting (9) into (8), we obtain


$ %
1− n−r−1
n−r tr+1 + · · · + n−r
1
tn−1 + r+1 r tr+1 + · · · + r tn−1 = 1 + β r
n−1 n

! !
⇒ r+1
r − n−r−1
n−r t r+1 + · · · + n−1
r − 1
n−r tn−1 = β r
n

⇒ n
r · n−r tr+1 + · · · + r · n−r tn−1
1 n n−r−1
= β nr
⇒ n−r tr+1 + · · · + n−r tn−1 =
1 n−r−1
β
" #
n−r−1 1
⇒ [tr+1 + · · · + tn−1 ] − tr+1 + · · · + tn−1 = β.
2 34 5 n−r n−r
(∗) 2 34 5
(∗∗)

Because t was assumed to be integral (∗) is necessarily integral. Suppose con-


versely that both (I) and (II) hold; by the assumption that tr = 0 and because
t is necessarily non-negative, the relation
" #
n−r−1 1 1 r−1
tr+1 + · · · + tn−1 = 1 − t1 + · · · + tr−1 + tr
n−r n−r r r

implies that (∗∗) must be fractional. But this contradicts that β is integral.
Therefore (I) and (II) cannot simultaneously hold.

4 Extensions under Automorphisms and Homomorphisms

Here we review some general properties of facets of the master group polyhedra
and discuss extensions of our previous results. Throughout, some basic knowledge
of algebra is assumed.
Let G be an abelian group with identity 0, G+ = G \ 0, and g0 ∈ G+ . The
master group polyhedron, P (G, g0 ) is defined by
⎧ ⎫
⎨  ⎬
|G|−1
P (G, g0 ) = conv t ∈ Z+ : gt(g) = g0 .
⎩ ⎭
g∈G+

Because |G|g = 0 for all g ∈ G+ , the recession cone of P (G, g0 ) is the non-negative
orthant, and since P (G, g0 ) is nonempty, the polyhedron is of full dimension.
As before, let (π, π0 ) denote the inequality

π(g)t(g) ≥ π0 .
g∈G+

If |G| − 1 ≥ 2, then the inequality t(g) ≥ 0 is facet defining for all g ∈ G+ , and
it is easily verified that these are the only facets with π0 = 0. Likewise, we call
these the trivial facets of P (G+ , g0 ).
A Polyhedral Study of the Mixed Integer Cut 131

4.1 Automorphisms

We are able to use automorphisms of G to obtain facets of P (G, g0 ) from other


master group polyhedra. Throughout, let φ be an automorphism of G.
Theorem 6 (Gomory [3], Theorem 14). If (π, π0 ) is a facet of P (G, g0 ),
with components, π(g), then (π  , π0 ) with components π  (g) = π(φ−1 (g)) is a
facet of P (G, φ(g0 )).
Similarly if t satisfies (π, π0 ) at equality, then t with components t (g) = t(φ−1 (g))
satisfies (π  , π0 ) at equality, and since φ is an automorphism of G, t necessarily
satisfies the group equation for P (G, φ(g0 )). As an obvious consequence, a point
t lies on the facet (π, π0 ) of P (G, g0 ) if and only if the corresponding point t lies
on the facet (π  , π0 ) of P (G, φ(g0 )). Hence we obtain the following:
Proposition 4. If (π, π0 ) and (γ, γ0 ) are facets of P (G, g0 ), then (γ, γ0 ) is
adjacent to (π, π0 ) if and only if (π  , π0 ) and (γ  , γ0 ) are adjacent facets of
P (G, φ(g0 )), where π  (g) = π(φ−1 (g)) and γ  (g) = γ(φ−1 (g)).

Proof. Since (γ, γ0 ) and (π, π0 ) define adjacent facets, there exist affinely in-
dependent points t1 , . . . , t(|G|−2) satisfying both at equality. By the previous
remarks, we may define points (t1 ) , . . . , (t(|G|−2) ) satisfying both (π  , π0 ) and
(γ  , γ0 ) at equality. Since these are all defined by the same permutation of the
indices of t1 , . . . , t(|G|−2) , affine independence is preserved.

Now consider the case when G = Cn , g0 = r. Let (μ , 1) be obtained by applying


φ to (μ, 1). Our previous results extend in the following sense:
Theorem 7. The non-trivial facets of P (Cn , φ(r)) adjacent to (μ , 1) are exactly
those obtained by applying φ to tilted knapsack facets.

Theorem 8. An integer point t ∈ P (Cn , φ(r)) satisfies (μ , 1) at equality if and


only if t is obtained by extending a knapsack solution of P (Kr ) or P (Kn−r ) and
applying φ to the indices of t.

4.2 Homomorphisms

Additionally one can obtain facets from homomorphisms of G by homomorphic


lifting. Let ψ : G → H be a homomorphism with kernel K such that g0 ∈
/ K. For
convenience let h0 = ψ(g0 ).
Theorem 9 (Gomory [3], Theorem 19). Let (π, π0 ) be a non-trivial facet
of P (H, h0 ). Then (π  , π0 ) is a facet of P (G, g0 ) where π  (g) = π(ψ(g)) for all
g ∈ G \ K, and π  (k) = 0 for all k ∈ K.
Unlike automorphisms, it is not clear that homomorphic lifting preserves the
adjacency of facets. We show next that it in fact does preserve adjacency.
132 S. Tyber and E.L. Johnson

First we prove the following useful proposition:


Proposition 5. Let (π, π0 ) and (γ, γ0 ) be adjacent non-trivial facets in P (H, h0 )
(h0 = 0). Then the affine subspace
T = P (H, h0 ) ∩ {t ∈ R|H|−1 : πt = π0 , γt = γ0 }
does not lie in the hyperplane H(h) = {t ∈ R|H|−1 : t(h) = 0} for any h ∈ H+ .
Proof. In [3], Gomory shows that every non-trivial facet (π, π0 ) of P (H, h0 )
satisfies π(h) + π(h0 − h) = π(h0 ) = π0 . In particular for all h ∈ H+ \ h0 , the
point t = eh + eh0 −h belongs to T , and has t(h) > 0. Similarly, the point t = eh0
belongs to T and has t(h0 ) > 0.
Using this proposition we obtain the following:
Lemma 2. Let (π, π0 ) and (γ, γ0 ) be adjacent non-trivial facets of P (H, h0 ),
and let (π  , π0 ) and (γ  , γ0 ) be facets of P (G, g0 ) obtained by homomorphic lifting
using the homomorphism ψ. Then (π  , π0 ) and (γ  , γ0 ) are adjacent.
Proof. Let K = ker(ψ). Let ϕ be a function selecting one element from each coset
of G/K distinct from K, and let ϕ(H) denote the set of coset representatives
chosen by ϕ.
Since we are assuming that (π  , π0 ) and (γ  , γ0 ) are obtained by homomorphic
lifting, h0 = 0. Since (π, π0 ) and (γ, γ0 ) are adjacent there exist affinely indepen-
dent points t1 , . . . , t(|H|−2) in P (H, h0 ) satisfying (π, π0 ) and (γ, γ0 ) at equality.
By Proposition 5, for all h ∈ H+ , there exists an i ∈ {1, . . . , |H| − 2} such that
ti (h) > 0.
Using these points, we will construct |G| − 2 affinely independent points be-
longing to P (G, g0 ) that satisfy both (π  , π0 ) and (γ  , γ0 ) at equality. We proceed
as follows:
1. Set N = H+
2. For i = {1, . . . , |H| − 2}
– Set N (i) = {h ∈ H+ : ti (h) > 0} ∩ N
– Define si as follows:
si (ϕ(h)) = ti (h), ∀h ∈ H
si (g) = 0, g ∈ G \ (K∪ ϕ(H))
si (k) = 1, k = g0 − g∈G+ \K si (g) · g

i
s (k) = 0, k = g0 − g∈G+ \K si (g) · g

– For each h ∈ N (i), k  ∈ K+ , define the point sik ,h as follows:


sik ,h (ϕ(h )) + k  ) = ti (h )
sik ,h (ϕ(h )) = 0
sik ,h (g) = si (g), g ∈ G+ \ K, g = ϕ(h ), g = ϕ(h ) + k 

i
sk ,h (k) = 1, k = g0 − g∈G+ \K sik ,h (g) · g

sik ,h (k) = 0, k = g0 − g∈G+ \K sik ,h (g) · g

– Set N = N \ N (i)
3. For each k ∈ K+ , define sk by sk = s1 + |G|ek
A Polyhedral Study of the Mixed Integer Cut 133

By construction these points satisfy (π  , π0 ) and (γ  , γ0 ) at equality. It remains


to verify that the above procedure indeed produces |G| − 2 affinely independent
points belonging to P (G, g0 ).
First we show that the above points belong to P (G, g0 ). Let s be one of the
above points. Then
⎛ ⎞ ⎛ ⎞
   
ψ⎝ gs(g)⎠ = ψ(g)s(g) = h·⎝ s(g)⎠ = h0 ,
g∈G+ \K g∈G+ \K h∈H+ g∈G+ :ψ(g)=h

where the first equality comes from the fact that ψ is a homomorphism and the
second equality follows by how we defined the above points. Therefore,

gs(g) ∈ g0 K,
g∈G+ \K

and by construction,
⎛ ⎞
 
ks(k) = g0 − ⎝ gs(g)⎠ .
k∈K g∈G+ \K

Thus s ∈ P (G, g0 ).
Note that we have the |H| − 2 points s1 , . . . , s|H|−2 . By Proposition 5, we
obtain (|H| − 1)(|K| − 1) points of the form sk,h for k ∈ K+ and h ∈ H+ , and
lastly, we obtain |K| − 1 points, sk for k ∈ K+ . Using the identity |G| = |K||H|,
it immediately follows that we have |G| − 2 points.
The affine independence of these points is easily verified. By constructing a
matrix for which the first |K| − 1 columns correspond to K, the next |H| − 1
columns corresponding to ϕ(H), and the remaining columns are arranged in
blocks by the cosets, it is readily observed by letting each row be one of the
above points and using the affine independence of t1 , . . . , t|H|−2 that the newly
defined points are affinely independent.
Given a point s ∈ P (G, g0 ) that satisfies the lifted facets at equality, we can
obtain a point t ∈P (H, h0 ) that satisfies (π, π0 ) and (γ, γ0 ) at equality under the
mapping t(h) = g∈G:ψ(g)=h s(g). By a fairly routine exercise in linear algebra,
one can use this to verify that s is in the affine hull of the points described above.
Hence we obtain the following theorem:
Theorem 10. Let (π, π0 ) and (γ, γ0 ) be non-trivial facets of P (H, h0 ), and let
(π  , π0 ) and (γ  , γ0 ) be facets of P (G, g0 ) obtained by homomorphic lifting using
the homomorphism ψ. Then (π  , π0 ) and (γ  , γ0 ) are adjacent if and only if (π, π0 )
and (γ, γ0 ) are adjacent.
Now consider G = Cn , g0 = r , a homomorphism ψ : Cn → Cn , ψ(r ) = r = 0,
and let (μ , 1) be obtained by applying homomorphic lifting to (μ, 1). Similarly
by applying Theorem 10, we know that the only lifted facets under ψ that are
adjacent to (μ , 1) come from tilted knapsack facets. Stated precisely:
134 S. Tyber and E.L. Johnson

Theorem 11. Let (π  , π0 ) be obtained by homomorphic lifting using ψ applied


to (π, π0 ). Then (π  , π0 ) is adjacent to (μ , 1) if and only if (π, π0 ) is a tilted
knapsack facet.
Moreover, for the integer points we obtain the following:
Theorem 12. If an integer point s ∈ P (Cn , r ) satisfies (μ , 1) at equality. Then
the point t defined by the mapping

ti = sj
j:ψ(j)=i

is an integer point of P (Cn , r) and satisfies (μ, 1) at equality. In particular it is


obtained from extending a knapsack solution of P (Kr ) or P (Kn−r ).

5 Future Work and Conclusions


Several questions remain for both the group polyhedron and knapsack polytope.
One worthy avenue of research is to expand the existing library of knapsack
facets, which in turn will provide even more information about the mixed integer
cut.
Another interesting problem is to obtain non-trivial necessary and sufficient
conditions to describe the extreme points of the master knapsack polytope and
the master group polyhedron. A natural idea was considered for the group poly-
hedron in terms of irreducibility. This condition is necessary for all vertices, but
insufficient. One might hope that this condition becomes sufficient for the master
knapsack polytope; however, it again fails.
Lastly, a closer inspection will reveal that in homomorphic lifting we gain no
information about the kernel of our homomorphism. If we consider the lifted
mixed integer cut as a polyhedron, it is no longer sufficient to characterize its
extreme points in terms of two related knapsacks. Similarly, it is easy to see that
lifted tilted knapsack facets are not the only adjacent non-trivial facets of the
lifted mixed integer cut. One might address whether there exists a family of facets
that when added to the lifted tilted knapsack facets completely characterizes the
adjacent facets of the lifted mixed integer cut.

References
1. Aráoz, J., Evans, L., Gomory, R.E., Johnson, E.L.: Cyclic group and knapsack facets.
Math. Program. 96(2), 377–408 (2003)
2. Dash, S., Günlük, O.: On the strength of gomory mixed-integer cuts as group cuts.
Math. Program. 115(2), 387–407 (2008)
3. Gomory, R.E.: Some polyhedra related to combinatorial problems. Linear Algebra
and Its Applications (2), 451–558 (1969)
4. Gomory, R.E., Johnson, E.L., Evans, L.: Corner polyhedra and their connection
with cutting planes. Math. Program. 96(2), 321–339 (2003)
Symmetry Matters for the Sizes of Extended
Formulations

Volker Kaibel, Kanstantsin Pashkovich, and Dirk O. Theis

Otto-von-Guericke-Universität Magdeburg, Institut für Mathematische Optimierung


Universitätsplatz 2, 39108 Magdeburg, Germany
{kaibel,pashkovich,theis}@ovgu.de

Abstract. In 1991, Yannakakis [17] proved that no symmetric extended


formulation for the matching polytope of the complete graph Kn with n
nodes has a number of variables and constraints that is bounded subex-
ponentially in n. Here, symmetric means that the formulation remains
invariant under all permutations of the nodes of Kn . It was also conjec-
tured in [17] that “asymmetry does not help much,” but no correspond-
ing result for general extended formulations has been found so far. In
this paper we show that for the polytopes associated with the matchings
in Kn with log n edges there are non-symmetric extended formulations
of polynomial size, while nevertheless no symmetric extended formula-
tion of polynomial size exists. We furthermore prove similar statements
for the polytopes associated with cycles of length log n. Thus, with
respect to the question for smallest possible extended formulations, in
general symmetry requirements may matter a lot.

1 Introduction
Linear Programming techniques have proven to be extremely fruitful for com-
binatorial optimization problems with respect to both structural analysis and
the design of algorithms. In this context, the paradigm is to represent the prob-
lem by a polytope P ⊆ Rm whose vertices correspond to the feasible solutions
of the problem in such a way that the objective function can be expressed by
a linear functional x %→ "c, x# on Rm (with some c ∈ Rm ). If one succeeds in
finding a description of P by means of linear constraints, then algorithms as
well as structural results from Linear Programming can be exploited. In many
cases, however, the polytope P has exponentially (in m) many facets, thus P
can only be described by exponentially many inequalities. Also it may be that
the inequalities needed to describe P are too complicated to be identified.
In some of these cases one may find an extended formulation for P , i.e., a
(preferably small and simple) description by linear constraints of another poly-
hedron Q ⊆ Rd in some higher dimensional space that projects to P via some
(simple) linear map p : Rd → Rm with p(y) = T y for all y ∈ Rd (and some
matrix T ∈ Rm×d ). Indeed, if p : Rm → Rd with p (x) = T t x for all x ∈ Rm
denotes the linear map that is adjoint to p (with respect to the standard bases),
then we have max{"c, x# : x ∈ P } = max{"p (c), y# : y ∈ Q}.

F. Eisenbrand and B. Shepherd (Eds.): IPCO 2010, LNCS 6080, pp. 135–148, 2010.

c Springer-Verlag Berlin Heidelberg 2010
136 V. Kaibel, K. Pashkovich, and D.O. Theis

As for an example, let us consider the spanning tree polytope Pspt (n) =
conv{χ(T ) ∈ {0, 1}En : T ⊆ En spanning tree of Kn }, where Kn = ([n], En )
denotes the complete graph with node set [n] = {1, . . . , n} and edge set En =
{{v, w} : v, w ∈ [n], v = w}, and χ(A) ∈ {0, 1}B is the characteristic vector of
the subset A ⊆ B of B, i.e., for all b ∈ B, we have χ(A)b = 1 if and only if b ∈ A.
Thus, Pspt (n) is the polytope associated with the bases of the graphical matroid
of Kn , and hence (see [7]), it consists of all x ∈ RE + satisfying x(En ) = n − 1
n

and x(En (S)) ≤ |S| − 1 for all ⊆ [n] with 2 ≤ |S| ≤ n − 1, where RE + is the
nonnegative orthant of R E
, we denote by En (S) the subset of all edges with both
nodes in S, and x(F ) = e∈F xe for F ⊆ En . This linear description of Pspt (n)
has an exponential (in n) number of constraints, and as all the inequalities define
pairwise disjoint facets, none of them is redundant.
The following much smaller exended formulation for Pspt (n) (with O(n3 ) vari-
ables and constraints) appears in [5] (and a similar one in [17], who attributes
it to [13]). Let us introduce additional 0/1-variables ze,v,u for all e ∈ En , v ∈ e,
and u ∈ [n] \ e. While each spanning tree T ⊆ En is represented by its char-
acteristic vector x(T ) = χ(T ) in Pspt (n), in the extended formulation it will be
(T )
represented by the vector y (T ) = (x(T ) , z (T ) ) with ze,v,u = 1 (for e ∈ En , v ∈ e,
u ∈ [n] \ e) if and only if e ∈ T and u is contained in the component of v in T \ e.
The polyhedron Qspt (n) ⊆ Rd defined by the nonnegativity constraints x ≥ 0,
z ≥ 0, the equations x(En ) = n − 1, x{v,w} − z{v,w},v,u  − z{v,w},w,u = 0 for all
pairwise distinct v, w, u ∈ [n], as well as x{v,w} + u∈[n]\{v,w} z{v,u},u,w = 1 for
all distinct v, w ∈ [n], satisfies p(Qspt (n)) = Pspt (n), where p : Rd → RE is the
orthogonal projection onto the x-variables.
For many other polytopes (with exponentially many facets) associated with
polynomial time solvable combinatorial optimization problems polynomially sized
extended formulations can be constructed as well (see, e.g., the recent survey [5]).
Probably the most prominent problem in this class for which, however, no such
small formulation is known, is the matching problem. In fact, Yannakakis [17]
proved that no symmetric polynomially sized extended formulation of the match-
ing polytope exists.
Here, symmetric refers to the symmetric group S(n) of all permutations
π : [n] → [n] of the node set [n] of Kn acting on En via π.{v, w} = {π(v), π(w)}
for all π ∈ S(n) and {v, w} ∈ En . Clearly, this action of S(n) on En induces
an action on the set of all subsets of En . For instance, this yields an action
on the spanning trees of Kn , and thus, on the vertices of Pspt (n). The ex-
tended formulation of Pspt (n) discussed above is symmetric in the sense that,
for every π ∈ S(n), replacing all indices associated with edges e ∈ En and
nodes v ∈ [n] by π.e and π.v, respectively, does not change the set of constraints
in the formulation. Phrased informally, all subsets of nodes of Kn of equal cardi-
nality play the same role in the formulation. For a general definition of symmetric
extended formulations see Section 2.
In order to describe the main results of Yannakakis paper [17] and the
contributions of the present paper, let us denote by M (n) = {M ⊆ En :
M matching in Kn , |M | = } the set of all matchings of size  (a matching
Symmetry Matters for the Sizes of Extended Formulations 137

being a subset of edges no two of which share a node), and by Pmatch (n) =
conv{χ(M ) ∈ {0, 1}En : M ∈ M (n)} the associated polytope. According to
n/2
Edmonds [6] the perfect matching polytope Pmatch (n) (for even n) is described
by
n/2
Pmatch (n) = {x ∈ RE
+ : x(δ(v)) = 1 for all v ∈ [n],
n

x(E(S)) ≤ (|S| − 1)/2 for all S ⊆ [n], 3 ≤ |S| odd} (1)

(with δ(v) = {e ∈ En : v ∈ e}). Yannakakis [17, Thm.1 and its proof] shows that
n/2
there is a constant C > 0 such that, for every extended formulation for Pmatch (n)
(with n even) that is symmetric
 n  in the sense above, the number of variables and
constraints is at least C · n/4 = 2Ω(n) . This in particular implies that there is
no polynomial size symmetric extended formulation for the matching polytope
of Kn (the convex hulls of characteristic vectors of all matchings in Kn ), of which
the perfect matching polytope is a face.
Yannakakis [17] also obtains a similar (maybe less surprising) result on travel-
ing salesman polytopes. Denoting the set of all (simple) cycles of length  in Kn
by C  (n) = {C ⊆ En : C cycle in Kn , |C| = }, and the associated polytopes by
Pcycl (n) = conv{χ(C) ∈ {0, 1}En : C ∈ C  (n)}, the traveling salesman polytope
n/2
is Pncycl (n). Identifying Pmatch (n) (for even n) with a suitable face of P3n
cycl (3n),
Yannakakis concludes that all symmetric extended formulations for Pncycl (n) have
size at least 2Ω(n) as well [17, Thm. 2 and its proof].
Yannakakis’ results in a fascinating way illuminate the borders of our principal
abilities to express combinatorial optimization problems like the matching or the
traveling salesman problem by means of linear constraints. However, they only
refer to linear descriptions that respect the inherent symmetries in the problems.
In fact, the second open problem mentioned in the concluding section of [17] is
described as follows: “We do not think that asymmetry helps much. Thus, prove
that the matching and TSP polytopes cannot be expressed by polynomial size
LP’s without the asymmetry assumption.”
The contribution of our paper is to show that, in contrast to the assumption
expressed in the quotation above, asymmetry can help much, or, phrased differ-
ently, that symmetry requirements on extended formulations indeed can matter
significantly with respect to the minimal sizes of extended formulations. Our
log n log n
main results are that both Pmatch (n) and Pcycl (n) do not admit symmetric
extended formulations of polynomial size, while they have non-symmetric ex-
tended formulations of polynomial size (see Cor. 1 and 2 for matchings, as well as
Cor. 3 and 4 for cycles). The corresponding theorems from which these corollar-
ies are derived provide some more general and more precise results for Pmatch (n)
and Pcycl(n). In order to establish the lower bounds for symmetric extensions,
we generalize the techniques developed by Yannakakis [17]. The constructions
of the compact non-symmetric extended formulations rely on small families of
perfect hash functions [1,8,15].
The paper is organized as follows. In Section 2, we provide definitions of
extensions, extended formulations, their sizes, the crucial notion of a section
138 V. Kaibel, K. Pashkovich, and D.O. Theis

of an extension, and we give some auxilliary results. In Section 3, we present


Yannakakis’ method to derive lower bounds on the sizes of symmetric extended
formulations for perfect matching polytopes in a general setting, which we then
exploit in Section 4 in order to derive lower bounds on the sizes of symmetric
extended formulations for the polytopes Pmatch (n) associated with cardinality
restricted matchings. In Section 5, we describe our non-symmetric extended for-
multions for these polytopes. Finally, in Section 6 we present the results on
Pcycl (n). Some remarks conclude the paper in Section 7.

2 Extended Formulations, Extensions, and Symmetry

An extension of a polytope P ⊆ Rm is a polyhedron Q ⊆ Rd together with


a projection (i.e., a linear map) p : Rd → Rm with p(Q) = P ; it is called a
subspace extension if Q is the intersection of an affine subspace of Rd and the
nonnegative orthant Rd+ . For instance, the polyhedron Qspt (n) defined in the
Introduction is a subspace extension of the spanning tree polytope Pspt (n). A
(finite) system of linear equations and inequalities whose solutions are the points
in an extension Q of P is an extended formulation for P . The size of an extension
is the number of its facets plus the dimension of the space it lies in. The size of
an extended formulation is its number of inequalities (including nonnegativity
constraints, but not equations) plus its number of variables. Clearly, the size
of an extended formulation is at least as large as the size of the extension it
describes. Conversely, every extension is described by an extended formulation
of at most its size.
Extensions or extended formulations of a family of polytopes P ⊆ Rm (for
varying m) are compact if their sizes and the encoding lengths of the coeffi-
cients needed to describe them can be bounded by a polynomial in m and the
maximal encoding length of all components of all vertices of P . Clearly, the
extension Qspt (n) of Pspt (n) from the Introduction is compact.
In our context, sections s : X → Q play a crucial role, i.e., maps that assign
to every vertex x ∈ X of P some point s(x) ∈ Q ∩ p−1 (x) in the intersection
of the polyhedron Q and the fiber p−1 (x) = {y ∈ Rd : p(y) = x} of x under
the projection p. Such a section induces a bijection between X and its image
s(X) ⊆ Q, whose inverse is given by p. In the spanning tree example from the
Introduction, the assignment χ(T ) %→ y (T ) = (x(T ) , z (T ) ) defined such a section.
Note that, in general, sections will not be induced by linear maps. In fact, if a
section is induced by a linear map s : Rm → Rd , then the intersection of Q with
the affine subspace of Rd generated by s(X) is isomorphic to P , thus Q has at
least as many facets as P .
For a family F of subsets of X, an extension Q ⊆ Rd is said to be indexed
by F if there is a bijection between F and [d] such that (identifying RF with
Rd via this bijection) the map 1F = (1F )F ∈F : X → {0, 1}F whose component
functions are the characteristic functions 1F : X → {0, 1} (with 1F (x) = 1 if and
only if x ∈ F ), is a section for the extension, i.e., 1F (X) ⊆ Q and p(1F (x)) = x
hold for all x ∈ X. For instance, the extension Qspt (n) of Pspt (n) is indexed by
Symmetry Matters for the Sizes of Extended Formulations 139

the family {T (e) : e ∈ En } ∪{T (e, v, u) : e ∈ En , v ∈ e, u ∈ [n] \ e}, where T (e)


contains all spanning trees using edge e, and T (e, v, u) consists of all spanning
trees in T (e) for which u and v are in the same component of T \ {e}.
In order to define the notion of symmetry of an extension precisely, let the
group S(d) of all permutations of [d] = {1, . . . , d} act on Rd by coordinate
permutations. Thus we have (σ.y)j = yσ−1 (j) for all y ∈ Rd , σ ∈ S(d), and
j ∈ [d].
Let P ⊆ Rm be a polytope and G be a group acting on Rm with π.P = P
for all π ∈ G, i.e., the action of G on Rm induces an action of G on the set X
of vertices of P . An extension Q ⊆ Rd of P with projection p : Rd → Rm
is symmetric (with respect to the action of G), if for every π ∈ G there is a
permutation κπ ∈ S(d) with κπ .Q = Q and

p(κπ .y) = π.p(y) for all y ∈ Rd . (2)

The prime examples of symmetric extensions arise from extended formula-


tions that “look symmetric”. To be more precise, we define an extended for-
mulation A= y = b= , A≤ y ≤ b≤ describing the polyhedron Q = {y ∈ Rd :
A= y = b= , A≤ y ≤ b≤ } extending P ⊆ Rm as above to be symmetric (with re-
spect to the action of G on the set X of vertices of P ), if for every π ∈ G
there is a permutation κπ ∈ S(d) satisfying (2) and there are two permuta-
≤ ≤ ≤
tions = = =
π and π of the rows of (A , b ) and (A , b ), respectively, such that
the corresponding simultaneous permutations of the columns and the rows of the
matrices (A= , b= ) and (A≤ , b≤ ) leaves them unchanged. Clearly, in this situation
the permutations κπ satisfy κπ .Q = Q, which implies the following.
Lemma 1. Every symmetric extended formulation describes a symmetric
extension.
One example of a symmetric extended formulation is the extended formulation
for the spanning tree polytope described in the Introduction (with respect to
the group G of all permutations of the nodes of the complete graph).
For the proof of the central result on the non-existence of certain symmetric
subspace extensions (Theorem 1), a weaker notion of symmetry will be sufficient.
We call an extension as above weakly symmetric (with respect to the action
of G) if there is a section s : X → Q for which the action of G on s(X)
induced by the bijection s works by permutation of variables, i.e., for every
π ∈ G there is a permutation κπ ∈ S(d) with s(π.x) = κπ .s(x) for all x ∈ X.
The following statement (and its proof, for which we refer to [12]) generalizes
the construction of sections for symmetric extensions of matching polytopes
described in Yannakakis’ paper [17, Claim 1 in the proof of Thm. 1].
Lemma 2. Every symmetric extension is weakly symmetric.
Finally, the following result (again, we refer to [12] for a proof) will turn out to
be useful in order to derive lower bounds on the sizes of symmetric extensions
for one polytope from bounds for another one.
140 V. Kaibel, K. Pashkovich, and D.O. Theis

Lemma 3. Let Q ⊆ Rd be an extension of the polytope P ⊆ Rm with projection


p : Rd → Rm , and let the face P  of P be an extension of a polytope R ⊆ Rk
with projection q : Rm → Rk . Then the face Q = p−1 (P  ) ∩ Q ⊆ Rd of Q is an
extension of R via the composed projection q ◦ p : Rd → Rk .
If the extension Q of P is symmetric with respect to an action of a group G
on Rm (with π.P = P for all π ∈ G), and a group H acts on Rk such that, for
every τ ∈ H, we have τ.R = R, and there is some πτ ∈ G with πτ .P  = P  and
q(πτ .x) = τ.q(x) for all x ∈ Rm , then the extension Q of R is symmetric (with
respect to the action of the group H).

3 Yannakakis’ Method
Here, we provide an abstract view on the method used by Yannakakis [17] in or-
der to bound from below the sizes of symmetric extensions for perfect matching
polytopes, without referring to these concrete poytopes. That method is capable
of establishing lower bounds on the number of variables of weakly symmetric
subspace extensions of certain polytopes. By the following lemma, which is ba-
sically Step 1 in the proof of [17, Theorem 1], such bounds imply similar lower
bounds on the dimension of the ambient space and the number of facets for
general symmetric extensions (that are not necessarily subspace extensions).

Lemma 4. If, for a polytope P , there is a symmetric extension in Rd̃ with f


facets, then P has also a symmetric subspace extension in Rd with d ≤ 2d˜ + f .
The following simple lemma provides the strategy for Yannakakis’ method, which
we need to extend slightly by allowing restrictions to affine subspaces.
Lemma 5. Let Q ⊆ Rd be a subspace extension of the polytope P ⊆ Rm with
vertex set X ⊆ Rm , and let s : X → Q be a section for the extension. If S ⊆ Rm
is an affine subspace, and, for some X ⊆ X ∩S, the coefficients cx ∈ R (x ∈ X )
yield an affine combination of a nonnegative vector
 
cx s(x) ≥ 0d with cx = 1 , (3)
x∈X  x∈X 

from the section images of the vertices in X , then x∈X  cx x ∈ P ∩ S holds.

Proof. Since Q is a subspace extension, we obtain x∈X  cx s(x) ∈ Q from
s(x) ∈ Q (for all x ∈ X ). Thus, if p : Rd → Rm is the projection of the
extension, we derive
  
P ' p( cx s(x)) = cx p(s(x)) = cx x . (4)
x∈X  x∈X  x∈X 

As S is an affine subspace containing X , we also have x∈X  cx x ∈ S.

Due to Lemma 5 one can prove that subspace extensions of some polytope P
with certain properties do not exist by finding, for such a hypothetical extension,
Symmetry Matters for the Sizes of Extended Formulations 141

a subset X of vertices of P and an affine subspace S containing


 X , for which
one can construct coefficients cx ∈ R satisying (3) such that x∈X  cx x violates
some inequality that is valid for P ∩ S.
Actually, following Yannakakis, we will not apply Lemma 5 directly to a hy-
pothetical small weakly symmetric subspace extension, but we will rather first
construct another subspace extension from the one assumed to exist that is in-
dexed by some convenient family F . We say that an extension Q of a polytope P
is consistent with a family F of subsets of the vertex set X of P if there is a
section s : X → Q for the extension such that, for every component function sj
of s, there is a subfamily Fj of F such that sj is constant on every set in Fj , and
the sets in Fj partition X. In this situation, we also call the section s consistent
with F . The proof of the following lemma can be found in [12].
Lemma 6. If P ⊆ Rm is a polytope and F is a family of vertex sets of P for
which there is some extension Q of P that is consistent with F , then there is
some extension Q for P that is indexed by F . If Q is a subspace extension,
then Q can be chosen to be a subspace extension as well.
Lemmas 5 and 6 suggest the following strategy for proving that subspace exten-
sions of some polytope P with certain properties (e.g., being weakly symmetric
and using at most B variables) do not exist by (a) exhibiting a family F of
subsets of the vertex set X of P with which such an extension would be consis-
tent and (b) determining a subset X ⊂ X of vertices and an affine subspace S
containing X , for which one can construct coefficients cx ∈ R satisying
 
cx 1F (x) ≥ 0F with cx = 1 , (5)
x∈X  x∈X 

such that x∈X  cx x violates some inequality that is valid for P ∩ S.
Let us finally investigate more closely the sections that come with weakly
symmetric extensions. In particular, we will discuss an approach to find suitable
families F within the strategy mentioned above in the following setting. Let
Q ⊆ Rd be a weakly symmetric extension of the polytope P ⊆ Rm (with respect
to an action of the group G on the vertex set X of P ) along with a section
s : X → Q such that for every π ∈ G there is a permutation κπ ∈ S(d) that
satisfies s(π.x) = κπ .s(x) for all x ∈ X (with (κπ .s(x))j = sκ−1
π (j)
(x)).
In this setting, we can define an action of G on the set S = {s1 , . . . , sd } of
the component functions of the section s : X → Q with π.sj = sκ−1 (j) ∈ S for
π −1
each j ∈ [d]. In order to see that this definition indeed is well-defined (note that
s1 , . . . , sd need not be pairwise distinct functions) and yields a group action,
observe that, for each j ∈ [d] and π ∈ G, we have

(π.sj )(x) = sκ−1 (j) (x) = (κπ−1 .s(x))j = sj (π −1 .x) for all x ∈ X , (6)
π −1

from which one deduces 1.sj = sj for the one-element 1 in G as well as (ππ  ).sj =
π.(π  .sj ) for all π, π  ∈ G. The isotropy group of sj ∈ S under this action is
isoG (sj ) = {π ∈ G : π.sj = sj }. From (6) one sees that, for all x ∈ X and
142 V. Kaibel, K. Pashkovich, and D.O. Theis

π ∈ isoG (sj ), we have sj (x) = sj (π −1 .x). Thus, sj is constant on every orbit of


the action of the subgroup isoG (sj ) of G on X. We conclude the following.
Remark 1. In the setting described above, if F is a family of subsets of X such
that, for each j ∈ [d], there is a sub-family Fj partitioning X and consisting of
vertex sets each of which is contained in an orbit under the action of isoG (sj )
on X, then s is consistent with F .
In general, it will be impossible to identify the isotropy groups isoG (sj ) without
more knowledge on the section s. However, for each isotropy group isoG (sj ), one
can at least bound its index (G : isoG (sj )) in G.
Lemma 7. In the setting described above, we have (G : isoG (sj )) ≤ d .
Proof. This follows readily from the fact that the index (G : isoG (sj )) of the
isotropy group of the element sj ∈ S under the action of G on S equals the
cardinality of the orbit of sj under that action, which due to |S| ≤ d, clearly is
bounded from above by d.
The bound provided in Lemma 7 can become useful, in case one is able to
establish a statement like “if isoG (sj ) has index less than τ in G then it contains
a certain subgroup Hj ”. Choosing Fj as the family of orbits of X under the action
of the subgroup Hj of G, then F = F1 ∪ · · · ∪ Fd is a familiy as in Remark 1.
If this family (or any refinement of it) can be used to perform Step (b) in the
strategy outlined in the paragraph right after the statement of Lemma 6, then
one can conclude the lower bound d ≥ τ on the number of variables d in an
extension as above.

4 Bounds on Symmetric Extensions of Pmatch(n)


In this section, we use Yannakakis’ method described in Section 3 to prove the
following result.
Theorem 1. For every n ≥ 3 and odd  with  ≤ n2 , there exists no weakly sym-
 n 
metric subspace extension for Pmatch (n) with at most (−1)/2 variables (with
respect to the group S(n) acting via permuting the nodes of Kn as described in
the Introduction).
From Theorem 1, we can derive the following more general lower bounds. Since
we need it in the proof of the next result, and also for later reference, we state
a simple fact on binomial coefficients first.
 
Lemma 8. For each constant b ∈ N there is some constant β > 0 with M−b ≥
M  N
β N for all large enough M ∈ N and N ≤ M 2 .
Theorem 2. There is a constant C > 0 such that, for all n and 1 ≤  ≤
n 
2 , the size of every extension for Pmatch (n) that is symmetric (with respect
to the group S(n) acting via permuting the  nodes of Kn as described in the
Introduction) is bounded from below by C · (−1)/2
n
.
Symmetry Matters for the Sizes of Extended Formulations 143

Proof. For odd , this follows from Theorem 1 using Lemmas 1, 2, and 4. For
match (n − 2) is (isomorphic to) a face of Pmatch (n) defined
even , the polytope P−1 −1

by xe = 1 for an arbitrary edge e of Kn . From this, as  − 1 is odd (and not


larger than (n − 2)/2) with ( − 2)/2 = ( − 1)/2, and due to Lemma 8, the
theorem follows by Lemma 3.
For even n and  = n/2, Theorem 2 provides a similar bound to Yannakakis
result (see Step 2 in the proof of [17, Theorem 1]) that no weakly symmetric
subspace extension of the perfect
  matching polytope of Kn has a number of
variables that is bounded by nk for any k < n/4.
Theorem 2 in particular implies that the size of every symmetric extension for
Pmatch (n) with Ω(log n) ≤  ≤ n/2 is bounded from below by nΩ(log n) , which
has the following consequence.
Corollary 1. For Ω(log n) ≤  ≤ n/2, there is no compact extended formulation
for Pmatch (n) that is symmetric (with respect to the group G = S(n) acting via
permuting the nodes of Kn as described in the Introduction).
The rest of this section is devoted to indicate the proof of
 Theorem 1. Through-
out, with  = 2k + 1, we assume that Q ⊆ Rd with d ≤ nk is a weakly symmetric
subspace extension of P2k+1match (n) for 4k + 2 ≤ n. We will only consider the case
k ≥ 1, as for  = 1 the theorem trivially is true (note that we restrict to n ≥ 3).
Weak symmetry is meant with respect to the action of G = S(n) on the set X of
match (n) as described in the Introduction, and we assume s : X → Q
vertices of P2k+1
to be a section as required in the definition of weak symmetry. Thus, we have
X = {χ(M ) ∈ {0, 1}En : M ∈ M2k+1 (n)}, where M2k+1 (n) is the set of all
matchings M ⊆ En with |M | = 2k + 1 in the complete graph Kn = (V, E) (with
V = [n]), and (π.χ(M )){v,w} = χ(M ){π−1 (v),π−1 (w)} holds for all π ∈ S(n),
M ∈ M2k+1 (n), and {v, w} ∈ E.
In order to identify suitable subgroups of the isotropy groups isoS(n) (sj ) (see
the remarks at the end of Section 3), we use the following result on subgroups of
the symmetric group S(n), where A(n) ⊆ S(n) is the alternating group formed
by all even permutations of [n]. This result is Claim 2 in the proof of Thm. 1 of
Yannakakis paper [17]. Its proof relies on a theorem of Bochert’s [3] stating that
any subgroup of S(m) that acts primitively on [m] and does not contain A(m)
has index at least (m + 1)/2! in S(m) (see [16, Thm. 14.2]).
 
Lemma 9. For each subgroup U of S(n) with (S(n) : U ) ≤ nk for k < n4 , there
is a W ⊆ [n] with |W | ≤ k and Hj = {π ∈ A(n) : π(v) = v for all v ∈ W } ⊆ U .
 
As we assumed d ≤ nk (with k < n4 due to 4k + 2 ≤ n), Lemmas 7 and 9 imply
Hj ⊆ isoS(n) (sj ) for all j ∈ [d]. For each j ∈ [d], two vertices χ(M ) and χ(M  )

match (n) (with M, M ∈ M
of P2k+1 2k+1
(n)) are in the same orbit under the action
of the group Hj if and only if we have
M ∩ E(Vj ) = M  ∩ E(Vj ) and Vj \ M = Vj \ M  . (7)

Indeed, it is clear that (7) holds if we have χ(M ) = π.χ(M ) for some per-
mutation π ∈ Hj . In turn, if (7) holds, then there clearly is some permu-
tation π ∈ S(n) with π(v) = v for all v ∈ Vj and M  = π.M . Due to
144 V. Kaibel, K. Pashkovich, and D.O. Theis

|M | = 2k + 1 > 2|Vj | there is some edge {u, w} ∈ M with u, w ∈ Vj . De-


noting by τ ∈ S(n) the transposition of u and w, we thus also have πτ (v) = v
for all v ∈ Vj and M  = πτ.M . As one of the permutations π and πτ is even,
say π  , we find π  ∈ Hj and M  = π  .M , proving that M and M  are contained
in the same orbit under the action of Hj .
As it will be convenient for Step (b) (referring to the strategy described after
the statement of Lemma 6), we will use the following refinements of the parti-
tionings of X into orbits of Hj (as mentioned at the end of Section 3). Clearly,
for j ∈ [d] and M, M  ∈ M2k+1 (n),

M \ E(V \ Vj ) = M  \ E(V \ Vj ) (8)

implies (7). Thus, for each j ∈ [d], the equivalence classes of the equivalence
relation defined by (8) refine the partitioning of X into orbits under Hj , and
we may use the collection of all these equivalence classes (for all j ∈ [d]) as the
family F in Remark 1. With

Λ = {(A, B) : A ⊆ E matching and there is some j ∈ [d] with


A ⊆ E \ E(V \ Vj ), B = Vj \ V (A)} ,
/
(with V (A) = a∈A a) we hence have F = {F (A, B) : (A, B) ∈ Λ} , where

F (A, B) = {χ(M ) : M ∈ M2k+1 (n), A ⊆ M ⊆ E(V \ B)} .

In order to construct a subset X ⊆ X which will be used to derive a con-


tradiction as mentioned after Equation (5), we choose two arbitrary disjoint
subsets V , V ⊂ V of nodes with |V | = |V | = 2k + 1, and define M =
{M ∈ M2k+1 (n) : M ⊆ E(V ∪ V )} as well as X = {χ(M ) : M ∈ M }.
Thus, M is the set of perfect matchings on K(V ∪ V ). Clearly, X is con-
tained in the affine subspace S of RE defined by xe = 0 for all e ∈ E \E(V ∪V ).
match (n) ∩ S of Pmatch (n), and for this
In fact, X is the vertex set of the face P2k+1 2k+1

face the inequality x(V : V ) ≥ 1 is valid (where (V : V ) is the set of all edges

having one node in V and the other one in V ), since every matching M ∈ M
intersects (V : V ) in an odd number of edges. Therefore, in order to derive the
 it suffices to find cx ∈ R  (for all x ∈
desired
 contradiction, X ) with
x∈X  cx = 1, x∈X  cx · 1F (x) ≥ 0F , and x∈X  cx e∈(V :V  ) xe = 0. For
the details on how this can be done we refer to [12].

5 A Non-symmetric Extension for Pmatch(n)


We shall establish the following result on the existence of extensions for cardi-
nality restricted matching polytopes in this section.
Theorem 3. For all n and , there are extensions for Pmatch (n) whose sizes can
be bounded by 2O() n2 log n (and for which the encoding lengths of the coefficients
needed to describe the extensions by linear systems can be bounded by a constant).
Symmetry Matters for the Sizes of Extended Formulations 145

In particular, Theorem 3 implies the following, although, according to Corol-


lary 1, no compact symmetric extended formulations exist for Pmatch (n) with
 = Θ(log n).
Corollary 2. For all n and  ≤ O(log n), there are compact extended formula-
tions for Pmatch (n).
The proof of Theorem 3 relies on the following result on the existence of small
families of perfect-hash functions, which is from [1, Sect. 4]. Its proof is based
on results from [8,15].
Theorem 4 (Alon, Yuster, Zwick [1]). There are maps φ1 , . . . , φq(n,r) :
[n] → [r] with q(n, r) ≤ 2O(r) log n such that, for every W ⊆ [n] with |W | = r,
there is some i ∈ [q(n, r)] for which the map φi is bijective on W .
Furthermore, we will use the following two auxilliary results that can be de-
rived from general results on polyhedral branching systems [11, see Cor. 3 and
Sect. 4.4]. The first one (Lemma 10) provides a construction of an extension
of a polytope that is specified as the convex hull of some polytopes of which
extensions are already available. In fact, in this section it will be needed only
for the case that these extensions are the polytopes themselves (this is a special
case of a result of Balas’, see [2, Thm.2.1]). However, we will face the slightly
more general situation in our treatment of cycle polytopes in Section 6.
Lemma 10. If the polytopes Pi ⊆ Rm (for i ∈ [q]) have extensions qQi of size si ,
respectively, then P = conv(P1 ∪· · ·∪Pq ) has an extension of size i=1 (si +2)+1.
The second auxilliary result that we need deals with describing a 0/1-polytope
that is obtained by splitting variables of a 0/1-polytope of which a linear de-
scription is already available.
Lemma 11. Let S be a set of subsets of [t], P = conv{χ(S) ∈ {0, 1}t : S ∈ S} ⊆
Rt , the corresponding 0/1-polytope, J = J(1)(· · ·(J(t) a disjoint union of finite
sets J(i),

S = {S ⊆ J : There is some S ∈ S with


|S ∩ J(i)| = 1 for all i ∈ S, |S ∩ J(i)| = 0 for all i ∈ S} , (9)

and P = conv{χ(S ) ∈ {0, 1}J : S ∈ S }. If P = {y ∈ [0, 1]t : Ay ≤ b} for


some A ∈ Rs×t and b ∈ Rs , then

t 
P = {x ∈ [0, 1]J : A ,i · xj ≤ bi for all i ∈ [t]} . (10)
i=1 j∈J(i)

In order to prove Theorem 3, let φ1 , . . . , φq be maps as guaranteed to exist


by Theorem 4 with r = 2 and q = q(n, 2) ≤ 2O() log n, and denote Mi =
{M ∈ M (n) : φi is bijective on V (M )} for each i ∈ [q]. By Theorem 4, we
have M (n) = M1 ∪ · · · ∪ Mq . Consequently,

Pmatch (n) = conv(P1 ∪ · · · ∪ Pq ) (11)


146 V. Kaibel, K. Pashkovich, and D.O. Theis

with Pi = conv{χ(M ) : M ∈ Mi } for all i ∈ [q], where we have

−1
Pi = {x ∈ RE
+ : xE\Ei = 0, x(δ(φi (s))) = 1 for all s ∈ [2],
x(Ei (φ−1
i (S))) ≤ (|S| − 1)/2 for all S ⊆ [2], |S| odd} ,
/
where Ei = E \ j∈[2] E(φ−1
i (j)). This follows by Lemma 11 from Edmonds’
linear description (1) of the perfect matching polytope Pmatch (2) of K2 . As the
sum of the number of variables and the number of inequalities in the description
of Pi is at most 2O() + n2 (the summand n2 comes from the nonnegativity
constraints on x ∈ RE+ and the constant in O() is independent of i), we obtain an

extension of Pmatch (n) of size 2O() n2 log n by Lemma 10. This proves Theorem 3.

6 Extensions for Cycle Polytopes


By a modification of Yannakakis’ construction for the derivation of lower bounds
on the sizes of symmetric extensions for traveling salesman polytopes from the
corresponding lower bounds for matching polytopes [17, Thm. 2], we obtain
lower bounds on the sizes of symmetric extensions for Pcycl (n). The lower bound
 ≥ 42 in the statement of the theorem (whose proof can be found in [12]) is
convenient with respect to both formulating the bound and proving its validity.
Theorem 5. There is a constant C  > 0 such that, for all n and 42 ≤  ≤ n,
the size of every extension for Pcycl (n) that is symmetric (with respect to the
group S(n) acting via permuting the nodes of Kn as described in the Introduc-
 n 
3
tion) is bounded from below by C  · (  −1)/2 .
6

Corollary 3. For Ω(log n) ≤  ≤ n, there is no compact extended formula-


tion for Pcycl (n) that is symmetric (with respect to the group S(n) acting via
permuting the nodes of Kn as described in the Introduction).

On the other hand, if we drop the symmetry requirement, we find extensions of


the following size.
Theorem 6. For all n and , there are extensions for Pcycl (n) whose sizes can
be bounded by 2O() n3 log n (and for which the encoding lengths of the coefficients
needed to describe the extensions by linear systems can be bounded by a constant).
Before we prove Theorem 6, we state a consequence that is similar to Corollary 1
for matching polytopes. It shows that, despite the non-existence of symmetric
extensions for the polytopes associated with cycles of length Θ(log n) (Corol-
lary 3), there are non-symmetric compact extensions of these polytopes.
Corollary 4. For all n and  ≤ O(log n), there are compact extended formula-
tions for Pcycl (n).
The rest of the section is devoted to prove Theorem 6, i.e., to construct an ex-
tension of Pcycl (n) whose size is bounded by 2O() n3 log n. We proceed similarly
Symmetry Matters for the Sizes of Extended Formulations 147

to the proof of Theorem 3 (the construction of extensions for matching poly-


topes), this time starting with maps φ1 , . . . , φq as guaranteed to exist by Theo-
rem 4 with r =  and q = q(n, ) ≤ 2O() log n, and defining Ci = {C ∈ C  (n) :
φi is bijective on V (C)} for each i ∈ [q]. Thus, we have C  (n) = C1 ∪ · · · ∪ Cq ,
and hence, Pcycl (n) = conv(P1 ∪ · · · ∪ Pq ) with Pi = conv{χ(C) : C ∈ Ci }
for all i ∈ [q]. Due to Lemma 10, it suffices to exhibit, for each i ∈ [q], an
extension of Pi of size bounded by O(2 · n3 ) (with the constant independent
of i). Towards this end, let, for i ∈ [q], Vc = φ−1 i (c) for all c ∈ [], and de-
= conv{χ(C) : C ∈ Ci , v ∈ V (C)} for each v ∈ V . Thus, we have
fine Pi (v )/
Pi = conv v ∈V Pi (v ), and hence, due to Lemma 10, it suffices to construct
extensions of the Pi (v ), whose sizes are bounded by O(2 · n2 ).
In order to derive such extensions define, for each i ∈ [q] and v ∈ V , a di-
rected acyclic graph D with nodes (A, v) for all A ⊆ [ − 1]and v ∈ φ−1 i (A), as
well as two additional nodes s and t, and arcs s, ({φi (v)}, v) and ([ − 1], v), t
 
for all v ∈ φ−1
i ([ − 1]), as well as (A, v), (A ∪ {φi (w)}, w) for all A ⊆ [ − 1],
v ∈ φ−1 −1
i (A), and w ∈ φi ([ − 1] \ A). This is basically the dynamic program-
ming digraph (using an idea going back to [10]) from the color-coding method
for finding paths of prescribed lengths described in [1]. Each s-t-path in D cor-
responds to a cycle in Ci that visits v , and each such cycle, in turn, corresponds
to two s-t-paths in D (one for each of the two directions of transversal).
Defining Qi (v ) as the convex hull of the characteristic vectors of all s-t-paths
in D in the arc space of D, we find that Pi (v ) is the image of Qi (v )) under the
projection whose component function corresponding to the edge {v, w} of Kn
is given by the sum of all arc variables corresponding to arcs ((A, v), (A , w))
(for A, A ⊆ [ − 1]) if v ∈ {v, w},